New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Live stream stalls with "playback stalling in low buffer" #1172
Comments
looks like you just have rebuffering due to network conditions. |
@mangui This issue may have multiple different manifestations. The variety that I experience consistently is represented with successive level info requests without requests for any new fragments. The streamcontroller appears to believe that the end of the buffer is greater than the live edge, as defined by the comment and section of code provided below. This is regardless of bandwidth availability in this case and proceeds for an arbitrary amount of time, but can go on ad infinitum. A log is pasted below. You can see the successive level loading begin starting at the first instance of the following log line.
In the case of the log it was roughly a two minute delay, however as mentioned this can go on for hours. There is an AMP link below seeded with the ABC Television live stream, however I am able to reproduce the issue consistently at the link Kiran provided with the HSN stream. Whereas the HSN stream tends to recover more readily, the Uplynk stream in the AMP link does not. Screen cast Comment
Code block [lines 4129-4131 of hls.js 0.7.9] Log
|
there is an issue parsing the fragment below. this is completely messing up the
I what would be great is to have a playlist extract when that happens. |
@mangui Attached are three playlist files requested in sequence. The DISCONTINUITY and DISCONTINUITY-SEQUENCE tags are present. As mentioned before this issue is occurring often during content transitions, specifically in the ABC Television Uplynk streams, here. Some quick research yielded this ExoPlayer ticket, which potentially seems related. I merely glanced at what the spec stipulates, but I don't see any removal of DISCONTINUITY tags from successive playlist requests. Please let us know what else you might possibly need. In the meantime we are going to move forward and document other manifestations of the same (visual) issue. Many thanks for your assistance. |
ok i think i understand what is going on.
|
@mangui We will check ourselves but do you know if this discrepancy violates the spec? The pushback we're going to get is that this works in Safari in relation to that question. |
If my assumption is correct, yes that should violate the spec. but Safari might be more robust to handle that cases. |
Guillaume - Been a while!
OK, we weren't sure if a VOD capture would be handled in the same way
that would encounter this issue. Let us see if we can do that.
Also, I have noticed if I have two of our AMP players sides by side in
two tabs (using link Justin provided above), sometimes one will get the
issue and their other would not. Not sure what that means.
Also, FYI, I have also tested the upLynk player and it seems so far to
handle it ok.
On Thu, Jun 8, 2017, at 12:36 AM, Guillaume du Pontavice wrote:
If my assumption is correct, yes that should violate the spec. but
Safari might be more robust to handle that cases.> I would be glad to enhance hls.js robustness but I would need a
stream capture highlighting the issue.> — You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub[1], or mute the
thread[2].>
|
@mangui, We were able to capture a representation of the issue using a perl script that meets the primary characteristics. That script is below in code. The script we're using doesn't insert DISCONTINUITY tags however, thus the issue in the capture could merely be hls.js reacting to the absence of the tag. That said, the issue as represented in the captured stream behaves exactly like the issue in the actual live stream, aside from the absence of the discontinuity tags and some additional and different logging. Here's the stream, which fails 100% of the time @ 6:59: http://projects.mediadev.edgesuite.net/customers/abctv/streamcapture/j003/manifest.m3u8 Let us know what you're findings are. In the meantime we have engaged Uplynk VDMS (Verizon) and asked for assistance producing a capture of the stream with the DISCONTINUITY tags inserted.
|
perfect, will check tonight |
@mangui Many thanks for your assistance! |
…e of gap if delta between expected timestamp and received timestamp is bigger than 10s, reset initPTS. this helps overcoming issues in streams in which DISCONTINUITY are sometimes missing related to #1172
ok there is definitely discontinuities in timestamps with your playlist. |
@mangui thanks again. I patched our current build using 0.7.9 with the change but I'm not seeing an improvement. You can take a look here if you like. Can you let me know which portion of the spec it is that you believe the playlist is violating? The relevant sections seemed to be: Section 3: https://tools.ietf.org/html/draft-pantos-http-live-streaming-21#section-3 All of those are referenced under the discontinuity section: https://tools.ietf.org/html/draft-pantos-http-live-streaming-21#section-4.3.2.3 I don't see a clear indication of the expectation around timestamps therein. If there is, we can take this to Verizon and ask them to fix the instances under which the timestamps diverge. Thank you! |
https://tools.ietf.org/html/draft-pantos-http-live-streaming-23#section-3
|
@mangui Thank you. I have to be honest - I agree with you that it should violate the spec, but I think that section can be interpreted such that the timestamps can diverge provided the discontinuity is present and the timestamps are received in an uninterrupted fashion. I suppose it depends upon the definition of "uninterrupted" in this context. Channeling my inner Bill Clinton ;) We'll ask internally about the spec with a resource who knows it well. Apologies regarding the CORS issues. Unfortunately this is the nature of ABCTV's environment, and we work around those issues locally in a couple of different ways (Charles and a websocket that serves as a proxy for the CORS). In the meantime, if you want to work around the CORS issue, you can emulate the attached Charles rewrite. Note that we have all had trouble importing the .chls, thus we have had to input it manually. I've added screen shots for that. Thanks again! |
in your case there is a timestamp discontinuity without |
@mangui The reason the discontinuity is missing is because the script we use to capture the live stream doesn't insert them. We're trying to get you a better capture including the discontinuity tags. |
@mangui We are having some trouble procuring a capture that retains all of the HLS characteristics and surfaces the issue. Would it be possible for you to try the following with this sample page and see if you are able to reproduce the issue?
This issue is slightly different than the one we have been trying to capture in that it is triggered exclusively by a bitrate switch, but appears to run through the same routine as most of the logging is identical. Let us know if you have any trouble with the sample page. Thank you! |
…r on level switch with unknown PTS this helps keeping levels in sync with PTS and avoid setting up a wrong start offset, especially when level switch happens on a discontinuity related to #1172
@mdtjrfg I spotted an issue while testing your stream : while switching to a new level on a discontinuity: if we load straight away a fragment which does not use the same timebase, the playlist start offset is wrong. |
@mangui Thanks! I have been unable to reproduce the discontinuity use case recently and with your latest dist. We will continue to test and determine if it is still reproducible at some rate. The bitrate switch case is still consistently reproducible however, at a fairly close to 100% rate, as defined by the following steps:
Here's a screen cast that depicts the issue. We see roughly the same output as is seen with the discontinuity case. https://www.screencast.com/t/vK0JOANt Could this case be addressed by your change if leveraged outside of the discontinuity case? |
interesting. there should be something wrong in https://github.com/video-dev/hls.js/blob/master/src/helper/level-helper.js#L9 in theory hls.js should be able to find overlapping fragments here: if it was not able to find an overlapping frag with PTS info, then this should be either because we didnt update PTS info (should be done here: hls.js/src/controller/stream-controller.js Line 1148 in 4f551cc
i will add more logs to understand what is going on. |
in case a live playlist has been refreshed between frag loading and updateFragPTSDTS(), fragments array will not contain the initial frag object. As updateFragPTSDTS() is doing the PTS update on the original frag object, we need to update fragments array with this original frag, so that PTS info could be propagated in the playlist. this to avoid invalid sliding computation related to #1172
ok plz recheck with latest diff, I found an issue that should affect a bunch of live playlists ... basically when switching to a new rendition, if it happens that the playlist gets refreshed before the end of the parsing of the first fragment at this new rendition, then sliding computation will be incorrect. scenario is:
once sliding is wrong, timestamps are messed up, and it ends up in |
@mangui Our testing indicates that this change addresses the issue. For the time being we are using the dist. What steps are required on your end to apply this fix to an official release? Thanks again for your assistance, it is greatly appreciated. |
https://github.com/video-dev/hls.js/releases/tag/v0.7.11 includes the fix |
looks like Vevo is already using 0.7.11 |
@mangui - @vitalibozhko noted anecdotally that this issue is more reproducible when both a level switch and a discontinuity are present concurrently. My understanding is that in or outside those conditions this issue is relatively difficult to reproduce. |
Here are some more logs We can see two scenarios from these logs:
We're trying to figure out what's wrong in each individual scenario but unfortunately we still don't have a solid understanding of the whole picture of startPTS management across demuxer/remuxer and stream controller. |
@mangui - We (@vitalibozhko and we) are stuck on this one right now. Our QA with lesser connections can reproduce easily, but many others cannot. Do you have any guidance or suggestion for how we can dig further into this? Thanks |
Date() of the beginning of the first fragment related to #1172
on your first scenario : both level switch + discontinuity : hls.js cannot compute the sliding. we would need to use PROGRAM DATE TIME info as a time reference to sync the playlist. if playlist A sliding is 1000 and its first frag PROGRAM-DATE-TIME is 2017-08-20 1:10:00 AM this logic should be added here, only for live playlist for which we still have I just pushed some code to expose the date of each level. it will simplify the computation. |
@mangui Thank you, we'll give this a shot. Note that there was some confusion on the ABCTV end in which the original issue you addressed wasn't working because of a reference to the wrong version of the hls.js dist. That is working on their end now. |
ok then I guess now they should only face issue 1 (level switch + discontinuity) |
@mangui That is correct. Thank you. |
@mangui Is b1bd73e and the related addition of |
b1bd73e helps on level switch / discontinuity when there are overlapping continuity ranges between two rendition, which was the case in your scenario 1.
this should address most of the issues. |
…M-DATE-TIME (if available) related to #1172
@mangui Thank you for implementing the changes. I had gotten as far as the sliding calculation, however due to my unfamiliarity with the codebase, hadn't yet advanced to adjusting the fragment values. This confirms at least that my thinking was correct using the changes in b1bd73e as a guide, and I am becoming more familiar with project architecture and the raw code. I stashed my changes in a local branch. Unfortunately when throttled at "Regular 3G" on a Nexus 6 with Android 7.0, the issue is still present and as bad as it's ever been. I'll continue looking into it along with you, and thank you for your time. Logs are attached of an instance in which the issue occurred and recovered over different time frames. |
@mangui These changes seem to be more impactful but I do still encounter the issue. Attached logs and screen shots show two different cases: logs-3 / ptsadjustment-3 in which the PTS diverges significantly prior to the issue occurring, and logs-4 / ptsadjustment-4 in which the PTS ends up as NaN; noting that this particular case is isolated in Will continue to dig using newly available info on my end. [logs-4.txt](https://github.com/video-dev/hls.js/files/1261975/logs-4.txt) |
…tinuity/PDT sliding computation related to #1172
I added one more check so that it could use the new PDT fallback logic in NaN case.
|
@mangui It appears as if the NaN case that you added the check for is different than the one I am running into; that appears to be the primary source of the stalling. I am seeing a level (in this case level 2) with an array of fragments that all have NaN I then encounter successive cases in which Followed by another level load (level 2 referred to earlier) that looks like this: So the question we need to answer is where these fragments are being seeded with NaN start values. I'll continue to investigate. |
@mangui It seems that the aforementioned NaN issue could be happening in relation to code added for the scenario 1 use case. See the following three screen shots: This screen shot represents fragment metadata at the point that the fragment object is populated in This screen shot represents fragment metadata at the point that the fragment object is populated in This screen shot represents the fragment array produced by This screen shot represents the "last level" (level 2) when it is referenced through I will continue to try and determine what is causing the NaN value, but this information at least seems to indicate the general location within the source in this case was level 2 / level 3 and he scenario 1 use case. Thanks again for your help! |
this was messing up frag.endPTS computation. and propagating NaN values related to #1172
playlist-loader produces worth rechecking with new pushed commit. thanks for your patience 🥇 |
@mangui, Thus far testing seems to indicate that your latest change addresses the issue. I am seeing the same logging sequence, but absent the NaN values and presence of the stalling. As of now I have not reproduced either of the NaN related issues, when a discontinuity throws off the fragment processing, I see fast recovery. I'll continue testing and report back with further results. |
\o/ keep us posted |
@mangui we don't see the issue happening anymore, thank you very much for your help. Is it possible to make a 0.8.2 release anytime soon? |
Environment
Steps to reproduce
Thanks for your help! Expected behavior Actual behavior Console output [log] > main stream:FRAG_LOADING->PARSING logger.js:37 [log] > Parsing 44 of [40 ,44],level 4, cc 8 logger.js:37 [log] > Parsed audio,PTS:[264.734,270.750],DTS:[264.734/270.750],nb:564,dropped:0 logger.js:37 [log] > Parsed video,PTS:[264.798,270.804],DTS:[264.731/270.737],nb:180,dropped:0 logger.js:37 [log] > main stream:PARSING->PARSED logger.js:37 [log] > main buffered : [6.008,84.216][96.265,114.281][126.328,138.338][150.473,270.737] logger.js:37 [log] > latency/loading/parsing/append/kbps:69/8/21/5/70602 logger.js:37 [log] > main stream:PARSED->IDLE logger.js:37 [log] > set loadLevel:2 logger.js:37 [log] > switching to level 2 logger.js:37 [log] > loading playlist for level 2 logger.js:37 [log] > main stream:IDLE->WAITING_LEVEL logger.js:37 [log] > main stream:WAITING_LEVEL->IDLE logger.js:37 [log] > main stream:IDLE->WAITING_LEVEL logger.js:37 [log] > main stream:WAITING_LEVEL->IDLE logger.js:37 [log] > main stream:IDLE->WAITING_LEVEL logger.js:37 [log] > main stream:WAITING_LEVEL->IDLE logger.js:37 [log] > live playlist, reload in 5441 ms logger.js:37 [log] > level 2 loaded [41,45],duration:30.03 logger.js:37 [log] > live playlist - outdated PTS, unknown sliding logger.js:37 [log] > No frag in previous level to align on logger.js:37 [log] > live playlist, switching playlist, unknown, load middle frag : 44 logger.js:37 [log] > Loading 44 of [41 ,45],level 2, currentTime:241.424,bufferEnd:270.737 logger.js:37 [log] > main stream:IDLE->FRAG_LOADING ?src=http%3A%2F%2Fsyd.foxbat.live%2Fyatk2506v%2Fmaster.m3u8&enableStreaming=true&autoRecoverError=true&enableWorker=true&dumpfMP4=false&levelCapping=-1&defaultAudioCodec=undefined:370 parsing level duration :216us,count:39 logger.js:37 [log] > Loaded 44 of [41 ,45],level 2 logger.js:37 [log] > main stream:FRAG_LOADING->PARSING logger.js:37 [log] > Parsing 44 of [41 ,45],level 2, cc 0 logger.js:37 [log] > main:discontinuity detected logger.js:37 [log] > main:switch detected blob:http://streambox.fr/2335b399-2779-439a-85a7-41a9f920d019:516 [log] > manifest codec:mp4a.40.2,ADTS data:type:2,sampleingIndex:0[96000Hz],channelConfig:2 blob:http://streambox.fr/2335b399-2779-439a-85a7-41a9f920d019:516 [log] > parsed codec:mp4a.40.5,rate:96000,nb channel:2 blob:http://streambox.fr/2335b399-2779-439a-85a7-41a9f920d019:516 [log] > audio sampling rate : 96000 logger.js:37 [log] > InitPTS for cc:0 found from video track:666544 logger.js:37 [log] > main track:audio,container:audio/mp4,codecs[level/parsed]=[mp4a.40.2/mp4a.40.5] logger.js:37 [log] > main track:video,container:video/mp4,codecs[level/parsed]=[avc1.4d0028/avc1.4d4028] logger.js:37 [log] > Parsed audio,PTS:[18.018,24.034],DTS:[18.018/24.034],nb:564,dropped:0 logger.js:37 [log] > Parsed video,PTS:[18.085,24.091],DTS:[18.018/24.024],nb:180,dropped:0 logger.js:37 [log] > main stream:PARSING->PARSED logger.js:37 [log] > main buffered : [6.008,84.216][96.265,114.281][126.328,138.338][150.473,270.737] logger.js:37 [log] > latency/loading/parsing/append/kbps:155/9/25/4/20520 logger.js:37 [log] > main stream:PARSED->IDLE logger.js:37 [log] > loading playlist for level 2 logger.js:37 [log] > live playlist, reload in 5776 ms logger.js:37 [log] > level 2 loaded [42,46],duration:30.03 logger.js:37 [log] > live playlist sliding:6.006 ?src=http%3A%2F%2Fsyd.foxbat.live%2Fyatk2506v%2Fmaster.m3u8&enableStreaming=true&autoRecoverError=true&enableWorker=true&dumpfMP4=false&levelCapping=-1&defaultAudioCodec=undefined:370 parsing level duration :213us,count:40 logger.js:37 [log] > loading playlist for level 2 logger.js:37 [log] > live playlist, reload in 5532 ms logger.js:37 [log] > level 2 loaded [43,47],duration:30.03 logger.js:37 [log] > live playlist sliding:12.012 ?src=http%3A%2F%2Fsyd.foxbat.live%2Fyatk2506v%2Fmaster.m3u8&enableStreaming=true&autoRecoverError=true&enableWorker=true&dumpfMP4=false&levelCapping=-1&defaultAudioCodec=undefined:370 parsing level duration :211us,count:41 logger.js:37 [log] > loading playlist for level 2 logger.js:37 [log] > live playlist, reload in 5766 ms logger.js:37 [log] > level 2 loaded [44,48],duration:30.03 logger.js:37 [log] > live playlist sliding:18.018 ?src=http%3A%2F%2Fsyd.foxbat.live%2Fyatk2506v%2Fmaster.m3u8&enableStreaming=true&autoRecoverError=true&enableWorker=true&dumpfMP4=false&levelCapping=-1&defaultAudioCodec=undefined:370 parsing level duration :212us,count:42 logger.js:37 [log] > loading playlist for level 2 logger.js:37 [log] > live playlist, reload in 5544 ms logger.js:37 [log] > level 2 loaded [45,49],duration:30.03 logger.js:37 [log] > discontinuity sliding from playlist, take drift into account logger.js:37 [log] > live playlist sliding:24.091 ?src=http%3A%2F%2Fsyd.foxbat.live%2Fyatk2506v%2Fmaster.m3u8&enableStreaming=true&autoRecoverError=true&enableWorker=true&dumpfMP4=false&levelCapping=-1&defaultAudioCodec=undefined:370 parsing level duration :208us,count:43 logger.js:37 [log] > loading playlist for level 2 logger.js:37 [log] > live playlist, reload in 5545 ms logger.js:37 [log] > level 2 loaded [46,50],duration:30.03 logger.js:37 [log] > discontinuity sliding from playlist, take drift into account logger.js:37 [log] > live playlist sliding:30.097 logger.js:37 [warn] > playback stalling in low buffer @270.713359 Here's a typical m3u8 sequence: #EXTM3U #EXT-X-VERSION:3 #EXT-X-TARGETDURATION:8 #EXT-X-MEDIA-SEQUENCE:2 #EXTINF:6.0060, http://204.236.220.132/encodes/default_1080p_0003.ts #EXTINF:6.0060, http://204.236.220.132/encodes/default_1080p_0004.ts #EXTINF:6.0060, http://204.236.220.132/encodes/default_1080p_0005.ts #EXT-X-DISCONTINUITY #EXTINF:6.0060, http://204.236.220.132/encodes/default_1080p_0001.ts #EXTINF:6.0060, http://204.236.220.132/encodes/default_1080p_0002.ts #EXTM3U #EXT-X-VERSION:3 #EXT-X-TARGETDURATION:8 #EXT-X-MEDIA-SEQUENCE:3 #EXTINF:6.0060, http://204.236.220.132/encodes/default_1080p_0004.ts #EXTINF:6.0060, http://204.236.220.132/encodes/default_1080p_0005.ts #EXT-X-DISCONTINUITY #EXTINF:6.0060, http://204.236.220.132/encodes/default_1080p_0001.ts #EXTINF:6.0060, http://204.236.220.132/encodes/default_1080p_0002.ts #EXTINF:6.0060, http://204.236.220.132/encodes/default_1080p_0003.ts #EXTM3U #EXT-X-VERSION:3 #EXT-X-TARGETDURATION:8 #EXT-X-MEDIA-SEQUENCE:4 #EXTINF:6.0060, http://204.236.220.132/encodes/default_1080p_0005.ts #EXT-X-DISCONTINUITY #EXTINF:6.0060, http://204.236.220.132/encodes/default_1080p_0001.ts #EXTINF:6.0060, http://204.236.220.132/encodes/default_1080p_0002.ts #EXTINF:6.0060, http://204.236.220.132/encodes/default_1080p_0003.ts #EXTINF:6.0060, http://204.236.220.132/encodes/default_1080p_0004.ts |
I was able to fix my issue. I added EXT-X-DISCONTINUITY-SEQUENCE tags to the manifest and the player continues to download segments after discontinuities and rendition changes. Makes sense now that I have realized what was going on. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Environment
Steps to reproduce
http://video-dev.github.io/hls.js/demo/?src=http%3A%2F%2Fhsn.mpl.miisolutions.net%2Fhsn-live01%2F_definst_%2Fsmil%3AHSN1_ipad.smil%2Fplaylist.m3u8&enableStreaming=true&autoRecoverError=true&enableWorker=true&dumpfMP4=false&levelCapping=-1&defaultAudioCodec=undefined
Expected behavior
Playback should not stop
Actual behavior
playback stops after few minutes
Console output
The text was updated successfully, but these errors were encountered: