Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Youtube: Only downloading last 2 hours of 3 hour video. (with hack as a workaround) #26330
Comments
|
Youtube is definitely up to something new as this phenomenon is something I haven't seen before and it started about 5-6 days ago? As most of you know livestreams used to only display the last 2 hours even on the webpage for X amount of time until they got processed into a VOD. I might be repeating what has already been said but since the "Processing into a VOD" could be different from person to person depending on your region, I wanted to throw out that I'm experiencing the exact same thing, which likely everybody is. |
|
Same issue here. $ command youtube-dl --version
2020.07.28
$ command youtube-dl --verbose 'https://www.youtube.com/watch?v=Mm0KCzYpMhQ'
[debug] System config: []
[debug] User config: ['--ignore-errors', '--no-mtime', '--console-title']
[debug] Custom config: []
[debug] Command-line args: ['--verbose', 'https://www.youtube.com/watch?v=Mm0KCzYpMhQ']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2020.07.28
[debug] Python version 3.8.3 (CPython) - Linux-5.7.9-arch1-1-x86_64-with-glibc2.2.5
[debug] exe versions: ffmpeg 4.3.1, ffprobe 4.3.1, rtmpdump 2.4
[debug] Proxy map: {}
[youtube] Mm0KCzYpMhQ: Downloading webpage
[youtube] Mm0KCzYpMhQ: Downloading m3u8 information
[youtube] Mm0KCzYpMhQ: Downloading MPD manifest
[debug] Default format spec: bestvideo+bestaudio/best
[debug] Invoking downloader on 'https://manifest.googlevideo.com/api/manifest/dash/expire/1597700011/ei/S6M6X_zOHpOcs8IPi7SF-Ak/ip/50.999.999.85/id/Mm0KCzYpMhQ.1/source/yt_live_broadcast/requiressl/yes/tx/23908007/txs/23908006%2C23908007/hfr/all/as/fmp4_audio_clear%2Cwebm_audio_clear%2Cwebm2_audio_clear%2Cfmp4_sd_hd_clear%2Cwebm2_sd_hd_clear/force_finished/1/vprv/1/keepalive/yes/fexp/23883098/itag/0/playlist_type/DVR/sparams/expire%2Cei%2Cip%2Cid%2Csource%2Crequiressl%2Ctx%2Ctxs%2Chfr%2Cas%2Cforce_finished%2Cvprv%2Citag%2Cplaylist_type/sig/AOq0QJ8wRQIgaBMCXhRtrGe2SeAHys3agvoV10DHyPqygiCOG-_PfjcCIQCBWI3JoZSHDc-Eu5xkAU2Xi_Jpll4aYzVP1m4HEE0g3A%3D%3D'
[dashsegments] Total fragments: 3600
[download] Destination: 【APEX耐久】ダイアモンドになるまで終わらないラストバトル!!【湊あくあ】-Mm0KCzYpMhQ.f299.mp4
[download] 100% of 3.01GiB in 10:12
[debug] Invoking downloader on 'https://manifest.googlevideo.com/api/manifest/dash/expire/1597700011/ei/S6M6X_zOHpOcs8IPi7SF-Ak/ip/50.999.999.85/id/Mm0KCzYpMhQ.1/source/yt_live_broadcast/requiressl/yes/tx/23908007/txs/23908006%2C23908007/hfr/all/as/fmp4_audio_clear%2Cwebm_audio_clear%2Cwebm2_audio_clear%2Cfmp4_sd_hd_clear%2Cwebm2_sd_hd_clear/force_finished/1/vprv/1/keepalive/yes/fexp/23883098/itag/0/playlist_type/DVR/sparams/expire%2Cei%2Cip%2Cid%2Csource%2Crequiressl%2Ctx%2Ctxs%2Chfr%2Cas%2Cforce_finished%2Cvprv%2Citag%2Cplaylist_type/sig/AOq0QJ8wRQIgaBMCXhRtrGe2SeAHys3agvoV10DHyPqygiCOG-_PfjcCIQCBWI3JoZSHDc-Eu5xkAU2Xi_Jpll4aYzVP1m4HEE0g3A%3D%3D'
[dashsegments] Total fragments: 3600
[download] Destination: 【APEX耐久】ダイアモンドになるまで終わらないラストバトル!!【湊あくあ】-Mm0KCzYpMhQ.f140.m4a
[download] 100% of 151.43MiB in 06:39
[ffmpeg] Merging formats into "【APEX耐久】ダイアモンドになるまで終わらないラストバトル!!【湊あくあ】-Mm0KCzYpMhQ.mp4"
[debug] ffmpeg command line: ffmpeg -y -loglevel repeat+info -i 'file:【APEX耐久】ダイアモンドになるまで終わらないラストバトル!!【湊あくあ】-Mm0KCzYpMhQ.f299.mp4' -i 'file:【APEX耐久】ダイアモンドになるまで終わらないラストバトル!!【湊あくあ】-Mm0KCzYpMhQ.f140.m4a' -c copy -map 0:v:0 -map 1:a:0 'file:【APEX耐久】ダイアモンドになるまで終わらないラストバトル!!【湊あくあ】-Mm0KCzYpMhQ.temp.mp4'
Deleting original file 【APEX耐久】ダイアモンドになるまで終わらないラストバトル!!【湊あくあ】-Mm0KCzYpMhQ.f299.mp4 (pass -k to keep)
Deleting original file 【APEX耐久】ダイアモンドになるまで終わらないラストバトル!!【湊あくあ】-Mm0KCzYpMhQ.f140.m4a (pass -k to keep)The result was the exactly last 2 hours of the 6:54:07 video. Using Firefox for Linux or YouTube app for Android, the video can be played without any problems though Chat Replay hasn't been made available yet. |
|
Same problem, actually I spent whole evening on downloading the video repeatedly and found that it's the audio that is limited to 2-hour duration caused the problem. No matter bitrate, both 139 and 140 have a duration of 2:00:00 while the video is exactly 3:17:51 of length. |
|
Same problem, but it seems like the problem disappears after up to 2 days since the live stream has been finished (please mind it when trying to reproduce the issue). |
|
OP of the linked issue here (on a new account, don't worry about it I'm just dumb...). Yes, they are basically the same issue. Or rather, your issue is more current because YouTube has literally changed how they handle livestream recordings in the past week. When I posted my issue, livestream recordings were clipped on the YouTube site just like what youtube-dl is giving until the processing finishes. Now the full video is available to browser users but youtube-dl is still somehow finding the clipped one. So I agree this now looks like a genuine bug. (As an aside, this is not the first time the full recording has been available to regular desktop browser users. There was a period of about a week a month or two ago where I could consistently get the full video, but only in chrome, not firefox. That was possibly a trial run for what we're seeing now. After that period it went back to the 2 hour clipped recording that has been standard for years.) The work-around you posted by editing the json to include the missing fragments is very neat but without trying a bunch of cases I'm not sure how robust it is. If you would rather just filter these videos and wait for processing to occur (for people with relatively automated workflows), I've had success with using As a word of warning for anyone else this comes up for, I've run into some very weird cases experimenting with youtube-dl on unprocessed livestreams over 2 hours in the past week. In one case, the video and audio were both downloaded normally (well, slowly, since downloading fragmented files is slow, but no errors or anything), but when they were combined I was left with just a 15 second long video from somewhere in the middle and the audio was gibberish. I also saw a case where when I called youtube-dl, there were literally 0 formats available, like it saw there was a video but it couldn't get any data at all, but subsequent calls couldn't reproduce that result. |
|
I went to grab this livestream 5 hours after it was finished, https://www.youtube.com/watch?v=VMfaFRPGvvM. It downloaded complete with the dash_manifest using youtube-dl. Interested to see if any changes occurred in the manifest, I wrote out the info.json and inspected it. The manifest was still in fragments form, but instead of 2 second fragments, it listed 5 second fragments. This is a change on youtube's end. and might be youtube's way of fixing this problem for us? The previous manifests were showing 2 second fragments with a limit of 3600 fragments (2 hours) before it lost the starting ones. With 5 second fragments, 3600 fragments would be 5 hours, making the bug only show up on livestreams over 5 hours in length?
At 5 second fragments, that's the full file length. As an aside, the youtube page source makes references to html5_manifest stuff, which I know nothing about. Wonder if the player is using an html5 video streaming thing while the dash and hls are there for backward compatibility with older browsers. |
No. Here is a live archive which has just finished being streamed minutes ago: https://www.youtube.com/watch?v=Esl7kGD5FdE. Although the web player has no problem with it, url=$(curl --silent 'https://www.youtube.com/watch?v=Esl7kGD5FdE' | grep -o 'https:\\/\\/[^"]\+googlevideo.com\\/[^"]\+' | grep 'hls_variant' | sed 's/\\//g')
ffmpeg -i "${url}" out.mkv #=> The same result. The problem is not specific to youtube-dl. |
|
This behavior existed before; in fact the 2 hour duration is new and was added when YouTube made several improvements to their livestream platform like increasing supported max resolution and adding low latency options. Depending on the settings for the livestream, the fragments can be 1s, 2s, 5s, and maybe other amounts. I have seen the clipped version of the video be a maximum of 2 hours, 4 hours, 5 hours, or maybe more (and it can be less than these if the livestream goes offline and back online again; only the final X hours of the stream, as it was shown in realtime, will be available. So you may see 1:58:30 clipped recordings if there were connection issues, for example). The settings that matter in determining this are (most importantly) the latency setting, the (source) resolution, framerate, and possibly bitrate. Higher settings lead to shorter durations but I do not know exactly what formula YouTube is using. It is not just a fixed number of fragments; you will find 2 hour videos with 3600 fragments or 7200 (2s vs 1s), and a 4 hour clip with 5s fragments is 2880 which I'm pretty sure I've seen. I have no way of knowing for sure, but it is likely the stream you were looking at was not using low latency settings, and while it is 1080p, it is only 30fps. So being able to get 4 or 5 hours of the video rather than 2 (which was enough for the full video in this case) is not a surprise. That is what you would have seen in a browser as well up until about a week ago. |
|
Here's a quick bash script to insert the fragments into a .info.json file. It assumes the same number of fragments are missing from every fragmented stream (which is true for every case I've seen so far.) It relies on gnu seq and gnu sed.
It also doesn't determine or insert the fragment duration, but seems to do well enough without it for the full stream. |
|
While we're on the topic of "inserting missing fragments" script, I cooked this up using Python.
I wrote this before the update with 5s fragments, so it would need to be updated for that. |
|
Several further complications I have encountered which I figured should be documented somewhere.
Unfortunately it still seems there is no completely foolproof automated method for ensuring that the full video will be downloaded or for filtering incomplete videos. Simply filtering based on fragmentation is not viable because of issue 2, and |
|
Here is a batch script to generate the missing fragments (you first have to replace the first fragment number, the last fragment number and the duration):
|
Where you get this data? |
i did the same but have only audio. |
Checklist
Verbose log
Description
On that page,the youtube player in firefox shows the video is 3:01:49 in length, and plays fully at that length. While youtube-dl downloads and exits cleanly, it only downloads the last 2:00:00 of it. This happens with both the dash and hls manifests I've tried.
Is there a workaround or proper method to grab the full video which is available to play on the site?
This is likely related to issue #26290.
Edit: the output is from an older youtube-dl, but the results are the same on my boox with v. 2020.07.28
Edit 2: streamlink also gets only 2 hours from hls. How does youtube player do it?!
Edit 3: Well, I'm not sure what the youtube player is doing, but I was able to hack the missing fragments into place...
First, I wrote out the info.json for the url.
Then I edited the json and found the first fragment in the stream:
Then I generated all the fragments from 0 to 1854:
I then inserted these missing fragments before fragment 1855 in the json and saved it.
Then I was able to run youtube-dl against this .json with:
And got the full length stream saved, as the fragments are on the server, just not enumerated in the manifest.
It would seriously suprise me if this is what the web player is doing to get the full stream.
Maybe the youtube plugin can check to see if fragments start at 0, if not check if 0 is on the server, and compensate for the incomplete manifest as a work around?