Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Missing video(s) in channel download on YouTube #16212

Open
fastily opened this issue Apr 18, 2018 · 18 comments
Open

[BUG] Missing video(s) in channel download on YouTube #16212

fastily opened this issue Apr 18, 2018 · 18 comments

Comments

@fastily
Copy link

@fastily fastily commented Apr 18, 2018

Please follow the guide below

  • You will be asked some questions and requested to provide some information, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your issue (like this: [x])
  • Use the Preview tab to see what your issue will actually look like

Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2018.04.16. If it's not, read this FAQ entry and update. Issues with outdated version will be rejected.

  • I've verified and I assure that I'm running youtube-dl 2018.04.16

Before submitting an issue make sure you have:

  • At least skimmed through the README, most notably the FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones
  • Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

This YouTube channel (https://www.youtube.com/user/ChildrenofPoseidon2) has 3 videos, but youtube-dl only finds 2 videos.

Command: youtube-dl -s -v https://www.youtube.com/user/ChildrenofPoseidon2

Log output:

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'https://www.youtube.com/user/ChildrenofPoseidon2', u'-s', u'-v']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2018.04.16
[debug] Python version 2.7.10 (CPython) - Darwin-17.5.0-x86_64-i386-64bit
[debug] exe versions: ffmpeg 3.4.2, ffprobe 3.4.2
[debug] Proxy map: {}
[youtube:user] ChildrenofPoseidon2: Downloading channel page
[youtube:playlist] UUyQUl9M6OfRX7EKhi8SPN_A: Downloading webpage
[download] Downloading playlist: Uploads from Jeffabel and Friends Extras
[youtube:playlist] playlist Uploads from Jeffabel and Friends Extras: Downloading 2 videos
[download] Downloading video 1 of 2
[youtube] 55DF8tRSMco: Downloading webpage
[youtube] 55DF8tRSMco: Downloading video info webpage
[youtube] 55DF8tRSMco: Extracting video information
[debug] Default format spec: bestvideo+bestaudio/best
[download] Downloading video 2 of 2
[youtube] Lm-_ZUiEQDk: Downloading webpage
[youtube] Lm-_ZUiEQDk: Downloading video info webpage
[youtube] Lm-_ZUiEQDk: Extracting video information
[debug] Default format spec: bestvideo+bestaudio/best
[download] Finished downloading playlist: Uploads from Jeffabel and Friends Extras

This video (https://www.youtube.com/watch?v=tZPI1mmVPCk) is missing from the output.

@yuri-sevatz
Copy link

@yuri-sevatz yuri-sevatz commented Oct 23, 2018

Someone recently managed to reproduce this on Reddit here:
https://www.reddit.com/r/DataHoarder/comments/9qrlbp/i_wrote_a_pythonselenium_based_crawler_to_really/?st=jnmcv30j&sh=66312301

They were trying to download this user:

https://www.youtube.com/user/LanaDelRey/

Pasting my analysis here:


Just trying some things to see what's up here:

youtube-dl -s -v https://www.youtube.com/user/LanaDelRey/videos

This yields:

...
[youtube:user] LanaDelRey: Downloading channel page
[youtube:playlist] UUqk3CdGN_j8IR9z4uBbVPSg: Downloading webpage
...

Which means youtube-dl converted that user to this playlist:

https://www.youtube.com/playlist?list=UUqk3CdGN_j8IR9z4uBbVPSg

Which uses the exact same ID as the channel in all of yotube's links (except a different prefix)

https://www.youtube.com/channel/UCqk3CdGN_j8IR9z4uBbVPSg

The channel's id is this:

UCqk3CdGN_j8IR9z4uBbVPSg

Which gets converted with this code:

    if channel_playlist_id and channel_playlist_id.startswith('UC'):
        playlist_id = 'UU' + channel_playlist_id[2:]
        return self.url_result(compat_urlparse.urljoin(url, '/playlist?list=%s' % playlist_id), 'YoutubePlaylist')

From this line in YoutubeChannelIE: https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/youtube.py#L2485

YoutubeChannelIE is inherited by YoutubeUserIE, (probably because there is so much in common between the two). I would gather that their assumption -- that this URL is the most ideal place to find all the channel's videos -- is probably an incorrect assumption in youtube-dl.

Imo this looks like a fairly drastic bug, but it shouldn't be too hard to fix.


Edit: Ironically even Youtube's "Play All" button on the user's "Uploads" page jumps to a playlist with only 22 videos too, despite there being more than 60 uploads in that playlist. This might be somewhat exasperated by a bug on Youtube's server:

https://www.youtube.com/user/LanaDelRey/videos

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Oct 24, 2018

The rationale behind delegation to playlist extractor is clearly explained in the comment above the code mentioned: user rendition has a limitation of 100 pages with 30 videos per page (35 pages at the moment of writing the code), at the same time playlist has no such limitation and provides unlimited 100 videos per page rendition (see https://www.youtube.com/user/drighk/videos).
"Fixing" it by removing the delegation will make it impossible to download whole channels with >3k videos.

@yuri-sevatz
Copy link

@yuri-sevatz yuri-sevatz commented Oct 24, 2018

@dstftw Wow, that makes things a lot harder. It sounds like the only way of mitigating what clearly looks like a server side bug -- would mean we'd have to request both the playlist and the user rendition, and then take the union of all videos in both.

Yuck. That would destroy network performance, and could show up on Google's radar a lot more too.

Maybe we can just file a bug with Youtube to get them to fix the"Play All" playlist instead.

@marabu88
Copy link

@marabu88 marabu88 commented Feb 2, 2019

As I understand it, the problem is that live videos are not added to the channel playlist. I have a suggestion - add some parameter to force the video to the specified address (with the parameters specified by the user). For example, if you specify the URL https://www.youtube.com/channel/UCjzHeG1KWoonmf9d5KBvSiw/videos?view=2, then only live broadcasts will be output.

There is also a fairly elegant way to solve a problem without the need to add new parameters. But it is less universal.

It makes no sense to load a playlist if the --playlist-end parameter is specified, where n is less than the number of videos on one page. In addition, it will reduce the number of requests to the server YouTube.

I think this is quite a serious problem, since live streams on YouTube are now used very often and do not always persist after they end. The only way to record them is to record it during the live broadcast.
 

@samsepiol59
Copy link

@samsepiol59 samsepiol59 commented Mar 5, 2019

I'm a bit of a noob, please don't eat me alive, but the "issue" is still there right? There's no way to go around this for the time being?

@fastily
Copy link
Author

@fastily fastily commented Mar 5, 2019

@samsepiol59 I'm no longer able to reproduce the issue with the latest version of youtube-dl.

Do you have any examples of it breaking?

@samsepiol59
Copy link

@samsepiol59 samsepiol59 commented Mar 5, 2019

@fastily I'm currently still using the 2018.10.05 version - are you able to do a dry run of LanaDelRey and verify if it's downloading 22 videos or more? Because with my version I'm still stuck at 22.

@fastily
Copy link
Author

@fastily fastily commented Mar 6, 2019

Using the latest version (2019.03.01) I'm also only able to retrieve 22 videos out of 64. I suppose this has not been fixed yet

@ghost
Copy link

@ghost ghost commented Mar 8, 2019

I'm completely new to working with YouTube-dl and started this thread discussion in r/DataHoarder https://www.reddit.com/r/DataHoarder/comments/awxjcl/youtubedl_archiving_projects_complete_list_of/ for constructing near-perfect set-it-and-forget-it scripts in the use of archiving at-risk YouTube channel/content for digital & cultural preservation.

It's my understanding after thoroughly combing through the Read Me, browsing others' posted queries on here, and comments from other people on Reddit, that there is no easy way to download an entire channel, include playlists, but also pull videos from the 'uploads', while not downloading the same video twice (eg: if it's categorized under uploads, and then in a specific playlist as well.)

In addition, per this thread, it appears the issue of Youtube-dl still skipping or not downloading all videos in a given playlist has been left unresolved, either on Youtube-dl end or YouTube's end.

Can we please receive some sort of notification or acknowledgement that this is even being worked on by YT-dl developers?

Thank you!

*Edit. I've since started a new thread to further this issue, link below.

@remitamine
Copy link
Collaborator

@remitamine remitamine commented Mar 9, 2019

there is no easy way to download an entire channel, include playlists, but also pull videos from the 'uploads', while not downloading the same video twice

--download-archive option should help with this.

In addition, per this thread, it appears the issue of Youtube-dl still skipping or not downloading all videos in a given playlist has been left unresolved, either on Youtube-dl end or YouTube's end.

Can we please receive some sort of notification or acknowledgement that this is even being worked on by YT-dl developers?

#16212 (comment)

@ghost
Copy link

@ghost ghost commented Mar 9, 2019

there is no easy way to download an entire channel, include playlists, but also pull videos from the 'uploads', while not downloading the same video twice

--download-archive option should help with this.

In addition, per this thread, it appears the issue of Youtube-dl still skipping or not downloading all videos in a given playlist has been left unresolved, either on Youtube-dl end or YouTube's end.
Can we please receive some sort of notification or acknowledgement that this is even being worked on by YT-dl developers?

#16212 (comment)

@remitamine Thank you. I already have --download archive in the script I'm using. I'm still stuck on constructing a script that outputs something like this as a file/folder tree:

'YouTube-Dl Archiving Projects > Channel > Playlist Name (if applicable) > Video Folder > Video + Metadata files. (Description File. JSON, Thumbnail, etc.)' while at the same time configuing the same script to ignore if a channel doesn't have playlists:

'YouTube-Dl Archiving Projects > Channel > Uploads (if no applicable playlists) > Video Folder > Video + Metadata files. (Description File. JSON, Thumbnail, etc.)'

Or am I confusing one in the same? Basically incorporate playlist name, ID, folder, video folder for each video in playlist, but ignore argument if channel just contains videos under 'Uploads', without designated playlists.

I'm also trying to export the same metadata to a CSV automatically, like this https://imgur.com/a/HF016ue in addition to exporting the entire CMD history for every channel / video / metadata file logged and archived.

Is there a way to do any of this? I've asked and nobody seems to know how to go about this.

Thank you!

@remitamine
Copy link
Collaborator

@remitamine remitamine commented Mar 9, 2019

'YouTube-Dl Archiving Projects > Channel > Playlist Name (if applicable) > Video Folder > Video + Metadata files. (Description File. JSON, Thumbnail, etc.)'

youtube-dl --download-archive ARCHIVE_FILENAME --write-description --write-info-json -o '%(uploader)s/%(playlist)s/%(title)s/%(title)s.%(ext)s' PLAYLISTS_URL

'YouTube-Dl Archiving Projects > Channel > Uploads (if no applicable playlists) > Video Folder > Video + Metadata files. (Description File. JSON, Thumbnail, etc.)'

youtube-dl --download-archive ARCHIVE_FILENAME --write-description --write-info-json -o '%(uploader)s/Uploads/%(title)s/%(title)s.%(ext)s' UPLOADS_URL

while at the same time configuring the same script to ignore if a channel doesn't have playlists

just start with the playlists than the uploads and make sure that you're using the same archive.

Is there a way to do any of this? I've asked and nobody seems to know how to go about this.

just a reminder, this is mainly an issue tracker to report problems or propose features, it's more preferable to use other channels for help in specific use cases like IRC, reddit...

@ghost
Copy link

@ghost ghost commented Mar 9, 2019

'YouTube-Dl Archiving Projects > Channel > Playlist Name (if applicable) > Video Folder > Video + Metadata files. (Description File. JSON, Thumbnail, etc.)'

youtube-dl --download-archive ARCHIVE_FILENAME --write-description --write-info-json -o '%(uploader)s/%(playlist)s/%(title)s/%(title)s.%(ext)s' PLAYLISTS_URL

'YouTube-Dl Archiving Projects > Channel > Uploads (if no applicable playlists) > Video Folder > Video + Metadata files. (Description File. JSON, Thumbnail, etc.)'

youtube-dl --download-archive ARCHIVE_FILENAME --write-description --write-info-json -o '%(uploader)s/Uploads/%(title)s/%(title)s.%(ext)s' UPLOADS_URL

while at the same time configuring the same script to ignore if a channel doesn't have playlists

just start with the playlists than the uploads and make sure that you're using the same archive.

Is there a way to do any of this? I've asked and nobody seems to know how to go about this.

just a reminder, this is mainly an issue tracker to report problems or propose features, it's more preferable to use other channels for help in specific use cases like IRC, reddit...

Noted - Much appreciated for your assistance!

@ghost ghost mentioned this issue Aug 13, 2019
6 of 6 tasks complete
@biodrone
Copy link

@biodrone biodrone commented Aug 15, 2019

Looks like this issue can be closed. I've just tried to download @fastily's problem user (https://www.youtube.com/user/ChildrenofPoseidon2) and I get 5 out of 5 videos. Can't test with the LanaDelRay user probably because it's been gobbled up by VEVO and the channel name is now different.

@samsepiol59
Copy link

@samsepiol59 samsepiol59 commented Aug 15, 2019

@biodrone On an unrelated note, can I ask how you're downloading youtube videos now? Because I'm using hetzner and I have 429 errors, while my sock5 proxies keep getting peer errors...

@biodrone
Copy link

@biodrone biodrone commented Aug 18, 2019

@samsepiol59 I can't say I do a great deal of it nowadays so it's either from my desktop or a VPS. Don't seem to have many issues thankfully!

@RevSnowfox
Copy link

@RevSnowfox RevSnowfox commented Sep 21, 2019

Is there any workaround for this issue? What would be the best method to ensure youtube-dl downloads all publicly viewable, non-geoblocked videos uploaded on a specific channel?

Should one try playlist URLs first, and then the main channel URL while using the download archive option? Are there better methods?

@samsepiol59
Copy link

@samsepiol59 samsepiol59 commented Sep 22, 2019

I resolved it by adding all videos/playlists that appear in the "video" section!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
8 participants
You can’t perform that action at this time.