Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YouTube channel pages have infinite scrolling page count. #5555

Closed
10 tasks done
a-raccoon opened this issue Nov 16, 2022 · 15 comments · Fixed by #6621
Closed
10 tasks done

YouTube channel pages have infinite scrolling page count. #5555

a-raccoon opened this issue Nov 16, 2022 · 15 comments · Fixed by #6621
Assignees
Labels
external issue Issue with an external tool site-bug Issue with a specific website

Comments

@a-raccoon
Copy link

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

  • I understand that I will be blocked if I remove or skip any mandatory* field

Checklist

Region

United States

Provide a description that is worded well enough to be understood

YouTube has some automatically generated channels and playlists for musicians via a partner named TuneCore. The playlist page for these channels is infinite scrolling (as of today). This causes yt-dlp to grab infinite number of playlist pages. There is no option to tell yt-dlp to stop grabbing playlist pages upon repetition.

yt-dlp will need to add playlist repetition detection to abort continued grabbing of playlist pages.

Example:

https://www.youtube.com/channel/UCH0_ywht5OYCdRYWZUjBQwQ/playlists

This channel and all its playlists and videos are auto-generated by YouTube via partner TuneCore.

Provide verbose output that clearly demonstrates the problem

  • Run your yt-dlp command with -vU flag added (yt-dlp -vU <your command line>)
  • Copy the WHOLE output (starting with [debug] Command-line config) and insert it below

Complete Verbose Output

>yt-dlp -vU "https://www.youtube.com/channel/UCH0_ywht5OYCdRYWZUjBQwQ/playlists"
[debug] Command-line config: ['-vU', 'https://www.youtube.com/channel/UCH0_ywht5OYCdRYWZUjBQwQ/playlists']
[debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out utf-8 (No VT), error utf-8 (No VT), screen utf-8 (No VT)
[debug] yt-dlp version 2022.11.11 [8b64402] (win_exe)
[debug] Python 3.8.10 (CPython AMD64 64bit) - Windows-7-6.1.7601-SP1 (OpenSSL 1.1.1k  25 Mar 2021)
[debug] exe versions: ffmpeg n4.4-6-g7e9b9f24df (setts), ffprobe n4.4-6-g7e9b9f24df
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.09.24, mutagen-1.46.0, sqlite3-2.6.0, websockets-10.4
[debug] Proxy map: {}
[debug] Loaded 1723 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.11.11, Current version: 2022.11.11
yt-dlp is up to date (2022.11.11)
[debug] [youtube:tab] Extracting URL: https://www.youtube.com/channel/UCH0_ywht5OYCdRYWZUjBQwQ/playlists
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ/playlists: Downloading webpage
[debug] [youtube:tab] Selected tab: 'playlists' (playlists), Requested tab: 'playlists'
[download] Downloading playlist: Pogo - Topic - Playlists
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 1: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 2: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 3: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 4: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 5: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 6: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 7: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 8: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 9: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 10: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 11: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 12: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 13: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 14: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 15: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 16: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 17: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 18: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 19: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 20: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 21: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 22: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 23: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 24: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 25: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 26: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 27: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 28: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 29: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 30: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 31: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 32: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 33: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 34: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 35: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 36: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 37: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 38: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 39: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 40: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 41: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 42: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 43: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 44: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 45: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 46: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 47: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 48: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 49: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 50: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 51: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 52: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 53: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 54: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 55: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 56: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 57: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 58: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 59: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 60: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 61: Downloading API JSON
[youtube:tab] UCH0_ywht5OYCdRYWZUjBQwQ page 62: Downloading API JSON

ERROR: Interrupted by user
@a-raccoon a-raccoon added site-bug Issue with a specific website triage Untriaged issue labels Nov 16, 2022
@pukkandan
Copy link
Member

This is a bug on Youtube's side and will likely fix itself in a few days. In the mean time, you can use --playlist-end

@pukkandan pukkandan added external issue Issue with an external tool and removed triage Untriaged issue labels Nov 16, 2022
@a-raccoon
Copy link
Author

a-raccoon commented Nov 16, 2022

Thank you.

Two things however. I don't believe --playlist-end nor the -I switch are intended to control how many pages of playlists are downloaded, but rather how many items within a specific playlist is downloaded. You see, yt-dlp is unable to download the page of playlists (aka "playlist tab") due to infinite scrolling.

Secondly, I doubt this will fix itself suddenly "in a few days". I only stumbled onto it for the first time, and this may be a very old and persistent issue that has gone unreported.

Might still be worth looking at detecting infinite page dynamic loading scroll on YouTube.

@pukkandan
Copy link
Member

I don't believe --playlist-end nor the -I switch are intended to control how many pages of playlists are downloaded, but rather how many items within a specific playlist is downloaded. You see, yt-dlp is unable to download the page of playlists (aka "playlist tab") due to infinite scrolling.

Did you try?

Secondly, I doubt this will fix itself suddenly "in a few days". I only stumbled onto it for the first time, and this may be a very old and persistent issue that has gone unreported.

Even if the issue does not fix itself, the bug is still on Youtube's side. Similar issues have been observed in the past.

We cannot have a simple duplicate detection since playlists may genuinely have repeated entries. Workarounds are possible (e.g. check if we are in playlist tab of a channel and encounters duplicate entry), but will be tedious to implement. Plus, the workaround will likely only solve this specific issue and similar bugs in Youtube API will require even more workarounds

@coletdjnz Since similar issues have now been observed multiple times, do u have any idea if we can somehow detect this reliably?

@coletdjnz
Copy link
Member

@coletdjnz Since similar issues have now been observed multiple times, do u have any
idea if we can somehow detect this reliably?

If the continuations are the same, we could maybe keep a cache of ones we have seen in a given context and break based on that. But I have a feeling all of them are unique for tabs.

@pukkandan
Copy link
Member

The issue still exists. We may want to try your suggestion

@coletdjnz coletdjnz changed the title YouTube TuneCore auto-generated playlist pages have infinite scrolling page count. YouTube channel pages have infinite scrolling page count. Mar 24, 2023
@coletdjnz coletdjnz self-assigned this Mar 24, 2023
@coletdjnz
Copy link
Member

This seems to be an intentional change rather than a backend issue, and has started rolling out to more lately.

Unlike the comments loop issue, the observed behaviour is this seems to start looping at the known end of feed/channel page rather than somewhere in the middle.

The above suggestion I made looks like it'll work for channel pages at least. Not sure if other feeds are impacted.

@dirkf
Copy link
Contributor

dirkf commented Jun 23, 2023

Has this gone away now? Or should yt-dl be acquiring some variant of #6621?

@coletdjnz
Copy link
Member

coletdjnz commented Jun 23, 2023

Has this gone away now? Or should yt-dl be acquiring some variant of #6621?

Have not seen or heard of any reports recently. Must have been a short-lived bug on YouTube's side.

@a-raccoon
Copy link
Author

Has this gone away now? Or should yt-dl be acquiring some variant of #6621?

Have not seen or heard of any reports recently. Must have been a short-lived bug on YouTube's side.

The infinite scrolling is still goofy on some /playlists urls like the one reported in this issue.

@jmbannon
Copy link

jmbannon commented Jul 1, 2023

Encountered this bug today on this channel: https://www.youtube.com/channel/UCaqOj5uQl-733nBtBpEEGBw/playlists

@jmbannon
Copy link

jmbannon commented Jul 1, 2023

and a ton of others:
https://www.youtube.com/channel/UCMki-b0zfAQiMQ0nbsrIuBQ/playlists
https://www.youtube.com/channel/UCJdP0RSsGpcdI8S_aH5Y7yQ/playlists

Basically any auto-generated music channel that doesn't have a /release tab has this issue form what I've seen

@jmbannon
Copy link

jmbannon commented Jul 1, 2023

Is there a work-around to limit the number of pages loaded in the initial scan? playlist_end and max_downloads only apply after the UUsItMF6_fP754ihIsSRLk5A page 9: Downloading API JSON stage

@bashonly
Copy link
Member

bashonly commented Jul 1, 2023

--lazy-playlist will process the video results per page instead of downloading all pages beforehand

@jmbannon
Copy link

jmbannon commented Jul 1, 2023

I think you'd also need break_on_existing to True to avoid infinite scrolling

@jmbannon
Copy link

jmbannon commented Jul 1, 2023

Actually I think you'd still hit infinite scrolling unless you ctrl+c:

[download] Skipping already downloaded playlist: „These Foolish Things“ - Lester Young

Since it skips already downloaded playlists, it wont be able to hit the ExistingVideoReached exception

coletdjnz added a commit that referenced this issue Jul 15, 2023
Closes #5555

Note: the first page may still be repeated, however this is better than nothing.

Authored by: coletdjnz
aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this issue Apr 21, 2024
Closes yt-dlp#5555

Note: the first page may still be repeated, however this is better than nothing.

Authored by: coletdjnz
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
external issue Issue with an external tool site-bug Issue with a specific website
Projects
Status: Working around youtube's bugs
Development

Successfully merging a pull request may close this issue.

6 participants