Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YouTube extractor won't load more than 20000 videos (for big news channels) #26092

Open
adan89lion opened this issue Jul 23, 2020 · 3 comments
Open

Comments

@adan89lion
Copy link

@adan89lion adan89lion commented Jul 23, 2020

Checklist

  • I'm reporting a broken site support issue
  • I've verified that I'm running youtube-dl version 2020.06.16.1
  • I've checked that all provided URLs are alive and playable in a browser
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched the bugtracker for similar bug reports including closed ones
  • I've read bugs section in FAQ

Verbose log

[debug] System config: [] , 코로나19 대응 관련 브리핑 / 연합뉴스TV (YonhapnewsTV) has already been recorded in archive
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--format', '(bestvideo[vcodec^=av01][height>=1080][fps>30]/bestvideo[vcodec=vp9.2][height>=1080][fps>30]/bestvideo[vcodec=vp9][height>=1080][fps>30]/bestvideo[vcodec^=av01][height>=1080]/bestvideo[vcodec=vp9.2][height>=1080]/bestvideo[vcodec=vp9][height>=1080]/bestvideo[height>=1080]/bestvideo[vcodec^=av01][height>=720][fps>30]/bestvideo[vcodec=vp9.2][height>=720][fps>30]/bestvideo[vcodec=vp9][height>=720][fps>30]/bestvideo[vcodec^=av01][height>=720]/bestvideo[vcodec=vp9.2][height>=720]/bestvideo[vcodec=vp9][height>=720]/bestvideo[height>=720]/bestvideo)+(bestaudio[acodec=opus]/bestaudio)/best', '--verbose', '--force-ipv4', '--ignore-errors', '--no-continue', '--no-overwrites', '--download-archive', 'archive.log', '--add-metadata', '--write-description', '--write-info-json', '--write-annotations', '--write-thumbnail', '--embed-thumbnail', '--all-subs', '--sub-format', 'srt', '--embed-subs', '--output', '%(uploader)s/%(uploader)s - %(upload_date)s - %(title)s/%(uploader)s - %(upload_date)s - %(title)s.%(ext)s', '--merge-output-format', 'mkv', '--proxy', '[REDACTED]', '--datebefore', '20200731', '--batch-file', 'Source - Channels.txt']
[debug] Batch file urls: ['https://www.youtube.com/user/JTBC10news']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2020.06.16.1
[debug] Python version 3.5.1 (CPython) - Linux-4.4.59+-x86_64-with-glibc2.3.4
[debug] exe versions: ffmpeg 2.7.1
[debug] Proxy map: {[REDACTED]}
[youtube:user] JTBC10news: Downloading channel page
[download] Downloading playlist: Uploads from JTBC News
[youtube:playlist] UUsU-I-vHLiaMfV_ceaYz5rQ: Downloading page #1
[youtube:playlist] UUsU-I-vHLiaMfV_ceaYz5rQ: Downloading page #2
[youtube:playlist] UUsU-I-vHLiaMfV_ceaYz5rQ: Downloading page #3
[youtube:playlist] UUsU-I-vHLiaMfV_ceaYz5rQ: Downloading page #4
...
[youtube:playlist] UUsU-I-vHLiaMfV_ceaYz5rQ: Downloading page #199
[youtube:playlist] playlist Uploads from JTBC News: Downloading 20000 videos
[download] Downloading video 1 of 20000
WARNING: video doesn't have subtitles
[info] Writing video description to: JTBC News/JTBC News - 20200723 - 2020년 7월 23일 (목) JTBC 정치부회의 다시보기 - 이인영 통일부 장관 후보자 청문회···'주체사상' 공방도/JTBC News - 20200723 - 2020년 7월 23일 (목) JTBC 정치부회의 다시보기 - 이인영 통일부 장관 
후보자 청문회···'주체사상' 공방도.description

[Following with regular video downloading logs]

Description

I'm trying to archive several news channels on YouTube, (not sure the actual date) but some day after June, the programme only downloads the first ~199 pages of the channel and exact 20000 videos in total, making older videos unable to download. The video downloading behaviour is normal and functional, but the webpage extractor fails to load more than 20000 videos from YouTube.

As of date, I've tried with and without proxies, which doesn't affact this issue (download behaviour normal, but unable to download videos after n. 20000). Tested OSes are Linux (DSM) and Windows 10. Will provide more info if needed.

@thulle
Copy link

@thulle thulle commented Jul 28, 2020

If I go to https://www.youtube.com/user/JTBC10news - or bloomberg for that matter - and press play all and check the playlist below the video it says "1 / 20000" - so this might be server side.

@brokeharvard
Copy link

@brokeharvard brokeharvard commented Aug 28, 2020

Has anyone found a way to get around this?

@DankMemeGuy
Copy link

@DankMemeGuy DankMemeGuy commented Sep 23, 2020

Idea:

Since it pulls 20,000 videos from newest to oldest, would a workaround be once hitting that limit, sort from oldest to newest then hit the 20,000 but merge both ends and you'd get a theoretical max of 40,000 max videos instead of 20,000?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.