Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[regression] [patch available] Youtube channel playlists stopped working #16323

Closed
haasn opened this issue Apr 29, 2018 · 7 comments
Closed

[regression] [patch available] Youtube channel playlists stopped working #16323

haasn opened this issue Apr 29, 2018 · 7 comments

Comments

@haasn
Copy link
Contributor

@haasn haasn commented Apr 29, 2018

Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2018.04.25. If it's not, read this FAQ entry and update. Issues with outdated version will be rejected.

  • I've verified and I assure that I'm running youtube-dl 2018.04.25

Before submitting an issue make sure you have:

  • At least skimmed through the README, most notably the FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones
  • Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

$ ./youtube-dl -v https://www.youtube.com/user/viperkeeper/videos
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-v', 'https://www.youtube.com/user/viperkeeper/videos']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2018.04.25
[debug] Python version 3.6.5 (CPython) - Linux-4.16.0-gentoo-pcireset-x86_64-AMD_Ryzen_Threadripper_1950X_16-Core_Processor-with-gentoo-2.4.1
[debug] exe versions: ffmpeg N-90883-g29fd44adf1, ffprobe N-90883-g29fd44adf1, rtmpdump 2.4
[debug] Proxy map: {}
[youtube:user] viperkeeper: Downloading channel page
[youtube:playlist] UUUtdmXEDHdSHPZP37n_Bymw: Downloading webpage
[download] Downloading playlist: Uploads from viperkeeper
[youtube:playlist] UUUtdmXEDHdSHPZP37n_Bymw: Downloading page #1
Traceback (most recent call last):
  File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "./youtube-dl/__main__.py", line 19, in <module>
  File "./youtube-dl/youtube_dl/__init__.py", line 471, in main
  File "./youtube-dl/youtube_dl/__init__.py", line 461, in _real_main
  File "./youtube-dl/youtube_dl/YoutubeDL.py", line 1993, in download
  File "./youtube-dl/youtube_dl/YoutubeDL.py", line 800, in extract_info
  File "./youtube-dl/youtube_dl/YoutubeDL.py", line 861, in process_ie_result
  File "./youtube-dl/youtube_dl/YoutubeDL.py", line 800, in extract_info
  File "./youtube-dl/youtube_dl/YoutubeDL.py", line 960, in process_ie_result
  File "./youtube-dl/youtube_dl/extractor/youtube.py", line 278, in _entries
KeyError: 'content_html'

I've bisected the issue:

0fe7783eced5c62dbd95780c2150fd1080bd3927 is the first bad commit
commit 0fe7783eced5c62dbd95780c2150fd1080bd3927
Author: Sergey M․ <dstftw@gmail.com>
Date:   Sat Apr 28 01:59:15 2018 +0700

    [extractor/common] Add _download_json_handle

:040000 040000 0c314b881219767d3cfd1c5208417bd689839131 4bb685f1488b271d522d5c81a690aa1ebdbb8cb2 M	youtube_dl

The commit before that works fine.

@haasn
Copy link
Contributor Author

@haasn haasn commented Apr 29, 2018

Some comments:

  • I went back and forth between 0fe7783 and 0fe7783^ multiple times to make absolutely sure the commit definitely triggers it for me.
  • Other channels seem to work fine, only this one is affected as far as I can tell. (I've tried a few others)
  • The same URL apparently works for other people on the same commit.
  • The issue has persisted for several hours.

It may be a transient issue caused by some CDN or caching layer, but that wouldn't explain why this specific commit seems to break it for me. It's a bit hard for me to understand what the commit is actually doing, but it seems to be replacing _download_webpage by _download_webpage_handle, which differs only in that it doesn't catch compat_http_client.IncompleteRead and also returns urlh, which is ignored.

@haasn
Copy link
Contributor Author

@haasn haasn commented Apr 29, 2018

Just to demonstrate further:

$ git status && git clean -fdx && make youtube-dl && ./youtube-dl -v -g https://www.youtube.com/user/viperkeeper/videos
HEAD detached at 0fe7783ec
nothing to commit, working tree clean
Removing youtube-dl
Removing youtube-dl.bash-completion
Removing youtube-dl.fish
Removing youtube-dl.zsh
Removing youtube_dl/__pycache__/
Removing youtube_dl/downloader/__pycache__/
Removing youtube_dl/extractor/__pycache__/
Removing youtube_dl/postprocessor/__pycache__/
mkdir -p zip
for d in youtube_dl youtube_dl/downloader youtube_dl/extractor youtube_dl/postprocessor ; do \
  mkdir -p zip/$d ;\
  cp -pPR $d/*.py zip/$d/ ;\
done
touch -t 200001010101 zip/youtube_dl/*.py zip/youtube_dl/*/*.py
mv zip/youtube_dl/__main__.py zip/
cd zip ; zip -q ../youtube-dl youtube_dl/*.py youtube_dl/*/*.py __main__.py
rm -rf zip
echo '#!/usr/bin/env python' > youtube-dl
cat youtube-dl.zip >> youtube-dl
rm youtube-dl.zip
chmod a+x youtube-dl
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-v', '-g', 'https://www.youtube.com/user/viperkeeper/videos']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2018.04.25
[debug] Python version 3.6.5 (CPython) - Linux-4.16.0-gentoo-pcireset-x86_64-AMD_Ryzen_Threadripper_1950X_16-Core_Processor-with-gentoo-2.4.1
[debug] exe versions: ffmpeg N-90883-g29fd44adf1, ffprobe N-90883-g29fd44adf1, rtmpdump 2.4
[debug] Proxy map: {}
Traceback (most recent call last):
  File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "./youtube-dl/__main__.py", line 19, in <module>
  File "./youtube-dl/youtube_dl/__init__.py", line 471, in main
  File "./youtube-dl/youtube_dl/__init__.py", line 461, in _real_main
  File "./youtube-dl/youtube_dl/YoutubeDL.py", line 1993, in download
  File "./youtube-dl/youtube_dl/YoutubeDL.py", line 800, in extract_info
  File "./youtube-dl/youtube_dl/YoutubeDL.py", line 861, in process_ie_result
  File "./youtube-dl/youtube_dl/YoutubeDL.py", line 800, in extract_info
  File "./youtube-dl/youtube_dl/YoutubeDL.py", line 960, in process_ie_result
  File "./youtube-dl/youtube_dl/extractor/youtube.py", line 278, in _entries
KeyError: 'content_html'

$ checkout HEAD^
Previous HEAD position was 0fe7783ec [extractor/common] Add _download_json_handle
HEAD is now at c84eae4f6 [funk:channel] Improve extraction (closes #16285)

$ git status && git clean -fdx && make youtube-dl && ./youtube-dl -v -g https://www.youtube.com/user/viperkeeper/videos
HEAD detached at c84eae4f6
nothing to commit, working tree clean
Removing youtube-dl
mkdir -p zip
for d in youtube_dl youtube_dl/downloader youtube_dl/extractor youtube_dl/postprocessor ; do \
  mkdir -p zip/$d ;\
  cp -pPR $d/*.py zip/$d/ ;\
done
touch -t 200001010101 zip/youtube_dl/*.py zip/youtube_dl/*/*.py
mv zip/youtube_dl/__main__.py zip/
cd zip ; zip -q ../youtube-dl youtube_dl/*.py youtube_dl/*/*.py __main__.py
rm -rf zip
echo '#!/usr/bin/env python' > youtube-dl
cat youtube-dl.zip >> youtube-dl
rm youtube-dl.zip
chmod a+x youtube-dl
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-v', '-g', 'https://www.youtube.com/user/viperkeeper/videos']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2018.04.25
[debug] Python version 3.6.5 (CPython) - Linux-4.16.0-gentoo-pcireset-x86_64-AMD_Ryzen_Threadripper_1950X_16-Core_Processor-with-gentoo-2.4.1
[debug] exe versions: ffmpeg N-90883-g29fd44adf1, ffprobe N-90883-g29fd44adf1, rtmpdump 2.4
[debug] Proxy map: {}
[debug] Default format spec: bestvideo+bestaudio/best
https://r4---sn-4g5e6nss.googlevideo.com/videoplayback?source=youtube&aitags=133%2C134%2C135%2C136%2C137%2C160%2C242%2C243%2C244%2C247%2C248%2C278&clen=428986506&ipbits=0&signature=8FCC6A1DC110B2321713ECE301AA60179FAE7FA1.95B4C63DC705BAF205FA5489EEA87990A5C57990&lmt=1524887944106824&expire=1525011090&fvip=4&keepalive=yes&requiressl=yes&mime=video%2Fmp4&gir=yes&mt=1524989382&dur=1793.424&mv=m&initcwndbps=1360000&ms=au%2Conr&itag=137&sparams=aitags%2Cclen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Ckeepalive%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cexpire&fexp=23724337&id=o-AE0xu0YtdNXar-PLCS4CPDy30mn_Qm_QLHcdPsEvi8GK&mm=31%2C26&ip=5.56.186.133&mn=sn-4g5e6nss%2Csn-h0jeened&pl=21&ei=Mn7lWuaECZen1wKMlaqgCw&c=WEB&key=yt6&ratebypass=yes
https://r4---sn-4g5e6nss.googlevideo.com/videoplayback?source=youtube&clen=20581210&ipbits=0&signature=85DB84B2752F96901D91C580C63C183060BB3967.216BE6216AD6C44D750D3DD63A9C4D7E9309BEA4&lmt=1524902411816573&expire=1525011090&fvip=4&keepalive=yes&requiressl=yes&mime=audio%2Fwebm&gir=yes&mt=1524989382&dur=1793.441&mv=m&initcwndbps=1360000&ms=au%2Conr&itag=251&sparams=clen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Ckeepalive%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cexpire&fexp=23724337&id=o-AE0xu0YtdNXar-PLCS4CPDy30mn_Qm_QLHcdPsEvi8GK&mm=31%2C26&ip=5.56.186.133&mn=sn-4g5e6nss%2Csn-h0jeened&pl=21&ei=Mn7lWuaECZen1wKMlaqgCw&c=WEB&key=yt6&ratebypass=yes
[debug] Default format spec: bestvideo+bestaudio/best
https://r5---sn-4g5ednle.googlevideo.com/videoplayback?key=yt6&initcwndbps=1420000&lmt=1524231604115815&source=youtube&dur=1472.103&mime=video%2Fmp4&pl=21&clen=289931625&expire=1525011091&ip=5.56.186.133&mm=31%2C26&mn=sn-4g5ednle%2Csn-h0jeen7d&keepalive=yes&sparams=aitags%2Cclen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Ckeepalive%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cexpire&ei=M37lWqLfCYXA1wLet5iQBw&itag=137&ms=au%2Conr&mt=1524989382&mv=m&gir=yes&signature=ADCC34D413C402E6C47A2661DC40E16DBB084189.6EFDEDE153DB45446E21D76566D631368E79EFD2&ipbits=0&requiressl=yes&fvip=5&id=o-AKfGPkOPMUmheBs840NcS8PulBZCpKoLSXMdIok3A8tU&fexp=23724337&c=WEB&aitags=133%2C134%2C135%2C136%2C137%2C160%2C242%2C243%2C244%2C247%2C248%2C278&ratebypass=yes
https://r5---sn-4g5ednle.googlevideo.com/videoplayback?key=yt6&initcwndbps=1420000&lmt=1524237718307692&source=youtube&dur=1472.121&mime=audio%2Fwebm&pl=21&clen=16876250&expire=1525011091&ip=5.56.186.133&mm=31%2C26&mn=sn-4g5ednle%2Csn-h0jeen7d&keepalive=yes&sparams=clen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Ckeepalive%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cexpire&ei=M37lWqLfCYXA1wLet5iQBw&itag=251&ms=au%2Conr&mt=1524989382&mv=m&gir=yes&signature=07982613CA4AA90E872799C45BE93EE34F0DF3D0.5783E65E37B068CCDF6742C22D51119F01BABA12&ipbits=0&requiressl=yes&fvip=5&id=o-AKfGPkOPMUmheBs840NcS8PulBZCpKoLSXMdIok3A8tU&fexp=23724337&c=WEB&ratebypass=yes
[debug] Default format spec: bestvideo+bestaudio/best
https://r6---sn-4g5edne6.googlevideo.com/videoplayback?expire=1525011092&ei=M37lWtmJPIPY1gLT8YiwBg&itag=137&keepalive=yes&requiressl=yes&signature=2D90E07F647C422732717F7E641BF6AC44279974.B39E3D95C53C81A9F19D11A20677B7032A30D2B0&ms=au%2Crdu&fvip=4&mv=m&sparams=aitags%2Cclen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Ckeepalive%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cexpire&mt=1524989382&pl=21&aitags=133%2C134%2C135%2C136%2C137%2C160%2C242%2C243%2C244%2C247%2C248%2C278&mime=video%2Fmp4&id=o-AKjJQrTLt3l_RsggzyAqcFD5FosjFBTmHyVN357y-Kfn&gir=yes&mn=sn-4g5edne6%2Csn-4g5e6nze&key=yt6&ip=5.56.186.133&ipbits=0&c=WEB&lmt=1523630615226181&initcwndbps=1327500&source=youtube&dur=1227.626&clen=244191650&mm=31%2C29&fexp=23724337&ratebypass=yes
https://r6---sn-4g5edne6.googlevideo.com/videoplayback?expire=1525011092&ei=M37lWtmJPIPY1gLT8YiwBg&itag=251&keepalive=yes&requiressl=yes&signature=4EC045E5F6B004F513A01A770CC02FB5437724D4.24BFD994679C6DA3A2E154427DF3111DFF82D40D&ms=au%2Crdu&fvip=4&mv=m&sparams=clen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Ckeepalive%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cexpire&mt=1524989382&pl=21&mime=audio%2Fwebm&id=o-AKjJQrTLt3l_RsggzyAqcFD5FosjFBTmHyVN357y-Kfn&gir=yes&mn=sn-4g5edne6%2Csn-4g5e6nze&key=yt6&ip=5.56.186.133&ipbits=0&c=WEB&lmt=1523636104269150&initcwndbps=1327500&source=youtube&dur=1227.641&clen=15985574&mm=31%2C29&fexp=23724337&ratebypass=yes
[debug] Default format spec: bestvideo+bestaudio/best
https://r2---sn-4g5ednsk.googlevideo.com/videoplayback?requiressl=yes&source=youtube&itag=137&ei=NH7lWtOYM8fTgAeM5rjACA&pl=21&ipbits=0&aitags=133%2C134%2C135%2C136%2C137%2C160%2C242%2C243%2C244%2C247%2C248%2C278&initcwndbps=1327500&c=WEB&mm=31%2C29&gir=yes&mn=sn-4g5ednsk%2Csn-4g5e6nsr&id=o-AP9Hfw51jxEQOrh18PdpNiCA6kI5DZnZ7K-TBO8RUSbh&fvip=2&mt=1524989382&mv=m&ms=au%2Crdu&ip=5.56.186.133&signature=7AAEB4ECA5043A9A02A0F2B017464C01FA1017B2.2864742B628CC3AA8F13F9F1712B23B82C74ECB5&lmt=1523180548043666&keepalive=yes&clen=347599481&dur=1539.137&expire=1525011092&fexp=23724337&mime=video%2Fmp4&key=yt6&sparams=aitags%2Cclen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Ckeepalive%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cexpire&ratebypass=yes
https://r2---sn-4g5ednsk.googlevideo.com/videoplayback?requiressl=yes&source=youtube&itag=251&ei=NH7lWtOYM8fTgAeM5rjACA&pl=21&ipbits=0&initcwndbps=1327500&c=WEB&mm=31%2C29&gir=yes&mn=sn-4g5ednsk%2Csn-4g5e6nsr&id=o-AP9Hfw51jxEQOrh18PdpNiCA6kI5DZnZ7K-TBO8RUSbh&fvip=2&mt=1524989382&mv=m&ms=au%2Crdu&ip=5.56.186.133&signature=C95EA58DC63410457A4A123472C6C8719BCBB8D7.8E8DD4E0FF737B21571BC744DEC3DF1E0DE68C8D&lmt=1523194686889302&keepalive=yes&clen=19982465&dur=1539.161&expire=1525011092&fexp=23724337&mime=audio%2Fwebm&key=yt6&sparams=clen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Ckeepalive%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cexpire&ratebypass=yes
[debug] Default format spec: bestvideo+bestaudio/best
https://r2---sn-4g5e6nld.googlevideo.com/videoplayback?ms=au%2Crdu&mv=m&mt=1524989382&source=youtube&clen=308276452&sparams=aitags%2Cclen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Ckeepalive%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cexpire&dur=1084.383&id=o-AFNxSLzY7CNQ_G8HmMAdPhIJx-hRcoZ3ZoL-VMfmn5El&mn=sn-4g5e6nld%2Csn-4g5ednz7&c=WEB&gir=yes&mime=video%2Fmp4&signature=6A58D0009081B4AE7E3F8D768467F53693EF4A64.C1010DB152F8A3CE0CE6851561957B2FEBB98FC0&aitags=133%2C134%2C135%2C136%2C137%2C160%2C242%2C243%2C244%2C247%2C248%2C278&mm=31%2C29&expire=1525011094&lmt=1522442225164016&fexp=23724337&pl=21&ipbits=0&initcwndbps=1420000&ip=5.56.186.133&key=yt6&requiressl=yes&ei=NX7lWrKBLs6D8gOp2YnABQ&itag=137&fvip=2&keepalive=yes&ratebypass=yes
https://r2---sn-4g5e6nld.googlevideo.com/videoplayback?ms=au%2Crdu&mv=m&mt=1524989382&source=youtube&clen=13109848&sparams=clen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Ckeepalive%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cexpire&dur=1084.401&id=o-AFNxSLzY7CNQ_G8HmMAdPhIJx-hRcoZ3ZoL-VMfmn5El&signature=5414A3A9CD7AE5C41E99167AE8B7936F03375CE6.DFFB73F21EADABB332850BD2C17A3882C0A82C78&c=WEB&gir=yes&mime=audio%2Fwebm&mn=sn-4g5e6nld%2Csn-4g5ednz7&mm=31%2C29&expire=1525011094&lmt=1522448280974244&fexp=23724337&pl=21&ipbits=0&initcwndbps=1420000&ip=5.56.186.133&key=yt6&requiressl=yes&ei=NX7lWrKBLs6D8gOp2YnABQ&itag=251&fvip=2&keepalive=yes&ratebypass=yes
^C
ERROR: Interrupted by user
@haasn
Copy link
Contributor Author

@haasn haasn commented Apr 29, 2018

Digging deeper: Sprinkling some print()s here and there reveals that the data actually returned by _download_webpage_handle is {"reload":"now"}. If this is the server telling us to reload the page, then maybe it worked because of the while success is False loop implicit in _download_webpage?

However, to investigate this theory, I tried simply duplicating the _download_webpage_handle call as separate lines, in the hopes of getting the server to be happy about the reload. This did not change anything.

Approaching the problem a different angle, I managed to confirm that replacing _download_webpage_handle by _download_webpage again solves the issue even on that commit. So whatever the problem is, it has to be isolated very specifically to the use of this function over the other.

However, even this line of thinking has me befuddled. This works:

    def _download_json_handle(
            self, url_or_request, video_id, note='Downloading JSON metadata',
            errnote='Unable to download JSON metadata', transform_source=None,
            fatal=True, encoding=None, data=None, headers={}, query={}):
        """Return a tuple (JSON object, URL handle)"""


        res = self._download_webpage(
            url_or_request, video_id, note, errnote, fatal=fatal,
            encoding=encoding, data=data, headers=headers, query=query), 'foo'


        print("_download_json_handle:", res)
        if res is False:
            return res
        json_string, urlh = res
        return self._parse_json(
            json_string, video_id, transform_source=transform_source,
            fatal=fatal), urlh

This does not:

    def _download_json_handle(
            self, url_or_request, video_id, note='Downloading JSON metadata',
            errnote='Unable to download JSON metadata', transform_source=None,
            fatal=True, encoding=None, data=None, headers={}, query={}):
        """Return a tuple (JSON object, URL handle)"""


        # Default parameters
        tries=1
        timeout=5

        # Full body of self._download_webpage (excluding the return)
        success = False
        try_count = 0
        while success is False:
            try:
                res = self._download_webpage_handle(url_or_request, video_id, note, errnote, fatal, encoding=encoding, data=data, headers=headers, query=query)
                success = True
            except compat_http_client.IncompleteRead as e:
                try_count += 1
                if try_count >= tries:
                    raise e
                self._sleep(timeout, video_id)


        print("_download_json_handle:", res)
        if res is False:
            return res
        json_string, urlh = res
        return self._parse_json(
            json_string, video_id, transform_source=transform_source,
            fatal=fatal), urlh

Literally replacing the function that works by its definition no longer works. At this point I've gone back and forth between the working and the non-working variant probably a good dozen times, so it can't be pure chance.

@haasn
Copy link
Contributor Author

@haasn haasn commented Apr 29, 2018

Digging even deeper, I managed to find some discrepency in the sequence of function calls that were made. Specifically, the difference is in the value of the query body.

When calling _download_webpage directly, the value of query is always {'disable_polymer': 'true'} - even though it's passed as {} by the caller! Observe this sequence of prints:

Working: (using _download_webpage)

query inside _download_webpage {'disable_polymer': 'true'}
query inside _download_webpage_handle: {'disable_polymer': 'true'}
query inside _download_webpage {'disable_polymer': 'true'}
query inside _download_webpage_handle: {'disable_polymer': 'true'}
query inside _download_json_handle {}
query inside _download_webpage {'disable_polymer': 'true'}
query inside _download_webpage_handle: {'disable_polymer': 'true'}
query inside _download_json_handle {'disable_polymer': 'true'}
query inside _download_webpage {'disable_polymer': 'true'}
query inside _download_webpage_handle: {'disable_polymer': 'true'}
query inside _download_json_handle {'disable_polymer': 'true'}
query inside _download_webpage {'disable_polymer': 'true'}
query inside _download_webpage_handle: {'disable_polymer': 'true'}
^C
ERROR: Interrupted by user

I've double and triple-checked to make sure there's no typo of query anywhere. The prints are added immediately after the functions' signatures.

When calling _download_webpage_handle directly, the query is not magically replaced in the same way, which is why it breaks. Observe:

query inside _download_webpage {'disable_polymer': 'true'}
query inside _download_webpage_handle: {'disable_polymer': 'true'}
query inside _download_webpage {'disable_polymer': 'true'}
query inside _download_webpage_handle: {'disable_polymer': 'true'}
query inside _download_json_handle {}
query inside _download_webpage_handle: {}

Since it stays as {} instead of disable_polymer, that explains why it fails downloading for me. It also explains why it may or may not work for random URLs and random people - isn't google still in the process of rolling out polymer?

This also suggests that a solution might be making sure that _download_json_handle is called with {disable_polymer: true} wherever it matters. However, more puzzlingly, it doesn't explain why python decides to randomly replace query here even though the caller specifies it as {}.

@haasn
Copy link
Contributor Author

@haasn haasn commented Apr 29, 2018

Okay, after stumbling my way through some more god-awful python debugging, I managed to figure out the issue: The reason _download_webpage was randomly overriding keyword args was becaus the specific subclass of the class I'm looking at decided was interefering with things. More specifically, this diff fixes it:

diff --git a/youtube_dl/extractor/youtube.py b/youtube_dl/extractor/youtube.py
index e7bd1f18f..04aeb91af 100644
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@@ -246,9 +246,9 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
 
         return True
 
-    def _download_webpage(self, *args, **kwargs):
+    def _download_webpage_handle(self, *args, **kwargs):
         kwargs.setdefault('query', {})['disable_polymer'] = 'true'
-        return super(YoutubeBaseInfoExtractor, self)._download_webpage(
+        return super(YoutubeBaseInfoExtractor, self)._download_webpage_handle(
             *args, **compat_kwargs(kwargs))
 
     def _real_initialize(self):

Shall I submit a pull request?

@haasn haasn changed the title [regression] Youtube channel playlists stopped working [regression] [patch available] Youtube channel playlists stopped working Apr 29, 2018
@dstftw
Copy link
Collaborator

@dstftw dstftw commented Apr 29, 2018

Shall I submit a pull request?

Yes.

dstftw added a commit that referenced this issue Apr 29, 2018
Rather than just the one that use the _download_webpage helper. The need
for this was made apparent by 0fe7783, which refactored
_download_json in a way that completely avoids the use of
_download_webpage, thus breaking youtube.

Fixes #16323
@Hrxn
Copy link

@Hrxn Hrxn commented Apr 29, 2018

[..] - isn't google still in the process of rolling out polymer?

Yes. They're still changing things, e.g. the edit functionality of a playlist view on YouTube..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

3 participants
You can’t perform that action at this time.