Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NPR Tiny Desk Concert: some concerts fail #24531

Open
DreadPirateShawn opened this issue Mar 29, 2020 · 1 comment
Open

NPR Tiny Desk Concert: some concerts fail #24531

DreadPirateShawn opened this issue Mar 29, 2020 · 1 comment

Comments

@DreadPirateShawn
Copy link

@DreadPirateShawn DreadPirateShawn commented Mar 29, 2020

Checklist

  • I'm reporting a broken site support
  • I've verified that I'm running youtube-dl version 2020.03.24
  • I've checked that all provided URLs are alive and playable in a browser
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched the bugtracker for similar issues including closed ones

Verbose log

Failure variation 1:

$ youtube-dl https://www.npr.org/2019/06/06/729312182/tomberlin-tiny-desk-concert --verbose
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://www.npr.org/2019/06/06/729312182/tomberlin-tiny-desk-concert', '--verbose']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2020.03.24
[debug] Python version 3.8.1 (CPython) - Linux-5.4.14-2-MANJARO-x86_64-with-glibc2.2.5
[debug] exe versions: ffmpeg 4.2.2, ffprobe 4.2.2, rtmpdump 2.4
[debug] Proxy map: {}
[Npr] 729312182: Downloading JSON metadata
ERROR: No video formats found; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 797, in extract_info
    ie_result = ie.extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 530, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/npr.py", line 114, in _real_extract
    self._sort_formats(formats)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 1327, in _sort_formats
    raise ExtractorError('No video formats found')
youtube_dl.utils.ExtractorError: No video formats found; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

Failure variation 2:

$ youtube-dl https://www.npr.org/2019/03/04/700102793/meg-myers-tiny-desk-concert --verbose
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://www.npr.org/2019/03/04/700102793/meg-myers-tiny-desk-concert', '--verbose']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2020.03.24
[debug] Python version 3.8.1 (CPython) - Linux-5.4.14-2-MANJARO-x86_64-with-glibc2.2.5
[debug] exe versions: ffmpeg 4.2.2, ffprobe 4.2.2, rtmpdump 2.4
[debug] Proxy map: {}
[Npr] 700102793: Downloading JSON metadata
[Npr] 700116253: Downloading m3u8 information
WARNING: Failed to download m3u8 information: HTTP Error 404: Not Found
[Npr] 700116253: Downloading SMIL file
[Npr] 700116253: Checking http-500 video format URL
[Npr] 700116253: http-500 video format URL is invalid, skipping
[Npr] 700116253: Checking http-1500 video format URL
[Npr] 700116253: http-1500 video format URL is invalid, skipping
[Npr] 700116253: Checking http-1000 video format URL
[Npr] 700116253: http-1000 video format URL is invalid, skipping
[Npr] 700116253: Checking http-200 video format URL
[Npr] 700116253: http-200 video format URL is invalid, skipping
ERROR: No video formats found; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 797, in extract_info
    ie_result = ie.extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 530, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/npr.py", line 114, in _real_extract
    self._sort_formats(formats)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 1327, in _sort_formats
    raise ExtractorError('No video formats found')
youtube_dl.utils.ExtractorError: No video formats found; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

Description

Some NPR Tiny Desk Concert links work, others do not.

Sample comparison:

@seandjones92
Copy link

@seandjones92 seandjones92 commented Apr 7, 2020

I'm taking a look at this and so far what I've found is that in here we see "story" get created.

I've made a small change to print "story" to see what the difference is between the successful and failed downloads. The output for each link is included here:

change to file:

        story = self._download_json(
            'http://api.npr.org/query', playlist_id, query={
                'id': playlist_id,
                'fields': 'audio,multimedia,title',
                'format': 'json',
                'apiKey': 'MDAzMzQ2MjAyMDEyMzk4MTU1MDg3ZmM3MQ010',
            })['list']['story'][0]
        playlist_title = story.get('title', {}).get('$text')

        KNOWN_FORMATS = ('threegp', 'm3u8', 'smil', 'mp4', 'mp3')
        quality = qualities(KNOWN_FORMATS)

        print(str(story))

https://www.npr.org/2011/07/28/138616627/they-might-be-giants-tiny-desk-concert

{u'multimedia': [{u'dataUrl': {}, u'stream': {u'active': u'false'}, u'format': {u'mp4': {u'$text': u'https://ondemand.npr.org/npr-mp4/npr/asc/2011/07/20110728_asc_tmbgvideo.mp4?orgId=1&topicId=1110&aggIds=92071316&d=593&story=138616627&ft=nprml&f=138616627', u'type': u'video'}, u'smil': {u'$text': u'https://www.npr.org/npr-mp4/npr/asc/2011/07/20110728_asc_tmbgvideo.smil', u'type': u'video'}, u'm3u8': {u'$text': u'http://ivideo-i.akamaihd.net/i/npr-mp4/npr/asc/2011/07/20110728_asc_tmbgvideo-,200000,500000,1000000,1500000,.mp4.csmil/master.m3u8', u'type': u'video'}}, u'credit': {u'$text': u'NPR'}, u'rightsHolder': {}, u'height': {u'$text': u'351'}, u'altImageUrl': {u'$text': u'https://media.npr.org/assets/music/studiosessions/tinydesk/2011/07/theymightbegiants_1280wide-ebac57e723111029567b6b84839d596f7074eb68.jpg'}, u'caption': {}, u'width': {u'$text': u'624'}, u'duration': {u'$text': u'593'}, u'title': {}, u'id': u'138789856', u'permissions': {u'download': {u'allow': u'false'}, u'embed': {u'allow': u'true'}, u'stream': {u'allow': u'true'}}}], u'audio': [{u'rightsHolder': {}, u'description': {u'$text': u'null'}, u'stream': {u'active': u'false'}, u'format': {u'wm': {u'$text': u'https://www.npr.org/templates/dmg/dmg_wmref_em.php?id=138787048&type=1&mtype=WM&orgId=1&topicId=1110&aggIds=92071316&d=591&story=138616627&ft=nprml&f=138616627'}, u'hls': {u'$text': u'http://ivideo-i.akamaihd.net/i/npr-mp4/npr/asc/2011/07/20110728_asc_tmbg,,.mp4.csmil/master.m3u8?orgId=1&topicId=1110&aggIds=92071316&d=591&story=138616627&ft=nprml&f=138616627'}, u'mp4': {u'$text': u'https://ondemand.npr.org/npr-mp4/npr/asc/2011/07/20110728_asc_tmbg.mp4?orgId=1&topicId=1110&aggIds=92071316&d=591&story=138616627&ft=nprml&f=138616627'}, u'mp3': [{u'$text': u'https://ondemand.npr.org/anon.npr-mp3/npr/asc/2011/07/20110728_asc_tmbg.mp3?orgId=1&topicId=1110&aggIds=92071316&d=591&story=138616627&ft=nprml&f=138616627', u'distribution': u'Download', u'type': u'mp3', u'method': u'fms'}, {u'$text': u'http://api.npr.org/m3u/1138787048-4eb416.m3u?orgId=1&topicId=1110&aggIds=92071316&d=591&story=138616627&ft=nprml&f=138616627', u'distribution': u'Streaming', u'type': u'm3u', u'method': u'fms'}], u'threegp': {u'$text': u'https://ondemand.npr.org/npr-3gp/npr/asc/2011/07/20110728_asc_tmbg.3gp?orgId=1&topicId=1110&aggIds=92071316&d=591&story=138616627&ft=nprml&f=138616627'}, u'mediastream': {u'$text': u'rtmp://flash.npr.org/ondemand/mp3:anon.npr-mp3/npr/asc/2011/07/20110728_asc_tmbg.mp3'}}, u'region': {u'$text': u'all'}, u'title': {u'$text': u'They Might Be Giants: Tiny Desk Concert'}, u'duration': {u'$text': u'591'}, u'type': u'primary', u'id': u'138787048', u'permissions': {u'download': {u'allow': u'true'}, u'embed': {u'allow': u'true'}, u'stream': {u'allow': u'true'}}}], u'link': [{u'$text': u'https://www.npr.org/2011/07/28/138616627/they-might-be-giants-tiny-desk-concert?ft=nprml&f=138616627', u'type': u'html'}, {u'$text': u'http://api.npr.org/query?id=138616627&apiKey=MDAzMzQ2MjAyMDEyMzk4MTU1MDg3ZmM3MQ010', u'type': u'api'}], u'id': u'138616627', u'title': {u'$text': u'They Might Be Giants: Tiny Desk Concert'}}

https://www.npr.org/2019/06/06/729312182/tomberlin-tiny-desk-concert

{u'multimedia': [{u'rightsHolder': {}, u'dataUrl': {}, u'stream': {u'active': u'false'}, u'format': {}, u'credit': {u'$text': u'NPR/Claire Harbage'}, u'title': {}, u'height': {u'$text': u'351'}, u'altImageUrl': {u'$text': u'https://media.npr.org/assets/img/2019/06/05/_dsc1075-edit-copy_wide-ec48a7ea21faabfddba32fc3b3d4bf23aa5a25d3.jpg'}, u'caption': {}, u'width': {u'$text': u'624'}, u'duration': {u'$text': u'747'}, u'type': u'primary', u'id': u'729312902', u'permissions': {u'download': {u'allow': u'false'}, u'embed': {u'allow': u'true'}, u'stream': {u'allow': u'true'}}}], u'link': [{u'$text': u'https://www.npr.org/2019/06/06/729312182/tomberlin-tiny-desk-concert?ft=nprml&f=729312182', u'type': u'html'}, {u'$text': u'http://api.npr.org/query?id=729312182&apiKey=MDAzMzQ2MjAyMDEyMzk4MTU1MDg3ZmM3MQ010', u'type': u'api'}], u'id': u'729312182', u'title': {u'$text': u'Tomberlin: Tiny Desk Concert'}}

This one (https://www.npr.org/2019/03/04/700102793/meg-myers-tiny-desk-concert) fails differently for me, I might have more time to look into that later:

WARNING: Failed to download m3u8 information: HTTP Error 404: Not Found

Shifting my focus to the other two links the main difference I can find is this. The npr.py extractor will loop through the multimedia part of the story object to work on the formats. It will then pass the formats to the _sort_formats method here

When we pass this address (https://www.npr.org/2019/06/06/729312182/tomberlin-tiny-desk-concert) to youtube-dl it gives us no formats (u'format': {}) so there is nothing to sort. This appears to be the root of the issue, _sort_formats fails when we don't pass anything to it.

I'm trying to understand the code base better to know if youtube-dl is supposed to provide the formats included in the story object or if this is something that get's passed to us by NPR and there's nothing we can do about it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.