Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extractor/common] _parse_mpd_formats raises ExtractorError on AdaptationSet with contentType="text" and mimeType="application/mp4" #14630

Closed
tobijjah opened this issue Oct 30, 2017 · 2 comments
Labels

Comments

@tobijjah
Copy link

@tobijjah tobijjah commented Oct 30, 2017

  • I've verified and I assure that I'm running youtube-dl 2017.10.29
  • At least skimmed through the README, most notably the FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones
  • Bug report (encountered problems with youtube-dl)

Description

If you trying to parse with _parse_mpd_formats (respectively _extract_mpd_formats) the provided example MPD manifest file, the method raises an ExtractorError. The error occurs if the parser reaches the AdaptationSet with the contentType="text" and the mimeType="application/mp4". In the attached code snippet (where the error occurs) it looks like the parser should skip an AdaptationSet with this contentType but it does not do it. Indeed, the 'content_type' attribute is set to application and the conditional statement is not fulfilled (see comments in the attached code snippet). I attached three possible solutions because I don't know if this behavior is wished or this is a bug of the parser. Additionally, I don't know if the provided solutions have unknown side effects etc. (I tested them just with a few examples and in my case it worked as it should)

for representation in adaptation_set.findall(_add_ns('Representation')):
    if is_drm_protected(representation):
        continue
    representation_attrib = adaptation_set.attrib.copy()
    representation_attrib.update(representation.attrib)  # contentType is text
    # According to [1, 5.3.7.2, Table 9, page 41], @mimeType is mandatory
    mime_type = representation_attrib['mimeType']  # mimeType is application/mp4
    content_type = mime_type.split('/')[0]  # sets content_type to application
    if content_type == 'text':  # therefore this condition is not fulfilled and this AdaptationSet is not skipped
        # TODO implement WebVTT downloading
        pass
    elif content_type in ('video', 'audio'):
        ...
    else:  # and it raises an error
        self.report_warning('Unknown MIME type %s in DASH manifest' % mime_type)
  • Possible solutions for common.InfoExtractor._parse_mpd_formats():
# First: tested and my preferred 
representation_attrib.update(representation.attrib)
mime_type = representation_attrib['mimeType'] 
content_type = mime_type.split('/')[0] 
if content_type == 'text' or representation_attrib.get('contentType') == 'text':  # change
     pass

# Second: untested
representation_attrib.update(representation.attrib)
mime_type = representation_attrib['mimeType'] 
content_type = representation_attrib.get('contentType')  # change
if content_type == 'text': 
     pass

# Third: tested
representation_attrib.update(representation.attrib)
mime_type = representation_attrib['mimeType'] 
content_type = mime_type.split('/')[0] 
if content_type in ('text', 'application'):  # change
     pass
@dstftw
Copy link
Collaborator

@dstftw dstftw commented Oct 31, 2017

_parse_mpd_formats does not throw ExtractorError.

@dstftw dstftw closed this Oct 31, 2017
@dstftw dstftw added the invalid label Oct 31, 2017
@tobijjah
Copy link
Author

@tobijjah tobijjah commented Oct 31, 2017

No and why this traceback? Did you even read my description or tested it with the provided valid MPD manifest file?
Edit
Sorry, I just had a closer look on the traceback and investigated that the ExtractorError is thrown by the test_download.py. Does this mean during normal execution of the youtube-dl warnings are just reproted? Furthermore, the described issue exists and leads to avoidable warnings and maybe should be fixed.
And it would be nice if you don't answer just with one line on an issue which obviously was a lot of work (debugging, wirting etc.)

Traceback (most recent call last):
  File "/home/ilex/Documents/code/python/projects/yt-dl/youtube-dl/youtube_dl/YoutubeDL.py", line 784, in extract_info
    ie_result = ie.extract(url)
  File "/home/ilex/Documents/code/python/projects/yt-dl/youtube-dl/youtube_dl/extractor/common.py", line 434, in extract
    ie_result = self._real_extract(url)
  File "/home/ilex/Documents/code/python/projects/yt-dl/youtube-dl/youtube_dl/extractor/hbo.py", line 309, in _real_extract
    return self._extract_from_path('series/%s' % path)
  File "/home/ilex/Documents/code/python/projects/yt-dl/youtube-dl/youtube_dl/extractor/hbo.py", line 154, in _extract_from_path
    return self._extract_from_xml(api_data, path)
  File "/home/ilex/Documents/code/python/projects/yt-dl/youtube-dl/youtube_dl/extractor/hbo.py", line 147, in _extract_from_xml
    data = self._extract_mpd_formats(url, 12)
  File "/home/ilex/Documents/code/python/projects/yt-dl/youtube-dl/youtube_dl/extractor/common.py", line 1753, in _extract_mpd_formats
    formats_dict=formats_dict, mpd_url=mpd_url)
  File "/home/ilex/Documents/code/python/projects/yt-dl/youtube-dl/youtube_dl/extractor/common.py", line 2007, in _parse_mpd_formats
    self.report_warning('Unknown MIME type %s in DASH manifest' % mime_type)
  File "/home/ilex/Documents/code/python/projects/yt-dl/youtube-dl/youtube_dl/extractor/common.py", line 702, in report_warning
    '[%s] %s%s' % (self.IE_NAME, idstr, msg))
  File "/home/ilex/Documents/code/python/projects/yt-dl/youtube-dl/test/helper.py", line 244, in _report_warning
    real_warning(w)
  File "/home/ilex/Documents/code/python/projects/yt-dl/youtube-dl/test/test_download.py", line 52, in report_warning
    raise ExtractorError(message)
youtube_dl.utils.ExtractorError: [hbo:series] Unknown MIME type application/mp4 in DASH manifest; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
@Madball73 Madball73 mentioned this issue Mar 2, 2018
5 of 9 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.