Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Crunchyroll] Potential "AttributeError: 'bool' object has no attribute 'find'" #17991

Closed
Arfrever opened this issue Oct 27, 2018 · 0 comments
Closed

Comments

@Arfrever
Copy link

I use newest youtube-dl from git repository.
When Crunchyroll site was probably temporarily down, I happened to reproduce AttributeError exception in youtube-dl:

[crunchyroll] 774790: Downloading webpage
[crunchyroll] 774790: Downloading audio-jaJP m3u8 information
[crunchyroll] 774790: Downloading audio-jaJP-hardsub-deDE m3u8 information
[crunchyroll] 774790: Downloading audio-jaJP-hardsub-itIT m3u8 information
[crunchyroll] 774790: Downloading audio-jaJP-hardsub-ptBR m3u8 information
[crunchyroll] 774790: Downloading audio-jaJP-hardsub-esES m3u8 information
[crunchyroll] 774790: Downloading audio-jaJP-hardsub-ruRU m3u8 information
[crunchyroll] 774790: Downloading audio-jaJP-hardsub-frFR m3u8 information
[crunchyroll] 774790: Downloading audio-jaJP-hardsub-esLA m3u8 information
[crunchyroll] 774790: Downloading audio-jaJP-hardsub-enUS m3u8 information
[crunchyroll] 774790: Downloading media info
WARNING: Unable to download XML: HTTP Error 500: Internal Server Error
Traceback (most recent call last):
  File "/root/youtube-dl/bin/youtube-dl", line 6, in <module>
    youtube_dl.main()
  File "/root/youtube-dl/youtube_dl/__init__.py", line 472, in main
    _real_main(argv)
  File "/root/youtube-dl/youtube_dl/__init__.py", line 462, in _real_main
    retcode = ydl.download(all_urls)
  File "/root/youtube-dl/youtube_dl/YoutubeDL.py", line 2001, in download
    url, force_generic_extractor=self.params.get('force_generic_extractor', False))
  File "/root/youtube-dl/youtube_dl/YoutubeDL.py", line 792, in extract_info
    ie_result = ie.extract(url)
  File "/root/youtube-dl/youtube_dl/extractor/common.py", line 507, in extract
    ie_result = self._real_extract(url)
  File "/root/youtube-dl/youtube_dl/extractor/crunchyroll.py", line 601, in _real_extract
    season = xpath_text(metadata, 'series_title')
  File "/root/youtube-dl/youtube_dl/utils.py", line 313, in xpath_text
    n = xpath_element(node, xpath, name, fatal=fatal, default=default)
  File "/root/youtube-dl/youtube_dl/utils.py", line 294, in xpath_element
    n = _find_xpath(xpath)
  File "/root/youtube-dl/youtube_dl/utils.py", line 291, in _find_xpath
    return node.find(compat_xpath(xpath))
AttributeError: 'bool' object has no attribute 'find'

By looking at code, it seems that youtube_dl/extractor/common.py:_download_xml() returned False, which it can do (although it is not documented in its doc string):

    def _download_xml(
            self, url_or_request, video_id,
            note='Downloading XML', errnote='Unable to download XML',
            transform_source=None, fatal=True, encoding=None,
            data=None, headers={}, query={}, expected_status=None):
        """
        Return the xml as an xml.etree.ElementTree.Element.

        See _download_webpage docstring for arguments specification.
        """
        res = self._download_xml_handle(
            url_or_request, video_id, note=note, errnote=errnote,
            transform_source=transform_source, fatal=fatal, encoding=encoding,
            data=data, headers=headers, query=query,
            expected_status=expected_status)
        return res if res is False else res[0]
               ^^^^^^^^^^^^^^^^^^^

Next youtube_dl/extractor/crunchyroll.py:_call_rpc_api() returned this False unchanged:

    def _call_rpc_api(self, method, video_id, note=None, data=None):
        data = data or {}
        data['req'] = 'RpcApi' + method
        data = compat_urllib_parse_urlencode(data).encode('utf-8')
        return self._download_xml(
            'https://www.crunchyroll.com/xml/',
            video_id, note, fatal=False, data=data, headers={
                'Content-Type': 'application/x-www-form-urlencoded',
            })

Which was assigned here:

        metadata = self._call_rpc_api(
            'VideoPlayer_GetMediaMetadata', video_id,
            note='Downloading media info', data={
                'media_id': video_id,
            })

Functions for XML parsing in youtube_dl/utils.py do not work with False, so it is better to return error earlier.
Probably something like this:

--- a/youtube_dl/extractor/crunchyroll.py
+++ b/youtube_dl/extractor/crunchyroll.py
@@ -581,6 +581,8 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
             note='Downloading media info', data={
                 'media_id': video_id,
             })
+        if metadata is False:
+            raise ExtractorError('crunchyroll returned error')
 
         subtitles = {}
         for subtitle in media.get('subtitles', []):
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant