[Crunchyroll] Potential "AttributeError: 'bool' object has no attribute 'find'" #17991

Arfrever · 2018-10-27T20:03:10Z

I use newest youtube-dl from git repository.
When Crunchyroll site was probably temporarily down, I happened to reproduce AttributeError exception in youtube-dl:

[crunchyroll] 774790: Downloading webpage
[crunchyroll] 774790: Downloading audio-jaJP m3u8 information
[crunchyroll] 774790: Downloading audio-jaJP-hardsub-deDE m3u8 information
[crunchyroll] 774790: Downloading audio-jaJP-hardsub-itIT m3u8 information
[crunchyroll] 774790: Downloading audio-jaJP-hardsub-ptBR m3u8 information
[crunchyroll] 774790: Downloading audio-jaJP-hardsub-esES m3u8 information
[crunchyroll] 774790: Downloading audio-jaJP-hardsub-ruRU m3u8 information
[crunchyroll] 774790: Downloading audio-jaJP-hardsub-frFR m3u8 information
[crunchyroll] 774790: Downloading audio-jaJP-hardsub-esLA m3u8 information
[crunchyroll] 774790: Downloading audio-jaJP-hardsub-enUS m3u8 information
[crunchyroll] 774790: Downloading media info
WARNING: Unable to download XML: HTTP Error 500: Internal Server Error
Traceback (most recent call last):
  File "/root/youtube-dl/bin/youtube-dl", line 6, in <module>
    youtube_dl.main()
  File "/root/youtube-dl/youtube_dl/__init__.py", line 472, in main
    _real_main(argv)
  File "/root/youtube-dl/youtube_dl/__init__.py", line 462, in _real_main
    retcode = ydl.download(all_urls)
  File "/root/youtube-dl/youtube_dl/YoutubeDL.py", line 2001, in download
    url, force_generic_extractor=self.params.get('force_generic_extractor', False))
  File "/root/youtube-dl/youtube_dl/YoutubeDL.py", line 792, in extract_info
    ie_result = ie.extract(url)
  File "/root/youtube-dl/youtube_dl/extractor/common.py", line 507, in extract
    ie_result = self._real_extract(url)
  File "/root/youtube-dl/youtube_dl/extractor/crunchyroll.py", line 601, in _real_extract
    season = xpath_text(metadata, 'series_title')
  File "/root/youtube-dl/youtube_dl/utils.py", line 313, in xpath_text
    n = xpath_element(node, xpath, name, fatal=fatal, default=default)
  File "/root/youtube-dl/youtube_dl/utils.py", line 294, in xpath_element
    n = _find_xpath(xpath)
  File "/root/youtube-dl/youtube_dl/utils.py", line 291, in _find_xpath
    return node.find(compat_xpath(xpath))
AttributeError: 'bool' object has no attribute 'find'

By looking at code, it seems that youtube_dl/extractor/common.py:_download_xml() returned False, which it can do (although it is not documented in its doc string):

    def _download_xml(
            self, url_or_request, video_id,
            note='Downloading XML', errnote='Unable to download XML',
            transform_source=None, fatal=True, encoding=None,
            data=None, headers={}, query={}, expected_status=None):
        """
        Return the xml as an xml.etree.ElementTree.Element.

        See _download_webpage docstring for arguments specification.
        """
        res = self._download_xml_handle(
            url_or_request, video_id, note=note, errnote=errnote,
            transform_source=transform_source, fatal=fatal, encoding=encoding,
            data=data, headers=headers, query=query,
            expected_status=expected_status)
        return res if res is False else res[0]
               ^^^^^^^^^^^^^^^^^^^

Next youtube_dl/extractor/crunchyroll.py:_call_rpc_api() returned this False unchanged:

    def _call_rpc_api(self, method, video_id, note=None, data=None):
        data = data or {}
        data['req'] = 'RpcApi' + method
        data = compat_urllib_parse_urlencode(data).encode('utf-8')
        return self._download_xml(
            'https://www.crunchyroll.com/xml/',
            video_id, note, fatal=False, data=data, headers={
                'Content-Type': 'application/x-www-form-urlencoded',
            })

Which was assigned here:

        metadata = self._call_rpc_api(
            'VideoPlayer_GetMediaMetadata', video_id,
            note='Downloading media info', data={
                'media_id': video_id,
            })

Functions for XML parsing in youtube_dl/utils.py do not work with False, so it is better to return error earlier.
Probably something like this:

--- a/youtube_dl/extractor/crunchyroll.py
+++ b/youtube_dl/extractor/crunchyroll.py
@@ -581,6 +581,8 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
             note='Downloading media info', data={
                 'media_id': video_id,
             })
+        if metadata is False:
+            raise ExtractorError('crunchyroll returned error')
 
         subtitles = {}
         for subtitle in media.get('subtitles', []):

The text was updated successfully, but these errors were encountered:

dstftw closed this as completed in 08c7d3d Oct 28, 2018

lkho referenced this issue in lkho/youtube-dl Dec 24, 2018

[crunchyroll] Improve extraction failsafeness (closes #17991)

c5bfbc9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Crunchyroll] Potential "AttributeError: 'bool' object has no attribute 'find'" #17991

[Crunchyroll] Potential "AttributeError: 'bool' object has no attribute 'find'" #17991

Arfrever commented Oct 27, 2018

[Crunchyroll] Potential "AttributeError: 'bool' object has no attribute 'find'" #17991

[Crunchyroll] Potential "AttributeError: 'bool' object has no attribute 'find'" #17991

Comments

Arfrever commented Oct 27, 2018