Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Morningstar URL does not scrape #2729
Comments
|
Thanks for the report. |
|
Fixed in the current version of youtube-dl. Type |
Looks like it's a not just matter of fixing the regex to accept "Cover" as well as "cover" in the Morningstar URLs; the URL http://www.morningstar.com/cover/videoCenter.aspx?id=641059 fails in an identical way.
python -m youtube_dl --skip-download --write-info-json -v http://www.morningstar.com/Cover/videoCenter.aspx?id=641059
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['--skip-download', '--write-info-json', '-v', 'http://www.morningstar.com/Cover/videoCenter.aspx?id=641059']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2014.04.07.4
[debug] Python version 2.7.5 - Darwin-13.1.0-x86_64-i386-64bit
[debug] Proxy map: {}
[generic] videoCenter: Requesting header
WARNING: Falling back on generic information extractor.
[generic] videoCenter: Downloading webpage
[generic] videoCenter: Extracting information
ERROR: Unsupported URL: http://www.morningstar.com/Cover/videoCenter.aspx?id=641059; please report this issue on https://yt-dl.org/bug . Be sure to call youtube-dl with the --verbose flag and include its complete output. Make sure you are using the latest version; type youtube-dl -U to update.
Traceback (most recent call last):
File "/Users/jill/june/.virtualenv/lib/python2.7/site-packages/youtube_dl/extractor/generic.py", line 387, in _real_extract
doc = parse_xml(webpage)
File "/Users/jill/june/.virtualenv/lib/python2.7/site-packages/youtube_dl/utils.py", line 1377, in parse_xml
return xml.etree.ElementTree.XML(s.encode('utf-8'), **kwargs)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1300, in XML
parser.feed(text)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed
self._raiseerror(v)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
raise err
ParseError: not well-formed (invalid token): line 21, column 78
Traceback (most recent call last):
File "/Users/jill/june/.virtualenv/lib/python2.7/site-packages/youtube_dl/YoutubeDL.py", line 514, in extract_info
ie_result = ie.extract(url)
File "/Users/jill/june/.virtualenv/lib/python2.7/site-packages/youtube_dl/extractor/common.py", line 161, in extract
return self._real_extract(url)
File "/Users/jill/june/.virtualenv/lib/python2.7/site-packages/youtube_dl/extractor/generic.py", line 627, in _real_extract
raise ExtractorError('Unsupported URL: %s' % url)
ExtractorError: Unsupported URL: http://www.morningstar.com/Cover/videoCenter.aspx?id=641059; please report this issue on https://yt-dl.org/bug . Be sure to call youtube-dl with the --verbose flag and include its complete output. Make sure you are using the latest version; type youtube-dl -U to update.