Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
NBCNEWS falls back to generic extractor #6922
Comments
|
Going through the NBCNewsIE class, the regex for valid url is:
If I am right, it only seems to accept pages from sections video, watch, feature and nightly-news. On testing, video -> Looks like format of the url has changed. The id is now NOT separated by /. For this issue, adding business to the matcher causes the program to use the correct extractor, but the problem of "unable to extract bootstrap json" comes up again. |
URL: http://www.nbcnews.com/business/autos/volkswagen-11-million-vehicles-could-have-suspect-software-emissions-scandal-n431456
C:>youtube-dl -v "http://www.nbcnews.com/business/autos/volkswagen-11-million-v
ehicles-could-have-suspect-software-emissions-scandal-n431456"
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.nbcnews.com/business/autos/volks
wagen-11-million-vehicles-could-have-suspect-software-emissions-scandal-n431456'
]
[debug] Encodings: locale cp1252, fs mbcs, out cp850, pref cp1252
[debug] youtube-dl version 2015.09.09
[debug] Python version 2.7.8 - Windows-7-6.1.7601-SP1
[debug] exe versions: ffmpeg N-71346-gdf4fca2
[debug] Proxy map: {}
[generic] volkswagen-11-million-vehicles-could-have-suspect-software-emissions-s
candal-n431456: Requesting header
WARNING: Falling back on generic information extractor.
[generic] volkswagen-11-million-vehicles-could-have-suspect-software-emissions-s
candal-n431456: Downloading webpage
[generic] volkswagen-11-million-vehicles-could-have-suspect-software-emissions-s
candal-n431456: Extracting information
ERROR: Unsupported URL: http://www.nbcnews.com/business/autos/volkswagen-11-mill
ion-vehicles-could-have-suspect-software-emissions-scandal-n431456
Traceback (most recent call last):
File "youtube_dl\extractor\generic.pyo", line 1222, in _real_extract
File "youtube_dl\utils.pyo", line 1656, in parse_xml
File "xml\etree\ElementTree.pyo", line 1300, in XML
File "xml\etree\ElementTree.pyo", line 1642, in feed
File "xml\etree\ElementTree.pyo", line 1506, in _raiseerror
ParseError: syntax error: line 1, column 0
Traceback (most recent call last):
File "youtube_dl\YoutubeDL.pyo", line 660, in extract_info
File "youtube_dl\extractor\common.pyo", line 287, in extract
File "youtube_dl\extractor\generic.pyo", line 1820, in _real_extract
UnsupportedError: Unsupported URL: http://www.nbcnews.com/business/autos/volkswa
gen-11-million-vehicles-could-have-suspect-software-emissions-scandal-n431456
Thanks
Ringo