Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

radio-canada.ca site support request #4020

Open
anarcat opened this issue Oct 24, 2014 · 9 comments
Open

radio-canada.ca site support request #4020

anarcat opened this issue Oct 24, 2014 · 9 comments

Comments

@anarcat
Copy link

@anarcat anarcat commented Oct 24, 2014

hello

it would be great if this would work:

anarcat@angela:Downloads$ python ./youtube-dl --verbose http://ici.radio-canada.ca/widgets/mediaconsole/medianet/7184272
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['--verbose', 'http://ici.radio-canada.ca/widgets/mediaconsole/medianet/7184272']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2014.10.24
[debug] Python version 2.7.3 - Linux-3.2.0-4-amd64-x86_64-with-debian-7.6
[debug] Proxy map: {}
[generic] 7184272: Requesting header
WARNING: Falling back on generic information extractor.
[generic] 7184272: Downloading webpage
[generic] 7184272: Extracting information
ERROR: Unsupported URL: http://ici.radio-canada.ca/widgets/mediaconsole/medianet/7184272; please report this issue on https://yt-dl.org/bug . Be sure to call youtube-dl with the --verbose flag and include its complete output. Make sure you are using the latest version; type  youtube-dl -U  to update.
Traceback (most recent call last):
  File "./youtube-dl/youtube_dl/extractor/generic.py", line 553, in _real_extract
    doc = parse_xml(webpage)
  File "./youtube-dl/youtube_dl/utils.py", line 1550, in parse_xml
    tree = xml.etree.ElementTree.XML(s.encode('utf-8'), **kwargs)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1301, in XML
    parser.feed(text)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1643, in feed
    self._raiseerror(v)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1507, in _raiseerror
    raise err
ParseError: syntax error: line 1, column 0
Traceback (most recent call last):
  File "./youtube-dl/youtube_dl/YoutubeDL.py", line 526, in extract_info
    ie_result = ie.extract(url)
  File "./youtube-dl/youtube_dl/extractor/common.py", line 193, in extract
    return self._real_extract(url)
  File "./youtube-dl/youtube_dl/extractor/generic.py", line 933, in _real_extract
    raise ExtractorError('Unsupported URL: %s' % url)
ExtractorError: Unsupported URL: http://ici.radio-canada.ca/widgets/mediaconsole/medianet/7184272; please report this issue on https://yt-dl.org/bug . Be sure to call youtube-dl with the --verbose flag and include its complete output. Make sure you are using the latest version; type  youtube-dl -U  to update.

this is with the latest version.

@anarcat
Copy link
Author

@anarcat anarcat commented Dec 7, 2014

another example:

anarcat@marcos:youtube-dl-2014.10.30$ ./youtube-dl --verbose https://ici.radio-canada.ca/nouvelles/societe/2014/12/05/001-entrevue-mere-monique-marc-lepine-tueur-polytechnique-25-ans.shtml?isAutoPlay=1
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['--verbose', 'https://ici.radio-canada.ca/nouvelles/societe/2014/12/05/001-entrevue-mere-monique-marc-lepine-tueur-polytechnique-25-ans.shtml?isAutoPlay=1']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2014.10.30
[debug] Python version 2.7.8 - Linux-3.16.0-4-amd64-x86_64-with-debian-jessie-sid
[debug] exe versions: avconv 11-6, avprobe 11-6
[debug] Proxy map: {}
[generic] 001-entrevue-mere-monique-marc-lepine-tueur-polytechnique-25-ans: Requesting header
WARNING: Falling back on generic information extractor.
[generic] 001-entrevue-mere-monique-marc-lepine-tueur-polytechnique-25-ans: Downloading webpage
[generic] 001-entrevue-mere-monique-marc-lepine-tueur-polytechnique-25-ans: Extracting information
ERROR: Unsupported URL: https://ici.radio-canada.ca/nouvelles/societe/2014/12/05/001-entrevue-mere-monique-marc-lepine-tueur-polytechnique-25-ans.shtml?isAutoPlay=1; please report this issue on https://yt-dl.org/bug . Be sure to call youtube-dl with the --verbose flag and include its complete output. Make sure you are using the latest version; type  youtube-dl -U  to update.
Traceback (most recent call last):
  File "./youtube-dl/youtube_dl/extractor/generic.py", line 581, in _real_extract
    doc = parse_xml(webpage)
  File "./youtube-dl/youtube_dl/utils.py", line 1629, in parse_xml
    tree = xml.etree.ElementTree.XML(s.encode('utf-8'), **kwargs)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1300, in XML
    parser.feed(text)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed
    self._raiseerror(v)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
    raise err
ParseError: syntax error: line 2, column 0
Traceback (most recent call last):
  File "./youtube-dl/youtube_dl/YoutubeDL.py", line 533, in extract_info
    ie_result = ie.extract(url)
  File "./youtube-dl/youtube_dl/extractor/common.py", line 198, in extract
    return self._real_extract(url)
  File "./youtube-dl/youtube_dl/extractor/generic.py", line 962, in _real_extract
    raise ExtractorError('Unsupported URL: %s' % url)
ExtractorError: Unsupported URL: https://ici.radio-canada.ca/nouvelles/societe/2014/12/05/001-entrevue-mere-monique-marc-lepine-tueur-polytechnique-25-ans.shtml?isAutoPlay=1; please report this issue on https://yt-dl.org/bug . Be sure to call youtube-dl with the --verbose flag and include its complete output. Make sure you are using the latest version; type  youtube-dl -U  to update.
remitamine added a commit that referenced this issue May 24, 2016
@baldurmen
Copy link

@baldurmen baldurmen commented Jun 20, 2016

@remitamine even with the latest git clone, this still fails.

Commit is indeed there:

$ git log 444417edb55a5bf471697a3b2353fdbfb6f7e26d
commit 444417edb55a5bf471697a3b2353fdbfb6f7e26d
Author: remitamine <remitamine@gmail.com>
Date:   Tue May 24 15:58:27 2016 +0100

    [radiocanada] Add new extractor(#4020)

I'm not famliar with youtube-dl development. Do you need to add something else for this to work? From the trace, it does not seem to use your extractor...

$ youtube-dl --verbose http://ici.radio-canada.ca/widgets/mediaconsole/medianet/7554948
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'--verbose', u'http://ici.radio-canada.ca/widgets/mediaconsole/medianet/7554948']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2016.02.22
[debug] Python version 2.7.11+ - Linux-4.4.0-1-amd64-x86_64-with-debian-stretch-sid
[debug] exe versions: avconv 2.8.6-1, avprobe 2.8.6-1, ffmpeg 2.8.6-1, ffprobe 2.8.6-1, rtmpdump 2.4
[debug] Proxy map: {}
[generic] 7554948: Requesting header
WARNING: Falling back on generic information extractor.
[generic] 7554948: Downloading webpage
[generic] 7554948: Extracting information
ERROR: Unsupported URL: http://ici.radio-canada.ca/widgets/mediaconsole/medianet/7554948
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/youtube_dl/extractor/generic.py", line 1308, in _real_extract
    doc = compat_etree_fromstring(webpage.encode('utf-8'))
  File "/usr/lib/python2.7/dist-packages/youtube_dl/compat.py", line 248, in compat_etree_fromstring
    doc = _XML(text, parser=etree.XMLParser(target=etree.TreeBuilder(element_factory=_element_factory)))
  File "/usr/lib/python2.7/dist-packages/youtube_dl/compat.py", line 237, in _XML
    parser.feed(text)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1653, in feed
    self._raiseerror(v)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1517, in _raiseerror
    raise err
ParseError: syntax error: line 1, column 0
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/youtube_dl/YoutubeDL.py", line 666, in extract_info
    ie_result = ie.extract(url)
  File "/usr/lib/python2.7/dist-packages/youtube_dl/extractor/common.py", line 316, in extract
    return self._real_extract(url)
  File "/usr/lib/python2.7/dist-packages/youtube_dl/extractor/generic.py", line 1950, in _real_extract
    raise UnsupportedError(url)
UnsupportedError: Unsupported URL: http://ici.radio-canada.ca/widgets/mediaconsole/medianet/7554948
@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Jun 20, 2016

[debug] youtube-dl version 2016.02.22

Seems you have multiple youtube-dl versions installed. However, the latest version is also broken:

$ youtube-dl --verbose http://ici.radio-canada.ca/widgets/mediaconsole/medianet/7554948 
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['--verbose', 'http://ici.radio-canada.ca/widgets/mediaconsole/medianet/7554948']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2016.06.19.1
[debug] Git HEAD: 8197079
[debug] Python version 3.5.1 - Linux-4.6.2-1-ARCH-x86_64-with-arch
[debug] exe versions: avconv v12_dev0-2785-g1e9c5bf, avprobe v12_dev0-2785-g1e9c5bf, ffmpeg 3.0.2, ffprobe 3.0.2, rtmpdump 2.4
[debug] Proxy map: {}
[radiocanada] 7554948: Downloading flash XML
[radiocanada] 7554948: Downloading metadata XML
Traceback (most recent call last):
  File "<string>", line 23, in <module>
  File "/home/yen/Projects/youtube-dl/youtube_dl/__init__.py", line 420, in main
    _real_main(argv)
  File "/home/yen/Projects/youtube-dl/youtube_dl/__init__.py", line 410, in _real_main
    retcode = ydl.download(all_urls)
  File "/home/yen/Projects/youtube-dl/youtube_dl/YoutubeDL.py", line 1740, in download
    url, force_generic_extractor=self.params.get('force_generic_extractor', False))
  File "/home/yen/Projects/youtube-dl/youtube_dl/YoutubeDL.py", line 687, in extract_info
    return self.process_ie_result(ie_result, download, extra_info)
  File "/home/yen/Projects/youtube-dl/youtube_dl/YoutubeDL.py", line 733, in process_ie_result
    return self.process_video_result(ie_result, download=download)
  File "/home/yen/Projects/youtube-dl/youtube_dl/YoutubeDL.py", line 1386, in process_video_result
    self.process_info(new_info)
  File "/home/yen/Projects/youtube-dl/youtube_dl/YoutubeDL.py", line 1451, in process_info
    if len(info_dict['title']) > 200:
TypeError: object of type 'NoneType' has no len()
@baldurmen
Copy link

@baldurmen baldurmen commented Jun 20, 2016

@yan12125 oh damn my bad, I wasn't using the write exec path -_-'

But your trace is true enough. i get the same thing. It does work for other urls though (http://ici.radio-canada.ca/widgets/mediaconsole/medianet/7184272) fine... Maybe it's an issue on the website side.

@remitamine
Copy link
Collaborator

@remitamine remitamine commented Jun 20, 2016

the extractor works for most of the content from the supported urls but it didn't work for this url because the data returned from the api doesn't contain a title:

<Meta name="AV-nomEmission">lasoireeestencorejeune</Meta>
<Meta name="Title" />
<Meta name="TitleID" />
<Meta name="Author" />
@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Jun 20, 2016

I guess the title can be extracted from the webpage in such cases.

By the way, seems this issue can be closed after all titles are correctly extracted?

@remitamine
Copy link
Collaborator

@remitamine remitamine commented Jun 20, 2016

By the way, seems this issue can be closed after all titles are correctly extracted?

i added support for two types of urls, i didn't check the articles that contain videos(like the second url in the issue)

@nlevitt
Copy link

@nlevitt nlevitt commented Jun 23, 2017

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--verbose', 'http://ici.radio-canada.ca/nouvelle/780619/pensionnats-autochtones-ottawa-acadie-terre-neuve-et-labrador']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2017.06.23
[debug] Git HEAD: e6b5770f6c
[debug] Python version 3.5.2 - Darwin-15.6.0-x86_64-i386-64bit
[debug] exe versions: ffmpeg 3.2.2, ffprobe 3.2.2
[debug] Proxy map: {}
[generic] pensionnats-autochtones-ottawa-acadie-terre-neuve-et-labrador: Requesting header
WARNING: Falling back on generic information extractor.
[generic] pensionnats-autochtones-ottawa-acadie-terre-neuve-et-labrador: Downloading webpage
[generic] pensionnats-autochtones-ottawa-acadie-terre-neuve-et-labrador: Extracting information
ERROR: Unsupported URL: http://ici.radio-canada.ca/nouvelle/780619/pensionnats-autochtones-ottawa-acadie-terre-neuve-et-labrador
Traceback (most recent call last):
  File "/Users/nlevitt/workspace/brozzler/brozzler-ve35/lib/python3.5/site-packages/youtube_dl/YoutubeDL.py", line 762, in extract_info
    ie_result = ie.extract(url)
  File "/Users/nlevitt/workspace/brozzler/brozzler-ve35/lib/python3.5/site-packages/youtube_dl/extractor/common.py", line 433, in extract
    ie_result = self._real_extract(url)
  File "/Users/nlevitt/workspace/brozzler/brozzler-ve35/lib/python3.5/site-packages/youtube_dl/extractor/generic.py", line 2796, in _real_extract
    raise UnsupportedError(url)
youtube_dl.utils.UnsupportedError: Unsupported URL: http://ici.radio-canada.ca/nouvelle/780619/pensionnats-autochtones-ottawa-acadie-terre-neuve-et-labrador
@jnbdz
Copy link

@jnbdz jnbdz commented Jul 12, 2017

Here is another example. But this one has two videos:

youtube-dl -v http://ici.radio-canada.ca/nouvelle/1044381/anne-marie-dussault-entrevue-omar-khadr-canada-prison?fromBeta=true
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-v', u'http://ici.radio-canada.ca/nouvelle/1044381/anne-marie-dussault-entrevue-omar-khadr-canada-prison?fromBeta=true']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2017.07.09
[debug] Python version 2.7.12 - Linux-4.4.0-72-generic-x86_64-with-Ubuntu-16.04-xenial
[debug] exe versions: ffmpeg 2.8.11-0ubuntu0.16.04.1, ffprobe 2.8.11-0ubuntu0.16.04.1
[debug] Proxy map: {}
[generic] anne-marie-dussault-entrevue-omar-khadr-canada-prison?fromBeta=true: Requesting header
WARNING: Falling back on generic information extractor.
[generic] anne-marie-dussault-entrevue-omar-khadr-canada-prison?fromBeta=true: Downloading webpage
[generic] anne-marie-dussault-entrevue-omar-khadr-canada-prison?fromBeta=true: Extracting information
ERROR: Unsupported URL: http://ici.radio-canada.ca/nouvelle/1044381/anne-marie-dussault-entrevue-omar-khadr-canada-prison?fromBeta=true
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 2043, in _real_extract
    doc = compat_etree_fromstring(webpage.encode('utf-8'))
  File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 2539, in compat_etree_fromstring
    doc = _XML(text, parser=etree.XMLParser(target=_TreeBuilder(element_factory=_element_factory)))
  File "/usr/local/bin/youtube-dl/youtube_dl/compat.py", line 2528, in _XML
    parser.feed(text)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1653, in feed
    self._raiseerror(v)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1517, in _raiseerror
    raise err
ParseError: syntax error: line 66, column 0
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 762, in extract_info
    ie_result = ie.extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 433, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/generic.py", line 2893, in _real_extract
    raise UnsupportedError(url)
UnsupportedError: Unsupported URL: http://ici.radio-canada.ca/nouvelle/1044381/anne-marie-dussault-entrevue-omar-khadr-canada-prison?fromBeta=true
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
7 participants
You can’t perform that action at this time.