Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid \escape error while trying to download a youtube page #2353

Closed
vis15 opened this issue Feb 9, 2014 · 2 comments
Closed

Invalid \escape error while trying to download a youtube page #2353

vis15 opened this issue Feb 9, 2014 · 2 comments

Comments

@vis15
Copy link

@vis15 vis15 commented Feb 9, 2014

command:
youtube-dl -v http://www.youtube.com/channel/UCjbfdFm4786wi6hFXjDKxSA/videos

ouput that I get:

[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['-v', 'http://www.youtube.com/channel/UCjbfdFm4786wi6hFXjDKxSA/videos']
[debug] Encodings: locale 'UTF-8', fs 'UTF-8', out 'UTF-8', pref: 'UTF-8'
[debug] youtube-dl version 2014.02.08.2
[debug] Python version 2.7.5+ - Linux-3.11.0-15-generic-x86_64-with-LinuxMint-16-petra
[debug] Proxy map: {}
[youtube:channel] UCjbfdFm4786wi6hFXjDKxSA: Downloading webpage
[youtube:channel] UCjbfdFm4786wi6hFXjDKxSA: Downloading page #1
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "../youtube-dl/main.py", line 18, in
File "../youtube-dl/youtube_dl/init.py", line 800, in main
File "../youtube-dl/youtube_dl/init.py", line 790, in _real_main
File "../youtube-dl/youtube_dl/YoutubeDL.py", line 982, in download
File "../youtube-dl/youtube_dl/YoutubeDL.py", line 493, in extract_info
File "../youtube-dl/youtube_dl/extractor/common.py", line 158, in extract
File "../youtube-dl/youtube_dl/extractor/youtube.py", line 1596, in _real_extract
File "/usr/lib/python2.7/json/init.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 365, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 381, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Invalid \escape: line 1 column 40052 (char 40051)

@stwl5
Copy link

@stwl5 stwl5 commented Feb 9, 2014

I got exactly the same error today and found out that a beginning uppercase (i.e "\U") instead of lowercase "\u" in a unicode-escape string causes this error, because the json decoder lib of python (my python version is also 2.7.x) only scans for the lowercase variant. A solution was, that I replaced page = json.loads(page) with page = json.loads(page.replace('\\U','\\u')) in file "youtube_dl/extractor/youtube.py" on line 1596. Maybe the same replace should be made on the other information extractors inside youtube.py because I only tried it with "/videos" pages. I also tried the url you posted above, it also worked after the mentioned fix. :)

@phihag phihag closed this in 81c2f20 Feb 9, 2014
@phihag
Copy link
Contributor

@phihag phihag commented Feb 9, 2014

Thank you for the report, this will be fixed in the next version. Since I'm currently mobile with a very spotty connection, the release may take some time, but should be out within 10 hours.

Simply replacing \U with \u will not work - YouTube intends to describe a character by the codepoint with 8 hexadecimal characters. In contrast, \u only expects 4 hexadecimal characters, so the result after simply replacing will be incorrect and contain superfluous digits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.