Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vlive] UnicodeEndodeError on Hangul URLs #9352

Closed
Kagami opened this issue Apr 30, 2016 · 4 comments
Closed

[vlive] UnicodeEndodeError on Hangul URLs #9352

Kagami opened this issue Apr 30, 2016 · 4 comments

Comments

@Kagami
Copy link
Contributor

@Kagami Kagami commented Apr 30, 2016

  • I've verified and I assure that I'm running youtube-dl 2016.04.24
  • At least skimmed through README and most notably FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

If the purpose of this issue is a bug report, site support request or you are not completely sure provide the full verbose output as follows:

$ python -m youtube_dl -v -F 'http://www.vlive.tv/video/7740/매드타운-조타의-고민상담소MADTOWN-JOTAs-COUNSELING-CENTER'       
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['-v', '-F', 'http://www.vlive.tv/video/7740/매드타운-조타의-고민상담소MADTOWN-JOTAs-COUNSELING-CENTER']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2016.04.24
[debug] Git HEAD: 7691184
[debug] Python version 3.4.3 - Linux-4.5.0-gentoo-x86_64-Intel-R-_Core-TM-_i7-3820_CPU_@_3.60GHz-with-gentoo-2.2
[debug] exe versions: ffmpeg N-79663-ge639f50, ffprobe N-79663-ge639f50, rtmpdump 2.4
[debug] Proxy map: {}
[vlive] 7740: Downloading webpage
[vlive] 7740: Downloading JSON status
Traceback (most recent call last):
  File "/usr/lib64/python3.4/runpy.py", line 170, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib64/python3.4/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/kagami/code/tmp/youtube-dl/youtube_dl/__main__.py", line 19, in <module>
    youtube_dl.main()
  File "/home/kagami/code/tmp/youtube-dl/youtube_dl/__init__.py", line 419, in main
    _real_main(argv)
  File "/home/kagami/code/tmp/youtube-dl/youtube_dl/__init__.py", line 409, in _real_main
    retcode = ydl.download(all_urls)
  File "/home/kagami/code/tmp/youtube-dl/youtube_dl/YoutubeDL.py", line 1732, in download
    url, force_generic_extractor=self.params.get('force_generic_extractor', False))
  File "/home/kagami/code/tmp/youtube-dl/youtube_dl/YoutubeDL.py", line 673, in extract_info
    ie_result = ie.extract(url)
  File "/home/kagami/code/tmp/youtube-dl/youtube_dl/extractor/common.py", line 341, in extract
    return self._real_extract(url)
  File "/home/kagami/code/tmp/youtube-dl/youtube_dl/extractor/vlive.py", line 46, in _real_extract
    headers={'Referer': url})
  File "/home/kagami/code/tmp/youtube-dl/youtube_dl/extractor/common.py", line 533, in _download_json
    encoding=encoding, data=data, headers=headers, query=query)
  File "/home/kagami/code/tmp/youtube-dl/youtube_dl/extractor/common.py", line 501, in _download_webpage
    res = self._download_webpage_handle(url_or_request, video_id, note, errnote, fatal, encoding=encoding, data=data, headers=headers, query=query)
  File "/home/kagami/code/tmp/youtube-dl/youtube_dl/extractor/common.py", line 408, in _download_webpage_handle
    urlh = self._request_webpage(url_or_request, video_id, note, errnote, fatal, data=data, headers=headers, query=query)
  File "/home/kagami/code/tmp/youtube-dl/youtube_dl/extractor/common.py", line 388, in _request_webpage
    return self._downloader.urlopen(url_or_request)
  File "/home/kagami/code/tmp/youtube-dl/youtube_dl/YoutubeDL.py", line 1942, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "/usr/lib64/python3.4/urllib/request.py", line 463, in open
    response = self._open(req, data)
  File "/usr/lib64/python3.4/urllib/request.py", line 481, in _open
    '_open', req)
  File "/usr/lib64/python3.4/urllib/request.py", line 441, in _call_chain
    result = func(*args)
  File "/home/kagami/code/tmp/youtube-dl/youtube_dl/utils.py", line 750, in http_open
    req)
  File "/usr/lib64/python3.4/urllib/request.py", line 1182, in do_open
    h.request(req.get_method(), req.selector, req.data, headers)
  File "/usr/lib64/python3.4/http/client.py", line 1088, in request
    self._send_request(method, url, body, headers)
  File "/usr/lib64/python3.4/http/client.py", line 1121, in _send_request
    self.putheader(hdr, value)
  File "/usr/lib64/python3.4/http/client.py", line 1064, in putheader
    values[i] = one_value.encode('latin-1')
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 31-34: ordinal not in range(256)
@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Apr 30, 2016

In general URLs should be corrected encoded:

$ youtube-dl -vF "http://www.vlive.tv/video/7740/%EB%A7%A4%EB%93%9C%ED%83%80%EC%9A%B4-%EC%A1%B0%ED%83%80%EC%9D%98-%EA%B3%A0%EB%AF%BC%EC%83%81%EB%8B%B4%EC%86%8CMADTOWN-JOTAs-COUNSELING-CENTER"
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['-vF', 'http://www.vlive.tv/video/7740/%EB%A7%A4%EB%93%9C%ED%83%80%EC%9A%B4-%EC%A1%B0%ED%83%80%EC%9D%98-%EA%B3%A0%EB%AF%BC%EC%83%81%EB%8B%B4%EC%86%8CMADTOWN-JOTAs-COUNSELING-CENTER']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2016.04.24
[debug] Git HEAD: cef3f30
[debug] Python version 3.5.1 - Linux-4.5.1-1-ARCH-x86_64-with-arch-Arch-Linux
[debug] exe versions: avconv v12_dev0-2591-gd12b5b2, avprobe v12_dev0-2591-gd12b5b2, ffmpeg 3.0.2, ffprobe 3.0.2, rtmpdump 2.4
[debug] Proxy map: {}
[vlive] 7740: Downloading webpage
[vlive] 7740: Downloading JSON status
[vlive] 7740: Downloading m3u8 information
[vlive] 7740: Downloading m3u8 information
[vlive] 7740: Downloading m3u8 information
[vlive] 7740: Downloading m3u8 information
[info] Available formats for 7740:
format code  extension  resolution note
audio        mp4        audio only   70k , mp4a.40.2
1200-meta    mp4        multiple   Quality selection URL 
250-meta     mp4        multiple   Quality selection URL 
400-meta     mp4        multiple   Quality selection URL 
audio-meta   mp4        multiple   Quality selection URL 
250          mp4        184x320     170k , avc1.77.21, mp4a.40.2
1200         mp4        368x640     249k , avc1.66.41, mp4a.40.2
400          mp4        368x640     549k , avc1.77.30, mp4a.40.2 (best)

When I copy the URL from Firefox, it's automatically escaped by Firefox. How did you get the URL?

@Kagami
Copy link
Contributor Author

@Kagami Kagami commented Apr 30, 2016

(Ironically that's my code, but not sure how to fix it best. 'Referer': url.encode('utf-8') works but I think _download_json should understand unicode headers?)

How did you get the URL?

network.standard-url.escape-utf8 set to false in about:config

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Apr 30, 2016

As of RFC 2616 default charset value is ISO-8859-1 (latin-1). Thus for now you should either pass bytestrings directly (as you suggested) or pass latin-1 encodable strings.

@Kagami
Copy link
Contributor Author

@Kagami Kagami commented Apr 30, 2016

Could you apply this fix then?

diff --git a/youtube_dl/extractor/vlive.py b/youtube_dl/extractor/vlive.py
index 7f9e99e..a672ea9 100644
--- a/youtube_dl/extractor/vlive.py
+++ b/youtube_dl/extractor/vlive.py
@@ -43,7 +43,7 @@ class VLiveIE(InfoExtractor):
         status_params = self._download_json(
             'http://www.vlive.tv/video/status?videoSeq=%s' % video_id,
             video_id, 'Downloading JSON status',
-            headers={'Referer': url})
+            headers={'Referer': url.encode('utf-8')})
         status = status_params.get('status')
         air_start = status_params.get('onAirStartAt', '')
         is_live = status_params.get('isLive')

Too small for PR.

@dstftw dstftw closed this in d41ee7b Apr 30, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.