Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeEncodeError when adding metadata to the output file #9292

Closed
fridtjof opened this issue Apr 23, 2016 · 3 comments
Closed

UnicodeEncodeError when adding metadata to the output file #9292

fridtjof opened this issue Apr 23, 2016 · 3 comments

Comments

@fridtjof
Copy link

@fridtjof fridtjof commented Apr 23, 2016

Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2016.04.19. If it's not read this FAQ entry and update. Issues with outdated version will be rejected.

  • I've verified and I assure that I'm running youtube-dl 2016.04.19

Before submitting an issue make sure you have:

  • At least skimmed through README and most notably FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your issue


If the purpose of this issue is a bug report, site support request or you are not completely sure provide the full verbose output as follows:

Add -v flag to your command line you run youtube-dl with, copy the whole output and insert it here. It should look similar to one below (replace it with your log inserted between triple ```):

I don't have a command line as I use it directly through python.
These are my options:

ydl_opts = {
    'format': 'bestaudio',
    'postprocessors': [
        {'key': 'FFmpegMetadata'},
        {'key': 'FFmpegExtractAudio',
         'preferredcodec': 'mp3',
         'preferredquality': '192'}],
    'outtmpl': u'/some/path/%(uploader)s - %(title)s.%(ext)s'
}

I'm running Python 3.5.1 via CGI on Lighttpd

Description of your issue, suggested solution and other information

When I try to download something from Soundcloud (could also be other sites) that has unicode characters in its description, the following error occurs on line 163 in youtube-dl/postprocessor/ffmpeg.py in run_ffmpeg_multiple_files:

UnicodeEncodeError: 'ascii' codec can't encode character '\u21bb' in position 86: ordinal not in range(128)

The parameter cmd contains:

['ffmpeg', '-y', '-i', 'file:/var/lib/mpd/fire/ephixa - Sushi_Killer_-_Waifu_Dream_Ephixa_Remix.wav', '-c', 'copy', 
'-metadata', 'artist=ephixa', '-metadata', 'date=20150915', 
'-metadata', 'description=I remixed https://soundcloud.com/sushi_killer\n\nLike the track? Click the [↻ REPOST] button!\n\nArtwork: http://www.therydesigns.com/\n\n\n', 
'-metadata', 'comment=I remixed https://soundcloud.com/sushi_killer\n\nLike the track? Click the [↻ REPOST] button!\n\nArtwork: http://www.therydesigns.com/\n\n\n', 
'-metadata', 'purl=http://soundcloud.com/ephixa/sushi-killer-waifu-dream-ephixa-remix', 
'-metadata', 'title=Sushi Killer - Waifu Dream (Ephixa Remix)', 
'file:/some/path/ephixa - Sushi_Killer_-_Waifu_Dream_Ephixa_Remix.temp.wav']

Weirdly, this never happens when using the CLI OR calling the cgi script directly. However, I don't get how this could be caused by lighttpd either...

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Apr 24, 2016

Add 'verbose': True to ydl_opts and post full output.

@fridtjof
Copy link
Author

@fridtjof fridtjof commented Apr 24, 2016

Output via lighttpd log:

WARNING: Assuming --restrict-filenames since file system encoding cannot encode all characters. Set the LC_ALL environment variable to fix this.
[debug] Encodings: locale ANSI_X3.4-1968, fs ascii, out ANSI_X3.4-1968, pref ANSI_X3.4-1968
[debug] youtube-dl version 2016.04.19
[debug] Python version 3.5.1 - Linux-3.16.0-4-amd64-x86_64-with-arch
[debug] exe versions: ffmpeg 3.0.1, ffprobe 3.0.1
[debug] Proxy map: {}
Traceback (most recent call last):
   File "/srv/http/addmix.py", line 45, in <module>
     ydl.download([url])
   File "/usr/lib/python3.5/site-packages/youtube_dl/YoutubeDL.py", line 1730, in download
     url, force_generic_extractor=self.params.get('force_generic_extractor', False))
   File "/usr/lib/python3.5/site-packages/youtube_dl/YoutubeDL.py", line 682, in extract_info
     return self.process_ie_result(ie_result, download, extra_info)
   File "/usr/lib/python3.5/site-packages/youtube_dl/YoutubeDL.py", line 727, in process_ie_result
     return self.process_video_result(ie_result, download=download)
   File "/usr/lib/python3.5/site-packages/youtube_dl/YoutubeDL.py", line 1376, in process_video_result
     self.process_info(new_info)
   File "/usr/lib/python3.5/site-packages/youtube_dl/YoutubeDL.py", line 1712, in process_info
     self.post_process(filename, info_dict)
   File "/usr/lib/python3.5/site-packages/youtube_dl/YoutubeDL.py", line 1776, in post_process
     files_to_delete, info = pp.run(info)
   File "/usr/lib/python3.5/site-packages/youtube_dl/postprocessor/ffmpeg.py", line 426, in run
     self.run_ffmpeg(filename, temp_filename, options)
   File "/usr/lib/python3.5/site-packages/youtube_dl/postprocessor/ffmpeg.py", line 172, in run_ffmpeg
     self.run_ffmpeg_multiple_files([path], out_path, opts)
   File "/usr/lib/python3.5/site-packages/youtube_dl/postprocessor/ffmpeg.py", line 163, in run_ffmpeg_multiple_files
     p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
   File "/usr/lib/python3.5/subprocess.py", line 950, in __init__
     restore_signals, start_new_session)
   File "/usr/lib/python3.5/subprocess.py", line 1483, in _execute_child
     restore_signals, start_new_session, preexec_fn)
UnicodeEncodeError: 'ascii' codec can't encode character '\u21bb' in position 82: ordinal not in range(128)

I suspect it has something to do with locales. Seems like the necessary env vars don't get passed on to python when it's called. Both os.environ.get("LANG") and os.environ.get("LC_ALL") return None.

I solved it by adding this to my lighttpd config:

server.modules += ( "mod_setenv" )
setenv.add-environment = ( "LANG" => "en_US.UTF-8" )
@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Apr 24, 2016

In the latest CPython Popen() internally uses the filesystem encoding with 'surrogateescape' to encode arguments. Maybe encodeArgument() should do the encoding even for Python 3:

diff --git a/youtube_dl/utils.py b/youtube_dl/utils.py
index f333e47..08a5154 100644
--- a/youtube_dl/utils.py
+++ b/youtube_dl/utils.py
@@ -486,10 +486,6 @@ def encodeFilename(s, for_subprocess=False):

     assert type(s) == compat_str

-    # Python 3 has a Unicode API
-    if sys.version_info >= (3, 0):
-        return s
-
     # Pass '' directly to use Unicode APIs on Windows 2000 and up
     # (Detecting Windows NT 4 is tricky because 'major >= 4' would
     # match Windows 9x series as well. Besides, NT 4 is obsolete.)

Before:

$ env -i python ./youtube_dl/__main__.py -v test:youtube --add-metadata
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['-v', 'test:youtube', '--add-metadata']
WARNING: Assuming --restrict-filenames since file system encoding cannot encode all characters. Set the LC_ALL environment variable to fix this.
[debug] Encodings: locale ANSI_X3.4-1968, fs ascii, out ANSI_X3.4-1968, pref ANSI_X3.4-1968
[debug] youtube-dl version 2016.04.19
[debug] Git HEAD: 2a7dee8
[debug] Python version 3.5.1 - Linux-4.5.0-1-ARCH-x86_64-with-arch-Arch-Linux
[debug] exe versions: avconv v12_dev0-2591-gd12b5b2, avprobe v12_dev0-2591-gd12b5b2, ffmpeg 3.0.1, ffprobe 3.0.1, rtmpdump 2.4
[debug] Proxy map: {}
[TestURL] Test URL: http://www.youtube.com/watch?v=BaW_jenozKc&t=1s&end=9
[youtube] BaW_jenozKc: Downloading webpage
[youtube] BaW_jenozKc: Downloading video info webpage
[youtube] BaW_jenozKc: Extracting video information
[youtube] BaW_jenozKc: Downloading MPD manifest
[debug] Invoking downloader on 'https://r5---sn-5njj-u2xe.googlevideo.com/videoplayback?id=05a5bf8de9e8cca7&itag=137&source=youtube&requiressl=yes&mv=m&ms=au&initcwndbps=6000000&mn=sn-5njj-u2xe&mm=31&pl=17&ratebypass=yes&mime=video/mp4&gir=yes&clen=2208750&lmt=1387961822987808&dur=9.800&fexp=9412777,9413140,9414671,9416126,9416891,9417250,9419451,9419670,9422596,9425077,9425945,9426927,9428398,9430935,9431012,9432024,9432604,9432683,9433097,9433192,9433425,9433495,9433947,9434022&mt=1461497772&upn=pUeFNacOluI&signature=3D64B6C17510FDE3BED2CEB7874333AB936929DC.068C6EA31AF29841ED61E91174AC0A9EE8AD63EA&key=dg_yt0&sver=3&ip=140.112.230.216&ipbits=0&expire=1461519567&sparams=ip,ipbits,expire,id,itag,source,requiressl,mv,ms,initcwndbps,mn,mm,pl,ratebypass,mime,gir,clen,lmt,dur'
[download] Destination: youtube-dl_test_video-BaW_jenozKc.f137.mp4
[download] 100% of 2.11MiB in 00:00
[debug] Invoking downloader on 'https://r5---sn-5njj-u2xe.googlevideo.com/videoplayback?id=05a5bf8de9e8cca7&itag=141&source=youtube&requiressl=yes&mv=m&ms=au&initcwndbps=6000000&mn=sn-5njj-u2xe&mm=31&pl=17&ratebypass=yes&mime=audio/mp4&gir=yes&clen=315992&lmt=1387961817988214&dur=9.891&fexp=9412777,9413140,9414671,9416126,9416891,9417250,9419451,9419670,9422596,9425077,9425945,9426927,9428398,9430935,9431012,9432024,9432604,9432683,9433097,9433192,9433425,9433495,9433947,9434022&mt=1461497772&upn=pUeFNacOluI&signature=06D10776E4BC9183FD3B6F61730236F6393A2B0E.92E178B49A25747A313B522A605E7B826DC9FE1C&key=dg_yt0&sver=3&ip=140.112.230.216&ipbits=0&expire=1461519567&sparams=ip,ipbits,expire,id,itag,source,requiressl,mv,ms,initcwndbps,mn,mm,pl,ratebypass,mime,gir,clen,lmt,dur'
[download] Destination: youtube-dl_test_video-BaW_jenozKc.f141.m4a
[download] 100% of 308.59KiB in 00:00
[ffmpeg] Merging formats into "youtube-dl_test_video-BaW_jenozKc.mp4"
[debug] ffmpeg command line: avconv -y -i file:youtube-dl_test_video-BaW_jenozKc.f137.mp4 -i file:youtube-dl_test_video-BaW_jenozKc.f141.m4a -c copy -map 0:v:0 -map 1:a:0 file:youtube-dl_test_video-BaW_jenozKc.temp.mp4
Deleting original file youtube-dl_test_video-BaW_jenozKc.f137.mp4 (pass -k to keep)
Deleting original file youtube-dl_test_video-BaW_jenozKc.f141.m4a (pass -k to keep)
[ffmpeg] Adding metadata to 'youtube-dl_test_video-BaW_jenozKc.mp4'
[debug] ffmpeg command line: avconv -y -i file:youtube-dl_test_video-BaW_jenozKc.mp4 -c copy -metadata 'purl=https://www.youtube.com/watch?v=BaW_jenozKc' -metadata 'comment=test chars:  "'"'"'/\
test URL: https://github.com/rg3/youtube-dl/issues/1892

This is a test video for youtube-dl.

For more information, contact phihag@phihag.de .' -metadata 'description=test chars:  "'"'"'/\
test URL: https://github.com/rg3/youtube-dl/issues/1892

This is a test video for youtube-dl.

For more information, contact phihag@phihag.de .' -metadata 'title=youtube-dl test video "'"'"'/\' -metadata 'artist=Philipp Hagemeister' -metadata date=20121002 file:youtube-dl_test_video-BaW_jenozKc.temp.mp4
Traceback (most recent call last):
  File "./youtube_dl/__main__.py", line 19, in <module>
    youtube_dl.main()
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/__init__.py", line 419, in main
    _real_main(argv)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/__init__.py", line 409, in _real_main
    retcode = ydl.download(all_urls)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/YoutubeDL.py", line 1732, in download
    url, force_generic_extractor=self.params.get('force_generic_extractor', False))
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/YoutubeDL.py", line 684, in extract_info
    return self.process_ie_result(ie_result, download, extra_info)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/YoutubeDL.py", line 736, in process_ie_result
    extra_info=extra_info)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/YoutubeDL.py", line 684, in extract_info
    return self.process_ie_result(ie_result, download, extra_info)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/YoutubeDL.py", line 729, in process_ie_result
    return self.process_video_result(ie_result, download=download)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/YoutubeDL.py", line 1378, in process_video_result
    self.process_info(new_info)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/YoutubeDL.py", line 1714, in process_info
    self.post_process(filename, info_dict)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/YoutubeDL.py", line 1778, in post_process
    files_to_delete, info = pp.run(info)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/postprocessor/ffmpeg.py", line 426, in run
    self.run_ffmpeg(filename, temp_filename, options)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/postprocessor/ffmpeg.py", line 172, in run_ffmpeg
    self.run_ffmpeg_multiple_files([path], out_path, opts)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/postprocessor/ffmpeg.py", line 163, in run_ffmpeg_multiple_files
    p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
  File "/usr/lib/python3.5/subprocess.py", line 950, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.5/subprocess.py", line 1483, in _execute_child
    restore_signals, start_new_session, preexec_fn)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 25-27: ordinal not in range(128)

After:

$ env -i python ./youtube_dl/__main__.py -v test:youtube --add-metadata        
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['-v', 'test:youtube', '--add-metadata']
WARNING: Assuming --restrict-filenames since file system encoding cannot encode all characters. Set the LC_ALL environment variable to fix this.
[debug] Encodings: locale ANSI_X3.4-1968, fs ascii, out ANSI_X3.4-1968, pref ANSI_X3.4-1968
[debug] youtube-dl version 2016.04.19
[debug] Git HEAD: 2a7dee8
[debug] Python version 3.5.1 - Linux-4.5.0-1-ARCH-x86_64-with-arch-Arch-Linux
[debug] exe versions: avconv v12_dev0-2591-gd12b5b2, avprobe v12_dev0-2591-gd12b5b2, ffmpeg 3.0.1, ffprobe 3.0.1, rtmpdump 2.4
[debug] Proxy map: {}
[TestURL] Test URL: http://www.youtube.com/watch?v=BaW_jenozKc&t=1s&end=9
[youtube] BaW_jenozKc: Downloading webpage
[youtube] BaW_jenozKc: Downloading video info webpage
[youtube] BaW_jenozKc: Extracting video information
[youtube] BaW_jenozKc: Downloading MPD manifest
[debug] Invoking downloader on 'https://r5---sn-5njj-u2xe.googlevideo.com/videoplayback?id=05a5bf8de9e8cca7&itag=137&source=youtube&requiressl=yes&mn=sn-5njj-u2xe&mm=31&mv=m&pl=17&ms=au&initcwndbps=6000000&ratebypass=yes&mime=video/mp4&gir=yes&clen=2208750&lmt=1387961822987808&dur=9.800&mt=1461497772&sver=3&signature=93012DEAFC7630BAE908F186FA57ABB409CE6303.88A6C92A9C6F1C2AC4223EA2E6C8AFC244481987&upn=i9iWOyhereE&key=dg_yt0&fexp=9416126,9416891,9422596,9426927,9428398,9431012,9433097,9433223,9433947&ip=140.112.230.216&ipbits=0&expire=1461519500&sparams=ip,ipbits,expire,id,itag,source,requiressl,mn,mm,mv,pl,ms,initcwndbps,ratebypass,mime,gir,clen,lmt,dur'
[download] Destination: youtube-dl_test_video-BaW_jenozKc.f137.mp4
[download] 100% of 2.11MiB in 00:00
[debug] Invoking downloader on 'https://r5---sn-5njj-u2xe.googlevideo.com/videoplayback?id=05a5bf8de9e8cca7&itag=141&source=youtube&requiressl=yes&mn=sn-5njj-u2xe&mm=31&mv=m&pl=17&ms=au&initcwndbps=6000000&ratebypass=yes&mime=audio/mp4&gir=yes&clen=315992&lmt=1387961817988214&dur=9.891&mt=1461497772&sver=3&signature=94222849A2D6FFFCE4615E31F0264FA9926A26A3.4E3AC2105807DD96372D88409F5191369058042D&upn=i9iWOyhereE&key=dg_yt0&fexp=9416126,9416891,9422596,9426927,9428398,9431012,9433097,9433223,9433947&ip=140.112.230.216&ipbits=0&expire=1461519500&sparams=ip,ipbits,expire,id,itag,source,requiressl,mn,mm,mv,pl,ms,initcwndbps,ratebypass,mime,gir,clen,lmt,dur'
[download] Destination: youtube-dl_test_video-BaW_jenozKc.f141.m4a
[download] 100% of 308.59KiB in 00:00
[ffmpeg] Merging formats into "youtube-dl_test_video-BaW_jenozKc.mp4"
[debug] ffmpeg command line: avconv -y -i file:youtube-dl_test_video-BaW_jenozKc.f137.mp4 -i file:youtube-dl_test_video-BaW_jenozKc.f141.m4a -c copy -map 0:v:0 -map 1:a:0 file:youtube-dl_test_video-BaW_jenozKc.temp.mp4
Deleting original file youtube-dl_test_video-BaW_jenozKc.f137.mp4 (pass -k to keep)
Deleting original file youtube-dl_test_video-BaW_jenozKc.f141.m4a (pass -k to keep)
[ffmpeg] Adding metadata to 'youtube-dl_test_video-BaW_jenozKc.mp4'
[debug] ffmpeg command line: avconv -y -i file:youtube-dl_test_video-BaW_jenozKc.mp4 -c copy -metadata 'title=youtube-dl test video "'"'"'/\' -metadata 'comment=test chars:  "'"'"'/\
test URL: https://github.com/rg3/youtube-dl/issues/1892

This is a test video for youtube-dl.

For more information, contact phihag@phihag.de .' -metadata 'purl=https://www.youtube.com/watch?v=BaW_jenozKc' -metadata date=20121002 -metadata 'description=test chars:  "'"'"'/\
test URL: https://github.com/rg3/youtube-dl/issues/1892

This is a test video for youtube-dl.

For more information, contact phihag@phihag.de .' -metadata 'artist=Philipp Hagemeister' file:youtube-dl_test_video-BaW_jenozKc.temp.mp4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.