Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Files Title encoding (UTF-8) #836

Open
thiagof opened this issue May 11, 2013 · 9 comments
Open

Files Title encoding (UTF-8) #836

thiagof opened this issue May 11, 2013 · 9 comments

Comments

@thiagof
Copy link

@thiagof thiagof commented May 11, 2013

Youtube uses UTF-8 encode for their pages, but I'm experiencing issues when downloading files with some special characters (brazilian portuguese language).

Under puppy linux 528, the ROX-filer tell me that the files (written on ext4) has "bad utf-8". youtube-dl still download them.

But if I try to download directly to a FAT/NTFS file system youtube-dl isnt posssible to create the files, giving the following error

sh-4.1# youtube-dl http://www.youtube.com/watch?v=MvmUcd7bokY
[youtube] Setting language
[youtube] MvmUcd7bokY: Downloading video webpage
[youtube] MvmUcd7bokY: Downloading video info webpage
[youtube] MvmUcd7bokY: Extracting video information
ERROR: unable to open for writing: [Errno 84] Invalid or incomplete multibyte or wide character: 'M\xe1rcia Peltier Entrevista Dr. Alberto - Parte 18-MvmUcd7bokY.flv.part'

Seems like a string conversion check need to be made.

@FiloSottile
Copy link
Collaborator

@FiloSottile FiloSottile commented May 18, 2013

On my system it works fine.
What's your locale on the puppy?

When downloading to a dumb filesystem, instead, you'll probably want to use --restrict-filenames.

@yasoob
Copy link
Contributor

@yasoob yasoob commented Jul 3, 2013

@thiagof did --restrict-filenames help ?

@thiagof
Copy link
Author

@thiagof thiagof commented Jul 3, 2013

Yes --restrict-filenames works. It helps by stripping the "bad characters" and not by fixing their encoding.

@thiagof thiagof closed this Jul 3, 2013
@phihag
Copy link
Contributor

@phihag phihag commented Jul 3, 2013

Well, this should work without --restrict-filenames, which is a hack and unsatisfactory. Repoening.

@phihag phihag reopened this Jul 3, 2013
@yasoob
Copy link
Contributor

@yasoob yasoob commented Feb 4, 2014

@thiagof can you confirm whether this issue still exists?

@laalex
Copy link

@laalex laalex commented Mar 13, 2014

It stil exists

@phihag
Copy link
Contributor

@phihag phihag commented Mar 13, 2014

@roshkattu Can you post the full output you get when you run youtube-dl with the -v option? That output will tell us how to reproduce the problem.

@smnuman
Copy link

@smnuman smnuman commented Jun 13, 2014

@phihag Here is what I've got...

[debug] System config: []
[debug] User config: ['--retries', '1024', '--no-overwrites', '--continue', '--write-description', '--write-info-json', '--write-annotations', '--write-thumbnail', '--console-title', '-f', 'mp4', '--all-subs', '-k', '--no-post-overwrites', '--embed-thumbnail', '--add-metadata', '-o', '~/Videos/youtube-dl/%(extractor)s/%(uploader)s/%(title)s-%(id)s.%(ext)']
[debug] Command-line args: ['-v', 'https://www.youtube.com/watch?v=B2YZ1_4IDkQ']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2014.06.09
[debug] Python version 2.7.3 - Linux-3.8.0-34-generic-x86_64-with-Ubuntu-12.04-precise
[debug] Proxy map: {}
[youtube] Setting language
[youtube] B2YZ1_4IDkQ: Downloading webpage
[youtube] B2YZ1_4IDkQ: Downloading video info webpage
[youtube] B2YZ1_4IDkQ: Extracting video information
WARNING: video doesn't have subtitles
[youtube] B2YZ1_4IDkQ: Searching for annotations.
ERROR: Error in output template: incomplete format (encoding: 'UTF-8')
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/youtube_dl/YoutubeDL.py", line 446, in prepare_filename
    filename = tmpl % template_dict
ValueError: incomplete format

Could you pls trace anything out of it ?

EDIT: Update:

I have tried the SW removing the -o template-info and i got the following:

89% @ 21:25 [1992:63] (~) -bash $ \youtube-dl https://www.youtube.com/watch?v=B2YZ1_4IDkQ
[youtube] Setting language
[youtube] B2YZ1_4IDkQ: Downloading webpage
[youtube] B2YZ1_4IDkQ: Downloading video info webpage
[youtube] B2YZ1_4IDkQ: Extracting video information
WARNING: video doesn't have subtitles
[youtube] B2YZ1_4IDkQ: Searching for annotations.
[info] Writing video description to: How to manage stories in Drupal - Drupal Tutorial-B2YZ1_4IDkQ.mp4.description
[info] Writing video annotations to: How to manage stories in Drupal - Drupal Tutorial-B2YZ1_4IDkQ.mp4.annotations.xml
[info] Writing video description metadata as JSON to: How to manage stories in Drupal - Drupal Tutorial-B2YZ1_4IDkQ.info.json
[youtube] B2YZ1_4IDkQ: Downloading thumbnail ...
[youtube] B2YZ1_4IDkQ: Writing thumbnail to: How to manage stories in Drupal - Drupal Tutorial-B2YZ1_4IDkQ.jpg
[download] Destination: How to manage stories in Drupal - Drupal Tutorial-B2YZ1_4IDkQ.mp4
[download] 100% of 2.65MiB in 01:00
[ffmpeg] Adding metadata to 'How to manage stories in Drupal - Drupal Tutorial-B2YZ1_4IDkQ.mp4'
ERROR: AtomicParsley was not found. Please install.

And I tried installing AtomicParsley as said:

73% @ 21:27 [1992:64] (~) -bash $ sudo apt-get install AtomicParsley
Reading package lists... Done
Building dependency tree       
Reading state information... Done
W: Duplicate sources.list entry http://extras.ubuntu.com/ubuntu/ precise/main amd64 Packages (/var/lib/apt/lists/extras.ubuntu.com_ubuntu_dists_precise_main_binary-amd64_Packages)
W: Duplicate sources.list entry http://extras.ubuntu.com/ubuntu/ precise/main i386 Packages (/var/lib/apt/lists/extras.ubuntu.com_ubuntu_dists_precise_main_binary-i386_Packages)
W: You may want to run apt-get update to correct these problems
E: Unable to locate package AtomicParsley

...without any avail.

@jaimeMF
Copy link
Collaborator

@jaimeMF jaimeMF commented Jun 13, 2014

@smnuman The issue is unrelated. The first error is because there must be an s before %(ext), the correct output template is ~/Videos/youtube-dl/%(extractor)s/%(uploader)s/%(title)s-%(id)s.%(ext)s.
It seems that the ubuntu package is atomicparsley.
If you have further problems, please open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
7 participants
You can’t perform that action at this time.