Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative solution for issue #4787 #5527

Open
MrS0m30n3 opened this issue Apr 25, 2015 · 9 comments
Open

Alternative solution for issue #4787 #5527

MrS0m30n3 opened this issue Apr 25, 2015 · 9 comments

Comments

@MrS0m30n3
Copy link

MrS0m30n3 commented Apr 25, 2015

@phihag

Currently the # 1 3a0d2f5 solution provided by @dstftw is not the best in my opinion since it does not completly solves the encoding problems.

On Windows you lose part of the filename which for most users is not ok.
See issues (#5045, #5182, etc..) and on my repository MrS0m30n3/youtube-dl-gui#39

Can we implement the # 2 solution suggested from @dstftw? (#4787)

Perform all the subprocessing on temporary file with plain ASCII name and rollback to original name right after it's done.

This way we can completly fix those issues.

You can assign me in this issue if you want.

@maleficarium
Copy link

I have been using this python script that I wrote for merging the files that youtube-dl failed to.

I'm currently looking into how to integrate it into youtube-dl but since I am not familiar with the code yet it will probably take me some time. If someone else wishes to do so, go ahead.

@MrS0m30n3
Copy link
Author

MrS0m30n3 commented Apr 29, 2015

Hi, @maleficarium

You also need to consider the path name.
For example if the filename that we pass to the post_process method (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L1502) is something like
/home/user/Downloads/λήψεις/test.flv then you can't pass that to the subprocess Popen.

Thus i'm starting to see some drawbacks in the # 2 solution. You can't really rename the whole path,
maybe you could change the working directory and then pass only the filenames into the subprocess Popen or maybe we could use a temporary path to download the files when we detect that the post_process won't be able to handle those.

Also, we should consider the # 3 solution suggested by dstftw.
It's not so hard to write a minimalistic wrapper in C for the CreateProcess to support unicode.

@maleficarium
Copy link

@MrS0m30n3 I finaly got some time to work on this.
Thankfully youtube-dl is passing only the filenames to ffmpeg so encoding the path is not necessary. Simply hashing the filenames before processing them and renaming them back afterwards solves the problem.

Even using youtube-dl.exe -f bestvideo+bestaudio https://www.youtube.com/watch?v=XXXXX -o "C:/週連続/%(title)s-%(id)s.%(ext)s" will process the files properly and store them in whatever your system interprets C:/週連続/ as (C:/###/ in my case). This is an issue with unicode support in command line and it's the users responsibility to use valid paths when using -o.

I will send a pull request for this fix: maleficarium/youtube-dl@21aa58a.

@dogancelik
Copy link

I will be happy if someone could fix this.

@tushevorg
Copy link

Why just not to add transliteration option? For instance, Чернобыль will be transliterated to Chernobyl, and almost everyone understands transliteration.

This could make people from #8529 #5982 #8641 and many other issues happy.

@dstftw
Copy link
Collaborator

dstftw commented Apr 2, 2016

@tushevorg how will you transliterate non-cyrillic Unicode?

@tushevorg
Copy link

@dstftw Japanese language has romaji, Greek language also has its own system.

I'm not a python developer, but afaik there are modules for this already, like https://pypi.python.org/pypi/transliterate/1.7.6
All we need is to use them (or extend 'em with romaji/chinese/etc )

@dstftw
Copy link
Collaborator

dstftw commented Apr 2, 2016

Unicode is not bound to languages' characters only.

@tushevorg
Copy link

@dstftw then nothing changes - it simply drops the characters as it does now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants