Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace "œ" by "oe" in restrict-filename mode #9463

Closed
sebma opened this issue May 12, 2016 · 9 comments
Closed

Replace "œ" by "oe" in restrict-filename mode #9463

sebma opened this issue May 12, 2016 · 9 comments

Comments

@sebma
Copy link

@sebma sebma commented May 12, 2016

Please follow the guide below

  • You will be asked some questions and requested to provide some information, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your issue (like that [x])
  • Use Preview tab to see how your issue will actually look like

Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2016.05.10. If it's not read this FAQ entry and update. Issues with outdated version will be rejected.

  • I've verified and I assure that I'm running youtube-dl 2016.05.10

Before submitting an issue make sure you have:

  • At least skimmed through README and most notably FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your issue


If the purpose of this issue is a bug report, site support request or you are not completely sure provide the full verbose output as follows:

Add -v flag to your command line you run youtube-dl with, copy the whole output and insert it here. It should look similar to one below (replace it with your log inserted between triple ```):

$ youtube-dl -v <your command line>
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2016.05.10
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
<end of log>

If the purpose of this issue is a site support request please provide all kinds of example URLs support for which should be included (replace following example URLs by yours):


Description of your issue, suggested solution and other information

Explanation of your issue in arbitrary form goes here. Please make sure the description is worded well enough to be understood. Provide as much context and examples as possible.
If work on your issue required an account credentials please provide them or explain how one can obtain them.


Hi,

Can you replace "œ" by "oe" in restrict-filename mode :

$ youtube-dl --version
2016.05.10
$ youtube-dl -f 18 -e IYGUF2d5Azc
Connaitre le cœur du père pour gérer la famille (Shora KUETU - 03/03/2015)

The command line youtube-dl -f 18 --get-filename --restrict-filename IYGUF2d5Azc would return :

Connaitre_le_coeur_du_pere_pour_gerer_la_famille_Shora_KUETU_-_03_03_2015__18__IYGUF2d5Azc.mp4

instead of :

Connaitre_le_c_ur_du_pere_pour_gerer_la_famille_Shora_KUETU_-_03_03_2015__18__IYGUF2d5Azc.mp4
@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented May 12, 2016

This is current mapping:

ACCENT_CHARS = dict(zip('ÂÃÄÀÁÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ',
                        itertools.chain('AAAAAA', ['AE'], 'CEEEEIIIIDNOOOOOOUUUUYP', ['ss'],
                                        'aaaaaa', ['ae'], 'ceeeeiiiionoooooouuuuypy')))

Is there anything missing besides œ? I hope to include all possible characters this time.

@sebma
Copy link
Author

@sebma sebma commented May 12, 2016

Let me check on my keyboard layout (which is French Alternative) ...
systemsettings_003

@sebma
Copy link
Author

@sebma sebma commented May 12, 2016

Ok, it seems fine but you can also add it's capital counterpart : Œ

@remitamine
Copy link
Collaborator

@remitamine remitamine commented May 12, 2016

I hope to include all possible characters this time.

there are a lot of Accented Characters most of them are not widely used.
you can find a lot of them in:
https://en.wikipedia.org/wiki/List_of_Unicode_characters

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented May 12, 2016

This table does not list replacements for all characters, in which native speakers' help is necessary.

@yan12125 yan12125 closed this in 778a1cc May 12, 2016
@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented May 12, 2016

Thanks @sebma for the information. The fix will be included in the next version.

@sebma
Copy link
Author

@sebma sebma commented May 12, 2016

Thank you for you reactivity.

@remitamine
Copy link
Collaborator

@remitamine remitamine commented Aug 17, 2016

looks like there are lots of libraries specialized in the transformation of strings, some of them extend the rules for other languages and even for symbols.
examples:
https://github.com/cocur/slugify/blob/master/src/RuleProvider/DefaultRuleProvider.php
https://github.com/dodo/node-slug/blob/master/slug.js

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Aug 17, 2016

Seems great, but both use MIT license.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.