Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug with filename charset on export under Windows. #853

Closed
3 of 9 tasks
M-O-Z-G opened this issue Oct 5, 2018 · 4 comments
Closed
3 of 9 tasks

Bug with filename charset on export under Windows. #853

M-O-Z-G opened this issue Oct 5, 2018 · 4 comments
Labels
bug It's a bug

Comments

@M-O-Z-G
Copy link

M-O-Z-G commented Oct 5, 2018

Operating system

  • Windows 7 x64
  • macOS
  • Linux
  • Android
  • iOS

Application

  • Desktop
  • Mobile
  • Terminal

Joplin version

  • 1.0.111 (Portable)

When user make export as MD, if the title contains non-latin (Cyrillic in my case) characters then they replaces with underscore in filename.

@photonxp
Copy link

photonxp commented Oct 27, 2018

I think the fix in version 1.0.114 is kind of weird, as described in the release announcement :

Fixes #853: Replace characters to equivalent US-ASCII ones when exporting files

If it's possible to import the title as non-latin words, it's better to export them as they were. Besides, Javascript has good support for universal encoding format such as utf-8.

Because file names have to be exchanged between software environments (think network file transfer, file system storage, backup and file synchronization software, configuration management, data compression and archiving, etc.), it is very important not to lose file name information between applications. This led to wide adoption of Unicode as a standard for encoding file names, although legacy software might be non-Unicode-aware.
https://en.wikipedia.org/wiki/Filename

@laurent22
Copy link
Owner

In theory it would be nice to export the filenames as UTF-8 but in practice various tools and systems fail to handle non-ASCII characters. Many times I saw the "é" or "à" in some of my filenames cause problems when transferring them to other systems or when backing them up, so it's better to keep it simple and use ASCII.

@heyeshuang
Copy link

heyeshuang commented Nov 14, 2018

After this fix, the export function still replace folder name to underscores. I have a folder named "研究" and a "生活", however notes in them are exported in the same folder like "__".

By the way, I'm with @photonxp in this topic. A Chinese (or Japanese, etc.) filename converted into Latin alphabets (or “pinyin”) makes it dizzy and unreadable, just like writing english in Phonetic Alphabet.

Edit: Maybe sanitize-filename or filenamify will help?

@laurent22
Copy link
Owner

Right, those are good points, the export module should indeed support proper UTF-8 filenames, so the next version will support this, and then let's see if we run into any issue.

I've also fixed the duplicate filename issue.

@lock lock bot locked and limited conversation to collaborators Oct 15, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug It's a bug
Projects
None yet
Development

No branches or pull requests

4 participants