New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug with filename charset on export under Windows. #853

Closed
M-O-Z-G opened this Issue Oct 5, 2018 · 4 comments

Comments

Projects
None yet
4 participants
@M-O-Z-G

M-O-Z-G commented Oct 5, 2018

Operating system

  • Windows 7 x64
  • macOS
  • Linux
  • Android
  • iOS

Application

  • Desktop
  • Mobile
  • Terminal

Joplin version

  • 1.0.111 (Portable)

When user make export as MD, if the title contains non-latin (Cyrillic in my case) characters then they replaces with underscore in filename.

@laurent22 laurent22 closed this in b880be8 Oct 5, 2018

@photonxp

This comment has been minimized.

photonxp commented Oct 27, 2018

I think the fix in version 1.0.114 is kind of weird, as described in the release announcement :

Fixes #853: Replace characters to equivalent US-ASCII ones when exporting files

If it's possible to import the title as non-latin words, it's better to export them as they were. Besides, Javascript has good support for universal encoding format such as utf-8.

Because file names have to be exchanged between software environments (think network file transfer, file system storage, backup and file synchronization software, configuration management, data compression and archiving, etc.), it is very important not to lose file name information between applications. This led to wide adoption of Unicode as a standard for encoding file names, although legacy software might be non-Unicode-aware.
https://en.wikipedia.org/wiki/Filename

@laurent22

This comment has been minimized.

Owner

laurent22 commented Nov 5, 2018

In theory it would be nice to export the filenames as UTF-8 but in practice various tools and systems fail to handle non-ASCII characters. Many times I saw the "é" or "à" in some of my filenames cause problems when transferring them to other systems or when backing them up, so it's better to keep it simple and use ASCII.

@heyeshuang

This comment has been minimized.

heyeshuang commented Nov 14, 2018

After this fix, the export function still replace folder name to underscores. I have a folder named "研究" and a "生活", however notes in them are exported in the same folder like "__".

By the way, I'm with @photonxp in this topic. A Chinese (or Japanese, etc.) filename converted into Latin alphabets (or “pinyin”) makes it dizzy and unreadable, just like writing english in Phonetic Alphabet.

Edit: Maybe sanitize-filename or filenamify will help?

@laurent22

This comment has been minimized.

Owner

laurent22 commented Nov 20, 2018

Right, those are good points, the export module should indeed support proper UTF-8 filenames, so the next version will support this, and then let's see if we run into any issue.

I've also fixed the duplicate filename issue.

@laurent22 laurent22 closed this Nov 20, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment