Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanitize non-ASCII characters out of memorial file names. #3600

Merged

Conversation

Projects
None yet
2 participants
@ianestrachan
Copy link
Contributor

commented Oct 12, 2013

Fixes #2845. I think.

I don't actually have a way to test this as I have no idea how to type Chinese characters into Cataclysm. It's a simple enough patch, and it SHOULD work, but I'd appreciate one of the international users confirming that it does what it's supposed to, at which point I'll take the [WIP] tag off.

@yobbobanana

This comment has been minimized.

Copy link
Contributor

commented Nov 17, 2013

internationalized character names with this are just a long line of underscores (especially long as each unicode character turns into 2-4 bytes in utf-8). i think i'd actually prefer removing the name and just using the timestamp.

without it, utf-8 memorial filenames do work fine for me, but i guess that's because my filesystem uses utf-8 encoding for filenames.

@ianestrachan

This comment has been minimized.

Copy link
Contributor Author

commented Nov 17, 2013

Perhaps just removing the offending characters - that way, if there's a single odd one in the middle of an otherwise-okay name, most of the name will still be there.

@yobbobanana

This comment has been minimized.

Copy link
Contributor

commented Nov 17, 2013

well i mean for a chinese name they'll all be non-ascii. Even for russian characters that look the same, they use a different section of unicode, and so would all be non-ascii.

I guess it could check how many characters were replaced, and if it's almost all or all, not use the name?

@ianestrachan

This comment has been minimized.

Copy link
Contributor Author

commented Nov 17, 2013

Should be better now - if a name is entirely made of non-ASCII characters, it just has the timestamp, without the - in front.

@yobbobanana

This comment has been minimized.

Copy link
Contributor

commented Nov 18, 2013

hmm, now it's removing everything except the spaces in between the names ><. Would a threshold work, like if the sanitized name is less than 1/5 of the original name it just uses the timestamp?

@ianestrachan

This comment has been minimized.

Copy link
Contributor Author

commented Nov 19, 2013

Added the threshold, <= 20% of original length removes the name entirely.

@yobbobanana yobbobanana merged commit f55bee3 into CleverRaven:master Nov 19, 2013

1 check failed

default Build finished.
Details
@yobbobanana

This comment has been minimized.

Copy link
Contributor

commented Nov 19, 2013

thanks :)

@ianestrachan ianestrachan deleted the ianestrachan:sanitize_memorial_file_names branch Nov 19, 2013

@kevingranade kevingranade changed the title Sanitize non-ASCII characters out of memorial file names. [WIP] Sanitize non-ASCII characters out of memorial file names. Sep 20, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.