FileSystem.exists() does not seem to be unicode-aware #5935

Open
larsiusprime opened this Issue Jan 11, 2017 · 4 comments

Projects

None yet

4 participants

@larsiusprime
Contributor
larsiusprime commented Jan 11, 2017 edited

Reproduction case:

  1. Create this file on your hard drive (I'm using windows):
    C:\Денисカタカナ冰淇淋\api\test.txt

  2. From your code, using hxcpp target, call:
    FileSystem.exists("C:\Денисカタカナ冰淇淋\api\test.txt")

What should happen:
The function should return true

What happens instead:
The function returns false

Speculation:
The error is in this call:
https://github.com/HaxeFoundation/haxe/blob/875ad19432abc2cec6b345cc49a880f5c7f3c98a/std/neko/_std/sys/FileSystem.hx#L34

To this function:
https://github.com/HaxeFoundation/haxe/blob/875ad19432abc2cec6b345cc49a880f5c7f3c98a/std/neko/_std/sys/FileSystem.hx#L103

And this function:
https://github.com/HaxeFoundation/haxe/blob/875ad19432abc2cec6b345cc49a880f5c7f3c98a/std/haxe/io/Path.hx#L262

They use standard string functions rather than UTF-aware functions.

Not sure what the proper solution is, but left as-is, using FileSystem.exists() naively on the computer of a random person in, say, Russia, China, Japan, etc. could very well return false on requests when opening files from .e.g. their documents directory, which likely includes their username.

UPDATE

If you skip the FileSystem.exists() check and just use File.getContent() directly within a try/catch block to handle the non-exist case, you get an exception:

[file_contents,C:\Денисカタカナ冰淇淋\api\test.txt]

So it seems there is a need for UTF-aware filepath handling functions. I understand it might be a burden to change the standard library, but perhaps some new functions could be added, or provided in a separate library?

@starry-abyss
starry-abyss commented Jan 11, 2017 edited

Happens for me too on both Haxe 3.2.1 and Haxe 3.4-rc2.
I didn't use you filename, since I already have some files with Russian and English names. Also I just used filenames at the root directory instead of long paths.
I believe slashes in your path are single because github ate them?

@ncannasse
Member

Can you try with HashLink? It's using UTF16 system functions

@larsiusprime
Contributor

Sure, what's the best way to compile a simple app with hashlink? I've never used it before.

@Justinfront
Contributor
Justinfront commented Jan 12, 2017 edited

larsiuprime

It's pretty similar to Neko.
If your on windows get the hl.exe
hl downloads
or on linux / mac you can make yourself, simpler if your just make hl than make all since for all you need sdl etc... stuff installed which with my attempt with macports did not work but probably simpler on linux.

git clone https://github.com/HaxeFoundation/hashlink.git
cd hashlink
make hl

you may need to specify the ARCH=32 or similar, and for the clone can't remember if there are submodules ( --recursive ).
Once you have hl make sure it can be found on terminal or put it next to your code.
For your actual program, it's the same as running neko vm using an hxml file and nightly haxe.

-hl output.hl 
-main Main
# mac probably...
-cmd ./hl output.hl
# windows probably...
-cmd hl.exe output.hl

Alternatively read http://hashlink.haxe.org/

Hope that helps, probably simpler than you thought unless your on a mac and want to do graphics with it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment