Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PROJ_LIB environment variable non ANSI characters in the path #1765

Closed
aharondavid opened this issue Dec 2, 2019 · 8 comments
Closed

PROJ_LIB environment variable non ANSI characters in the path #1765

aharondavid opened this issue Dec 2, 2019 · 8 comments
Milestone

Comments

@aharondavid
Copy link

@aharondavid aharondavid commented Dec 2, 2019

Expected behavior and actual behavior.

PROJ failed to read PROJ_LIB environment variable when the file path encoding
is different from the system local

Steps to reproduce the problem.

  1. set the PROJ_LIB environment variable to something different from your current locale (for example c:\temp\档案文件\proj )

  2. add a egm96_15.gtx file to the above directory

  3. transform failed on trying to open the GTX file

Operating system

Windows 10

PROJ version and provenance

PROJ 4.9.3

@kbevers
Copy link
Member

@kbevers kbevers commented Dec 2, 2019

PROJ 4.9.3 is quite ancient. Is this also a problem in PROJ 6.2.1?

@rouault
Copy link
Member

@rouault rouault commented Dec 2, 2019

We didn't change how we interact with files, so I'd bet yes. The issue is that we use ANSI C API. For Windows, we should likey use Win32 Unicode API like GDAL does. And possibly have some variable like GDAL also does to say if paths are in ANSI encoding or UTF-8. This comes from a GDAL ticket originally, but I suggested opening it here as it is a pure PROJ issue fundamentally.

@kbevers
Copy link
Member

@kbevers kbevers commented Dec 2, 2019

I figured, but my instinct tells me that a bug report for an unmaintained version of PROJ is not applicable.

And possibly have some variable like GDAL also does to say if paths are in ANSI encoding or UTF-8.

Is it really necessary to support both ANSI and Unicode in parallel? Wouldn't it be enough to just switch to the Unicode C API?

@rouault
Copy link
Member

@rouault rouault commented Dec 2, 2019

Wouldn't it be enough to just switch to the Unicode C API?

That would be a possibility. It is just that, at least in the old days when I tried on Windows, being able to specify a Unicode string in Windows cmd.exe was challenging. The default was to have the current codepage, which more or less what fopen() expects on Windows

@kbevers
Copy link
Member

@kbevers kbevers commented Dec 2, 2019

I'm no expert but I do work on Windows most of the time and it is my impression that Unicode generally works quite well. Unicode-support would be a nice feature for PROJ 7. I guess Linux/OSX already works fine with Unicode?

@rouault
Copy link
Member

@rouault rouault commented Dec 2, 2019

it is my impression that Unicode generally works quite well

my relationship with Unicode and Windows is definitely more complicated :-)

I guess Linux/OSX already works fine with Unicode?

They've been UTF-8 by default for the last 20 years or so :-) Nothing to do on that front.

@kbevers
Copy link
Member

@kbevers kbevers commented Dec 2, 2019

my relationship with Unicode and Windows is definitely more complicated :-)

You are at least correct in saying that cmd.exe is horrible when it comes to Unicode. PowerShell seems to be the same. Maybe it is too early to go full Unicode... :|

@busstoptaktik
Copy link
Member

@busstoptaktik busstoptaktik commented Dec 2, 2019

Maybe it is too early to go full Unicode

The cmd.exe unicode support is horrible, to say the least. It may need to wait for another couple of decades^H^H^H^H^H^H^H centuries^H^H^H^H^H^H^H^H^H milleniums^H^H^H^H^H^H^H^H^H^H aeons.

rouault added a commit to rouault/PROJ that referenced this issue Jan 10, 2020
…o#1765)

For backward compatibility, if PROJ_LIB content is found to be not UTF-8 or
pointing to a non existing directory, then an attempt at interpretating it
in the ANSI page encoding is done.

proj_context_set_search_paths() now assumes strings to be in UTF-8, and
functions returning paths will also return values in UTF-8.
rouault added a commit to rouault/PROJ that referenced this issue Jan 10, 2020
…o#1765)

For backward compatibility, if PROJ_LIB content is found to be not UTF-8 or
pointing to a non existing directory, then an attempt at interpretating it
in the ANSI page encoding is done.

proj_context_set_search_paths() now assumes strings to be in UTF-8, and
functions returning paths will also return values in UTF-8.
rouault added a commit to rouault/PROJ that referenced this issue Jan 10, 2020
…o#1765)

For backward compatibility, if PROJ_LIB content is found to be not UTF-8 or
pointing to a non existing directory, then an attempt at interpretating it
in the ANSI page encoding is done.

proj_context_set_search_paths() now assumes strings to be in UTF-8, and
functions returning paths will also return values in UTF-8.
rouault added a commit to rouault/PROJ that referenced this issue Jan 10, 2020
…o#1765)

For backward compatibility, if PROJ_LIB content is found to be not UTF-8 or
pointing to a non existing directory, then an attempt at interpretating it
in the ANSI page encoding is done.

proj_context_set_search_paths() now assumes strings to be in UTF-8, and
functions returning paths will also return values in UTF-8.
rouault added a commit that referenced this issue Jan 22, 2020
[RFC4_dev] Use Win32 Unicode APIs and expect all strings to be UTF-8 (fixes #1765)
@kbevers kbevers added this to the 7.0.0 milestone Jan 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants