New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change text encoding in UTFGrid driver to work with Windows #5342

Closed
wants to merge 3 commits into
from

Conversation

Projects
None yet
3 participants
@geographika
Contributor

geographika commented Oct 30, 2016

The use of the ICONV encoding "UCS-4LE" in the maputfgrid.cpp encoding causes junk output on Windows.

utf8 = msConvertWideStringToUTF8 (string, "UCS-4LE");

Changing this to "UCS-2LE" produces the correct output on Windows. I'm not sure if this also works
without issues on Linux. Hopefully there can be one encoding that works on both.

The only other place the encoding is specified is in the Windows-only MS SQL driver which uses "UCS-2LE"
https://github.com/mapserver/mapserver/blob/branch-7-0/mapmssql2008.c#L1741

I'm not sure of the exact difference between the two encodings. The most detailed descriptions I could find were:

"UCS2LE is a direct byte encoding of the first plane in which the low byte comes first"
http://interscript.sourceforge.net/interscript/doc/en_iscr_0279.html

"UCS4LE is a four byte direct encodings of of ISO-10646. UCS4LE puts the low byte first. "
http://interscript.sourceforge.net/interscript/doc/en_iscr_0281.html

@rouault

This comment has been minimized.

Show comment
Hide comment
@rouault

rouault Oct 30, 2016

Contributor

Looking at https://www.gnu.org/software/libc/manual/html_node/iconv-Examples.html, I'd suggest trying "WCHAR_T" although I'm not sure. I found elsewhere that wchar_t on Windows was only 2 bytes wide, whereas it is 4 bytes on Unix systems, hence "UCS-4LE" is indeed inappropriate for Windows and wchar_t.

Contributor

rouault commented Oct 30, 2016

Looking at https://www.gnu.org/software/libc/manual/html_node/iconv-Examples.html, I'd suggest trying "WCHAR_T" although I'm not sure. I found elsewhere that wchar_t on Windows was only 2 bytes wide, whereas it is 4 bytes on Unix systems, hence "UCS-4LE" is indeed inappropriate for Windows and wchar_t.

Update maputfgrid.cpp
Try "WCHAR_T" encoding
@geographika

This comment has been minimized.

Show comment
Hide comment
@geographika

geographika Nov 2, 2016

Contributor

@rouault thanks for the comment. I have tried the "WCHAR_T" encoding, as UCS-2LE broke the Linux tests on Travis.

A couple more links I found that might be relevant: http://stackoverflow.com/a/40150716/179520

UTF-32LE = UCS-4LE : UCS-4 in little endian flavour, without BOM

This link says WCHAR_T doesn't work correctly on OSX, and the best approach is to use macros.

http://www.firstobject.com/wchar_t-string-on-linux-osx-windows.htm

Contributor

geographika commented Nov 2, 2016

@rouault thanks for the comment. I have tried the "WCHAR_T" encoding, as UCS-2LE broke the Linux tests on Travis.

A couple more links I found that might be relevant: http://stackoverflow.com/a/40150716/179520

UTF-32LE = UCS-4LE : UCS-4 in little endian flavour, without BOM

This link says WCHAR_T doesn't work correctly on OSX, and the best approach is to use macros.

http://www.firstobject.com/wchar_t-string-on-linux-osx-windows.htm

@geographika

This comment has been minimized.

Show comment
Hide comment
@geographika

geographika Nov 9, 2016

Contributor

I found several examples in the MapServer codebase where different code paths are used between Windows and Linux. The final commit works fine on Windows. Apologies for multiple commits, in one pull request - trying to resolve this now.

Contributor

geographika commented Nov 9, 2016

I found several examples in the MapServer codebase where different code paths are used between Windows and Linux. The final commit works fine on Windows. Apologies for multiple commits, in one pull request - trying to resolve this now.

tbonfort added a commit that referenced this pull request Dec 5, 2016

@tbonfort

This comment has been minimized.

Show comment
Hide comment
@tbonfort

tbonfort Dec 5, 2016

Member

backported and applied to branch-7-0 in 2ab0dc0

Member

tbonfort commented Dec 5, 2016

backported and applied to branch-7-0 in 2ab0dc0

@tbonfort tbonfort closed this Dec 5, 2016

@geographika geographika referenced this pull request Mar 8, 2017

Open

Empty UTFGRID #5400

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment