Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assumed string encoding for .mhd differs on platforms #68

Open
codeling opened this issue Jan 10, 2019 · 7 comments · May be fixed by #113
Open

Assumed string encoding for .mhd differs on platforms #68

codeling opened this issue Jan 10, 2019 · 7 comments · May be fixed by #113

Comments

@codeling
Copy link
Contributor

codeling commented Jan 10, 2019

Looking at the .mhd documentation, I see no explicit mention of any assumed encoding (or a setting for it).

Yet when ElementDataFile points to a filename containing special characters, the character encoding is highly relevant for MetaIO to find the actual file. In my experiments (utilizing the MetaIO included in ITK), on Windows, an encoding of cp1252 is assumed, while on Linux, an encoding of utf-8 is expected. This means that when the filename given under ElementDataFile contains special characters, a separate .mhd file is required for Linux and Windows (and potentially more for other platforms I have not tested). What is thus required to make this consistent (and .mhd files with special characters transferrable between platforms), in my opinion, is to implement one of two options:

  1. That .mhd files are required to have a specific encoding (utf-8 seems to be the logic choice), or
  2. To have a separate entry specifying the encoding of the .mhd file

Or am I missing something here, is there an encoding specification somewhere already?

@dzenanz
Copy link
Collaborator

dzenanz commented Jan 10, 2019

Option 1 (utf-8) seems more logical. It will require a bit of extra coding for Windows, see SO 30829364.

@thewtex
Copy link
Member

thewtex commented Jan 10, 2019

+1 for Option 1 to keep things simple.

@codeling
Copy link
Contributor Author

codeling commented Jan 10, 2019

Option 1 would be my preferred choice too. I'll probably look into it, might take a while though.
Note that for windows user, this change will break backward compatibility, they'll have to convert any .mhd files with special characters created before...

@thewtex
Copy link
Member

thewtex commented Jan 10, 2019

@codeling awesome!

@jcfr
Copy link
Member

jcfr commented Jan 10, 2019

I wonder if any function from https://gitlab.kitware.com/cmake/cmake/tree/master/Source/kwsys could be used ?

@bradking Is there any code in CMake already doing this that could be factored out ?

@bradking
Copy link
Member

See KWSys Encoding.hxx and Encoding.h.

@todoooo
Copy link

todoooo commented Feb 24, 2021

@codeling Please see how this is handled in VTK, where the unicode policy is UTF-8 everywhere.

@codeling codeling linked a pull request Aug 10, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants