Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle UTF8 file names on Windows #26

Closed
ollira opened this issue Oct 17, 2019 · 10 comments
Closed

Handle UTF8 file names on Windows #26

ollira opened this issue Oct 17, 2019 · 10 comments

Comments

@ollira
Copy link
Contributor

ollira commented Oct 17, 2019

Hi,

To support UTF8-encoded file names on Windows, some code changes are needed in CheckedFile.cpp. This patch appears to work on Visual Studio 2017, and probably should also work on 2015.

diff --git a/src/CheckedFile.cpp b/src/CheckedFile.cpp
index 814afd3..66461a5 100644
--- a/src/CheckedFile.cpp
+++ b/src/CheckedFile.cpp
@@ -52,6 +52,7 @@
 #include <cmath>
 #include <cstring>
 #include <fcntl.h>
+#include <codecvt>
 
 #include "CRC.h"
 
@@ -112,10 +113,13 @@ CheckedFile::CheckedFile( const ustring &fileName, Mode mode, ReadChecksumPolicy
 
 int CheckedFile::open64( const ustring &fileName, int flags, int mode )
 {
-   //??? handle utf-8 file names?
+   // handle utf-8 file names
+   std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;
+   std::wstring widePath = converter.from_bytes(fileName);
+
 #if defined(_MSC_VER)
    int handle;
-   int err = _sopen_s(&handle, fileName_.c_str(), flags, _SH_DENYNO, mode);
+   int err = _wsopen_s(&handle, widePath.c_str(), flags, _SH_DENYNO, mode);
    if (handle < 0)
    {
       throw E57_EXCEPTION2(E57_ERROR_OPEN_FAILED,
@asmaloney
Copy link
Owner

_wsopen_s is expecting a const wchar_t * - is that what .c_str() is returning?

What should the __GNUC__ case use to work with this widePath?

@ollira
Copy link
Contributor Author

ollira commented Mar 6, 2020

I believe, std::wstring c_str() returns a pointer to wchar_t buffer that is terminated by zero.

I have tested this change on VIsual Studio 2017, 2019 only, but it should work on other C++ compilers too.

@asmaloney
Copy link
Owner

I have tested this change on VIsual Studio 2017, 2019 only, but it should work on other C++ compilers too.

I was actually asking about modifications to the code in the __GNUC__ case right below your change. I think the answer is "no changes necessary".

I think I was confused because the way you set it up with the #include and conversion code, it seems like you meant your solution to be used cross-platform, but then you're converting to UTF-16 with std::codecvt_utf8_utf16 which is only required for Windows (AFAIK).

If you meant this to be a Windows-only solution then I'll move the conversion within the Windows-only block.

@ollira
Copy link
Contributor Author

ollira commented Mar 6, 2020

Yes, these changes are needed on Windows only (while I also tried to make portable character conversion code).

asmaloney added a commit that referenced this issue Mar 6, 2020
I'm not sure if this works since I don't compile on Windows. Please let me know if there are issues!

Based on code from #26
@asmaloney
Copy link
Owner

OK - I've made changes in 8ee8582.

Thanks Olli!

@jiawenquan
Copy link

Visual Studio 2019 error
int err = _wsopen_s(&handle, widePath.c_str(), flags, _SH_DENYNO, mode);

don't need the conversion

@asmaloney
Copy link
Owner

Can you please post the error you're getting?

wstring's c_str() should be returning const wchar_t *.

https://en.cppreference.com/w/cpp/string/basic_string/c_str

@jiawenquan
Copy link

jiawenquan commented Dec 4, 2020

I can't catch it here

   int result = open(fileName_.c_str(), flags, mode);
   if (result < 0)
   {
      throw E57_EXCEPTION2(E57_ERROR_OPEN_FAILED,
                           "result=" + toString(result)
                           + " fileName=" + fileName
                           + " flags=" + toString(flags)
                           + " mode=" + toString(mode));
   }
   return result;

The normal work

but

   // Handle UTF-8 file names - Windows requires conversion to UTF-16
   std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;
   std::wstring widePath = converter.from_bytes( fileName );

   int handle;
   int err = _wsopen_s(&handle, widePath.c_str(), flags, _SH_DENYNO, mode);
   if (handle < 0)
   {
      throw E57_EXCEPTION2(E57_ERROR_OPEN_FAILED,
                           "err=" + toString(err)
                           + " fileName=" + fileName
                           + " flags=" + toString(flags)
                           + " mode=" + toString(mode));
   }
   return handle;

Don't work

You can test filenames ’测试点云.e57‘

(Edited by @asmaloney to format code - use ```cpp around code to make it more readable.)

@asmaloney
Copy link
Owner

I don't run Windows, so I can't test it. Maybe @ollira or another Windows user can try it and come up with an explanation/solution.

@ollira
Copy link
Contributor Author

ollira commented Dec 4, 2020

Hi,

In my environments the suggested filename (测试点云.e57) works just fine. Perhaps the fileName parameter from jiawenquan's application is not a UTF8-encoded string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants