New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support unicode characters in command line arguments for windows #96
Comments
squashfs-tools-ng/lib/compat/chdir.c Line 19 in ab98505
squashfs-tools-ng/lib/sqfs/win32/io_file.c Line 184 in ab98505
your code seems to be mixing these two encodings? |
It took a little bit longer than expected, but I finally got around to looking into this (and also got stuck with another bug along the way during testing), but I hope that this should be fixed soon-ish for a new release with primarily Windows fixes. There are now commits on master and fixes-1.1.0 that try to address this issue, but I'm afraid that it will require a little more research, review and testing. A wrapper for the main() function was added that obtains the actual UTF-16 command line and converts it to UTF-8 before running the real main() function. The This was sufficient that I could use the command line tools for accessing files/archives with German and Chinese names when running some quick tests. Input files (i.e. the |
thanks for you work. I can create archive file with non-ASCII directory name now. however the tool will print garbled text if file name contains non-ASCII characters. one possible solution is to call |
Hi, first of all, sorry for the long delay. While I was preoccupied with work/personal issues for much longer than I had initially hoped, I did occasionally find some time to look into this and test several approaches on a Windows 7 VM. Sadly, the suggested drop-in solution doesn't seem to work. Using Trying to do I tried another approach to use pre-processor magic to redirect the stdio functions to Windows specific, custom implementations (generate a finished string for the printf ones) and then convert it to UTF-16 and use the wide-char versions. This strangely worked for German Umlaut characters, but Chinese text magically disappeared. Also, if it had worked, this would result in UTF-16 files when redirecting the output to a file or a pipe. Particularly I modified this approach and instead added a hacky check if the target stream is I also alternatively tried to change the codepage to UTF-8, not convert the strings at all and use The approach in 6447b19 worked to most reliably so far, but is still not perfect. In the Windows 7 VM, printing Chinese text causes a weird indentation to be added in front of every printed line (I guess this caused by the different font being switched to?). Also, when manually setting the codepage to UTF-8 (by running |
on Windows the
argv
is encoded in the ANSI codepage. your code seems to assume it is UTF-8 and convert it to wide characters when call system functions.The text was updated successfully, but these errors were encountered: