-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opening files with non-ANSI filenames on Windows #78
Comments
Are you saying that |
I don't have a lot of experience with it, but I came across one example of a library handling this issue: zlib has an extra function |
FYI:
Unicode Support fopen supports Unicode file streams. To open a Unicode file, pass a ccs flag that specifies the desired encoding to fopen, as follows. FILE *fp = fopen("newfile.txt", "rt+, ccs=encoding"); Allowed values of encoding are UNICODE, UTF-8, and UTF-16LE. When a file is opened in Unicode mode, input functions translate the data that's read from the file into UTF-16 data stored as type wchar_t. Functions that write to a file opened in Unicode mode expect buffers that contain UTF-16 data stored as type wchar_t. If the file is encoded as UTF-8, then UTF-16 data is translated into UTF-8 when it is written, and the file's UTF-8-encoded content is translated into UTF-16 when it is read. An attempt to read or write an odd number of bytes in Unicode mode causes a parameter validation error. To read or write data that's stored in your program as UTF-8, use a text or binary file mode instead of a Unicode mode. You are responsible for any required encoding translation. If the file already exists and is opened for reading or appending, the Byte Order Mark (BOM), if it present in the file, determines the encoding. The BOM encoding takes precedence over the encoding that is specified by the ccs flag. The ccs encoding is only used when no BOM is present or the file is a new file. |
Using This can currently be a The question is, (how) can we use Can we convert the wide characters provided to |
@skyrich62 Thanks for the link; at the moment we are just looking at the encoding of the file name, not the content, but we'll keep the possibility of an on-the-fly conversion in mind. |
As a work-around for this: Would it be sufficient to add a ctor taking the @wojdyr Would that be OK with you or does it cause additional problems? Note that our class takes ownership of the |
Yes, it would work fine for me. |
That makes things easier, |
Thanks! |
Thank you for maintaining PEGTL.
My understanding is that currently file_input and read_input don't work with Unicode filenames on Windows? Because I see that internal::file_reader uses fopen(_s) not _wfopen(_s).
If that's true, this is a feature request to add support for Unicode filenames.
The text was updated successfully, but these errors were encountered: