Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
command-line-arguments can't read umlauts with utf-8 encoding #81
I've tested this out on my Mac using the standard terminal, and it also fails here if the expression editor is enabled:
However, when I turn the expeditor off, it seems to work fine:
Additionally, echoing the instructions in from the terminal worked in both cases:
However, command line arguments (which allow us to name files to load also fails). So, for instance if we have two files
Then Chez can load
However, Chez can load the
So, anyway, it seems like there are two problems here:
I think in both cases OS X is probably providing the characters in UTF-8, but I was a little surprised by the number of ? characters in the load error report.
So, there are some work arounds (though not being able to use the expression editor is a pretty big bummer). Worth noting though is that file and console IO seem to do the right thing when the expression editor isn't involved.
I'll also try to take a look into this and see what I can figure out.
The inability to enter non-latin characters in the expression editor is #32.
This issue would be more accurately titled "Command line arguments always treated as bytes". The C spec (at least as of C99) says that
In any case, if you want to take a stab at making things better, looks like
Yes, I was just looking at that file and the that stack overflow article.
The pertinent code for the expression editor is in
added a commit
Jul 21, 2016
The command-line arguments are converted to Scheme strings using
The command-line argument handling should account for the encoding used by the operating system. For unix-like systems, it is UTF-8. For Windows, it's UTF-16LE when the arguments are obtained from