Skip to content

lein 2 preview 4 has encoding issues in the REPL #586

Closed
wolverian opened this Issue May 15, 2012 · 13 comments

5 participants

@wolverian

Leiningen 2.0.0-preview4 on Java 1.6.0_31 Java HotSpot(TM) 64-Bit Server VM

user=> "ä"
"?"

This is on OS X 10.7, iTerm2, and my locale is:

$ locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"

UTF-8 works in other programs.

@borkdude

I tested this on my OSX 10.7.4 using iTerm2. In an in-project REPL with dependency on clojure 1.3, "ä" prints fine. In an outside-project REPL using Clojure 1.4 I have the same problem!

@technomancy
Owner

Is this related to the fact that OS X ignores your locale settings and uses MacRoman?

@wolverian

@technomancy What does that mean, exactly? That Apple's Java ignores the locale? Note that just starting the Clojure REPL directly does not exhibit this problem:

$ java -cp $CLASSPATH:/usr/local/Cellar/clojure/1.4.0/clojure-1.4.0.jar clojure.main --repl
Clojure 1.4.0
user=> "ä"
"ä"
@technomancy
Owner

I don't know the details, but I have heard a lot of bug reports caused by MacRoman on OS X. I believe the example you showed doesn't break because the input and output are both using the same (wrong) encoding.

@wolverian

Hmm. I downloaded Oracle's JDK 7, and behold:

Leiningen 2.0.0-preview4 on Java 1.7.0_04 Java HotSpot(TM) 64-Bit Server VM
user=> "ä"
"ä"

…this certainly indicates that Apple's Java is doing something weird with locales.

@wolverian

Indeed:

Java 7:

user=> (import java.nio.charset.Charset)
user=> (Charset/defaultCharset)
#<UTF_8 UTF-8>

Java 6 (Apple):

user=> (Charset/defaultCharset)
#<MacRoman MacRoman>
@wolverian

It would be extra-super-cool if Leiningen worked around this, if possible.

@technomancy
Owner

Definitely open to a patch if an OS X user wants to tackle this.

@michaelklishin
Collaborator

It should be sufficient to pass -Dfile.encoding=UTF-8 to Leiningen's JVM

@michaelklishin
Collaborator

Yes, passing -Dfile.encoding to Leiningen's JVM does help

@michaelklishin michaelklishin added a commit that closed this issue May 17, 2012
@michaelklishin michaelklishin Set default encoding for Lein's JVM to UTF-8, fixes #586
On OS X, JDK 6 uses MacRoman encoding otherwise and it messes
things up for REPL sessions that may have non-ASCII characters
efc8fb4
@michaelklishin
Collaborator

@wolverian fixed. If you don't use Lein from a checkout, you can either copy the change in efc8fb4 into your lein script or set LEIN_JVM_OPTS=-Dfile.encoding=UTF-8.

@trptcolin
Collaborator

Nice, thanks guys. I'll put this in REPL-y's shell script as well.

@wolverian

Now if some poor fool actually uses MacRoman on purpose... Thanks, @michaelklishin!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.