Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JRuby does not Handle UTF-8 Source Files #480

martinvahi opened this issue Jan 6, 2013 · 8 comments

JRuby does not Handle UTF-8 Source Files #480

martinvahi opened this issue Jan 6, 2013 · 8 comments


Copy link

@martinvahi martinvahi commented Jan 6, 2013


S_NONASCII_SCRIPT="puts 'Tsitaatsõne'" 


echo ""  >  $FILE_B

echo "puts 'Start of test.'" >> $FILE_B
echo "require './a.rb'"      >> $FILE_B
echo "puts 'End of test.\n'" >> $FILE_B

echo ""
echo "Running ./a.rb with plain ruby:"
echo ""
`which ruby` -Ku ./a.rb
echo ""
echo "Running ./b.rb with plain ruby:"
echo ""
`which ruby` -Ku ./b.rb
echo ""
echo ""
echo "Running jruby with non-ASCII script directly from console:"
echo ""
`which jruby` -Ku --1.9 -e "$S_NONASCII_SCRIPT"
echo ""
echo ""
echo "Running ./a.rb with jruby:"
echo ""
`which jruby` -Ku --1.9 ./a.rb
echo ""
echo ""
echo "Running ./b.rb with jruby:"
echo ""
`which jruby` -Ku --1.9 ./b.rb
echo ""
echo ""

# The console output of this script, if the bash comments and 
# a single space were removed from the start of the lines:
# Running ./a.rb with plain ruby:
# Tsitaatsõne
# Running ./b.rb with plain ruby:
# Start of test.
# Tsitaatsõne
# End of test.\n
# Running jruby with non-ASCII script directly from console:
# Tsitaatsõne
# Running ./a.rb with jruby:
# SyntaxError: ./a.rb:2: invalid multibyte char (US-ASCII)
# Running ./b.rb with jruby:
# Start of test.
# SyntaxError: /home/ts2/tmp/xx5/test_case/./a.rb:2: invalid multibyte char (US-ASCII)
#   require at org/jruby/
#    (root) at /home/ts2/m_local/bin_p/Ruby/JRuby/paigaldatult/v_1_7_0/lib/ruby/shared/rubygems/custom_require.rb:1
#    (root) at ./b.rb:3
Copy link

@the8472 the8472 commented Jan 7, 2013

Have you tried adding an encoding header to the file?

Copy link

@BanzaiMan BanzaiMan commented Jan 7, 2013

This might be a dup of #343.

Copy link

@martinvahi martinvahi commented Jan 7, 2013

What regards to the addition of encoding header, then my answer is that thank You for the workaround suggestion. I'll keep it in mind as a measure of last resort.

I already moved back to the "vanilla Ruby". I find it easier to re-compile the classical, tried and tested, Ruby rather than to risk that considerable amount of work ends up being wasted due to lack of basics. Besides, the Java has 2 byte Char's and the stabilization of Unicode support quirk-s will probably take some time even for the Java VM main language, the Java.

@martinvahi martinvahi closed this Jan 7, 2013
@martinvahi martinvahi reopened this Jan 7, 2013
Copy link

@martinvahi martinvahi commented Jan 7, 2013

I apologize for the useless closing and opening of this issue. I did not know that the word "Close" on the button stood for the issue, not just a comment box.

Copy link

@BanzaiMan BanzaiMan commented Jan 7, 2013

@martinvahi I'm convinced that this is a duplicate of #343. I'm closing this as such, but if you have evidence to show otherwise, feel free to reopen it.

@BanzaiMan BanzaiMan closed this Jan 7, 2013
@enebo enebo reopened this Dec 20, 2013
Copy link

@enebo enebo commented Dec 20, 2013

I am re-opening this issue over #343 since this has a much simpler reproduction.

headius added a commit that referenced this issue Dec 20, 2013
Fixes #480.
Copy link

@headius headius commented Dec 20, 2013

I fixed this for the cases you present. There are a few other paths not obeying -K encoding for parsing, but I'm not sure if they should get the same treatment (evals, mostly...some of which transcode to a Java String before parsing and others which do not and may be getting wrong encoding as a result). If someone wants to explore those possibilties, it would be really excellent.

In any case, I did not come up with a test case for this because -K is deprecated (and warns you of that in verbose mode) and because command-line stuff is a pain to test. If someone here would like to contribute a test, perhaps in test/test_command_line_switches.rb, we'd be happy to incorporate it.

Not merged to master yet, but I'll do that now.

@headius headius closed this Dec 20, 2013
Copy link

@Alamoz Alamoz commented Dec 21, 2013

Thanks @enebo and @headius for re-opening and then fixing this issue, a great holiday gift! 😄 The fix is working for my app.

@enebo enebo modified the milestones: JRuby 1.7.10, JRuby 1.7.11 Feb 24, 2014
eregon added a commit that referenced this issue Oct 28, 2017
a6b8805 Fix the 2.5 example of Warning.warn to be independent of $VERBOSE
65f80f6 Make the Warning.warn specs work indepent of the external warning level
9daa40d Fix for loop for old before C99 compilers
b25e9ec Add a couple more expectations for post + kwargs
ea6c301 Fix style
f04a632 Improve warnings when there are leaked process before the Process.wait2 specs
e186683 Add an example for readlink with unicode characters
43adad3 Add spec for a binary Symbol
aa792e7 Add rb_str_new example with offset
e6b67cd Add spec verifying $VERBOSE allows truthy values.
61e7f32 Add rb_yield passing block with each spec
3e945e7 Recompile spec cexts if the ruby library changed
f398b2c Improve SafeStringValue by sharing the specs of StringValue
1bbcba9 Fix SafeStringValue, add spec
be895f7 Test rb_class_of() with an object with a singleton class
b2e08d2 Use stub! when there is no guarantee if and how many times it is called
4125eca Add specs for [Feature#13983] Rational and Complex should be frozen
6b6d33e Revert "Dir.glob with FNM_EXTGLOB is optimized [Feature#13873]"
6019e87 io.c: write a newline together
4c42133 Dir.glob with FNM_EXTGLOB is optimized [Feature#13873]
07576e1 Revert "ignore server side error"
00a6585 ignore server side error
ad5eb0a spec/ruby/optional/capi/io_spec.rb: speling fics
e79f3eb array.c: improve operations on small arrays
340826a Use #gets instead of recv(2014) for FTP specs
bc5521a Avoid warnings in if with multiple assignments
8916cca Fix ruby_bug version range and expectation
061ac41 Make it clear top-level return specs specify the ruby 2.4.2 behavior
9a26eda Clarify how to use ruby_bug
6078bba Add specs for Integer.sqrt [Feature #13219]
45e578a Add more specs for Date#next_month, Date#prev_month
fb5a97b Improve specs for Date#next_day, Date#prev_day
d9c1c83 Code review. Mark some test cases as bugs
63b3d78 Code review. Replace excessive `ruby_exe` calls with temp file loading
04fd5be Add specs for top-level return (tested with Ruby 2.4.2)
847d901 spec for regexp absent operator
6ea9b6b Rename file to match standard naming conventions
9cfced0 Fix a couple spec descriptions
53dc62f Add spec for REXML::Element#[]
81c9b4f Remove unnecessary top-level version guard in Module#refine specs
ab3765e Add specs for Module#using
7cf3b1d Update specs for main#using
37a2e86 Add specs for Module#refine for Ruby 2.4
dca2e4f Add specs for Module#refine
214743d Update specs for Module#refine
44a1368 Restore specs for main#using (45fe77547a063284cbc04faf6708a426ed68f712)
84c940e Restore specs for Module#refine (45fe77547a063284cbc04faf6708a426ed68f712)
58154a0 Fix misspell with
21f7385 Use multiple lines in StringIO chomp: true examples
b6ed5bf Changed incorrect carriage return usage to newline characters
74190bb Add chomp specs for StringIO#gets, StringIO#readline, StringIO#each_line, StringIO#readlines, StringIO#each and StringIO#lines
863f342 Changed incorrect carriage return usage to newline characters
e8f16ca Add chomp specs for IO#gets, IO#readline, IO#each_line, IO#readlines and IO.foreach
9457cc8 close logger
93a2219 test shift_period_suffix
92a38a8 Code review. Update case descriptions
c135a65 Add spec for CSV#readlines to test parsing illegal input
d98f29b Add spec for CSV.parse to test parsing illegal input
d3dc4b8 Add specs for CSV#liberal_parsing?
5be6e46 uses be_close to avoid day rollover
9c73504 Properly implement be_close on timing assertion
8258707 Be close enough on time
d691fbd Replace data shell out with Process#clock_gettime
44bd7e6 Better describes the Timezone preservation for
fa464db preserves local timezone
fc8dad7 checks against local date
0ac7b33 spec checks time
a2d4732 Expands coverage of DateTime#now
feef77e Adds DateTime#to_s
dae126d Adds DateTime#to_date
d62a273 Adds DateTime#to_datetime
f48b4a4 add spec for ConditionVariable#marshal_dump
3516c4a Update RuboCop to 0.51.0
92e2231 Remove spec, a better one is coming in PR 528
9dc9897 test instantiating logger with keyword arguments
441b94b Recommend to not use should_not raise_error
b006666 fix a couple typos
3ce41f7 Add spec for OptionParser#parse into argument
da9f1eb Add spec for OptionParser#order into argument
213041c Avoid printing the return value of Warning.warn/IO#puts
8d35f67 Move parser warnings to 2.4
da55149 Add spec for overriding Warning.warn
ebcdd4c Add a few more specs to Warning.warn
f7da018 Add spec for Warning.warn
4ca0dd0 Code review. Fix spec description
3b46c0c Add spec for Shelwords.shelwords bug with backslash escaping
76d9649 try to clarify when to use ruby_bug
0f1c23e Add spec for `IPAddr#==` bug
2602cd8 Add specs for
f6a1941 Refactoring spec for Date#wday
cb0fd00 Add spec for Date#friday?
d72d05d Add Date#saturday? spec
e84fba1 Add Date#friday? spec
0b6d255 Add Date#thursday? spec
0cda9f2 Add Date#wednesday? spec
134361c Add Date#tuesday? spec
eebb66f Add Date#monday? spec
36ebba9 Add Date#sunday? spec
0f78774 Add Date#wday spec
e78f0ba transform_values: add spec for partial modification
fa11325 transform_values: test for empty frozen hash
fb65f5a Add spec for Hash#transform_values
7e4033a remove extra proc, clean up desc
08d6e68 use scratchpad for mult. assign test
149faaf catch error to get other ruby versions passing
6e55b81 Add spec for multiple assignment conditional
5c705c1 Merge pull request #522 from 284km/fix-typo
75129db Fix a typo
559bc18 Enable Thread.report_on_exception = true by default in ruby/spec
5ea6848 Add specs for Thread{.,#}report_on_exception{=,}
66a6754 Mark behavior with the reported upstream bug
d33bc13 Code review. Check both "\n" and "\r\n"
0a1ac1a Add specs for `chomp` argument of `String#lines` and `String#each_line`
a8d352e Ensure that Ruby 2.4 supports rescue in method arguments #473
d4892bd Add specs for \X character class
8c1b3f5 Add a list of common guards
b54e8c6 Add a list of frequently-used matchers to get started
a0ae256 enable the leak checker on macOS since we get EBADF errors
14a951a Improve descriptions
4854304 add missing :
b59a9ca changed assertion language and used symbol literals
6f4f946 added missing do after assertion
d4e2148 Character manipulation coverage for symbol ruby 2.4
5a0f22a Code review. Check string content instead of not raising exception
e873562 Add simple test for `capacity` argument of `` method
77f23fa Core review. Add case with long buffer and use `equal`
35307dd Add specs for `Array#pack`'s `buffer` option
9e81f33 Add specs for TracePoint#callee_id
0ba04d7 Add Set#compare_by_identity & Set#compare_by_identity? specs
fa456b4 Add specs for Set#compare_by_identity
6345a38 Add a spec for deprecation warnings on Fixnum/Bignum
e3b13f1 Add missing version guard and improve description
9027d74 Code review. Replace `==` with `equal` and add spec for `Integer`
7626b75 Update Fixnum specs
b894db3 Update Bignum specs
25883f4 spec: covered String#upcase!
0e937c3 spec: covered String#swapcase!
7e97cb8 spec: covered String#downcase!
199a0b7 spec: covered String#capitalize!
ae5fc98 Fix style of a couple specs breaking lines when not needed
ba295cb Add missing should in Array#permutation spec
d245974 Check the return value of & to avoid warnings
2144b2a Merge pull request #500 from joshgelbard/more-missing-should
0d5a424 Add more missing .should calls
770b3bf Improve specs description
0f3e461 Add specs for `half` option in `Rational#round`
3e43ae5 Add missing ".should" and fix subsequent failure
38b085c Add a spec testing what calls are made for Array#sum
e3e3ffa Code review. Implement test case without refinements
232dcc7 Add specs for Array#sum
a68c5e9 Spec Constant ||= for all scoping possibilities.
f5146f1 Make sure the returned time is the same instance
c4cdc6b Moves the to_time call into the with_timezone to verify no timezone shift
50bf0a9 Version guard on timezone preservation of DateTime#to_time
d8bb344 Updates for clarity on to_time
dd6fbc0 Adds timezone comparison
04cbc22 Adds specs for DateTime#to_time
a538fa9 Time#to_time spec
f8a9f7c Fix some typos in test descriptions
1915741 Adds specs for Complex with infinity on the real part
43df951 Adds Complex#finite? and Complex#infinite?
e2a9b65 Add Numeric#finite? and Numeric#infinite?
0a2d315 Supplement specs for Enumerable#uniq
fa47230 Add spec for complex expression in ().
ef70ce7 Adds String#unpack1 (#488)
6d04a08 Skip failing spec on Travis CI due to the special Travis environment
de02faa Update to latest MRI releases
487a79d Enable the TrailingWhitespace cop
1af928f Update to RuboCop 0.50
ec7a116 Add spec for CGI::Cookie.parse about handling , separator
64c06f9 Fix integer rounding-related specs for the new behavior on 2.5
75b1a1f Add more specs for Integer#ceil and Integer#floor
100bec0 Add specs for Integer#truncate that takes optional digits
8d097ed Add specs for Integer#floor that takes optional digits
c0f73bd Add specs for Integer#ceil that takes optional digits
dea4eb7 Changes on Pathname#empty? specs per Peer-Review
d19fcad Adding Pathname#empty? specs
819a4e4 Clarifies Integer#digits radix mechanics
d59065a Adds specs for Integer#digits
04add34 Removes dependence on DirSpecs fixtures
9d8e136 Nitpick: it's Dir.empty?, not Dir#empty?
36a81ce Adds specs for Dir#empty?
bd9eda1 Add specs for symbol#casecmp? (#480)
7904658 Add spec for Integer#round(half:)
8501780 Symbol with invalid bytes are detected at parse time
3d5efd7 Symbol are unique so also test their identity with #equal?
1ff0e0a Improve String#to_sym spec to specify the encoding of the resulting Symbol
8f96e83 Use UTF-8 characters instead of escape sequences when possible
6f182f9 Move to utf-8 encoding in the Symbol#casecmp spec
c33c429 Improve String#casecmp? spec and use String literals when possible
f4ade8d spec for String#casecmp?
88036c3 Add spec for Net::FTP#status(pathname)
2429bba Add spec for Float#round(half:)
6bf1725 Add spec for File.empty?
2c43fec Adds specs for Float#ceil, Float#floor, and Float#truncate (#475)
c073cd3 Fix whitespace [ci-skip]
1153e30 Tweak spec so it doesn't assume that number literals are always the same object.
6e943e1 Fix whitespace [ci-skip]
f07c3de Spec that Array#min,max are defined and not just from Enumerable
a6c4f6e Fix of MRI Bug 12367 was backported to 2.3 and 2.2
24515bc add specs for 2.4+ MRI behavior on duping numerics,nil,false and true
e778d17 Create objects in before blocks in super specs
61db431 improved specs for super arguments

git-subtree-dir: spec/ruby
git-subtree-split: a6b8805fe3fe9cea81bbcebffb8a0fac5b09dbc2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants