Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Substring encoding #146

Merged
merged 1 commit into from

4 participants

@jvshahid

add a spec to make sure that ascii substrings that originate from binary strings can be utf-8 encoded without an exception in jruby.

@nurse

Ruby shall not distinguish a normal ascii only string and a ascii only string originate from binary strings, so this spec should be useless.
Or this intends a regression test?
Actually CRuby had some such bugs because of internal encoding cache named code range.

@jvshahid
@csw

This test case illustrates the bug JRUBY-6764. Pull request to fix it in JRuby is here.

@nurse

this is a current bug in jruby that i wanted to write a test case for before pushing a patch

This test case illustrates the bug JRUBY-6764. Pull request to fix it in JRuby is here.

Thanks, that's just I wanted to know.

@jvshahid

are there any updates on this pull request ?

@nurse

I don't think rubyspec should include such regression test,
but don't object if brian wants to merge.

core/string/encoding_spec.rb
@@ -67,6 +67,16 @@
"\u{4040}".encoding.should == Encoding::UTF_8
end
+ it "an ascii substring of a binary string should be encoded UTF-8 without raising an exception" do
@brixen Owner
brixen added a note

This would probably be better in #encode specs. It should also be significantly simplified. If I understand correctly, the spec states that encoding a substring of an ascii-8bit string as a utf-8 string (assuming the bytes encode correctly) should succeed. (If I'm mistaken, please correct me.) This suggests a two-line spec along the lines of:

str = "\x82foo".force_encoding("ascii-8bit")[1..-1].encode("utf-8")
str.should == encode(foo, "utf-8")

Other specs should already establish that #force_encoding, #encode, #[], etc. work as expected.

@jvshahid
jvshahid added a note

I'll try your simplified version to make sure it exploits the bug in jruby as intended. If that's the case, I'll make the change and push this weekend.

@brixen Owner
brixen added a note

Sweet, thanks for taking a look!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jvshahid

I've simplified the test and moved it to encode_spec.rb. Do you want me to squash the commits ?

@brixen
Owner

@jvshahid yes, would you please squash. Thanks!

John Shahid [encoding] [jruby] add a spec to make sure that ascii substrings of b…
…inary string can be utf-8 encoded without an exception in jruby
a72e36e
@jvshahid

Done.

@brixen brixen merged commit a72e36e into rubyspec:master
@qmx qmx referenced this pull request from a commit in qmx/jruby
John Shahid [encoding] ascii strings that originate from a binary string should b…
…e ascii/utf-8 encodable without throwing an exception.

A new rubyspec was added that demonstrates the issue (rubyspec/rubyspec#146).
454c444
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Jun 30, 2013
  1. @jvshahid

    [encoding] [jruby] add a spec to make sure that ascii substrings of b…

    John Shahid authored jvshahid committed
    …inary string can be utf-8 encoded without an exception in jruby
This page is out of date. Refresh to see the latest.
Showing with 6 additions and 0 deletions.
  1. +6 −0 core/string/encode_spec.rb
View
6 core/string/encode_spec.rb
@@ -39,6 +39,12 @@
describe "String#encode" do
it_behaves_like :encode_string, :encode
+ it "an ascii substring of a binary string should be encoded UTF-8 without raising an exception" do
+ str = "\x82foo".force_encoding("ascii-8bit")[1..-1].encode("utf-8")
+ str.should == encode("foo", "utf-8")
+ str.encoding.name.should == "UTF-8"
+ end
+
it "returns a copy of self when called with only a target encoding" do
str = "strung".force_encoding(Encoding::UTF_8)
copy = str.encode('ascii')
Something went wrong with that request. Please try again.