New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix some failing rubyspecs for String#encode #684

Merged
merged 4 commits into from Apr 30, 2013

Conversation

Projects
None yet
2 participants
@tychobrailleur
Contributor

tychobrailleur commented Apr 30, 2013

This first pass at String#encode fixes 32 failures and 2 errors in the core/string/encode rubyspecs.

In particular, I have implemented my first stab at the :xml option; feedback welcome!

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Apr 30, 2013

Member

Getting the XML support in is supergreat. We figured we'd do an inefficient first version similar to yours, so I think it's probably fine.

I'll review the rest of the changes and get them merged in!

Member

headius commented Apr 30, 2013

Getting the XML support in is supergreat. We figured we'd do an inefficient first version similar to yours, so I think it's probably fine.

I'll review the rest of the changes and get them merged in!

headius added a commit that referenced this pull request Apr 30, 2013

Merge pull request #684 from tychobrailleur/string_encode
Fix some failing rubyspecs for String#encode

@headius headius merged commit 639752e into jruby:master Apr 30, 2013

1 check failed

default The Travis build failed
Details
@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Apr 30, 2013

Member

This seems to have introduced a failure; test/externals/ruby1.9/ruby/test_m17n.rb fails in test_split when it gets to the Emacs-Mule encoding. I'm looking into that, but if you have a change to fix I'd appreciate it. I already merged your changes.

Also a small style note: we only allow the no-braces form of "if" if it's a one-liner; you have a few cases in your patches where you use that form along with else, which we just consider confusing and error-prone. If you can't do everything in one line, use braces please. Thanks :-)

Member

headius commented Apr 30, 2013

This seems to have introduced a failure; test/externals/ruby1.9/ruby/test_m17n.rb fails in test_split when it gets to the Emacs-Mule encoding. I'm looking into that, but if you have a change to fix I'd appreciate it. I already merged your changes.

Also a small style note: we only allow the no-braces form of "if" if it's a one-liner; you have a few cases in your patches where you use that form along with else, which we just consider confusing and error-prone. If you can't do everything in one line, use braces please. Thanks :-)

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Apr 30, 2013

Member

Ok, I think I understand the failure. Because of the 7bit change, Emacs-MULE is now getting hit as an encoding. However, it's not an encoding we support transcoding for yet, and I think that triggers the issue.

I also have a change for your 7bit fix that uses an already-known bit in RubyString for coderange that indicates if it's all 7bit characters, rather than scanning the whole string every time.

Member

headius commented Apr 30, 2013

Ok, I think I understand the failure. Because of the 7bit change, Emacs-MULE is now getting hit as an encoding. However, it's not an encoding we support transcoding for yet, and I think that triggers the issue.

I also have a change for your 7bit fix that uses an already-known bit in RubyString for coderange that indicates if it's all 7bit characters, rather than scanning the whole string every time.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Apr 30, 2013

Member

It seems like there must be a bug in this encoding. The logic for split basically just walks the string's characters, shifting as many bytes as necessary for each character in turn. I think somewhere along the way it's shifting too far so we get the weird "c", "def" split in test_m17n#test_split.

When I print out "p" and "t" in the loop in RubyString#stringSplit19 I get the following for the last two encodings in test_m17n#test_split (CP949, which works, and Emacs-Mule, which doesn't):

#<Encoding:CP949>
[97, 98, 99, 44, 100, 101, 102]
p: 0 t: 3
["abc", "def"]
p: 0 t: 3
#<Encoding:Emacs-Mule>
[97, 98, 99, 44, 100, 101, 102]
p: 0 t: 1
p: 1 t: 2
p: 2 t: 3
["c", "def"]
p: 0 t: 1
p: 1 t: 2
p: 2 t: 3

Seems like something's borked in Emacs-Mule's rightAdjustCharHead.

Member

headius commented Apr 30, 2013

It seems like there must be a bug in this encoding. The logic for split basically just walks the string's characters, shifting as many bytes as necessary for each character in turn. I think somewhere along the way it's shifting too far so we get the weird "c", "def" split in test_m17n#test_split.

When I print out "p" and "t" in the loop in RubyString#stringSplit19 I get the following for the last two encodings in test_m17n#test_split (CP949, which works, and Emacs-Mule, which doesn't):

#<Encoding:CP949>
[97, 98, 99, 44, 100, 101, 102]
p: 0 t: 3
["abc", "def"]
p: 0 t: 3
#<Encoding:Emacs-Mule>
[97, 98, 99, 44, 100, 101, 102]
p: 0 t: 1
p: 1 t: 2
p: 2 t: 3
["c", "def"]
p: 0 t: 1
p: 1 t: 2
p: 2 t: 3

Seems like something's borked in Emacs-Mule's rightAdjustCharHead.

headius added a commit that referenced this pull request May 1, 2013

Fixes in response to PR #684.
* Use code range to determine ascii rather than scanning.
* Fix bug in EmacsMuleEncoding exposed by new 7-bit logic.

We will need a jcodings release for JRuby 1.7.4.
@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 1, 2013

Member

Found the bug in EmacsMuleEncoding. All good now.

Member

headius commented May 1, 2013

Found the bug in EmacsMuleEncoding. All good now.

@tychobrailleur

This comment has been minimized.

Show comment
Hide comment
@tychobrailleur

tychobrailleur May 1, 2013

Contributor

Darn, I missed that, sorry for sending you on a wild-goose chase... :-( Thanks a lot for your help on this!
Do you want me to send a separate pull request to address the style issues?

Contributor

tychobrailleur commented May 1, 2013

Darn, I missed that, sorry for sending you on a wild-goose chase... :-( Thanks a lot for your help on this!
Do you want me to send a separate pull request to address the style issues?

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 1, 2013

Member

No, it's ok, I took care of the style issues in place. Everything's merged and the EmacsMuleEncoding issue is fixed :-)

Member

headius commented May 1, 2013

No, it's ok, I took care of the style issues in place. Everything's merged and the EmacsMuleEncoding issue is fixed :-)

@tychobrailleur

This comment has been minimized.

Show comment
Hide comment
@tychobrailleur

tychobrailleur May 1, 2013

Contributor

Awesome, thanks again!

Contributor

tychobrailleur commented May 1, 2013

Awesome, thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment