-
-
Notifications
You must be signed in to change notification settings - Fork 923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String encoding differs between MRI and JRuby #4564
Comments
I was unable to run your tests on my Fedora 25 environment with JRuby master. Are you able to reproduce this with JRuby master? (builds at ci.jruby.org). |
Just tested using Sample stacktrace
|
Looks like the test order is important. Running tests using different seeds results in different tests failing (but it looks like using the same seed keeps failing the same tests). E.g. using seed "62230" results in 7 errors, but "27871" yields only 6 errors. |
Here's what I get when I try to run your tetsts:
I'm not sure what's wrong with my setup :-( |
For some reason you are running the Rails 5.0 tests while my setup is running the 4.2 ones by default. Pretty sure my setup is the broken one. Nonetheless you should be able to run the tests using appraisal: (just trying for myself I had to downgrade concurrent-ruby - |
Ahh ok. I just did "bundle install" and it went for the Rails 5 gems by default. I'm re-trying with 4.2 now. |
Well I struggled to get a reproduction and had little luck, but I was able to reproduce your erratic test results. Anything you can do to reduce this down to just a few tests would be helpful, I'm trying to cut pieces out as I go, while keeping failures, but I have not made a lot of progress. |
I finally managed to create a more minimal reproduction 🎉 First I extracted one of the failing tests: # frozen_string_literal: true
require 'ostruct'
require 'graphql'
QueryType = GraphQL::ObjectType.define do
name "Query"
field :unit, types.String
end
Schema = GraphQL::Schema.define do
query QueryType
resolve_type ->(obj, ctx) { QueryType }
end
module TestData
def unit
nil
end
end
query_string = '{ Query: __type(name: "Query") { fields { name }}}'
puts Schema.execute(query_string, root_value: TestData, variables: {}) But once I removed the (now useless) TestData module everything works. After that it was pretty straightforward to get graphql out of the picture: s_string = :unit.to_s
puts s_string.encoding # => US-ASCII
module Test
def unit
nil
end
end
puts s_string.encoding # => ASCII-8BIT Interestingly this only breaks if the method is defined within a module or class - top-level functions work without any problem. |
Oh excellent! Ok, this gives me enough to proceed. I have my suspicions. |
OMG I just realized the original string changes encoding. That's super wrong. It must be a String stored in the Symbol, returned directly for to_s, and then later updated by the symbol to have ASCII-8BIT encoding. |
Wow, this is a great find! We have long had issues with encoding of symbols. In MRI, all symbols are keyed off their raw bytes, but in the table they store the original encoding. This allows looking them up with a simple key (rather than bytes + encoding) but sometimes leads to a race in deciding the symbol encoding for bytes which are the same but which should be encoded differently. In our IR we addressed this by having our symbol logic always force newly-created symbols to have the encoding we wanted. Unfortunately if the symbol already existed -- perhaps with a different encoding -- we would mutate the original symbol to have the encoding. So that's bad. Making matters worse, the underlying "ByteList" is shared by the original symbol and any to_s representation of it. Updating the encoding in the symbol ended up updating it in any to_s strings throughout the system, which was the root cause of this gnarly little issue. Fixing that led to us realizing that we had many other paths incorrectly using java.lang.String as a representation of the symbol, so I did additional work to actually propagate the original ByteList through as-is. This should be much closer to matching MRI, and it fixes your issue along the way. Thanks so much for the extra work in reducing this! It was a simple matter to find the problem, and the fix is a great one to have. |
Great to see this fixed! And many thanks for the additional insights. |
Environment
JRuby 9.1.8.0 / 9.1.5.0 on OSX and Linux + graphql-ruby gem
Behavior
ATM i'm trying to get graphql-ruby (https://github.com/rmosolgo/graphql-ruby) to work on JRuby. The only real showstopper right now is that occasionally strings like "unit" or "limit" are binary encoded rather than UTF-8 or 7Bit ASCII which are properly encoding on MRI.
As can be seen on Travis (https://travis-ci.org/rmosolgo/graphql-ruby/jobs/223395132) this happens reproducible on every test run. Running the failing test cases isolated however works without any issues.
I'm working on a simpler repro case, but failed to create a truly minimal one.
The text was updated successfully, but these errors were encountered: