New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Float() parsing innacuracy #6480
Comments
Reduced this a little more to: Float("661e7086FROGGER") So at first glance it appears that we ignore extra data after finding a valid float value on the front of the string and MRI will raise an ArgumentError because there is invalid extra crud after the float value. I half wonder if Ruby ever did what we did or not? |
Some notes... We actually lean on JVM Double.parseDouble BUT we pre-process the string first to account for things Java does not (or at least did not up to a point) like using _. This processing happens to abort when seeing '6' of 7086 thinking as an exponent it will be too large a value. So this is how we get Inifinity. It never even makes it to the Java double parsing. If it did then we would end up with a NumberFormatException and match MRI. At some point we should just properly make a double processor and not perform this pre-processing. It would be quicker and ultimately it be a little less confusing. It is also a fairly large and icky job. For now, I might add some logic to look past the first three digits as part of tooLargeExponent() in ConvertDouble.java. If it sees a non-numeric value in the rest of the string it will just abort. In this case it should find a '-'. The original String value is quite entertaining because all the characters after the initial float string are actually all value characters that can occur within a float number. |
I actually am mistaken in my last comment this pre-processing does not have 0x float values so any non-[0-9] would be recognized as not valid. |
Targetting 9.2.14 which we plan on releasing soon since the fix I used is very very narrow so the risk is minimal. This new logic only occurs in a place where we think we have too many digits in an exponent. Worst case a break in logic would not cause Infinity to happen. There is not logical flaw here though Also ruby/spec does actually adequately test garbage after a float. Our failure is just a weird wrinkle in how we implement things so I am adding something to our regression specs. |
Thanks a lot for the very fast action! |
Environment Information
I'm running jruby
9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 25.252-b09 on 1.8.0_252-b09 +jit [linux-x86_64]
from logstash (although I don't think this is a logstash specific problem) on aLinux 3.10.0-1062.el7.x86_64 #1 SMP Wed Aug 7 18:08:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Here's a side-by-side comparison between ruby and jruby, I think the expected behavior should be the later:
ruby -e 'v="661e7086-33af-11eb-a4ef-fa163ec4aef9"; puts(Float(v) rescue v)'
returns661e7086-33af-11eb-a4ef-fa163ec4aef9
jruby -e 'v="661e7086-33af-11eb-a4ef-fa163ec4aef9"; puts(Float(v) rescue v)'
returnsInfinity
I think jruby is parsing
"661e7086"
from the original the string and interpreting as scientific notation. Although more robust, I don't think is a good behaviour in general (imho jruby should follow ruby).Thanks a lot for the attention!
The text was updated successfully, but these errors were encountered: