-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error deserializing float to string #211
Comments
These strings are equivalent representations of the same underlying 32-bit floating-point value. For example, 508.41 and 508.410004 are both represented as Therefore, I think your observation is due to a difference in behavior between the Opaque and Spark float-to-string cast expressions.
It would take significant effort to match Spark's behavior exactly. One way would be to use a library like Ryu and modify it to match Java behavior. This can be tricky for scientific notation. |
Also, note that Spark's
|
Thanks for such a thorough response! This behavior makes sense now though it is a shame it isn't easier to fix. |
@ankurdave thanks for the detailed response :) @ryanleh perhaps we can add this behavior to the documentation, then close the issue? |
To demonstrate the error, I can simply follow the instructions from
Using Opaque SQL
in the documentation but substitute the given integers for floats:If I encrypt the dataframe, then decrypt it I get the following:
So it appears that there is an error in de-serializing the floats to a string as the displayed numbers are incorrect when using
show()
but notcollect()
.One thing I noticed when debugging is that, if I set breakpoints in the various cases here, running
collect()
shows that both theStringField
andFloatField
cases are entered (as expected), but runningshow()
prints out thatStringField
is visited twice so it would seem that the float value is being turned into a string (incorrectly) somewhere in C++ code before Scala? But I am not sure.The text was updated successfully, but these errors were encountered: