Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adopt changes from JNI for casting from float to decimal #10917

Draft
wants to merge 3 commits into
base: branch-24.08
Choose a base branch
from

Conversation

ttnghia
Copy link
Collaborator

@ttnghia ttnghia commented May 28, 2024

Signed-off-by: Nghia Truong <nghiat@nvidia.com>
Signed-off-by: Nghia Truong <nghiat@nvidia.com>
Signed-off-by: Nghia Truong <nghiat@nvidia.com>
@ttnghia ttnghia added bug Something isn't working SQL part of the SQL/Dataframe plugin task Work required that improves the product but is not user facing labels May 28, 2024
@ttnghia ttnghia self-assigned this May 28, 2024
@thirtiseven
Copy link
Collaborator

thirtiseven commented May 28, 2024

The results look really close.

The @approximate_float in integration test can even be removed under my tests.

CastOpSuite can pass with eps of 1e-13.

Related Spark UT still fails, also close:

- casting to fixed-precision decimals *** FAILED ***
  Incorrect evaluation: cast(10.03 as decimal(38,18)), actual: 10.030000000000002048, expected: 10.03 (RapidsTestsTrait.scala:350)

Performance test to cast 5000000 floats to 10 kinds of decimal types (in ms):

Type CPU 24.08 #10909 This PR
Double 146,524 468.33 660.33 370.66
Float 82,691 412.33 480.67 275.33

@ttnghia
Copy link
Collaborator Author

ttnghia commented May 28, 2024

Should that case be acceptable as the relative error is just 1.7e-16?
For 10.3, it is stored like 9.30000000000000071054 and it is very difficult to remove the trailing bonus digits if we are going to take 18 decimal digits.

@ttnghia
Copy link
Collaborator Author

ttnghia commented May 29, 2024

I've just updated my JNI code in NVIDIA/spark-rapids-jni#2078, adding some utilities functions by @pmattione-nvidia and it seems to fix all of our failed tests. @thirtiseven please run tests again on your machine.

@thirtiseven
Copy link
Collaborator

thirtiseven commented May 29, 2024

It passed Spark UT, personally I think it is good enough for this issue.

If set eps = 0 in com.nvidia.spark.rapids.CastOpSuite#cast float/double to decimal, we can still see some mismatch:
Some cases in float test:

cpu: -6.963900469355742E15	gpu: -6.963900469355743E15
cpu: 4.005061702373036E19	gpu: 4.005061702373037E19
cpu: 7.1726783043136778E18	gpu: 7.1726783043136788E18
cpu: 4.9245108190671776E17	gpu: 4.9245108190671782E17
cpu: -6.6190078985307568E16	gpu: -6.6190078985307576E16
cpu: -8.286135559736182E15	gpu: -8.286135559736183E15
cpu: 9.999999933815812E18	gpu: 9.999999933815814E18
cpu: 9.2205076355788718E18	gpu: 9.2205076355788728E18
cpu: 1.00000004091847872E17	gpu: 1.00000004091847888E17

Only diffs in last few digits.

Some cases in double test:

cpu: -1.0E16	                gpu: -1.0000000000000002E16
cpu: -3.953408452257507E34	gpu: -3.9534084522575073E34
cpu: -1.0E16	                gpu: -1.0000000000000002E16
cpu: 7.1726781626632929E18	gpu: 7.172678162663294E18
cpu: 4.9245106955378758E17	gpu: 4.9245106955378765E17
cpu: 1.0E29                     gpu: 1.0000000000000001E29
cpu: -6.619008101723092E16	gpu: -6.6190081017230928E16
cpu: 9.999999999999999E29	gpu: 1.0E30
cpu: -6.200692612954865E31	gpu: -6.200692612954866E31
cpu: -8.286135303307476E15	gpu: -8.286135303307477E15
cpu: -5.75278800663036E23	gpu: -5.7527880066303604E23
cpu: -2.786658992657616E33	gpu: -2.7866589926576164E33
cpu: -4.857510460413498E34	gpu: -4.8575104604134985E34

There some cases like -1.0E16 vs -1.0000000000000002 and 9.999999999999999E29 vs 1.0E30, not sure if they can be fix easily.

New performance:

Type CPU 24.08 #10909 This PR old This PR new
Double 146,524 468.33 660.33 370.66 439
Float 82,691 412.33 480.67 275.33 378.33

@thirtiseven
Copy link
Collaborator

And again some thoughts about the float => string => decimal solution. In Spark/Java double is converted to string with as many, but only as many, more digits as are needed to uniquely distinguish the argument value from adjacent values of type double. doc. But the algorithm is complex and sometimes does not produce results only as many at low version jdk. So technically the best we can match now might be the only as many digits, which is implemented in jni ftos_converter.cuh. So we don't need to really convert it to string, and can get the digits in the string in int64. But it seems not to be easy to use in your approach.

@ttnghia
Copy link
Collaborator Author

ttnghia commented May 29, 2024

My solution in NVIDIA/spark-rapids-jni#2078 is adopted from the ongoing work of @pmattione-nvidia. Paul is working on even a better solution, so hopefully it will be able to fix the failures above. Let's wait for it.

@ttnghia ttnghia changed the base branch from branch-24.06 to branch-24.08 May 29, 2024 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working SQL part of the SQL/Dataframe plugin task Work required that improves the product but is not user facing
Projects
None yet
2 participants