-
Notifications
You must be signed in to change notification settings - Fork 16
fix: Fix precision loss in IntervalMonthDayNano nanosecond #321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Fix precision loss in IntervalMonthDayNano nanosecond #321
Conversation
…conversion Fix BigInt to Number conversion that was causing precision loss for large nanosecond values in IntervalMonthDayNano. The issue occurred when converting 64-bit nanosecond values to two 32-bit integers - values exceeding safe integer range lost precision during Number() conversion. Apply unsigned 32-bit shift (>>> 0) after Number() conversion to ensure values are properly treated as unsigned integers without precision loss. Fixes integration test failures where JS-produced IntervalMonthDayNano values differed from C++/Rust/nanoarrow by small amounts due to this precision error.
|
Wow! Is this the same problem as #15 ? |
|
Sure, I'll give it a shot. It should resolve it but I'll confirm. I was pulling my hair out with this issue, hopefully this solution extends. @kou |
|
Yes! After checking it against your comment, this PR does fix issue #15 it looks like. Both issues stem from the same root cause in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR improves the handling of nanosecond values in the toIntervalMonthDayNanoInt32Array function by ensuring proper unsigned 32-bit integer conversion. The changes prevent potential precision loss when converting large BigInt nanosecond values to the Int32Array format.
- Extracts the BigInt conversion to a separate variable for clarity
- Applies unsigned right shift (
>>> 0) to ensure the results are treated as unsigned 32-bit integers - Adds clarifying comment about the purpose of the conversion
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
This does not seem to fix the issue, see #15 (comment) Also, please add unit tests to exercise the fix, for example in https://github.com/apache/arrow-js/blob/main/test/unit/vector/interval-month-day-nano-tests.ts |
|
@GeorgeLeePatterson Could you take a look at the above comment? |
Sure, I can take a look tomorrow. Perhaps it only fixed the integration tests issues in the case of certain randomly chosen values, but wasn't in fact a blanket fix across the underlying problem. I'll see if I can figure this out more broadly. |
Rationale for this change
Integration tests were failing when JavaScript produced IntervalMonthDayNano data that was consumed by C++, Rust, or nanoarrow implementations. The nanosecond values differed by small amounts (e.g., 216 nanoseconds) due to precision loss during BigInt to Number conversion.
Example failure:
6684525287992311000ns6684525287992310784nsWhat changes are included in this PR?
Fixed the
toIntervalMonthDayNanoInt32Arrayfunction insrc/util/interval.tsto properly handle unsigned 32-bit integer conversion without precision loss. The issue was thatNumber(BigInt)conversion for large values (>2^53-1) loses precision. Applied unsigned right shift (>>> 0) to ensure correct unsigned 32-bit representation.Changed:
Fixes apache/arrow#46203
Closes #15.