-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Int256 to reduce BigInts in FD operations. #93
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We do not here explicitly introduce support for FD{BitIntegers.Int256}, though that should work out of the box both before and after this PR. Rather, this PR _uses_ a (U)Int256 under the hood to prevent allocations from Int128 widening to BigInt in FD operations.
I just realized this reimplements RelationalAI-oss#7, from 6 years ago (😳) which @TotalVerb had already reviewed. @TotalVerb you may want to do one more pass over this. |
NHDaly
added a commit
that referenced
this pull request
Jun 13, 2024
Finally implements the fast-multiplication optimization from #45, but this time for 128-bit FixedDecimals! :) This is a follow-up to #93, which introduces an Int256 type for widemul. However, the fldmod still required 2 BigInt allocations. Now, this PR uses a custom implementation of the LLVM div-by-const optimization for (U)Int256, which briefly widens to Int512 (😅) to perform the fldmod by the constant 10^f coefficient. This brings 128-bit FD multiply to the same performance as 64-bit. :)
NHDaly
added a commit
that referenced
this pull request
Jun 13, 2024
Finally implements the fast-multiplication optimization from #45, but this time for 128-bit FixedDecimals! :) This is a follow-up to #93, which introduces an Int256 type for widemul. However, the fldmod still required 2 BigInt allocations. Now, this PR uses a custom implementation of the LLVM div-by-const optimization for (U)Int256, which briefly widens to Int512 (😅) to perform the fldmod by the constant 10^f coefficient. This brings 128-bit FD multiply to the same performance as 64-bit. :)
NHDaly
added a commit
that referenced
this pull request
Jun 13, 2024
Finally implements the fast-multiplication optimization from #45, but this time for 128-bit FixedDecimals! :) This is a follow-up to #93, which introduces an Int256 type for widemul. However, the fldmod still required 2 BigInt allocations. Now, this PR uses a custom implementation of the LLVM div-by-const optimization for (U)Int256, which briefly widens to Int512 (😅) to perform the fldmod by the constant 10^f coefficient. This brings 128-bit FD multiply to the same performance as 64-bit. :)
NHDaly
added a commit
that referenced
this pull request
Jun 13, 2024
Finally implements the fast-multiplication optimization from #45, but this time for 128-bit FixedDecimals! :) This is a follow-up to #93, which introduces an Int256 type for widemul. However, the fldmod still required 2 BigInt allocations. Now, this PR uses a custom implementation of the LLVM div-by-const optimization for (U)Int256, which briefly widens to Int512 (😅) to perform the fldmod by the constant 10^f coefficient. This brings 128-bit FD multiply to the same performance as 64-bit. :)
NHDaly
added a commit
that referenced
this pull request
Jun 13, 2024
Finally implements the fast-multiplication optimization from #45, but this time for 128-bit FixedDecimals! :) This is a follow-up to #93, which introduces an Int256 type for widemul. However, the fldmod still required 2 BigInt allocations. Now, this PR uses a custom implementation of the LLVM div-by-const optimization for (U)Int256, which briefly widens to Int512 (😅) to perform the fldmod by the constant 10^f coefficient. This brings 128-bit FD multiply to the same performance as 64-bit. :)
NHDaly
added a commit
that referenced
this pull request
Jun 13, 2024
Finally implements the fast-multiplication optimization from #45, but this time for 128-bit FixedDecimals! :) This is a follow-up to #93, which introduces an Int256 type for widemul. However, the fldmod still required 2 BigInt allocations. Now, this PR uses a custom implementation of the LLVM div-by-const optimization for (U)Int256, which briefly widens to Int512 (😅) to perform the fldmod by the constant 10^f coefficient. This brings 128-bit FD multiply to the same performance as 64-bit. :)
Drvi
approved these changes
Jul 19, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We do not here explicitly introduce support for FD{BitIntegers.Int256}, though that should work out of the box both before and after this PR.
Rather, this PR uses a (U)Int256 under the hood to prevent allocations from Int128 widening to BigInt in FD operations.
Unfortunately,
rem
andmod
on BitIntegers.Int256 still fall-back to a BigInt (see the note here), so this doesn't completely eliminate the BigInt allocs. But it does reduce them.This is a pretty small PR, but it should have a big impact on users of
FD{Int128}
.Before:
After: