Optimize fullword inner loop in nmod_poly_gcd_euclidean#2528
Merged
fredrik-johansson merged 1 commit intoflintlib:mainfrom Dec 16, 2025
Merged
Optimize fullword inner loop in nmod_poly_gcd_euclidean#2528fredrik-johansson merged 1 commit intoflintlib:mainfrom
nmod_poly_gcd_euclidean#2528fredrik-johansson merged 1 commit intoflintlib:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Found two more time savers in
_nmod_poly_divrem_q1_preinv1_fullword:r = a + q0*b + q1*cwill have high limb larger thannand can even overflow two limbs, but in practice this rarely happens (and almost surely doesn't happen whennis just larger than2^(FLINT_BITS-1). We can inspect the actual valuesq0andq1before entering the loop to verify that it is safe to do a regular addition instead of a modular addition for the high limb.norm == 0.The asymptotic improvement is about 15%. This is quite nice since it affects the moduli used by
fmpz_poly_gcd_modular.Effect on
nmod_poly_gcd: