Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed of division with remainder for cpp_int #537

Open
afabri opened this issue Feb 24, 2023 · 5 comments
Open

Speed of division with remainder for cpp_int #537

afabri opened this issue Feb 24, 2023 · 5 comments

Comments

@afabri
Copy link

afabri commented Feb 24, 2023

In a geometric code I observe a slowdown of a factor 4, when I switch from the gmp to the cpp_int backend. In vtune I see that more than 35% of the time are spent here. The comment at the signature leads me to the question if I am in the scenario where this function is not optimized. Is there a trivial fix? The comment seems to indicate that it is optimized for small numbers of limbs.

@afabri
Copy link
Author

afabri commented Feb 24, 2023

In my case the numerator has 78000 digits and the denominator 30000.

@ckormanyos
Copy link
Member

ckormanyos commented Feb 24, 2023

It might be possible to low-word the shift on the carry. But I am really not sure if that exact sequence is using full multiples of the number of bits in a limb for the shift.

As a general comment, it's really hard to beat or match the raw speed of GMP when using portable code. They squeeze the assembler for their well-deserved, legendary reputation for bare-metal speed.

@jzmaddock
Copy link
Collaborator

I'm not surprised by this. Division is hard, so many corner cases to handle correctly. There are better algorithms (in the big-O sense), but it's a significant piece of work to implement them correctly.

@ckormanyos
Copy link
Member

ckormanyos commented Feb 24, 2023

Division is hard, so many corner cases to handle correctly

I actually have a Knuth long division in an unrelated project. I've run hundreds of millions of tests on it and squeezed all the bugs out over the course of, my gosh, years. I use an opposite limb-ordering.

I feel that Knuth long division is the winner for limb-counts below 2,000 or so. But at that range of your test case, you're in the quadratic domain of classic algorithms. You'd probably be better off inverting-and-multiplying with conversion to bin-float, then Newton-Raphson-ing it. But to assemble the result, you'd be desperately driven to make sure you rorund correctly. Oh my gosh, I get exahusted just thingking about verifying and testing that code for hard-world-open use.

I wouldn't take on either of these tasks at the moment. But maybe note this as a potential place for optimization in the future...
As these are really daunting tasks.

@ckormanyos
Copy link
Member

ckormanyos commented Feb 24, 2023

actually have a Knuth long division

Since I've never spoken to, really anyone about this one and it's a gem, ...

Here is that Knuth long division - in it's full-on, loving, 300 lines of glory (albeit with the wrong limb-ordering) for future reference if anyone ever cares for it.

@afabri afabri changed the title Speed of division with reminder for cpp_int Speed of division with remainder for cpp_int Feb 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants