Skip to content

Conversation

@czurnieden
Copy link
Contributor

s_mp_sqr is significantly slower than s_mp_mul here (without comba). Changing the addition r + r to r<<1 made it already faster and changing the computation in the carry propagating loop to a manual multi-word doubled the speed-up already gained. The relation between s_mp_mul and s_mp_sqr follows the theoretical runtimes much closer now.
A test for s_mp_sqr against s_mp_mul has been added to etc/test.c.

Those advantages depend on compiler and/or architectur, of course, but the changes should not make it worse.

diff_original_vs_new_s_mp_sqr_vs_s_mp_mul

@czurnieden
Copy link
Contributor Author

I'm trying to clean up my PRs (quite a mess), so:
Is it useful or can it go?

@sjaeckel sjaeckel force-pushed the optimize_s_mp_sqr branch from 77d2920 to b7d9057 Compare April 5, 2023 07:55
Copy link
Member

@sjaeckel sjaeckel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reminding me of those PR's!

I rebased after merging #502 and fixed the merge conflict ... Does that merge make sense? Both tests test_s_mp_sqr() and test_s_mp_sqr_comba() do something slightly different, or should there only be one?

@czurnieden
Copy link
Contributor Author

I rebased after merging #502 and fixed the merge conflict ... Does that merge make sense?

As far as I can see: yes.

Both tests test_s_mp_sqr() and test_s_mp_sqr_comba() do something slightly different, or should there only be one?

The outer loops are different, that is intentional but test_s_mp_sqr_comba() only works if Comba is switched off at compile time. Add a round to the CI-tests?

And, no, I don't have the faintest idea anymore why I added that small inner loop in test_s_mp_sqr_comba(). The known edge case is the main point, one test at a random point for the unknown edge cases should suffice.

@sjaeckel
Copy link
Member

sjaeckel commented Apr 5, 2023

Add a round to the CI-tests?

And, no, I don't have the faintest idea anymore why I added that small inner loop in test_s_mp_sqr_comba(). The known edge case is the main point, one test at a random point for the unknown edge cases should suffice.

you wanna update these tests or should we merge as is?

@czurnieden
Copy link
Contributor Author

you wanna update these tests or should we merge as is?

Merge as it is, because I know me and if I start fiddling around I will find this…and that…and those…and at the end a week is gone ;-)

@sjaeckel sjaeckel merged commit 3f10a28 into libtom:develop Apr 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants