Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReduceMod optimization #75

Merged
merged 18 commits into from
Oct 9, 2021
Merged

ReduceMod optimization #75

merged 18 commits into from
Oct 9, 2021

Conversation

GelilaSeifu
Copy link
Contributor

@GelilaSeifu GelilaSeifu commented Oct 8, 2021

Optimize Barrett Reduction for output_mod_factor=2 and BitShift=52/64. See details at https://jiratest.idoc.intel.com/browse/GLADE-12

Copy link
Contributor

@fboemer fboemer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, just noticed this is a draft. But if the performance looks good, this looks like a nice solution.

hexl/include/hexl/util/msvc.hpp Outdated Show resolved Hide resolved
hexl/util/avx512-util.hpp Outdated Show resolved Hide resolved
hexl/util/avx512-util.hpp Outdated Show resolved Hide resolved
@GelilaSeifu GelilaSeifu marked this pull request as ready for review October 8, 2021 22:39
@GelilaSeifu GelilaSeifu requested a review from a team as a code owner October 8, 2021 22:39
Copy link
Contributor

@fboemer fboemer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor feedback; otherwise, LGTM.

test/test-avx512-util.cpp Outdated Show resolved Hide resolved
@GelilaSeifu GelilaSeifu merged commit 38e1074 into main Oct 9, 2021
@GelilaSeifu GelilaSeifu deleted the gseifu/opt-ReduceMod branch October 9, 2021 01:52
fboemer pushed a commit that referenced this pull request Nov 8, 2021
* optimize modular reduction

* remove MinTime

* update condition

* add unit test

* test algorithm 2 barrett reduction

* use different optimization for reduce mod based on input size

* add more test and benchmark

* update include

* fix formatting

* fix formatting

* rename arg, better perf

* fix windows build

* fix IFMA52 unit-tests

* fix debug benchmark

* avoid unused error

* update modulo

* update test

* simplify test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants