Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIMD spectralnorm #77

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

llogiq
Copy link
Contributor

@llogiq llogiq commented Sep 11, 2018

This replaces the autovectorized code with explicit SIMD instructions and inlines the former div_and_add function. On my machine, the difference is negligible, but I suspect that skylake has better branch prediction than the old Core 2, so here goes nothing...

@@ -1,6 +1,6 @@
SOURCES = $(wildcard src/*.rs)
RUSTC ?= rustc
RUSTC_FLAGS ?= -C opt-level=3 -C target-cpu=core2 -C lto
RUSTC_FLAGS ?= -C panic=abort -C opt-level=3 -C target-cpu=core2 -C lto
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll ask for this modification?

}
fn a(i: [usize; 2], j: [usize; 2]) -> F64x2 {
F64x2::new(((i[0] + j[0]) * (i[0] + j[0] + 1) / 2 + i[0] + 1) as f64,
((i[1] + j[1]) * (i[1] + j[1] + 1) / 2 + i[1] + 1) as f64)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No SIMD here?

Copy link
Owner

@TeXitoi TeXitoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't want to SIMDify fn a(), that's good for me.

@llogiq
Copy link
Contributor Author

llogiq commented Sep 11, 2018

Ah, I completely overlooked this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants