Improve spectralnorm #9

TeXitoi · 2015-08-12T12:06:07Z

The fn A() is not autovectorized, and then simd are not used anywhere (as it can be seen in the generated ASM). I think writing something like fn Ax2(i: u64x2, j: u64x2) -> f64x2 and use it where we use A() should do the trick.

The text was updated successfully, but these errors were encountered:

llogiq · 2015-09-22T11:51:39Z

Could we use the simd crate? It's not tested on stable AFAIK, but it might work on beta, at least.

TeXitoi · 2015-09-22T12:04:52Z

The simd crate is used on the rust repo. You can contribute here, but it'll be merged here when stable will support simd (or a couple of days before).

This issue is about writing the code such that llvm optimizes with simd without using the simd crate.

llogiq · 2015-09-22T12:20:48Z

Ah, I see. Do you mean something like the following?

#[allow(non_camel_case_types)]
#[derive(Debug)]
struct u64x2(u64, u64);
impl std::ops::Add for u64x2 {
    type Output = Self;
    fn add(self, rhs: Self) -> Self {
        u64x2(self.0 + rhs.0, self.1 + rhs.1)
    }
}
impl std::ops::Div for u64x2 {
    type Output = Self;
    fn div(self, rhs: Self) -> Self {
        u64x2(self.0 / rhs.0, self.1 / rhs.1)
    }
}

fn Ax2(i: u64x2, j: u64x2) -> f64x2 {
    f64x2(((i.0 + j.0) * (i.0 + j.0 + 1) / 2 + i.0 + 1) as f64,
           ((i.1 + j.1) * (i.1 + j.1 + 1) / 2 + i.1 + 1) as f64)
}

I don't see any advantage to that in a minimal test setup (look at the assembly output). Maybe within a loop?

llogiq · 2015-09-22T12:26:46Z

Also if I try to keep the operations within x2 representations, the assembly stays the same: http://is.gd/65q7NN

TeXitoi · 2015-09-22T13:14:49Z

It seems different in release mode.

TeXitoi · 2015-09-22T13:17:20Z

And yes, I was thinking of

fn ax2(i: u64x2, j: u64x2) -> f64x2 {
    ((i + j) * (i + j + u64x2(1, 1)) / u64x2(2, 2) + i + u64x2(1, 1)).into()

llogiq · 2015-09-22T13:34:05Z

Yeah, I just diffed the assembly versions. They are different, although I'm not sure if the vectorized version is actually faster. Guess there's only one way to find out... 😄

TeXitoi mentioned this issue Aug 12, 2015

spectral-norm fails to build on stable 1.2 #8

Closed

llogiq mentioned this issue Sep 26, 2015

autovectorize spectralnorm #22

Merged

TeXitoi closed this as completed in #22 Sep 27, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve spectralnorm #9

Improve spectralnorm #9

TeXitoi commented Aug 12, 2015

llogiq commented Sep 22, 2015

TeXitoi commented Sep 22, 2015

llogiq commented Sep 22, 2015

llogiq commented Sep 22, 2015

TeXitoi commented Sep 22, 2015

TeXitoi commented Sep 22, 2015

llogiq commented Sep 22, 2015

Improve spectralnorm #9

Improve spectralnorm #9

Comments

TeXitoi commented Aug 12, 2015

llogiq commented Sep 22, 2015

TeXitoi commented Sep 22, 2015

llogiq commented Sep 22, 2015

llogiq commented Sep 22, 2015

TeXitoi commented Sep 22, 2015

TeXitoi commented Sep 22, 2015

llogiq commented Sep 22, 2015