New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect code generation for nalgebra's Matrix::swap_rows() #54462

Closed
HadrienG2 opened this Issue Sep 22, 2018 · 4 comments

Comments

Projects
None yet
2 participants
@HadrienG2

HadrienG2 commented Sep 22, 2018

The nalgebra linear algebra library has a swap_rows method which allows the user to swap two rows of a matrix. Unfortunately, I'm currently investigating a code generation heisenbug which causes this method to corrupt the matrix data in some circumstances.

Given the UB-like symptoms, and the fact that the implementation of swap_rows takes multiple (non-overlapping) &mut references to the target matrix, I wondered if this could be a violation of Rust's aliasing rules. However, @nagisa confirmed that this is not not the case, and that the compiler is probably the culprit here. He identified the recent upgrade from LLVM 7 to LLVM 8 as a cause of this issue (but that later turned out to be incorrect).

Here is a minimal reproducer of my problem:

extern crate nalgebra;

use nalgebra::Matrix3x4;

fn swappy() -> Matrix3x4<f32> {
    let mut mat = Matrix3x4::new(1., 2.,  3.,  4.,
                                 5., 6.,  7.,  8.,
                                 9., 10., 11., 12.);

    // NOTE: This printf makes the bug go away, suggesting UB or a codegen issue
    // println!("Result: {}", mat);

    for i in 0..2 {
        for j in i+1..3 {
            if mat[(j, 3)] > mat[(i, 3)] { mat.swap_rows(i, j); }
        }
    }

    mat
}

fn main() {
    let mat = swappy();
    println!("Result: {}", mat);
}

To reproduce the issue, you must build in release mode. The issue is also sensitive to the amount of codegen units in flight, therefore I strongly recommend building with codegen-units=1 as well.

I expect the following output:

  ┌             ┐
  │  9 10 11 12 │
  │  5  6  7  8 │
  │  1  2  3  4 │
  └             ┘

Instead, on my systems (nalgebra 0.16.2, rust 1.29, Ivy Bridge & Haswell CPUs) I get the following output:

  ┌             ┐
  │  9 10 11 12 │
  │  5  6  7  8 │
  │  1  6  7  4 │
  └             ┘
@nagisa

This comment has been minimized.

Show comment
Hide comment
@nagisa

nagisa Sep 22, 2018

Contributor

LLVM upgrade as a cause was misidentified (for some reason early 1.29 nightly did report correct results at some point for me).

The code works with -Zmutable-noalias=no. So does 2018-05-15 (which is before the noalias PR landed), but 2018-06-01 starts failing (without the noalias flag). In between these dates releases had misc. bugs related to typesystem preventing nalgebra from building.

There may still be UB somewhere in the code, but given our bad experience with noalias before, I do not discount llvm being at fault either.

Contributor

nagisa commented Sep 22, 2018

LLVM upgrade as a cause was misidentified (for some reason early 1.29 nightly did report correct results at some point for me).

The code works with -Zmutable-noalias=no. So does 2018-05-15 (which is before the noalias PR landed), but 2018-06-01 starts failing (without the noalias flag). In between these dates releases had misc. bugs related to typesystem preventing nalgebra from building.

There may still be UB somewhere in the code, but given our bad experience with noalias before, I do not discount llvm being at fault either.

@nagisa

This comment has been minimized.

Show comment
Hide comment
@nagisa

nagisa Sep 26, 2018

Contributor
Minimised test case with no unsafe code (make sure to compile with 1 codegen unit!):
fn linidx(row: usize, col: usize) -> usize {
    row * 1 + col * 3
}

fn swappy() -> [f32; 12] {
    let mut mat = [1.0f32, 5.0, 9.0, 2.0, 6.0, 10.0, 3.0, 7.0, 11.0, 4.0, 8.0, 12.0];

    for i in 0..2 {
        for j in i+1..3 {
            if mat[linidx(j, 3)] > mat[linidx(i, 3)] {
                    for k in 0..4 {
                            let (x, rest) = mat.split_at_mut(linidx(i, k) + 1);
                            let a = x.last_mut().unwrap();
                            let b = rest.get_mut(linidx(j, k) - linidx(i, k) - 1).unwrap();
                            ::std::mem::swap(a, b);
                    }
            }
        }
    }

    mat
}

fn main() {
    let mat = swappy();
    assert_eq!([9.0, 5.0, 1.0, 10.0, 6.0, 2.0, 11.0, 7.0, 3.0, 12.0, 8.0, 4.0], mat);
}

Output

thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `[9.0, 5.0, 1.0, 10.0, 6.0, 2.0, 11.0, 7.0, 3.0, 12.0, 8.0, 4.0]`,
 right: `[9.0, 5.0, 1.0, 10.0, 6.0, 6.0, 11.0, 7.0, 7.0, 12.0, 8.0, 4.0]`', src/main.rs:43:5

To compile

rustc src/main.rs -Ccodegen-units=1 -O -Zmutable-noalias=yes

setting -Zmutable-noalias=no makes it work fine.

Contributor

nagisa commented Sep 26, 2018

Minimised test case with no unsafe code (make sure to compile with 1 codegen unit!):
fn linidx(row: usize, col: usize) -> usize {
    row * 1 + col * 3
}

fn swappy() -> [f32; 12] {
    let mut mat = [1.0f32, 5.0, 9.0, 2.0, 6.0, 10.0, 3.0, 7.0, 11.0, 4.0, 8.0, 12.0];

    for i in 0..2 {
        for j in i+1..3 {
            if mat[linidx(j, 3)] > mat[linidx(i, 3)] {
                    for k in 0..4 {
                            let (x, rest) = mat.split_at_mut(linidx(i, k) + 1);
                            let a = x.last_mut().unwrap();
                            let b = rest.get_mut(linidx(j, k) - linidx(i, k) - 1).unwrap();
                            ::std::mem::swap(a, b);
                    }
            }
        }
    }

    mat
}

fn main() {
    let mat = swappy();
    assert_eq!([9.0, 5.0, 1.0, 10.0, 6.0, 2.0, 11.0, 7.0, 3.0, 12.0, 8.0, 4.0], mat);
}

Output

thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `[9.0, 5.0, 1.0, 10.0, 6.0, 2.0, 11.0, 7.0, 3.0, 12.0, 8.0, 4.0]`,
 right: `[9.0, 5.0, 1.0, 10.0, 6.0, 6.0, 11.0, 7.0, 7.0, 12.0, 8.0, 4.0]`', src/main.rs:43:5

To compile

rustc src/main.rs -Ccodegen-units=1 -O -Zmutable-noalias=yes

setting -Zmutable-noalias=no makes it work fine.

@nagisa

This comment has been minimized.

Show comment
Hide comment
@nagisa

nagisa Sep 26, 2018

Contributor

cc @rust-lang/compiler this is probably a soundness issue. I’m not sure if it can be exploited to do bad things to memory, but I marked it as such to be safe.

Contributor

nagisa commented Sep 26, 2018

cc @rust-lang/compiler this is probably a soundness issue. I’m not sure if it can be exploited to do bad things to memory, but I marked it as such to be safe.

@nagisa

This comment has been minimized.

Show comment
Hide comment
@nagisa

nagisa Sep 27, 2018

Contributor

Discussed in T-compiler meeting. I will prepare a patch for at least master and beta changing the default to -Zmutable-noalias=no. Might also make a backport into stable depending on what T-core decides.

After that I’ll keep looking into the underlying issue to see if it can be easily fixed within LLVM.

Contributor

nagisa commented Sep 27, 2018

Discussed in T-compiler meeting. I will prepare a patch for at least master and beta changing the default to -Zmutable-noalias=no. Might also make a backport into stable depending on what T-core decides.

After that I’ll keep looking into the underlying issue to see if it can be easily fixed within LLVM.

@nagisa nagisa self-assigned this Sep 27, 2018

@nagisa nagisa removed the I-nominated label Sep 27, 2018

nagisa added a commit to nagisa/rust that referenced this issue Sep 28, 2018

Do not put noalias annotations by default
This will be re-enabled sooner or later depending on results of further
investigation.

Fixes #54462

nagisa added a commit to nagisa/rust that referenced this issue Sep 28, 2018

Do not put noalias annotations by default
This will be re-enabled sooner or later depending on results of further
investigation.

Fixes #54462

nagisa added a commit to nagisa/rust that referenced this issue Sep 28, 2018

Do not put noalias annotations by default
This will be re-enabled sooner or later depending on results of further
investigation.

Fixes #54462

nagisa added a commit to nagisa/rust that referenced this issue Sep 28, 2018

Do not put noalias annotations by default
This will be re-enabled sooner or later depending on results of further
investigation.

Fixes #54462

nagisa added a commit to nagisa/rust that referenced this issue Sep 28, 2018

Do not put noalias annotations by default
This will be re-enabled sooner or later depending on results of further
investigation.

Fixes #54462

nagisa added a commit to nagisa/rust that referenced this issue Sep 28, 2018

Do not put noalias annotations by default
This will be re-enabled sooner or later depending on results of further
investigation.

Fixes #54462

nagisa added a commit to nagisa/rust that referenced this issue Sep 28, 2018

Do not put noalias annotations by default
This will be re-enabled sooner or later depending on results of further
investigation.

Fixes #54462

nagisa added a commit to nagisa/rust that referenced this issue Sep 28, 2018

Do not put noalias annotations by default
This will be re-enabled sooner or later depending on results of further
investigation.

Fixes #54462

kennytm added a commit to kennytm/rust that referenced this issue Sep 29, 2018

Rollup merge of #54639 - nagisa:lets-alias-for-now, r=eddyb
Do not put noalias annotations by default

This will be re-enabled sooner or later depending on results of further
investigation.

Fixes rust-lang#54462

Beta backport is: rust-lang#54640

r? @nikomatsakis

nagisa added a commit to nagisa/rust that referenced this issue Sep 29, 2018

Do not put noalias annotations by default
This will be re-enabled sooner or later depending on results of further
investigation.

Fixes #54462

nagisa added a commit to nagisa/rust that referenced this issue Sep 29, 2018

Do not put noalias annotations by default
This will be re-enabled sooner or later depending on results of further
investigation.

Fixes #54462

bors added a commit that referenced this issue Sep 30, 2018

Auto merge of #54639 - nagisa:lets-alias-for-now, r=eddyb
Do not put noalias annotations by default

This will be re-enabled sooner or later depending on results of further
investigation.

Fixes #54462

Beta backport is: #54640

r? @nikomatsakis

@bors bors closed this in #54639 Sep 30, 2018

Aaronepower added a commit to Aaronepower/rust that referenced this issue Sep 30, 2018

Do not put noalias annotations by default
This will be re-enabled sooner or later depending on results of further
investigation.

Fixes #54462

pietroalbini added a commit to pietroalbini/rust that referenced this issue Oct 4, 2018

Do not put noalias annotations by default
This will be re-enabled sooner or later depending on results of further
investigation.

Fixes #54462
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment