Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed optimization: RVO isn't applied #116541

Open
gootorov opened this issue Oct 8, 2023 · 10 comments
Open

Missed optimization: RVO isn't applied #116541

gootorov opened this issue Oct 8, 2023 · 10 comments
Labels
A-codegen Area: Code generation A-mir-opt Area: MIR optimizations A-mir-opt-nrvo Fixed by NRVO C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@gootorov
Copy link
Contributor

gootorov commented Oct 8, 2023

Consider the code below. At the first glace, it looks like RVO/destination propagation isn't applied, resulting in a stack overflow.

File structure

tree .

.
├── bin
│   └── rvo.rs
├── Cargo.toml
├── src
│   └── lib.rs

Code

Cargo.toml

[package]
name = "rvo"
version = "0.1.0"
edition = "2021"

[[bin]]
name = "rvo"
path = "bin/rvo.rs"

[dependencies]

[profile.release]
lto = "fat"

bin/rvo.rs

use rvo::Outer;

fn main() {
    let _foo = Box::new(Outer::new());
}

src/lib.rs

#![allow(unused)]

const SIZE: usize = 1024 * 1024 * 1024;

pub struct Outer {
    a: LargeStruct,
}

impl Outer {
    #[inline(always)]
    pub fn new() -> Self {
        let a = LargeStruct::new();

        Self { a }
    }
}

pub struct LargeStruct {
    d0: Option<Box<u128>>,
    d1: Option<Box<u128>>,
    d2: Option<Box<u128>>,
    d3: Option<Box<u128>>,
    d4: Option<Box<u128>>,
    huge: [u8; SIZE],
}

impl LargeStruct {
    #[inline(always)]
    pub fn new() -> Self {
        Self {
            d0: None,
            d1: None,
            d2: None,
            d3: None,
            d4: None,
            huge: [0; SIZE],
        }
    }
}

Running cargo run --release results in:

    Finished release [optimized] target(s) in 0.01s
     Running `target/release/rvo`

thread 'main' has overflowed its stack
fatal runtime error: stack overflow
Aborted (core dumped)

Compiler version:
rustc --version --verbose

rustc 1.73.0 (cc66ad468 2023-10-03)
binary: rustc
commit-hash: cc66ad468955717ab92600c770da8c1601a4ff33
commit-date: 2023-10-03
host: x86_64-unknown-linux-gnu
release: 1.73.0
LLVM version: 17.0.2
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Oct 8, 2023
@gootorov
Copy link
Contributor Author

gootorov commented Oct 8, 2023

Interestingly, the stack no longer overflows when just a few fields are removed:

diff --git a/src/lib.rs b/src/lib.rs
index d84519e..3d43db8 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -18,9 +18,6 @@ impl Outer {
 pub struct LargeStruct {
     d0: Option<Box<u128>>,
     d1: Option<Box<u128>>,
-    d2: Option<Box<u128>>,
-    d3: Option<Box<u128>>,
-    d4: Option<Box<u128>>,
     huge: [u8; SIZE],
 }
 
@@ -30,9 +27,6 @@ impl LargeStruct {
         Self {
             d0: None,
             d1: None,
-            d2: None,
-            d3: None,
-            d4: None,
             huge: [0; SIZE],
         }
     }

@saethlin
Copy link
Member

saethlin commented Oct 8, 2023

At the first glace, it looks like RVO/destination propagation isn't applied

Both these MIR optimizations are off by default, and the NRVO MIR opt is unsound.

@workingjubilee workingjubilee added A-mir-opt Area: MIR optimizations A-mir-opt-nrvo Fixed by NRVO A-codegen Area: Code generation T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-opsem Relevant to the opsem team C-enhancement Category: An issue proposing an enhancement or a PR with one. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Oct 8, 2023
@gootorov
Copy link
Contributor Author

gootorov commented Oct 8, 2023

Both these MIR optimizations are off by default, and the NRVO MIR opt is unsound.

Interesting, thank you. I was always under the impression that optimizations of the Box::new(returns_large_object()) type are RVO. Or is it (currently) just LLVM's RVO and not MIR?

Also, is there a way to turn on these MIR optimizations via a compiler flag/crate attribute?

@workingjubilee
Copy link
Member

Yes, currently LLVM performs the relevant RVO (or doesn't).
You can use -Zmir-enable-passes=+DestinationPropagation,+RenameReturnPlace.

Alternately, I believe most unsound opts are enabled at -Zmir-opt-level=4, but that can cause other hilarious bugs.

@workingjubilee workingjubilee added C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such and removed C-enhancement Category: An issue proposing an enhancement or a PR with one. labels Oct 8, 2023
@saethlin
Copy link
Member

saethlin commented Oct 8, 2023

Unsound optimizations must be enabled by their own flag, -Zunsound-mir-opts. Unfortunately perhaps, -Zmir-opt-level=4 is used in the ecosystem.

It's possible your request here will be fulfilled by #116531.

@gootorov
Copy link
Contributor Author

gootorov commented Oct 8, 2023

Unfortunately, if I did everything correctly, #116531 doesn't help in this case :(

I've checked-out #116531 and built stage2. Then ran the binary again:

RUSTFLAGS="-Zmir-opt-level=4 --emit=mir" cargo +stage2 run --release

Still overflows.

This is the emitted MIR:

mir
// WARNING: This output format is intended for human consumers only
// and is subject to change without notice. Knock yourself out.
const SIZE: usize = {
  let mut _0: usize;
  let mut _1: usize;
  let mut _2: (usize, bool);
  let mut _3: (usize, bool);

  bb0: {
      StorageLive(_1);
      _2 = CheckedMul(const 1024_usize, const 1024_usize);
      assert(!move (_2.1: bool), "attempt to compute `{} * {}`, which would overflow", const 1024_usize, const 1024_usize) -> [success: bb1, unwind continue];
  }

  bb1: {
      _1 = move (_2.0: usize);
      _3 = CheckedMul(_1, const 1024_usize);
      assert(!move (_3.1: bool), "attempt to compute `{} * {}`, which would overflow", move _1, const 1024_usize) -> [success: bb2, unwind continue];
  }

  bb2: {
      _0 = move (_3.0: usize);
      StorageDead(_1);
      return;
  }
}

fn <impl at src/lib.rs:9:1: 9:11>::new() -> Outer {
  let mut _0: Outer;
  let _1: LargeStruct;
  scope 1 {
      debug a => _1;
  }
  scope 2 (inlined LargeStruct::new) {
      let mut _2: [u8; 1073741824];
  }

  bb0: {
      StorageLive(_2);
      _2 = [const 0_u8; 1073741824];
      _1 = LargeStruct { d0: const Option::<Box<u128>>::None, d1: const Option::<Box<u128>>::None, d2: const Option::<Box<u128>>::None, d3: const Option::<Box<u128>>::None, d4: const Option::<Box<u128>>::None, huge: move _2 };
      StorageDead(_2);
      _0 = Outer { a: move _1 };
      return;
  }
}

LargeStruct::huge::{constant#0}: usize = {
  let mut _0: usize;

  bb0: {
      _0 = const _;
      return;
  }
}

fn <impl at src/lib.rs:27:1: 27:17>::new() -> LargeStruct {
  let mut _0: LargeStruct;
  let mut _1: [u8; 1073741824];

  bb0: {
      StorageLive(_1);
      _1 = [const 0_u8; 1073741824];
      _0 = LargeStruct { d0: const Option::<Box<u128>>::None, d1: const Option::<Box<u128>>::None, d2: const Option::<Box<u128>>::None, d3: const Option::<Box<u128>>::None, d4: const Option::<Box<u128>>::None, huge: move _1 };
      StorageDead(_1);
      return;
  }
}

<impl at src/lib.rs:27:1: 27:17>::new::{constant#0}: usize = {
  let mut _0: usize;

  bb0: {
      _0 = const _;
      return;
  }
}
// WARNING: This output format is intended for human consumers only
// and is subject to change without notice. Knock yourself out.
fn main() -> () {
  let mut _0: ();
  let _1: std::boxed::Box<rvo::Outer>;
  let mut _2: rvo::Outer;
  scope 1 {
      debug _foo => _1;
  }
  scope 2 (inlined Outer::new) {
      let _3: rvo::LargeStruct;
      scope 3 {
          debug a => _3;
      }
      scope 4 (inlined LargeStruct::new) {
          let mut _4: [u8; 1073741824];
      }
  }

  bb0: {
      StorageLive(_1);
      StorageLive(_2);
      StorageLive(_3);
      StorageLive(_4);
      _4 = [const 0_u8; 1073741824];
      _3 = LargeStruct { d0: const Option::<Box<u128>>::None, d1: const Option::<Box<u128>>::None, d2: const Option::<Box<u128>>::None, d3: const Option::<Box<u128>>::None, d4: const Option::<Box<u128>>::None, huge: move _4 };
      StorageDead(_4);
      _2 = Outer { a: move _3 };
      StorageDead(_3);
      _1 = Box::<Outer>::new(move _2) -> [return: bb1, unwind continue];
  }

  bb1: {
      StorageDead(_2);
      drop(_1) -> [return: bb2, unwind continue];
  }

  bb2: {
      StorageDead(_1);
      return;
  }
}

Also, just in case, I tried perhaps less aggressive

RUSTFLAGS="-Zmir-enable-passes=+DestinationPropagation,+RenameReturnPlace --emit=mir" cargo +stage2 run --release

and there were no difference in the emitted MIR

@workingjubilee
Copy link
Member

Did you use -Zunsound-mir-opts as well?

@gootorov
Copy link
Contributor Author

gootorov commented Oct 9, 2023

I thought that wasn't necessary judging by the diff in #116531, but to be sure, I've just tried these two variants:

RUSTFLAGS="-Zunsound-mir-opts -Zmir-enable-passes=+DestinationPropagation,+RenameReturnPlace --emit=mir" cargo +stage2 run --release
RUSTFLAGS="-Zunsound-mir-opts -Zmir-opt-level=4 --emit=mir" cargo +stage2 run --release

Unfortunately, same outcome, and the emitted MIR is also the same as before.

@gootorov
Copy link
Contributor Author

gootorov commented Oct 9, 2023

Actually, I think I'm doing something wrong. Just tried comparing this MIR to stable MIR (RUSTFLAGS="--emit=mir" cargo run --release) and there also isn't any difference

@saethlin
Copy link
Member

saethlin commented Oct 9, 2023

--release is about equivalent to -Zmir-opt-level=2. You're doing things fine, the MIR optimization as-written simply doesn't help for this specific case.

@RalfJung RalfJung removed the T-opsem Relevant to the opsem team label May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-mir-opt Area: MIR optimizations A-mir-opt-nrvo Fixed by NRVO C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants