Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Special case repeating fieldless enum variants #104452

Closed
wants to merge 1 commit into from

Conversation

clubby789
Copy link
Contributor

Another approach to #104384

Add another field to Rvalue::Repeat specifying that the item being repeated is a fieldless enum variant so that it can be transformed to a memset.

r? @scottmcm

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 15, 2022
@rustbot
Copy link
Collaborator

rustbot commented Nov 15, 2022

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

This PR changes MIR

cc @oli-obk, @RalfJung, @JakobDegen, @davidtwco, @celinval, @vakaras

/// Corresponds to source code like `[x; 32]`.
///
/// [#74836]: https://github.com/rust-lang/rust/issues/74836
Repeat(Operand<'tcx>, ty::Const<'tcx>),
Repeat(Operand<'tcx>, ty::Const<'tcx>, bool),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels weird to need this specific property on RValue? It feels like it could be gotten from looking at the type of the operand when needed. It doesn't seem like it ought to be part of MIR syntax.

(Also, primitives in tuple-variants are often a bad idea, since there's no type hint for what it means.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put the property into the rvalue since the actual variant being written is lost at the point I use the information - we only have a value loaded from the generated temporary (i.e. (i8: %9 = zext i1 %5 to i8))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is known at MIR building but not later, one possibility might be to add a NonDivergingIntrinsic::MemSet that you emit instead of the Repeat sometimes.

(That's still an addition to the language that would need a spec, so it's still better to avoid if possible, but it's much lower impact than changing a top-level statement.)

@scottmcm
Copy link
Member

I'm not a compiler reviewer, so I'm going to reroll for someone with more context on the way things should be done.
r? compiler

@rustbot rustbot assigned fee1-dead and unassigned scottmcm Nov 15, 2022
@@ -1004,10 +1004,13 @@ pub enum Rvalue<'tcx> {
/// This is the cause of a bug in the case where the repetition count is zero because the value
/// is not dropped, see [#74836].
///
/// The third param is true if the value we're repeating is an enum variant
/// with no fields so we can emit a memset.
Copy link
Member

@RalfJung RalfJung Nov 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what is the behavior of Repeat if that thing is true but the value is not such an enum variant? Is it UB? And what about when the thing is false but it is such an enum variant?

I agree with @scottmcm, encoding this in the syntax of MIR seems wrong. MIR is a programming language and you wouldn't usually expect the syntax of a language to encode redundant information like this.

(Also unnamed booloeans are pretty bad, so if we stick with this design, this should be turned into a struct variant where the fields have names.)

@@ -207,7 +207,7 @@ impl<'mir, 'tcx: 'mir, M: Machine<'mir, 'tcx>> InterpCx<'mir, 'tcx, M> {
}
}

Repeat(ref operand, _) => {
Repeat(ref operand, _, _) => {
Copy link
Member

@RalfJung RalfJung Nov 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is possible UB due to that boolean being wrong, then the interpreter must be adjusted to be able to detect that UB.

There must never be a bit of valid MIR (in the sense of MIR that passes the MIR checker) where the interpreter and codegen behave differently.

@RalfJung
Copy link
Member

RalfJung commented Nov 15, 2022

You probably want to have a chat with @rust-lang/wg-mir-opt before changing MIR syntax. I seriously hope there is a better way to do this. I don't have the time to figure out a good design here, but MIR syntax extensions need to be done carefully, and with a proper update to the interpreter that describes all aspects of their semantics, including the UB.

@scottmcm scottmcm added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 15, 2022
@JakobDegen
Copy link
Contributor

I don't think this is the right strategy - MIR has all the information that you need to do this right. [None; 64] should turn into an Rvalue::Repeat with two Operand::Consts (if it doesn't we need to fix const prop). Codegen can then do all the special casing of repeats with constant values that it wants, no need to touch MIR

@rust-log-analyzer
Copy link
Collaborator

The job mingw-check failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
    Checking rustc_codegen_cranelift v0.1.0 (/checkout/compiler/rustc_codegen_cranelift)
error[E0023]: this pattern has 2 fields, but the corresponding tuple variant has 3 fields
    --> src/base.rs:720:32
     |
720  |                 Rvalue::Repeat(ref operand, times) => {
     |
    ::: /checkout/compiler/rustc_middle/src/mir/syntax.rs:1013:12
     |
     |
1013 |     Repeat(Operand<'tcx>, ty::Const<'tcx>, bool),
     |
help: use `_` to explicitly ignore each field
     |
     |
720  |                 Rvalue::Repeat(ref operand, times, _) => {

For more information about this error, try `rustc --explain E0023`.
error: could not compile `rustc_codegen_cranelift` due to previous error
Build completed unsuccessfully in 0:03:45

@clubby789
Copy link
Contributor Author

clubby789 commented Nov 16, 2022

[None; 64] should turn into an Rvalue::Repeat with two Operand::Consts

Category::Constant | Category::Place | Category::Rvalue(..) => {
let operand = unpack!(block = this.as_temp(block, scope, expr, Mutability::Mut));
if this.local_decls[operand].local_info.is_none() {
this.local_decls[operand].local_info = local_info;
}
block.and(Operand::Move(Place::from(operand)))

None is an Rvalue at this point, so it gets made into a temporary and the Place containing the temporary is instead passed as an operand. In the case of

pub fn poc() -> [u8; 64] { [0; 64] }

this works as expected, and the element to repeat is a Const.

Both of these also cause the option to be a const

#![feature(inline_const)]
pub fn poc() -> [Option<u8>; 64] { [ const { None }; 64] }
const N: Option<u8> = None;
pub fn poc2() -> [Option<u8>; 64] { [ N; 64] }

EDIT: Some further testing with the ConstProp pass shows that dataless enums (or enums with ZST fields) are constified fine -

#[derive(Copy, Clone)]
enum T { A, B }
fn rep() -> [T; 64] {
    [T::A; 64]
}
  fn rep() -> [T; 64] {
      let mut _0: [T; 64];                 // return place in scope 0 at $DIR/enum_repeat.rs:+0:13: +0:20
      let mut _1: T;                       // in scope 0 at $DIR/enum_repeat.rs:+1:6: +1:10
  
      bb0: {
          StorageLive(_1);                 // scope 0 at $DIR/enum_repeat.rs:+1:6: +1:10
          Deinit(_1);                      // scope 0 at $DIR/enum_repeat.rs:+1:6: +1:10
          discriminant(_1) = 0;            // scope 0 at $DIR/enum_repeat.rs:+1:6: +1:10
-         _0 = [move _1; 64];              // scope 0 at $DIR/enum_repeat.rs:+1:5: +1:15
+         _0 = [const T::A; 64];           // scope 0 at $DIR/enum_repeat.rs:+1:5: +1:15
+                                          // mir::Constant
+                                          // + span: $DIR/enum_repeat.rs:10:5: 10:15
+                                          // + literal: Const { ty: T, val: Value(Scalar(0x00)) }
          StorageDead(_1);                 // scope 0 at $DIR/enum_repeat.rs:+1:14: +1:15
          return;                          // scope 0 at $DIR/enum_repeat.rs:+2:2: +2:2
      }
  }

Which then eliminates _1. However, when the enum has data (even unused)

#[derive(Copy, Clone)]
enum T { A, B(usize) }
fn rep() -> [T; 64] {
    [T::A; 64]
}

the pass does not kick in

@JakobDegen
Copy link
Contributor

Oh, yeah, this is due to an annoying limitation of current const prop. This can and should be fixed in the future (dataflow const prop would need support for non-scalars). It seems reasonable to wait for that though

@bors
Copy link
Contributor

bors commented Nov 17, 2022

☔ The latest upstream changes (presumably #103138) made this pull request unmergeable. Please resolve the merge conflicts.

@clubby789
Copy link
Contributor Author

I'll close this in that case, seems like this is not the right strategy

@clubby789 clubby789 closed this Nov 17, 2022
@clubby789 clubby789 deleted the another-enum-array branch February 11, 2023 14:45
@clubby789 clubby789 restored the another-enum-array branch February 11, 2023 14:45
@clubby789 clubby789 deleted the another-enum-array branch February 11, 2023 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants