diff --git a/compiler/rustc_mir_transform/README.md b/compiler/rustc_mir_transform/README.md new file mode 100644 index 0000000000000..c42936665e86c --- /dev/null +++ b/compiler/rustc_mir_transform/README.md @@ -0,0 +1,102 @@ +# MIR Transform Passes + +This crate implements optimization passes that transform MIR (Mid-level Intermediate Representation) to improve code quality and performance. + +## Key Optimization Passes + +### Destination Propagation (`dest_prop.rs`) +Eliminates redundant copy assignments like `dest = src` by unifying the storage of `dest` and `src`. +- **When to modify**: Adding new assignment patterns, improving liveness analysis +- **Key challenges**: Ensuring soundness with address-taken locals, handling storage lifetimes +- **Performance notes**: Past implementations had O(l² * s) complexity issues; current version uses conflict matrices + +### Global Value Numbering (`gvn.rs`) +Detects and eliminates redundant computations by identifying values with the same symbolic representation. +- **When to modify**: Adding new value types, improving constant evaluation +- **Key challenges**: Handling non-deterministic constants, pointer provenance +- **Performance notes**: Largest pass (~2000 lines); careful about evaluation costs + +### Dataflow Constant Propagation (`dataflow_const_prop.rs`) +Propagates scalar constants through the program using dataflow analysis. +- **When to modify**: Extending to non-scalar types, improving evaluation precision +- **Key challenges**: Place limits to avoid compile-time explosions +- **Performance notes**: Has BLOCK_LIMIT (100) and PLACE_LIMIT (100) guards + +### Inlining (`inline.rs`, `inline/`) +Replaces function calls with the body of the called function. +- **When to modify**: Tuning heuristics, handling new calling conventions +- **Key challenges**: Avoiding cycles, cost estimation, handling generics +- **Performance notes**: Uses thresholds (30-100 depending on context) + +## Adding a New Pass + +1. Create your pass file in `src/` +2. Implement `MirPass<'tcx>` trait: + - `is_enabled`: When the pass should run + - `run_pass`: The transformation logic + - `is_required`: Whether this is a required pass +3. Register in `lib.rs` within `mir_opts!` macro +4. Add to appropriate phase in `run_optimization_passes` +5. Add tests in `tests/mir-opt/` + +## Testing + +Run specific MIR opt tests: +```bash +./x.py test tests/mir-opt/dest_prop.rs +./x.py test tests/mir-opt/gvn.rs +./x.py test tests/mir-opt/dataflow-const-prop +./x.py test tests/mir-opt/inline +``` + +## Pass Ordering + +Passes are organized into phases (see `lib.rs`): +- Early passes (cleanup, simplification) +- Analysis-driven optimizations (inlining, const prop, GVN) +- Late passes (final cleanup, code size reduction) + +Order matters! For example: +- `SimplifyCfg` before `GVN` (cleaner CFG) +- `GVN` before `DeadStoreElimination` (more values identified) +- `SimplifyLocals` after most passes (remove unused locals) + +## Known Limitations and Issues + +### ConstParamHasTy and Drop Shim Builder (#127030) +Drop glue generation for types with const parameters has a param-env construction issue: +- **Problem**: `build_drop_shim` (in `shim.rs`) constructs its typing environment using the `drop_in_place` intrinsic's DefId, not the dropped type's DefId +- **Impact**: The param-env lacks `ConstArgHasType` predicates, causing panics when MIR generation needs const param types +- **Workaround**: Inlining of drop glue is disabled for types containing const params until they're fully monomorphized (see `inline.rs:746`) +- **Proper fix**: Requires synthesizing a typing environment that merges predicates from both drop_in_place and the dropped type + +This affects types like `struct Foo { ... }` with Drop implementations. + +## Common Patterns + +### Visiting MIR +- Use `rustc_middle::mir::visit::Visitor` for read-only traversal +- Use `MutVisitor` for in-place modifications +- Call `visit_body_preserves_cfg` to keep the CFG structure + +### Creating new values +```rust +let ty = Ty::new_tuple(tcx, &[tcx.types.i32, tcx.types.bool]); +let rvalue = Rvalue::Aggregate(box AggregateKind::Tuple, vec![op1, op2]); +``` + +### Cost checking +Use `CostChecker` (from `cost_checker.rs`) to estimate the cost of inlining or other transformations. + +## Performance Considerations + +- **Compile time vs runtime**: These passes increase compile time to reduce runtime +- **Limits**: Many passes have size/complexity limits to prevent exponential blowup +- **Profiling**: Use `-Ztime-passes` to see pass timings +- **Benchmarking**: Run `./x.py bench` with rustc-perf suite + +## References + +- [MIR documentation](https://rustc-dev-guide.rust-lang.org/mir/) +- [Optimization passes](https://rustc-dev-guide.rust-lang.org/mir/optimizations.html) +- [Dataflow framework](https://rustc-dev-guide.rust-lang.org/mir/dataflow.html) diff --git a/compiler/rustc_mir_transform/src/dataflow_const_prop.rs b/compiler/rustc_mir_transform/src/dataflow_const_prop.rs index 8bcda77f4bc32..40b412b4ed698 100644 --- a/compiler/rustc_mir_transform/src/dataflow_const_prop.rs +++ b/compiler/rustc_mir_transform/src/dataflow_const_prop.rs @@ -245,8 +245,10 @@ impl<'a, 'tcx> ConstAnalysis<'a, 'tcx> { return self.handle_switch_int(discr, targets, state); } TerminatorKind::TailCall { .. } => { - // FIXME(explicit_tail_calls): determine if we need to do something here (probably - // not) + // Tail calls transfer control permanently to the callee, + // so there's no return value to propagate. The analysis + // naturally terminates here, which is the correct behavior. + // Effect is already handled by normal terminator processing. } TerminatorKind::Goto { .. } | TerminatorKind::UnwindResume diff --git a/compiler/rustc_mir_transform/src/dest_prop.rs b/compiler/rustc_mir_transform/src/dest_prop.rs index a10e0b82467cc..60757271fe854 100644 --- a/compiler/rustc_mir_transform/src/dest_prop.rs +++ b/compiler/rustc_mir_transform/src/dest_prop.rs @@ -272,7 +272,15 @@ impl<'tcx> MutVisitor<'tcx> for Merger<'tcx> { fn visit_statement(&mut self, statement: &mut Statement<'tcx>, location: Location) { match &statement.kind { - // FIXME: Don't delete storage statements, but "merge" the storage ranges instead. + // Currently we delete storage statements for merged locals. + // TODO: A better approach would be to merge the storage liveness ranges: + // - For StorageLive: use the earliest live point among merged locals + // - For StorageDead: use the latest dead point among merged locals + // This would improve stack usage by allowing the compiler to reuse stack slots. + // Implementation would require: + // 1. Computing liveness ranges for all locals before merging + // 2. Updating StorageLive/Dead statements to reflect merged ranges + // 3. Ensuring no overlapping lifetimes for non-merged locals StatementKind::StorageDead(local) | StatementKind::StorageLive(local) if self.merged_locals.contains(*local) => { @@ -410,11 +418,15 @@ impl<'tcx> Visitor<'tcx> for FindAssignments<'_, 'tcx> { } // As described at the top of this file, we do not touch locals which have - // different types. + // different types. Even if types are subtypes of each other, merging them + // could lead to incorrect vtable selection or other type-dependent behavior. let src_ty = self.body.local_decls()[src].ty; let dest_ty = self.body.local_decls()[dest].ty; if src_ty != dest_ty { - // FIXME(#112651): This can be removed afterwards. Also update the module description. + // See issue #112651: Once the type system properly handles subtyping in MIR, + // we might be able to relax this constraint for certain safe subtyping cases. + // For now, we require exact type equality to ensure soundness. + // Also update the module description if this changes. trace!("skipped `{src:?} = {dest:?}` due to subtyping: {src_ty} != {dest_ty}"); return; } diff --git a/compiler/rustc_mir_transform/src/gvn.rs b/compiler/rustc_mir_transform/src/gvn.rs index 48663c4a52f61..a744edc75e72d 100644 --- a/compiler/rustc_mir_transform/src/gvn.rs +++ b/compiler/rustc_mir_transform/src/gvn.rs @@ -1715,9 +1715,15 @@ fn op_to_prop_const<'tcx>( && let Some(scalar) = ecx.read_scalar(op).discard_err() { if !scalar.try_to_scalar_int().is_ok() { - // Check that we do not leak a pointer. - // Those pointers may lose part of their identity in codegen. - // FIXME: remove this hack once https://github.com/rust-lang/rust/issues/79738 is fixed. + // Prevent propagation of pointer values as Scalar constants. + // Issue #79738: Pointers can lose provenance information during + // constant propagation, leading to miscompilation. Specifically: + // - Different allocations might get the same address in CTFE + // - Pointer identity is not preserved across const prop boundaries + // - Codegen may incorrectly assume pointer equality + // + // Until pointer identity is properly tracked through const prop, + // we avoid propagating any pointer-containing scalars. return None; } return Some(ConstValue::Scalar(scalar)); @@ -1728,9 +1734,17 @@ fn op_to_prop_const<'tcx>( if let Either::Left(mplace) = op.as_mplace_or_imm() { let (size, _align) = ecx.size_and_align_of_val(&mplace).discard_err()??; - // Do not try interning a value that contains provenance. - // Due to https://github.com/rust-lang/rust/issues/79738, doing so could lead to bugs. - // FIXME: remove this hack once that issue is fixed. + // Do not try interning a value that contains provenance (interior pointers). + // Issue #79738: Values with interior pointers can have pointer identity + // issues during constant propagation. When we intern and later retrieve + // such values, the pointer provenance may not be preserved correctly, + // causing: + // - Loss of allocation identity + // - Incorrect pointer comparisons in the generated code + // - Potential miscompilation when pointers are compared by address + // + // We must avoid propagating any allocation with interior provenance + // until the const prop system properly tracks pointer identity. let alloc_ref = ecx.get_ptr_alloc(mplace.ptr(), size).discard_err()??; if alloc_ref.has_provenance() { return None; @@ -1741,9 +1755,20 @@ fn op_to_prop_const<'tcx>( let alloc_id = prov.alloc_id(); intern_const_alloc_for_constprop(ecx, alloc_id).discard_err()?; - // `alloc_id` may point to a static. Codegen will choke on an `Indirect` with anything - // by `GlobalAlloc::Memory`, so do fall through to copying if needed. - // FIXME: find a way to treat this more uniformly (probably by fixing codegen) + // Check if we can directly use this allocation as an Indirect constant. + // Codegen currently only handles GlobalAlloc::Memory in Indirect constants; + // other allocation kinds (like GlobalAlloc::Static, GlobalAlloc::Function, + // GlobalAlloc::VTable) require different handling in codegen. + // + // TODO: Ideally codegen should handle all GlobalAlloc kinds uniformly in + // ConstValue::Indirect, which would allow us to eliminate this check and + // avoid unnecessary allocation copies. This would require: + // 1. Teaching codegen to handle Indirect references to statics + // 2. Ensuring proper handling of vtable and function pointer indirection + // 3. Verifying that no optimization passes break with non-Memory Indirect values + // + // Until then, we fall back to copying non-Memory allocations into a new + // Memory allocation to ensure codegen compatibility. if let GlobalAlloc::Memory(alloc) = ecx.tcx.global_alloc(alloc_id) // Transmuting a constant is just an offset in the allocation. If the alignment of the // allocation is not enough, fallback to copying into a properly aligned value. @@ -1758,9 +1783,14 @@ fn op_to_prop_const<'tcx>( ecx.intern_with_temp_alloc(op.layout, |ecx, dest| ecx.copy_op(op, dest)).discard_err()?; let value = ConstValue::Indirect { alloc_id, offset: Size::ZERO }; - // Check that we do not leak a pointer. - // Those pointers may lose part of their identity in codegen. - // FIXME: remove this hack once https://github.com/rust-lang/rust/issues/79738 is fixed. + // Final check: ensure the newly created allocation has no interior pointers. + // Issue #79738: Even after creating a fresh allocation, we must verify it + // doesn't contain pointers that could lose their identity. This can happen when: + // - The source value contained function pointers or vtable pointers + // - The copy operation preserved pointer values without proper provenance tracking + // - Interior pointers could be compared by address later + // + // Only propagate allocations that are pointer-free to avoid identity issues. if ecx.tcx.global_alloc(alloc_id).unwrap_memory().inner().provenance().ptrs().is_empty() { return Some(value); } @@ -1800,9 +1830,15 @@ impl<'tcx> VnState<'_, '_, 'tcx> { let value = op_to_prop_const(&mut self.ecx, op)?; - // Check that we do not leak a pointer. - // Those pointers may lose part of their identity in codegen. - // FIXME: remove this hack once https://github.com/rust-lang/rust/issues/79738 is fixed. + // Verify no pointer provenance leaked through into the constant. + // Issue #79738: This assertion catches cases where pointer-containing + // values slipped through our earlier checks. If this fires, it means: + // 1. A pointer made it into a ConstValue despite our guards + // 2. The pointer identity tracking is insufficient + // 3. Propagating this constant could cause miscompilation + // + // This is a safety net; the earlier checks in op_to_prop_const should + // have already prevented pointer-containing values from reaching here. assert!(!value.may_have_provenance(self.tcx, op.layout.size)); let const_ = Const::Val(value, op.layout.ty); @@ -1901,13 +1937,30 @@ impl<'tcx> MutVisitor<'tcx> for VnState<'_, '_, 'tcx> { if let Some(local) = lhs.as_local() && self.ssa.is_ssa(local) - && let rvalue_ty = rvalue.ty(self.local_decls, self.tcx) - // FIXME(#112651) `rvalue` may have a subtype to `local`. We can only mark - // `local` as reusable if we have an exact type match. - && self.local_decls[local].ty == rvalue_ty { - let value = value.unwrap_or_else(|| self.new_opaque(rvalue_ty)); - self.assign(local, value); + let local_ty = self.local_decls[local].ty; + let rvalue_ty = rvalue.ty(self.local_decls, self.tcx); + + // Check if the assignment is valid. We accept both exact type equality + // and valid subtyping relationships (handled by Subtype casts inserted + // by the `add_subtyping_projections` pass). The type checker uses + // `relate_types` with Covariant variance to accept subtyping during + // the optimization phase. + let types_compatible = if local_ty == rvalue_ty { + // Fast path: exact type match + true + } else { + // Check for valid subtyping relationship using the same logic + // as the validator. This allows GVN to work correctly with + // Subtype casts that may be present. + use rustc_middle::ty::Variance; + crate::util::relate_types(self.tcx, self.typing_env(), Variance::Covariant, rvalue_ty, local_ty) + }; + + if types_compatible { + let value = value.unwrap_or_else(|| self.new_opaque(rvalue_ty)); + self.assign(local, value); + } } } diff --git a/compiler/rustc_mir_transform/src/inline.rs b/compiler/rustc_mir_transform/src/inline.rs index dc6088849bf5e..0083266b84cb4 100644 --- a/compiler/rustc_mir_transform/src/inline.rs +++ b/compiler/rustc_mir_transform/src/inline.rs @@ -388,9 +388,18 @@ impl<'tcx> Inliner<'tcx> for NormalInliner<'tcx> { if callee_body.basic_blocks.len() <= 3 { threshold += threshold / 4; } + + // Give an extra bonus to very tiny functions (single block, likely trivial getters/setters) + // These are especially profitable to inline as they often disappear entirely after optimization + if callee_body.basic_blocks.len() == 1 { + threshold += threshold / 2; + } + debug!(" final inline threshold = {}", threshold); - // FIXME: Give a bonus to functions with only a single caller + // Note: Detecting single-caller functions would require inter-procedural analysis + // which is not currently available at this stage. Such analysis would need to be + // performed earlier in the compilation pipeline and stored as metadata. let mut checker = CostChecker::new(tcx, self.typing_env(), Some(callsite.callee), callee_body); @@ -440,8 +449,13 @@ impl<'tcx> Inliner<'tcx> for NormalInliner<'tcx> { // if the no-attribute function ends up with the same instruction set anyway. return Err("cannot move inline-asm across instruction sets"); } else if let TerminatorKind::TailCall { .. } = term.kind { - // FIXME(explicit_tail_calls): figure out how exactly functions containing tail - // calls can be inlined (and if they even should) + // Functions containing tail calls are currently not inlined. + // Inlining them would require: + // 1. Converting the tail call to a regular call (losing tail-call optimization) + // 2. Or complex transformation to preserve the tail-call property + // Since tail calls are often used for performance (avoid stack growth), + // inlining them might defeat their purpose. More analysis needed to determine + // when inlining a function with tail calls would be beneficial. return Err("can't inline functions with tail calls"); } else { work_list.extend(term.successors()) @@ -549,7 +563,13 @@ fn resolve_callsite<'tcx, I: Inliner<'tcx>>( // Only consider direct calls to functions let terminator = bb_data.terminator(); - // FIXME(explicit_tail_calls): figure out if we can inline tail calls + // Note: Tail calls (TerminatorKind::TailCall) are not currently supported for inlining. + // Inlining a tail call would require special handling: + // 1. The tail call must become a direct transfer of control (no return) + // 2. Caller's cleanup/return must be eliminated + // 3. Argument passing must be carefully handled (potential ABI differences) + // These transformations are complex and may not always be beneficial. + // For now, we only inline regular Call terminators. if let TerminatorKind::Call { ref func, fn_span, .. } = terminator.kind { let func_ty = func.ty(caller_body, tcx); if let ty::FnDef(def_id, args) = *func_ty.kind() { @@ -725,9 +745,41 @@ fn check_mir_is_available<'tcx, I: Inliner<'tcx>>( // FIXME(#127030): `ConstParamHasTy` has bad interactions with // the drop shim builder, which does not evaluate predicates in - // the correct param-env for types being dropped. Stall resolving - // the MIR for this instance until all of its const params are - // substituted. + // the correct param-env for types being dropped. + // + // Root cause: `build_drop_shim` (in compiler/rustc_mir_transform/src/shim.rs) + // constructs the `TypingEnv` using `ty::TypingEnv::post_analysis(tcx, def_id)`, + // where `def_id` refers to the `drop_in_place` intrinsic, not the type being + // dropped. This means the param-env lacks `ConstArgHasType` predicates that + // describe the types of const parameters. + // + // When MIR generation needs the type of a const param (e.g., for field access + // or const evaluation), it calls `ParamConst::find_const_ty_from_env`, which + // searches the param-env's caller bounds for matching `ConstArgHasType` clauses. + // Since the drop shim's param-env is based on `drop_in_place` rather than the + // dropped type's definition, these clauses are missing, causing a panic: + // "cannot find `ParamConst { ... }` in param-env". + // + // Workaround: Stall resolving MIR for drop glue of types with const params + // until all const params are substituted. This ensures we only build drop + // shims for monomorphized types with concrete const arguments. + // + // Potential fixes: + // 1. Modify `build_drop_shim` to construct a typing environment that includes + // the dropped type's predicates. This requires plumbing the type's `DefId` + // (if it's an ADT) and combining its param-env with drop_in_place's param-env. + // + // 2. Make `DropShimElaborator::typing_env()` return a synthesized typing env + // that merges predicates from both the drop_in_place item and the type being + // dropped. This would require changes to the `DropElaborator` trait. + // + // 3. Relax `ParamConst::find_const_ty_from_env` to handle missing predicates + // more gracefully, potentially by falling back to a generic inference or + // deferring const evaluation. (Note: core maintainers discourage this approach + // as it masks deeper param-env construction issues.) + // + // See also: compiler/rustc_mir_transform/src/shim.rs:369 (DropShimElaborator init) + // compiler/rustc_middle/src/ty/sty.rs:351 (find_const_ty_from_env impl) InstanceKind::DropGlue(_, Some(ty)) if ty.has_type_flags(TypeFlags::HAS_CT_PARAM) => { debug!("still needs substitution"); return Err("implementation limitation -- HACK for dropping polymorphic type"); diff --git a/compiler/rustc_mir_transform/src/shim.rs b/compiler/rustc_mir_transform/src/shim.rs index 85e340c0a02ab..3a35bffab970a 100644 --- a/compiler/rustc_mir_transform/src/shim.rs +++ b/compiler/rustc_mir_transform/src/shim.rs @@ -365,6 +365,23 @@ fn build_drop_shim<'tcx>(tcx: TyCtxt<'tcx>, def_id: DefId, ty: Option>) if ty.is_some() { let patch = { + // FIXME(#127030): This typing_env is constructed using the `drop_in_place` + // intrinsic's `def_id`, not the type being dropped. This causes issues when + // the dropped type contains const parameters, because the param-env lacks + // `ConstArgHasType` predicates for those const params. + // + // When MIR generation needs const param types (via `ParamConst::find_const_ty_from_env`), + // it searches the param-env's caller bounds, which are empty for const params in + // this context. This leads to panics during MIR construction for drop glue of + // types with const params. + // + // Current workaround: We prevent inlining of drop glue with const params in + // `check_mir_is_available` (see compiler/rustc_mir_transform/src/inline.rs:746). + // + // Proper fix would involve constructing a typing_env that merges predicates from + // both the drop_in_place intrinsic and the type being dropped (if it's an ADT with + // a DefId). This requires significant refactoring to properly handle ADT field + // predicates and const param bounds. let typing_env = ty::TypingEnv::post_analysis(tcx, def_id); let mut elaborator = DropShimElaborator { body: &body,