Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve codegen of large Copy parameters under CopyProp to match DestinationPropagation #108068

Closed
saethlin opened this issue Feb 15, 2023 · 0 comments · Fixed by #113758
Closed
Labels
A-codegen Area: Code generation A-mir-opt Area: MIR optimizations C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code.

Comments

@saethlin
Copy link
Member

This will probably bitrot, but this is a working godbolt link at the moment: https://godbolt.org/z/v446e66jG

Currently this program:

type T = [u8; 256];
pub fn f(a: T, b: fn(_: T, _: T)) {
    b(a, a)
}

Compiles to this IR:

define void @f(ptr noalias nocapture noundef readonly dereferenceable(256) %a, ptr nocapture noundef nonnull readonly %b) unnamed_addr #0 !dbg !6 {
start:
  %_5 = alloca [256 x i8], align 1
  %_4 = alloca [256 x i8], align 1
  call void @llvm.lifetime.start.p0(i64 256, ptr nonnull %_4), !dbg !11
  call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 1 dereferenceable(256) %_4, ptr noundef nonnull align 1 dereferenceable(256) %a, i64 256, i1 false), !dbg !11
  call void @llvm.lifetime.start.p0(i64 256, ptr nonnull %_5), !dbg !12
  call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 1 dereferenceable(256) %_5, ptr noundef nonnull align 1 dereferenceable(256) %a, i64 256, i1 false), !dbg !12
  call void %b(ptr noalias nocapture noundef nonnull dereferenceable(256) %_4, ptr noalias nocapture noundef nonnull dereferenceable(256) %_5), !dbg !13
  call void @llvm.lifetime.end.p0(i64 256, ptr nonnull %_5), !dbg !14
  call void @llvm.lifetime.end.p0(i64 256, ptr nonnull %_4), !dbg !14
  ret void, !dbg !15
}

But if we pass -Zmir-enable-passes=+DestinationPropagation, we eliminate a memcpy:

define void @f(ptr noalias nocapture noundef dereferenceable(256) %a, ptr nocapture noundef nonnull readonly %b) unnamed_addr #0 !dbg !6 {
start:
  %_3 = alloca [256 x i8], align 1
  call void @llvm.lifetime.start.p0(i64 256, ptr nonnull %_3), !dbg !11
  call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 1 dereferenceable(256) %_3, ptr noundef nonnull align 1 dereferenceable(256) %a, i64 256, i1 false), !dbg !11
  call void %b(ptr noalias nocapture noundef nonnull dereferenceable(256) %_3, ptr noalias nocapture noundef nonnull dereferenceable(256) %a), !dbg !12
  call void @llvm.lifetime.end.p0(i64 256, ptr nonnull %_3), !dbg !13
  ret void, !dbg !14
}

But with CopyProp enabled, we do not manage this optimization. This is probably a coordination problem between codegen, MIR optimizations, and MIR semantics: #105813 (comment)

@jyn514 jyn514 added I-slow Issue: Problems and improvements with respect to performance of generated code. C-enhancement Category: An issue proposing an enhancement or a PR with one. A-mir-opt Area: MIR optimizations A-codegen Area: Code generation labels Apr 10, 2023
@bors bors closed this as completed in 06a53dd Jul 20, 2023
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Jul 22, 2023
Turn copy into moves during DSE.

Dead store elimination computes whether removing a direct store to an unborrowed place is allowed.
Where removing a store is allowed, writing `uninit` is too.

This means that we can use this pass to transform `copy` operands into `move` operands. This is only interesting in call terminators, so we only handle those.

Special care is taken for the `use_both(_1, _1)` case:
- moving the second argument is ok, as `_1` is not live after the call;
- moving the first argument is not, as the second argument reads `_1`.

Fixes #75993
Fixes rust-lang/rust#108068

r? `@RalfJung`
cc `@JakobDegen`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-mir-opt Area: MIR optimizations C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants