Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add range metadata to slice lengths #116542

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

the8472
Copy link
Member

@the8472 the8472 commented Oct 8, 2023

This adds range information to the slice-len in fat pointers if we can conservatively determine that the pointee is not a ZST without having to normalize the pointee type.

I only intended to pass the !range to llvm but apparently this also lets the length in fat pointers be used for its niches 😅.

Ideally this would use the naive-layout computation from #113166 to calculate a better approximation of the pointee size, but that PR got reverted.

@rustbot
Copy link
Collaborator

rustbot commented Oct 8, 2023

r? @cjgillot

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 8, 2023
@rustbot
Copy link
Collaborator

rustbot commented Oct 8, 2023

Changes to the size of AST and/or HIR nodes.

cc @nnethercote

@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Member

RalfJung commented Oct 8, 2023

Note that doing this for raw slice pointers would be unsound, since slice_from_raw_parts is safe.

So this PR must be introducing (for the first time) a difference in metadata validity for raw pointers vs references. Are we sure we want that?

| ty::RawPtr(..)
| ty::Char
| ty::Ref(..)
| ty::Closure(..) => true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closure could be a ZST I think, if the environment is empty.

ty::Tuple(tys) => tys.iter().any(|ty| ty.is_trivially_non_zst(tcx)),
ty::Adt(def, args) => def.all_fields().any(|field| {
let ty = field.ty(tcx, args);
ty.is_trivially_non_zst(tcx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is non-obvious and deserves a comment. It relies on Result<(), (!, i32)> not removing the Err variant entirely (which plausibly it could do due to it being uninhabited -- but that would also cause issues in MIR).

return Err(error(cx, LayoutError::Unknown(pointee)));
};

if !ty.is_unsafe_ptr() {
match pointee.kind() {
ty::Slice(element) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So &[i32] is getting the optimization but &(bool, [i32]) is not? That seems odd?

ty::Slice(element) => {
let mut metadata = scalar_unit(Int(dl.ptr_sized_integer(), false));
if !ty.is_unsafe_ptr() && !element.is_trivially_non_zst(tcx) {
metadata.valid_range_mut().end = dl.ptr_sized_integer().signed_max() as u128;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have this logic duplicated here?

@the8472
Copy link
Member Author

the8472 commented Oct 8, 2023

I think the test failure indicates that I've already broken something because rustc_graphviz only depends on std and doesn't have any unsafe, so its behavior should be unchanged.

@@ -1,7 +1,7 @@
// check-pass
// compile-flags: -Zhir-stats
// only-x86_64

// ignore-stage1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is needed temporarily, can you add a comment saying that?

Copy link

@riking riking Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write this as a cfg(bootstrap) so that it gets picked up during the version bump

@the8472
Copy link
Member Author

the8472 commented Oct 8, 2023

My approach probably is too naive. Apparently layout_of gets called for generic &[T] (with unresolved T)? Having different layouts for &[T] and &[<concrete type>] where one allows niches and the other doesn't seems like it would cause issues.

Though I'm a bit surprised that all the stage 2 UI/codegen/std tests pass and it "merely" fails in rustc_graphviz.

@the8472 the8472 marked this pull request as draft October 8, 2023 22:50
@RalfJung
Copy link
Member

RalfJung commented Oct 9, 2023

Yes layout_of gets called on generic types, and if it returns a result then it must be the case that instantiating the generics must not change the resulting layout. This way we can know some layout facts even before monomorphization.

With your PR this means that &[T] must return TooGeneric if it doesn't know whether T is a ZST or not.

@cjgillot
Copy link
Contributor

Having &[T] look at the layout of T will definitely create cycles during layout computation. For instance: struct A<'a> { x: &'a [A<'a>] }. That is probably impossible.

Could we cheat by adding the annotation in codegen, on the code we generate for Len MIR statement? Just the proper range attribute. At that point we have all the layout info we need.

@the8472
Copy link
Member Author

the8472 commented Oct 11, 2023

For instance: struct A<'a> { x: &'a [A<'a>] }. That is probably impossible.

We don't necessarily need the full layout if we limit ourselves to adding an isize::MAX range annotation - computing more accurate, size-based ranges would require the layout - then we only need to know whether A is certainly ZST or non-ZST. since x is a reference A is known to be non-zero. That's similar to the naive layout computation in #113166, but simpler.

Just the proper range attribute.

Yeah, that's the fallback solution if I can't get this to work. It'll be simpler but lose the niches

@cjgillot cjgillot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 4, 2023
@bors
Copy link
Contributor

bors commented Dec 26, 2023

☔ The latest upstream changes (presumably #119258) made this pull request unmergeable. Please resolve the merge conflicts.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@the8472
Copy link
Member Author

the8472 commented Mar 24, 2024

Back to a broken rustc_graphviz . I'm not sure how I'm breaking it. Maybe something about enum discriminants or about &str....

@bors
Copy link
Contributor

bors commented Mar 26, 2024

☔ The latest upstream changes (presumably #122849) made this pull request unmergeable. Please resolve the merge conflicts.

@the8472
Copy link
Member Author

the8472 commented Mar 31, 2024

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 31, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 31, 2024
Add range metadata to slice lengths

This adds range information to the slice-len in fat pointers if we can conservatively determine that the pointee is not a ZST without having to normalize the pointee type.

I only intended to pass the `!range` to llvm but apparently this also lets the length in fat pointers be used for its niches 😅.

Ideally this would use the naive-layout computation from rust-lang#113166 to calculate a better approximation of the pointee size, but that PR got reverted.
@bors
Copy link
Contributor

bors commented Mar 31, 2024

⌛ Trying commit c054bbc with merge cbf560f...

@bors
Copy link
Contributor

bors commented Mar 31, 2024

☀️ Try build successful - checks-actions
Build commit: cbf560f (cbf560fca7433b3a9f62c9d6f32aeb29e5bf5b29)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (cbf560f): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.5% [0.1%, 1.0%] 28
Regressions ❌
(secondary)
0.9% [0.3%, 2.1%] 34
Improvements ✅
(primary)
-0.9% [-1.2%, -0.7%] 2
Improvements ✅
(secondary)
-2.9% [-6.9%, -0.5%] 3
All ❌✅ (primary) 0.4% [-1.2%, 1.0%] 30

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
5.8% [4.1%, 6.8%] 6
Improvements ✅
(primary)
-1.0% [-1.8%, -0.7%] 4
Improvements ✅
(secondary)
-2.7% [-3.5%, -1.2%] 3
All ❌✅ (primary) -1.0% [-1.8%, -0.7%] 4

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.8% [0.7%, 0.8%] 3
Regressions ❌
(secondary)
1.8% [0.8%, 3.5%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.0% [-2.9%, -1.0%] 2
All ❌✅ (primary) 0.8% [0.7%, 0.8%] 3

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.5% [0.0%, 2.0%] 31
Regressions ❌
(secondary)
1.3% [0.0%, 2.7%] 74
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.5% [0.0%, 2.0%] 31

Bootstrap: 666.819s -> 669.852s (0.45%)
Artifact size: 315.68 MiB -> 315.91 MiB (0.07%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Mar 31, 2024
@the8472
Copy link
Member Author

the8472 commented Mar 31, 2024

The layout code itself barely even shows up in cachegrind diffs. So I assume that the regressions are due to LLVM doing more work or worse codegen.

Perhaps it is better after all to give expanding the already-existing niche in references another try. That'll have to be a target-dependent optimization though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants