Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anonymous allocations in statics get duplicated for multiple codegen units #79738

Closed
RalfJung opened this issue Dec 5, 2020 · 17 comments · Fixed by #121644
Closed

Anonymous allocations in statics get duplicated for multiple codegen units #79738

RalfJung opened this issue Dec 5, 2020 · 17 comments · Fixed by #121644
Assignees
Labels
A-codegen Area: Code generation A-const-eval Area: constant evaluation (mir interpretation) C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@RalfJung
Copy link
Member

RalfJung commented Dec 5, 2020

According to analysis performed by @bjorn3, this code actually leads to two allocations containing 42, i.e., FOO and BAR point to different things. The linker later merges the two, so the issue is currently not directly observable. However, it becomes observable when things are mutable.

I think this is a bug. When a static is defined as pub static BAR: &i32 = crate::a::FOO;, IMO we should guarantee that BAR and FOO have the same value. I see no lee-way here for duplicating the memory they both point to.

Cc @oli-obk

@bjorn3 bjorn3 added A-const-eval Area: constant evaluation (mir interpretation) C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 5, 2020
@oli-obk oli-obk added the A-codegen Area: Code generation label Apr 1, 2021
@RalfJung
Copy link
Member Author

IIUC #116564 is an attempt to fix this.

@RalfJung
Copy link
Member Author

IIUC #116564 is an attempt to fix this.

I was wrong.

But I also don't think I understand this example. This is all in the same crate. We could evaluate the static FOO once, which will create two allocations, and then BAR points to the inner of these two. Where does the duplicate AllocId come from...?

@oli-obk
Copy link
Contributor

oli-obk commented Feb 11, 2024

Oh neat you found a single crate repro.

The issue is just that llvm treats alloc ids without a name as codegen unit local and thus generates one per unit. We need to give them a name, but we only want to give those a name that come from a static.

@RalfJung
Copy link
Member Author

I didn't find it, you did. 😂

@RalfJung
Copy link
Member Author

The issue is just that llvm treats alloc ids without a name as codegen unit local and thus generates one per unit. We need to give them a name, but we only want to give those a name that come from a static.

That example fails even with -Ccodegen-units=1 so multiple codegen units can't be the issue I think?

@RalfJung
Copy link
Member Author

RalfJung commented Feb 11, 2024

I think this happens because the codegen backend does not cache allocations. There is a function that codegens an allocation, and if the same AllocId is requested multiple times that just generates multiple LLVM allocations.

@oli-obk
Copy link
Contributor

oli-obk commented Feb 11, 2024

Ah good point. That won't help us cross crate tho

@oli-obk
Copy link
Contributor

oli-obk commented Feb 11, 2024

The issue is just that llvm treats alloc ids without a name as codegen unit local and thus generates one per unit. We need to give them a name, but we only want to give those a name that come from a static.

That example fails even with -Ccodegen-units=1 so multiple codegen units can't be the issue I think?

Yea, the modules are not needed: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=a447edb36a144bd8863b11f743a37937

@RalfJung
Copy link
Member Author

Ah good point. That won't help us cross crate tho

Yeah, adding such a cache does not replace your planned "named inner allocations".

@RalfJung
Copy link
Member Author

Is there an example of this that doesn't need static mut? We want to get rid of static mut anyway so I hope this is useful even for static with interior mutability...

@oli-obk
Copy link
Contributor

oli-obk commented Feb 15, 2024

yea, the issues are the same with static with interior mutability. But I think then we need unleash mode for the test, but I'll check.

@oli-obk
Copy link
Contributor

oli-obk commented Feb 15, 2024

lol no, not even with unleash mode. Our interning-time checks are too solid: https://rust.godbolt.org/z/65o9PnvPq

@RalfJung
Copy link
Member Author

Ah right, static mut is the only one that allows inner mutable allocations. That's quite explicit in the new interner.

But as far as pointer identity goes, we don't need mutability to make this an example, do we? The cross-crate version of this should also trigger the problem?

pub static FOO: &[i32] = &[42];
pub static BAR: &[i32] = &*FOO;

fn main() {
    assert_eq!(FOO.as_ptr(), BAR.as_ptr());
}

bors added a commit to rust-lang-ci/rust that referenced this issue Feb 20, 2024
Always evaluate free constants and statics, even if previous errors occurred

work towards rust-lang#79738

We will need to evaluate static items before the `definitions.freeze()` below, as we will start creating new `DefId`s (for nested allocations) within the `eval_static_initializer` query.

But even without that motivation, this is a good change. Hard errors should always be reported and not silenced if other errors happened earlier.
flip1995 pushed a commit to flip1995/rust-clippy that referenced this issue Feb 22, 2024
Always evaluate free constants and statics, even if previous errors occurred

work towards rust-lang/rust#79738

We will need to evaluate static items before the `definitions.freeze()` below, as we will start creating new `DefId`s (for nested allocations) within the `eval_static_initializer` query.

But even without that motivation, this is a good change. Hard errors should always be reported and not silenced if other errors happened earlier.
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 26, 2024
Ensure nested allocations in statics do not get deduplicated

This PR generates new `DefId`s for nested allocations in static items and feeds all the right queries to make the compiler believe these are regular `static` items. I chose this design, because all other designs are fragile and make the compiler horribly complex for such a niche use case.

At present this wrecks incremental compilation performance *in case nested allocations exist* (because any query creating a `DefId` will be recomputed and never loaded from the cache). This will be resolved later in rust-lang#115613 . All other statics are unaffected by this change and will not have performance implications (heh, famous last words)

This PR contains various smaller refactorings that can be pulled out into separate PRs. It is best reviewed commit-by-commit. The last commit is where the actual magic happens.

r? `@RalfJung` on the const interner and engine changes

fixes rust-lang#79738
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 26, 2024
Ensure nested allocations in statics do not get deduplicated

This PR generates new `DefId`s for nested allocations in static items and feeds all the right queries to make the compiler believe these are regular `static` items. I chose this design, because all other designs are fragile and make the compiler horribly complex for such a niche use case.

At present this wrecks incremental compilation performance *in case nested allocations exist* (because any query creating a `DefId` will be recomputed and never loaded from the cache). This will be resolved later in rust-lang#115613 . All other statics are unaffected by this change and will not have performance regressions (heh, famous last words)

This PR contains various smaller refactorings that can be pulled out into separate PRs. It is best reviewed commit-by-commit. The last commit is where the actual magic happens.

r? `@RalfJung` on the const interner and engine changes

fixes rust-lang#79738
@bors bors closed this as completed in 3b85d2c Mar 12, 2024
@RalfJung
Copy link
Member Author

@cjgillot I seem to recall this issue coming up in the GVN pass... now that this is fixed, is there cleanup that could be done in GVN?

lnicola pushed a commit to lnicola/rust-analyzer that referenced this issue Apr 7, 2024
Always evaluate free constants and statics, even if previous errors occurred

work towards rust-lang/rust#79738

We will need to evaluate static items before the `definitions.freeze()` below, as we will start creating new `DefId`s (for nested allocations) within the `eval_static_initializer` query.

But even without that motivation, this is a good change. Hard errors should always be reported and not silenced if other errors happened earlier.
@cjgillot
Copy link
Contributor

cjgillot commented Apr 7, 2024

Function pointer AllocIds have the same duplication behaviour, and that's by design.
So cleaning-up GVN means distinguishing fn pointer AllocId from static/memory AllocId. I haven't thought where vtable AllocIds lie in this split.

@bjorn3
Copy link
Member

bjorn3 commented Apr 7, 2024

Vtables are duplicated too. They have to for the same reason functions are duplicated and statics can't be generic. cg_clif duplicates vtables for each cgu and I believe cg_ssa does too.

@RalfJung
Copy link
Member Author

RalfJung commented Apr 9, 2024

I am not convinced GVN needs to treat them nearly as carefully as it currently does, but I also don't fully understand the current situation. Let's discuss in a new issue: #123670.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-const-eval Area: constant evaluation (mir interpretation) C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants