Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix 32-bit overflows in LLVM composite constants #122000

Merged
merged 1 commit into from Mar 12, 2024

Conversation

erer1243
Copy link
Contributor

@erer1243 erer1243 commented Mar 4, 2024

Inspired by #121868. Fixes unsoundness created when constructing constant arrays, strings, and structs with 2^32 or more elements on x86_64. This introduces copies of a few LLVM functions that have their signatures updated to use size_t in place of unsigned int. Alternatively we could just add overflow checks and just disallow huge composite constants. That introduces less code, but maybe a huge static block of memory is useful in embedded/no-os situations?

@rustbot
Copy link
Collaborator

rustbot commented Mar 4, 2024

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @nnethercote (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 4, 2024
@nnethercote
Copy link
Contributor

Not my area of expertise, let's try r? @nikic

@rustbot rustbot assigned nikic and unassigned nnethercote Mar 5, 2024
@nikic
Copy link
Contributor

nikic commented Mar 5, 2024

Would you be interested in upstreaming the new APIs to LLVM? If you do that, then we can directly start using them by providing shims like this:

// Or whatever version it was added
#if LLVM_VERSION_LE(17, 0)
LLVMValueRef LLVMConstArray2(LLVMTypeRef ElementTy, LLVMValueRef *ConstantVals,
                             uint64_t Length) {
  ArrayRef<Constant *> V(unwrap<Constant>(ConstantVals, Length), Length);
  return wrap(ConstantArray::get(ArrayType::get(unwrap(ElementTy), Length), V));
}
#endif

That way we'll automatically switch to using the upstream APIs as soon as we can.

@erer1243
Copy link
Contributor Author

erer1243 commented Mar 5, 2024

@nikic Yeah, I was planning on doing that. I'm not sure how long it would take to get through, so I'm not sure which version we would put in the version macro.

@erer1243
Copy link
Contributor Author

erer1243 commented Mar 8, 2024

Having 64bit APIs for structs and vectors did not make sense, because LLVM stores 32bit lengths for them anyway. Now we just assert for overflows before calling those functions. I made a pr to LLVM for 64bit ConstantString constructors and used your idea @nikic to create shims to delete later. If the LLVM patch gets through quickly we can update the LLVM_VERSION_LT check to be 19.

DianQK pushed a commit to llvm/llvm-project that referenced this pull request Mar 9, 2024
…signatures (#84433)

Adds `LLVMConstStringInContext2` and `LLVMConstString2`, which are
identical to originals except that they use `size_t` for length. This is
a clone of
35276f1
and is needed for rust-lang/rust#122000.

As an aside, the issue of 32 bit overflow on constants is present in the
C++ APIs as well. A few classes, e.g. `ConstantDataArray` and
`ConstantAggregateZero`, can hold 64-bit ArrayTypes but their length
accessors return 32-bit values. This means the same issue from the
original Rust report is also present in LLVM itself. Would it be a
reasonable goal to update all of these length methods & types to be
uint64_t, or would that be too breaking? Alternatively, we could use
safe fallible casts instead of implicit ones inside the accessors (if an
overflow does happen, the solution would be to use
`MyValue->getType()->getArrayNumElements()` instead).
@erer1243
Copy link
Contributor Author

erer1243 commented Mar 9, 2024

Barring the LLVM patch being rescinded, this is now finished.

@rust-log-analyzer

This comment has been minimized.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now. Could you please squash the commits?

compiler/rustc_llvm/llvm-wrapper/RustWrapper.cpp Outdated Show resolved Hide resolved
@nikic
Copy link
Contributor

nikic commented Mar 11, 2024

@bors r+

@bors
Copy link
Contributor

bors commented Mar 11, 2024

📌 Commit 3af28f0 has been approved by nikic

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 11, 2024
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Mar 11, 2024
Fix 32-bit overflows in LLVM composite constants

Inspired by rust-lang#121868. Fixes unsoundness created when constructing constant arrays, strings, and structs with 2^32 or more elements on x86_64. This introduces copies of a few LLVM functions that have their signatures updated to use size_t in place of unsigned int. Alternatively we could just add overflow checks and just disallow huge composite constants. That introduces less code, but maybe a huge static block of memory is useful in embedded/no-os situations?
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 12, 2024
…iaskrgr

Rollup of 8 pull requests

Successful merges:

 - rust-lang#115141 (Update Windows platform support)
 - rust-lang#121865 (Add FileCheck annotations to MIR-opt unnamed-fields tests)
 - rust-lang#122000 (Fix 32-bit overflows in LLVM composite constants)
 - rust-lang#122194 (Enable creating backtraces via -Ztreat-err-as-bug when stashing errors)
 - rust-lang#122319 (Don't ICE when non-self part of trait goal is constrained in new solver)
 - rust-lang#122339 (Update books)
 - rust-lang#122342 (Update /NODEFAUTLIB comment for msvc)
 - rust-lang#122343 (Remove some unnecessary `allow(incomplete_features)` in the test suite)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 60f4b7a into rust-lang:master Mar 12, 2024
11 checks passed
@rustbot rustbot added this to the 1.78.0 milestone Mar 12, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Mar 12, 2024
Rollup merge of rust-lang#122000 - erer1243:issue-121868, r=nikic

Fix 32-bit overflows in LLVM composite constants

Inspired by rust-lang#121868. Fixes unsoundness created when constructing constant arrays, strings, and structs with 2^32 or more elements on x86_64. This introduces copies of a few LLVM functions that have their signatures updated to use size_t in place of unsigned int. Alternatively we could just add overflow checks and just disallow huge composite constants. That introduces less code, but maybe a huge static block of memory is useful in embedded/no-os situations?
ilovepi pushed a commit to ilovepi/llvm-project that referenced this pull request Mar 15, 2024
…signatures (llvm#84433)

Adds `LLVMConstStringInContext2` and `LLVMConstString2`, which are
identical to originals except that they use `size_t` for length. This is
a clone of
llvm@35276f1
and is needed for rust-lang/rust#122000.

As an aside, the issue of 32 bit overflow on constants is present in the
C++ APIs as well. A few classes, e.g. `ConstantDataArray` and
`ConstantAggregateZero`, can hold 64-bit ArrayTypes but their length
accessors return 32-bit values. This means the same issue from the
original Rust report is also present in LLVM itself. Would it be a
reasonable goal to update all of these length methods & types to be
uint64_t, or would that be too breaking? Alternatively, we could use
safe fallible casts instead of implicit ones inside the accessors (if an
overflow does happen, the solution would be to use
`MyValue->getType()->getArrayNumElements()` instead).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants