Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for allowing zero-sized memory accesses and offsets #117945

Open
1 of 6 tasks
RalfJung opened this issue Nov 15, 2023 · 10 comments
Open
1 of 6 tasks

Tracking Issue for allowing zero-sized memory accesses and offsets #117945

RalfJung opened this issue Nov 15, 2023 · 10 comments
Labels
C-tracking-issue Category: A tracking issue for an RFC or an unstable feature.

Comments

@RalfJung
Copy link
Member

RalfJung commented Nov 15, 2023

This issue tracks implementing the t-opsem decision in rust-lang/unsafe-code-guidelines#472. This will require adjustments in many places (codegen, Miri, library docs, reference, ...). The intention is to track here what needs to be done until the transition is complete.

2024-04-17: The current status is that this is blocked on t-lang discussing #117329.

  • update LLVM codegen
  • update cranelift codegen (not needed)
  • update GCC codegen
  • update Miri
  • update library docs
  • update the reference

Implementation history

@RalfJung RalfJung added the C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. label Nov 15, 2023
@bjorn3
Copy link
Member

bjorn3 commented Nov 16, 2023

cg_clif accepts ZST memory accesses and pointer offsets already. Pointer offsets are implemented as integer addition which doesn't have UB and ZST memory accesses never get turned into loads and stores in cranelift ir as there is no instruction that does so.

@RalfJung
Copy link
Member Author

memory accesses never get turned into loads and stores in cranelift ir as there is no instruction that does so.

Besides direct accesses, the other concerns are the copy, write_bytes, compare_bytes intrinsics. Those must be implemented in a way that they are not UB when elem_count*elem_size is 0.

@bjorn3
Copy link
Member

bjorn3 commented Nov 16, 2023

They are implemented by calling the respective libc functions which LLVM already expects to accept 0-sized accesses, right?

@RalfJung
Copy link
Member Author

GCC codegen might also need updating, Cc @antoyo @GuillaumeGomez

@RalfJung
Copy link
Member Author

They are implemented by calling the respective libc functions which LLVM already expects to accept 0-sized accesses, right?

Well what LLVM assumes doesn't matter for the cranelift backend, does it? ;) But more importantly, Rust explicitly assumes this itself as documented here.

@GuillaumeGomez
Copy link
Member

No problem. Please ping us when we need to update our part and thanks for the ping!

@RalfJung
Copy link
Member Author

Well I'm asking you if you need to update anything. :) You need to make sure that the Offset MIR binop is compiled in a way that offset by 0 bytes is always Defined Behavior even if the pointer operand is null or dangling or out of bounds or whatever.

I think zero-sized memory accesses disappear in the SSA codegen infrastructure before your backend even sees them so they should be fine.

And finally the copy, copy_nonoverlapping, write_bytes, compare_bytes intrinsics need to be lowered in a way that they are Defined Behavior when the size is 0, even if the pointers are null or dangling or whatever.

@antoyo
Copy link
Contributor

antoyo commented Nov 16, 2023

These intrinsics are implemented by calling the GCC builtin functions: memcmp, memset, memcpy, memmove.
I'll double-check, but it seems fine to have a count of zero, but not NULL pointers.

@RalfJung
Copy link
Member Author

RalfJung commented Nov 16, 2023

Okay, something needs to change then in the backend because we'll allow null pointers for the Rust intrinsics.

@RalfJung
Copy link
Member Author

I updated #117329 to make it ready to land.

@rust-lang/opsem to address this concern, I made offset_from have library UB but not language UB on two pointers with the same address but provenance for different allocations. Please let me know what you think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: A tracking issue for an RFC or an unstable feature.
Projects
None yet
Development

No branches or pull requests

4 participants