Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more comprehensive crate level docs for bevy_ptr #12391

Merged
merged 11 commits into from Mar 12, 2024
106 changes: 102 additions & 4 deletions crates/bevy_ptr/README.md
Expand Up @@ -6,8 +6,106 @@
[![Docs](https://docs.rs/bevy_ptr/badge.svg)](https://docs.rs/bevy_ptr/latest/bevy_ptr/)
[![Discord](https://img.shields.io/discord/691052431525675048.svg?label=&logo=discord&logoColor=ffffff&color=7389D8&labelColor=6A7EC2)](https://discord.gg/bevy)

The `bevy_ptr` crate provides low-level abstractions for working with pointers in a more safe way than using rust's raw pointers.
Pointers in computer programming, according to Wikipedia, are "objects in many programming languages that stores a memory address".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Pointers in computer programming, according to Wikipedia, are "objects in many programming languages that stores a memory address".
A pointer, according to Wikipedia, is an "object in many programming languages that stores a memory address".

This fixes the plurality grammatical error where "stores" should be singular.

I would recommend rewriting this phrase, though. I don't think quoting Wikipedia is necessary here, especially with the "in many programming languages" part. It's implied that this is programming related, so the phrase is redundant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Reworded this section.

They're a fundamental building block for constructing more complex data structures.

Rust has lifetimed and typed references (`&'a T`), unlifetimed and typed references (`*const T`), but no lifetimed but untyped references.
`bevy_ptr` adds them, called `Ptr<'a>`, `PtrMut<'a>` and `OwningPtr<'a>`.
These types are lifetime-checked so can never lead to problems like use-after-frees and must always point to valid data.
They're also *the* definitive source of memory safety bugs. You can dereference a null pointer. Access a pointer after the underlying
memory has been freed. Ignore type safety and misread or mutate the underlying memory improperly.
james7132 marked this conversation as resolved.
Show resolved Hide resolved

Rust is a programming language that heavily relies on its types to enforce correctness, and by proxy, memory safety. As a result,
Rust has an entire zoo of types for working with pointers, and a graph of safe and unsafe conversions to among them to make working
with them safer.
james7132 marked this conversation as resolved.
Show resolved Hide resolved

`bevy_ptr` is a crate that attempts to bridge the gap between the full blown unsafety of `*mut ()` and the safe `&'a T`, allowing users
to build progressively to choose what invariants to uphold.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
to build progressively to choose what invariants to uphold.
to choose which invariants to uphold.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded this ti be a bit clear what I meant here.


## How to build a Borrow (from scratch)
james7132 marked this conversation as resolved.
Show resolved Hide resolved
Correctly and safety converting a pointer into a valid borrow is at the core of all `unsafe` code in Rust. Looking at the documentation for
[`(*const T)::as_ref`], a pointer must satisfy *all* of the following conditions:

* The pointer must be properly aligned.
* The pointer cannot be null, even for zero sized types.
* The pointer must be within bounds of a valid allocated object (on the stack or the heap).
* The pointer must point to an initialized instance of `T`.
* The newly assigned lifetime should be valid for the value that the pointer is targeting.
* The code must enforce Rust's aliasing rules. Only one mutable borrow or arbitrarily many read-only borrows may exist to a value at any given moment
in time, and converting from `&T` to `&mut T` is always not allowed.
james7132 marked this conversation as resolved.
Show resolved Hide resolved

Note these rules aren't final and are still in flux as the Rust Project hashes out what exactly are the pointer aliasing rules, but the expectation is that the
final set of constraints are going to be a superset of this list, not a subset.

This list already is non-trivial to satisfy in isolation. Thankfully, the Rust core/standard library provides a progressive list of pointer types that help
build these safety guarantees...

## Standard Pointers

|Pointer Type |Lifetime'ed|Mutable|Strongly Typed|Aligned|Not Null|Forbids Aliasing|Forbids Arithmetic|
|:------------------|:----------|:------|:-------------|:------|:-------|:---------------|:-----------------|
james7132 marked this conversation as resolved.
Show resolved Hide resolved
|`Box<T>` |Owned |Yes |Yes |Yes |Yes |Yes |Yes |
|`&'a mut T` |Yes |Yes |Yes |Yes |Yes |Yes |Yes |
|`&'a T` |Yes |No |Yes |Yes |Yes |No |Yes |
|`&'a UnsafeCell<T>`|Yes |Maybe |Yes |Yes |Yes |Yes |Yes |
|`NonNull<T>` |No |Yes |Yes |No |Yes |No |No |
|`*const T` |No |No |Yes |No |No |No |No |
|`*mut T` |No |Yes |Yes |No |No |No |No |
|`*const ()` |No |No |No |No |No |No |No |
|`*mut ()` |No |Yes |No |No |No |No |No |

`&T`, `&mut T`, and `Box<T>` are by far the most common pointer types that Rust developers will see. They're the only ones in this list that are entirely usable
without the use of `unsafe`.

`&UnsafeCell<T>` is the first step away from safety. `UnsafeCell` is the *only* way to get a mutable borrow from an immutable one in the language, so it's the
base primitive for all interior mutability in the language: `Cell<T>`, `RefCell<T>`, `Mutex<T>`, `RwLock<T>`, etc. are all built on top of
`UnsafeCell<T>`. To safety convert `&UnsafeCell<T>` into a `&T` or `&mut T`, the caller must guarantee that all simultaneous access follow Rust's aliasing rules.

`NonNull<T>` takes quite a step down from the aforementioned types. In addition to allowing aliasing, it's the first pointer type on this list to drop both
lifetimes and the alignment guarantees of borrows. The only guarantees it provides are that the pointer is not null, and that it points to a valid instance
james7132 marked this conversation as resolved.
Show resolved Hide resolved
of type `T`. If you've ever worked with C++, `NonNull<T>` is very close to a C++ reference (`T&`).

`*const T` and `*mut T` are what most developers with a background in C or C++ would consider pointers.

`*const ()` is the bottom of this list. They're the Rust equivalent to C's `void*`. Note that Rust doesn't formally have a concept of type that holds an arbitrary
untyped memory address. Pointing at the unit type (or some other zero-sized type) just happens to be the convention. The only way to reasonably use them is to
cast back to a typed pointer. They show up occasionally when dealing with FFI and the rare occasion where dynamic dispatch is required, but a trait is too
constraining of an interface to work with. A great example of this are the [RawWaker] APIs, where a singular trait (or set of traits) may be insufficient to capture
all usage patterns. `*mut ()` should only be used to carry the mutability of the target, and as there is no way to to mutate an unknown type.

[RawWaker]: https://doc.rust-lang.org/std/task/struct.RawWaker.html

## Available in Nightly

|Pointer Type |Lifetime'ed|Mutable|Strongly Typed|Aligned|Not Null|Forbids Aliasing|Forbids Arithmetic|
|:------------------|:----------|:------|:-------------|:------|:-------|:---------------|:-----------------|
|`Unique<T>` |Owned |Yes |Yes |Yes |Yes |Yes |Yes |
|`Shared<T>` |Owned* |Yes |Yes |Yes |Yes |No |Yes |

`Unique<T>` is currently available in `core::ptr` on nightly Rust builds. It's a pointer type that acts like it owns the value it points to. It can be thought of
as a `Box<T>` that does not allocate on initialization or deallocated when it's dropped, and is in fact used to implement common types like `Box<T>`, `Vec<T>`,
etc.

`Shared<T>` is currently available in `core::ptr` on nightly Rust builds. It's the pointer that backs both `Rc<T>` and `Arc<T>`. It's semantics allow for
multiple instances to collectively own the data it points to, and as a result, forbids getting a mutable borrow.

`bevy_ptr` does not support these types right now, but may support [polyfills] for these pointer types upon request.
james7132 marked this conversation as resolved.
Show resolved Hide resolved

[polyfills]: https://en.wikipedia.org/wiki/Polyfill_(programming)

## Available in `bevy_ptr`

|Pointer Type |Lifetime'ed|Mutable|Strongly Typed|Aligned|Not Null|Forbids Aliasing|Forbids Arithmetic|
|:--------------------|:----------|:------|:-------------|:------|:-------|:---------------|:-----------------|
|`ConstNonNull<T>` |No |No |Yes |No |Yes |No |Yes |
|`ThinSlicePtr<'a, T>`|Yes |No |Yes |Yes |Yes |Yes |Yes |
|`OwningPtr<'a>` |Yes |Yes |No |Maybe |Yes |Yes |No |
|`Ptr<'a>` |Yes |No |No |Maybe |Yes |No |No |
|`PtrMut<'a>` |Yes |Yes |No |Maybe |Yes |Yes |No |

`ConstNonNull<T>` is like `NonNull<T>` but disallows safe conversions into types that allow mutable access to the value it points to. It's the `*const T` to
`NonNull<T>`'s `*mut T`.

`ThinSlicePtr<'a, T>` is a `&'a [T]` without the slice length. This means it's smaller on the stack, but it means bounds checking is impossible locally, so
accessing elements in the slice is `unsafe`. In debug builds, the length is included and will be checked.

`OwningPtr<'a>`, `Ptr<'a>`, and `PtrMut<"a>` act like `NonNull<()>`, but attempts to restore much of the safety guarantees of `Unique<T>`, `&T`, and `&mut T`.
james7132 marked this conversation as resolved.
Show resolved Hide resolved
They allow working with heterogenous type erased storage (i.e. ECS tables, typemaps) without the overhead of dynamic dispatch in a manner that progressively
translates back to safe borrows. These types also support optional alignment requirements at a type level, and will verify it on dereference in debug builds.