Note design constraints on hypothetical `DynSized` #166

CAD97 · 2022-06-14T00:30:25Z

I've been meaning to write this up for a while (since the 2020-01-06 design meeting, honestly). New information brought to light in my recent irlo thread finally prompted me to collect this into one document.

This represents my best understanding of the implications and design constraints on the hypothetical DynSized trait. I've done my best to stay neutral and objective here.

cc @Ericson2314 (pushing for DynSized as an implementation detail)

CAD97 · 2022-06-14T00:35:09Z

One additional note w.r.t. Mutex that seemed a bit too nonobjective to include in the main text:

Mutex<T> could support T: ?Sized + ?MetaSized + ?DynSized if Mutex<T>: DynSized => T: MetaSized. Locking on size_of_val_raw is likely not possible with the constraint that size_of_val_raw should work on a dropped pointee. (However, if no Mutex uses an implementation that requires deinitialization of the actual raw mutex lock itself (e.g. a boxed OS mutex), then it might be possible.)

Ericson2314 · 2022-06-14T02:48:14Z

Glad this exists!

I think a

struct WithLayout<T: ?DynSized> {
    layout: Layout,
    data: T,
};

impl<T: ?DynSized> DynSized for WithLayout<T> { ... }

might also be possible to help odd-ball types use Ref<T>-like types.

The CString restrictions I am not too worried about because even without interior mutability if one modifies them one could end up deallocating too much or too little. An restricted form of them where one cannot write any null bytes or overwrite the the original null byte (So something like &mut RestrictedThinCStr -> &mut [NonZeroU8]) also can safe to use with UnsafeCell. I think we can therefore skip any MetaSized for UnsafeCell and do a MutationCannotChangeSize marker trait or even just document that restriction.

CAD97 · 2022-06-14T03:22:37Z

The problem I illustrated with ThinCStr exists for WithLayout as well, and even (<T as Pointee>::Metadata, T), though; it's not specific to CStr. (FWIW, I originally thought it might not be a problem since &mut CStr doesn't expose any functionality over &CStr.)

The problem is that the example fn noop_write is presumably sound for for<T: ?Sized + DynSized> &mut T. If you have &mut T, that gives you permission to nonatomically read and write back the unchanged value for the entire span of mem::size_of_val(it).

It is a fundamental assumption of UnsafeCell<T> that given &UnsafeCell<T>, you cannot safely read the bytes of T, and have to unsafely assert that you're upholding borrowing/thread-safety rules. If mem::size_of_val::<&UnsafeCell<T>> requires reading any bytes from T, this breaks the fundamental restriction that allows UnsafeCell to work.

CStr is just the worst case scenario, because it always requires reading the entire value to determine the size. For WithLayout, you could potentially do some waffling about requiring self.layout not to be written to; with CStr this would require all writes to be atomic to avoid racing with (now required atomic) the read to scan for the nul byte.

Note that it's not just sufficient to say "it only writes nonzero bytes," because a data race is UB. Naively you'd want to assume that a data race on byte-sized values would only result in reading some value that's previously been written (as byte sized writes theoretically should not be able to tear), but this is not what the memory model gives us¹; a write and another memory use on a different thread, at least one of which is not atomic and which are not externally synchronized is plain UB and your program no longer has any defined semantics, anything can happen, nasal demons eating your laundry, etc.

So even if reading in size_of_val is required to be atomic, it's still possible to write to the object's memory via noop_write and cause a data race. It is unsound to expose both &UnsafeCellLike<T> and &mut T where getting the size of T requires a reading any bytes from T.

(If there's a way I can better communicate this in not quite so many words for the document itself, I'd appreciate any edits, additions, or guidance.)

IIUC LLVM actually has an unordered atomic ordering which has this behavior of not causing UB on a data race, but just reading some bytes which were previously written (independently arbitrary for each byte). However, if the other side of the data race is unatomic and not just unordered, it's still a UB data race. ↩

Ericson2314 · 2022-06-14T19:21:06Z

I think if noop_write must require MetaSized, then UnsafeCell is safe to not MetaSized. mem::size_of_val::<UnsafeCell<T>> must merely not read any mutable bytes, but if we can prevent noop_write then some bytes are not subject to interior mutation I believe? That can save us.

Anyways, what this reveals to me is that ?MetaSized is the heart of fully general custom DSTs, while ?DynSized alone is nicely a more narrow thing. I would make the first version of a DynSized not allow for any ?MetaSized by forcing the information to come from the meta alone. Then ThinCString cannot be DynSized but I think that's OK: O(n) size_of_val is a footgun anyways so long as that function is implicitly called.

Later, if we are happy with DynSized (and perhaps beyond the taboo of more opt-out traits in general 🤞) we can return to the question of ?MetaSized and fully general custom DSTs. As long as we do that before DynSized becomes stable, there should be no issue.

CAD97 · 2022-06-14T19:34:08Z

It seems very subtle to allow size_of_val<T: ?Sized + ?MetaSized + DynSized> but require noop_write to be <T: ?Sized + MetaSized>. (I'm using fully explicit ?bounds here for clarity.) As today's ?Sized is (roughly¹) ?Sized + MetaSized, this is theoretically a possible resolution, but it seems an undesirably subtle one to me. I'll add it as a potential resolution into the doc, though.

you can accomplish metadata -> Layout by creating a dangling pointer and calling [size|align]_of_val_raw, but there is still a safety burden that the described layout size must fit in isize ↩

Ericson2314 · 2022-06-14T19:47:05Z

@CAD97 Yeah I am not supposed we get deep into that now.

Rather, other hand if we ignore ?MetaSized entire for now and allow noop_write, it's point out that if later we do add ?MetaSized we have option of noop_write requiring MetaSized (so saying the implicit bound cannot be relaxed) and that opens some doors.

The stuff is super trickle and subtle, which is why I don't want to make it part of the much simpler DynSized initial effort. But I also don't want anyone to think DynSized is a dead end because all this tricky custom DST stuff is impossible no matter what. It is possible just not for the squeemish :).

Bottom line is politically I feel like there is a tough issues to thread where:

On one hand, DynSized can't be seen as a slippery slope to custom DST endless complexity for those worried about that.
On the other hand, DynSized sized needs to seem a viable stepping stone to custom DST endless flexibility for those that want that.

See my choice of metaphors :). Stepping stones are used to cross bodies of water which are (locally) flat. The stepping stones are necessary to cross, but unlike the slippery slope they don't force one to finish crossing once one has begun stepping on them.

CAD97 · 2022-07-02T21:17:04Z

Backlinking this here:

rust-lang/rust#97052 proposes to add

// std::ptr
struct TypedMetadata<T: ?Sized>(pub <T as Pointee>::Metadata);
impl<T: ?Sized + Unsize<U>, U: ?Sized> CoerceUnsized<TypedMetadata<U>> for TypedMetadata<T> {}

This allows unsizing pointee metadata without access to an attached pointer. This would place another restriction on custom unsized types: that any unsizing must operate exclusively on the metadata without access to a pointer.

I can think of one possible feature that would be eliminated by this:

Consider a type Indyn<T> which is (T::Metadata, T), and &Indyn<T> is always thin. Essentially, the heap representation of ThinBox<T>. The above restriction prevents a coercion from &Indyn<dyn Trait> to &dyn Trait via CoerceUnsized; a conversion via auto[de]ref &*(_: &Indyn<dyn Trait>) is fine.

CAD97 · 2022-07-03T17:20:41Z

Crossposting again for visibility:

That coercion (&Indyn<dyn Trait> -> &dyn Trait) could work if Indyn<dyn Trait>: Trait. It still runs into the other problems with pointee-sized DSTs rather than meta-sized DSTs (see the above design notes), but coercing &Indyn<dyn Trait> -> &dyn Trait could work by using a vtable which is implemented as roughly method(ptr, args...) = (*(ptr as *const trait_vtable_t)).method(ptr.offset_bytes(sizeof::<trait_vtable_t>()), args...)`.

Writing that out: this coercion itself actually does only need the (Indyn) type and its pointee metadata (()), and is not verboten by TypedMetadata coercions. I return to the position that any coercions that are address/pointee-sensitive are custom coercions to custom DSTs (such as avoiding the recursive vtable lookup by dynamically creating a vtable).

CAD97 · 2022-07-13T18:01:30Z

Just discussed in passing on Zulip, custom pointer metadata will likely run into all the same issues as Box<T, A> does for making pointers no longer 100% controlled by the language. Technically this is not directly relevant, as these design notes are technically just on the opt-out default trait bound, and not on actually adding new pointee kinds beyond extern type.

nikomatsakis

This is a great summary, thank you @CAD97

src/design_notes/dynsized_constraints.md

nikomatsakis · 2022-10-04T20:27:13Z

@CAD97 can you clarify the two above points quickly? r=me in any case, so I'll merge once you do so (or if you don't by next week, I'll merge anyway...)

CAD97 · 2022-10-04T21:13:02Z

@nikomatsakis I agree that section was confusingly written and did my best to rewrite for clarity.

src/design_notes/dynsized_constraints.md

Co-authored-by: Christopher Durham <cad97@cad97.com>

Note design constraints on hypothetical DynSized

5cd1c6c

Fix T-lang quote

9a7a391

Update dynsized_constraints.md

5b2caa2

Update dynsized_constraints.md

17311ed

CAD97 mentioned this pull request Jun 28, 2022

Implement pointee metadata unsizing via a TypedMetadata<T> container rust-lang/rust#97052

Closed

CAD97 mentioned this pull request Jul 13, 2022

Tracking Issue for layout information behind pointers rust-lang/rust#69835

Open

2 tasks

CAD97 mentioned this pull request Jul 13, 2022

Replace feature(layout_for_ptr) with feature(layout_for_meta) rust-lang/libs-team#67

Open

nikomatsakis approved these changes Oct 4, 2022

View reviewed changes

src/design_notes/dynsized_constraints.md Outdated Show resolved Hide resolved

src/design_notes/dynsized_constraints.md Outdated Show resolved Hide resolved

Update dynsized_constraints.md

f24c4f2

CAD97 added 2 commits October 4, 2022 16:13

Update dynsized_constraints.md

5260de8

Fix footnote syntax

1aa06cc

CAD97 commented Oct 4, 2022

View reviewed changes

src/design_notes/dynsized_constraints.md Outdated Show resolved Hide resolved

Update src/design_notes/dynsized_constraints.md

9e68194

Co-authored-by: Christopher Durham <cad97@cad97.com>

nikomatsakis merged commit 9b61841 into rust-lang:master Oct 11, 2022

CAD97 deleted the patch-1 branch October 11, 2022 20:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Note design constraints on hypothetical `DynSized` #166

Note design constraints on hypothetical `DynSized` #166

CAD97 commented Jun 14, 2022 •

edited

Loading

CAD97 commented Jun 14, 2022

Ericson2314 commented Jun 14, 2022

CAD97 commented Jun 14, 2022

Ericson2314 commented Jun 14, 2022

CAD97 commented Jun 14, 2022

Ericson2314 commented Jun 14, 2022 •

edited

Loading

CAD97 commented Jul 2, 2022

CAD97 commented Jul 3, 2022 •

edited

Loading

CAD97 commented Jul 13, 2022 •

edited

Loading

nikomatsakis left a comment

nikomatsakis commented Oct 4, 2022

CAD97 commented Oct 4, 2022

Note design constraints on hypothetical DynSized #166

Note design constraints on hypothetical DynSized #166

Conversation

CAD97 commented Jun 14, 2022 • edited Loading

CAD97 commented Jun 14, 2022

Ericson2314 commented Jun 14, 2022

CAD97 commented Jun 14, 2022

Footnotes

Ericson2314 commented Jun 14, 2022

CAD97 commented Jun 14, 2022

Footnotes

Ericson2314 commented Jun 14, 2022 • edited Loading

CAD97 commented Jul 2, 2022

CAD97 commented Jul 3, 2022 • edited Loading

CAD97 commented Jul 13, 2022 • edited Loading

nikomatsakis left a comment

Choose a reason for hiding this comment

nikomatsakis commented Oct 4, 2022

CAD97 commented Oct 4, 2022

Note design constraints on hypothetical `DynSized` #166

Note design constraints on hypothetical `DynSized` #166

CAD97 commented Jun 14, 2022 •

edited

Loading

Ericson2314 commented Jun 14, 2022 •

edited

Loading

CAD97 commented Jul 3, 2022 •

edited

Loading

CAD97 commented Jul 13, 2022 •

edited

Loading