Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New RFC: Collection Transmute #2756

Open
wants to merge 3 commits into
base: master
from

Conversation

@llogiq
Copy link
Contributor

commented Sep 3, 2019

rendered

@llogiq llogiq force-pushed the llogiq:collection-transmute branch from d11da3b to 4a44ee1 Sep 3, 2019

@llogiq llogiq force-pushed the llogiq:collection-transmute branch from 4a44ee1 to 15256d3 Sep 3, 2019

@Centril

This comment has been minimized.

Copy link
Member

commented Sep 3, 2019

@BurntSushi

This comment has been minimized.

Copy link
Member

commented Sep 3, 2019

Nice! Thanks for writing this up.

Though the code is not too large, there are a good number of things to get wrong – this solution was iteratively created by soundness-knowledgeable Rustaceans, with multiple wrong attempts.

Could you include text in the RFC that documents this progression? In particular, why is the naive approach unsound, and why is the final result correct?

In addition to that, I don't see alignment mentioned in this RFC, and I'd kind of expect that to be an issue somewhere. Or am I misunderstanding?

Additionally, are there any problems with the fact that you might create an allocation for Vec<T>, but wind up freeing that allocation as a Vec<U>? It seems to me like that is not universally okay. I think size_of::<T>() == size_of::<U>() and align_of::<T>() == align_of::<U>() both need to hold, where I think the latter is an additional restriction over transmute, which AFAIK does not require the alignments to be equivalent. Are there more conditions that need to hold?

@rkruppe

This comment has been minimized.

Copy link
Member

commented Sep 3, 2019

I'd rather just declare mem::transmute to be OK. A new method with the same name and purpose as mem::transmute but subtly different meaning makes it difficult to remember which one to use, and it doesn't help people who (in the past or in the future) write it "incorrectly".

As far as I know, transmuting a Vec (if the item types are sufficiently compatible) is only UB today because we're not committed to Vec<T> and Vec<U> having the same memory layout for all T and U, but we can easily guarantee that that if we so choose:

  • The current implementation (e.g., Vec's source code, rustc's layout algorithm) poses no obstacles to such transmutes, it's just a matter of guaranteeing it will remain so
  • There's no reason to expect that we'll ever want to not make it true (e.g. it might make sense to randomize layout of user-defined structs or determine it by PGO, but for Vec neither point seems likely to be useful)
  • If needed (e.g. because we add struct layout randomization), we can always add something extra to Vec to make it true, e.g., a repr(C) attribute or the hypothetical repr(in_order) (if the FFI implications of repr(C) are undesirable)

One remaining benefit of a specialized transmute method would that it could do more sanity checks, such as the item types having the same size and alignment. This is significant but I'm not so sure it outweighs the aforementioned drawbacks of having a separate but similar method.

@Centril Centril added the T-lang label Sep 3, 2019

@burdges

This comment has been minimized.

Copy link

commented Sep 3, 2019

I agree with @BurntSushi about enforcing size_of::<T>() == size_of::<U>(). We must enforce this because mem::transmute specifies that it checks type sizes, which morally unsafe { mem::transmute::<_,Vec<u64>>(vec![0u8]) } violates.

I think however an inequality suffices for the alignment, probably align_of::<U>() < align_of::<T>() no? If we dislike this, then Vec::transmute could reallocate whenever it must fix alignment, but maybe transmute users would prefer the inequality.

@BurntSushi

This comment has been minimized.

Copy link
Member

commented Sep 3, 2019

I got the equality (as opposed to inequality) from the docs on GlobalAlloc::dealloc. Specifically:

layout must be the same layout that was used to allocate that block of memory

Where layout specifies the alignment. A strict reading of that suggests that the alignment you allocate with must be the same as the alignment that you deallocate with. Whether this is actually required for the underlying memory allocator(s) used, I don't know.

@burdges

This comment has been minimized.

Copy link

commented Sep 3, 2019

An inequality would prove useful, so maybe @RalfJung can say if it suffices.

@RalfJung

This comment has been minimized.

Copy link
Member

commented Sep 3, 2019

You'll have to ask whoever wrote the allocator docs. I have no idea why alignment is even required to match.

(I'll only be able to read the full thing and respond properly to it when I am back from vacation.)

@sfackler

This comment has been minimized.

Copy link
Member

commented Sep 3, 2019

The Windows system allocator implementation does rely on getting the same alignment when deallocating as allocating I believe.

@Diggsey

This comment has been minimized.

Copy link
Contributor

commented Sep 3, 2019

This is not just a windows thing: any time that you need a greater alignment than the underlying allocator can provide, you have to allocate a slightly larger chunk of memory and then offset the returned pointer to the required alignment. When it comes to freeing that memory, you need some way to undo that offset.

@RalfJung

This comment has been minimized.

Copy link
Member

commented Sep 3, 2019

Just the alignment is not enough information though to compute what the offset was.
When requesting a 8-byte-aligned allocation where we really need an alignment of 16, we need to know whether the offset-by-8 was necessary or not. Just knowing "oh, 16-aligned was requested" does not help.

@llogiq

This comment has been minimized.

Copy link
Contributor Author

commented Sep 3, 2019

Thanks to all of you for this illuminating discussion! This convinces me that a) the current design isn't courageous enough, I should go with an unsafe Transmute trait instead, b) implementations should check what they can, preferrably at compile-time, and c) we should discourage mem::transmute usage, leaving it only as a last resort.

Regarding the alignment question, we'd need the information from the allocator if reducing alignment is acceptable.

@Diggsey

This comment has been minimized.

Copy link
Contributor

commented Sep 3, 2019

Just the alignment is not enough information though to compute what the offset was.
When requesting a 8-byte-aligned allocation where we really need an alignment of 16, we need to know whether the offset-by-8 was necessary or not. Just knowing "oh, 16-aligned was requested" does not help.

You need the alignment to determine whether an offset was applied or not though, and that's exactly what libstd does: https://github.com/rust-lang/rust/blob/master/src/libstd/sys/windows/alloc.rs#L47-L56

Without that, you end up having to add overhead to every single allocation, not just those with unusually large alignment requirements.

@clarfon

This comment has been minimized.

Copy link
Contributor

commented Sep 3, 2019

Kind of unsure how this could be used properly in the case of BinaryHeap.

@burdges

This comment has been minimized.

Copy link

commented Sep 3, 2019

How much do we know about when transmute gets used? I assume the most common usages are firstly to convert to/from a byte array, and secondly to violate another crate's visibility rules, like by adding or removing a private wrapper type. I think size_of::<T>() == size_of::<U>() suffices for the second, so the most restrictive form sounds useful.

@llogiq

This comment has been minimized.

Copy link
Contributor Author

commented Sep 3, 2019

@clarfon it's an unsafe operation. One could pair it with a heapify operation to restore the heap property if so desired, in cases where the types' orderings differ.

@clarfon

This comment has been minimized.

Copy link
Contributor

commented Sep 3, 2019

@llogiq Right, but BinaryHeap doesn't offer that in any public APIs, and at that rate, you're mostly just reimplementing what BinaryHeap has to offer anyway. Either the internals of the heap should be exposed (seems out of scope for libstd) or anyone wanting to transmute their heap should just implement one on top of Vec.

@comex

This comment has been minimized.

Copy link

commented Sep 4, 2019

How exactly is the size/alignment check to be implemented? An extension of the special-case intrinsicck routine that currently checks std::mem::transmute? Is there any way to use const generics to integrate it properly with the trait system? (I tried, but it seems like the necessary functionality isn't implemented yet.)

@djc

This comment has been minimized.

Copy link
Contributor

commented Sep 4, 2019

This may be obvious to most people involved here so far, but it might be useful to mention one or two use cases for having a transmute operation on these collections in the first place.

@ExpHP

This comment has been minimized.

Copy link

commented Sep 4, 2019

A Transmute trait sounds useless. Let me explain:


There are some cases where it would be highly desirable for straight-up transmutation to be well-defined. In particular, when switching a generic type argument between #[repr(transparent)] types; I'm not sure that #[repr(transparent)] is specified in a way such that ArbitraryStruct<usize> and ArbitraryStruct<Node> are guaranteed to have the same representation; but it ought to be!

// Newtype index wrapper for a graph node.
#[repr(transparent)]
#[derive(Debug, Copy, Clone, Hash, PartialEq, Eq)]
struct Node(usize);

// It is highly desirable for all of these to be valid operations
transmute::<Vec<Vec<usize>>, Vec<Vec<Node>>>(vecs);
transmute::<Vec<HashSet<usize>>, Vec<HashSet<Node>>>(sets);
transmute::<&HashSet<usize>, &HashSet<Node>>(set);

If we tried to solve the problem of UB Vec transmutes by introducing a Transmute trait, then

  1. either: all of these transmutes have the same tricky UB that we'd be discouraging against mem::transmute for, and so we merely brushed a tiny bit of dirt under the rug.
  2. or: the trait impls do not actually transmute. The first two transmutes would have to cost O(n), and the transmute behind references cannot be implemented.
@KrishnaSannasi

This comment has been minimized.

Copy link

commented Sep 4, 2019

I'm not sure that #[repr(transparent)] is specified in a way such that ArbitraryStruct and ArbitraryStruct are guaranteed to have the same representation; but it ought to be!

No it should not!

trait Foo {
    type Bar;
}

struct Baz<T: Foo>(T, T::Bar);

Baz can have different representations even for transparent types.

@rkruppe

This comment has been minimized.

Copy link
Member

commented Sep 4, 2019

@KrishnaSannasi

Baz can have different representations even for transparent types.

That is not a transparent newtype, though. repr(transparent) does guarantee equal layout to the sole non-ZST field (which may not be the same as the type parameter, if any, but that is a minor point).

However, @ExpHP, transmuting transparent newtypes may still not be safe because the newtype may have additional invariants over the wrapped type. For example, NonZeroU32 is a repr(transparent) struct containing u32.

Use unsafe block in example
Since the transmute method is still unsafe, there should be and unsafe block in the example.
@KrishnaSannasi

This comment has been minimized.

Copy link

commented Sep 4, 2019

@rkuppe you misunderstand, I'm saying that you can't go from Baz<i32> to Baz<Transparent> in general. Also, we can make a slight modification to Baz to make it repr transparent.

#[repr(transparent)]
struct Quax<T: Foo>(T::Bar, PhantomData<T>);

Wdit: made Quax actually repr transparent

@llogiq

This comment has been minimized.

Copy link
Contributor Author

commented Sep 4, 2019

@ExpHP Thinking more about it, the sole reason for a trait is that it would let us define and document the requirements for transmutation in one place. On the other hand, it would also allow generic usage, which is something I'm not so sure about. Monomorphising unsafe code doesn't sound like a good idea to me.

@AnthonyMikh

This comment has been minimized.

Copy link

commented Sep 4, 2019

This may be obvious to most people involved here so far, but it might be useful to mention one or two use cases for having a transmute operation on these collections in the first place.

Okay, here the use case: get n largest elements from a sequence of orderable elements. This is pretty easy to do with a heap: put the first n elements into heap and then for each of remaining elements put that element into heap and remove the least one.

However, there is an obstacle: this solution requires a min heap while Rust std library only offers max heap. Although possibly there is a min heap implementation on crates.io, I don't want to pull a dependency for a solving only this one task, so I end up wrapping elements into std::cmp::Ordering.

Now the question is "how to return these max values to caller?". Ideally, I would call BinaryHeap::into_vec, somehow cast the result into Vec<T> (it should be safe since std::cmp::Ordering<T> and T has the same runtime representation and Vec<T> doesn't rely on ordering operations defined for T) and call it a day. Since such an operation doesn't exist today, I have to resort to vec.into_iter().map(|x| x.0).collect(). Needless to say that it is expensive, but so far that is the only option for me if want to stay in safe Rust. And even if I decide to go unsafe, this would be not trivial to do right, as shown above.

@AnthonyMikh

This comment has been minimized.

Copy link

commented Sep 4, 2019

My two cents: I am a bit surprised that so far nobody has mentioned prior art developed in GHC Haskell.

@petertodd

This comment has been minimized.

Copy link
Contributor

commented Sep 4, 2019

Thanks to all of you for this illuminating discussion! This convinces me that a) the current design isn't courageous enough, I should go with an unsafe Transmute trait instead, b) implementations should check what they can, preferrably at compile-time, and c) we should discourage mem::transmute usage, leaving it only as a last resort.

Regarding the alignment question, we'd need the information from the allocator if reducing alignment is acceptable.

Re: a trait, I did a crate to do this: https://crates.io/crates/coercion

Basically the design I came up with was to define an unsafe Coerce<T> trait kinda similar to AsRef<T> that could be implemented by any trait with the same size and alignment as T with a coerce() method for sized types and a coerce_ptr()/coerce_mut_ptr() method for unsized types to coerce the pointers. In my crate coerce() is more of a check than something you should actually implement: if it can be implemented with mem::transmute() you've proven your type can in fact implement Coerce.

Ideally Coerce<T> would be implemented automatically by the compiler for the relevant combinations; note how in the unsized pointer coercions also subsumes the functionality of CoerceUnsized in a way. If implemented by the compiler we could make mem::transmute(T) have an appropriate T: Transmute<U> bound in a backwards compatible way, as getting transmute wrong is a compile time error.

I also included an As<T> trait for cases where the coercion was safe.

The big advantage of this being a trait is all container types can implement it, which was my rational in doing so in the first place. Basically I have a project with generic container types where being able to transmute between different forms in-place is useful.

@alecmocatta

This comment has been minimized.

Copy link

commented Sep 4, 2019

I've wanted an in-place map method on containers for a while, that avoids re-allocation if it doesn't need to:

fn map<U,F>(self, f: F) -> Vec<U> where F: FnMut(T) -> U {
    // Like this, but avoids reallocation where possible:
    self.into_iter().map(f).collect()
}

If this method existed, could transmuting a collection be done soundly via vec.map(mem::transmute)?

@petertodd

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2019

If this method existed, could transmuting a collection be done soundly via vec.map(mem::transmute)?

Actually that optimization could be implemented already without a Transmute trait:

fn map<U>(self, f: impl FnMut(T) -> U) -> Vec<U> {
    if Layout::new::<U>() == Layout::new::<T>() {
        // in-place path
       <tricky low level bits here>
    } else {
        // slow path
        self.into_iter().map(f).collect()
    }
}

Rust will definitely optimize out the path not taken. Since your proposed Vec::map() consumes self it wouldn't even be that hard to avoid leaking memory if f() panics.

@kornelski

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2019

I'm surprised that Vec doesn't have into_raw already, like Box does.

Would this be sound?

let (p, l, c) = Vec::into_raw_parts(oldvec);
let newvec = Vec::from_raw_parts(p as *mut OtherType, l, c);
@kornelski

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2019

If it's Vec-specific, could it be extended to support change in element size by splitting or combining adjacent elements? So that i32 can be transmuted to 4 times longer vec of u8?

@sfackler

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

If it's Vec-specific, could it be extended to support change in element size by splitting or combining adjacent elements? So that i32 can be transmuted to 4 times longer vec of u8?

No, as has already been mentioned above.

@KrishnaSannasi

This comment has been minimized.

Copy link

commented Sep 5, 2019

Here's an implementation for the fast path of Vec::map

https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=474f3248fdd6ffde61f2846e62c6bb80

I posted it on internals earlier

@rkruppe

This comment has been minimized.

Copy link
Member

commented Sep 5, 2019

@KrishnaSannasi

@rkuppe you misunderstand, I'm saying that you can't go from Baz<i32> to Baz<Transparent> in general.

Oh, I see. Yes, that is so resoundingly true (and I have had that discussion frequently enough) that I accidentally misread @ExpHP as a less hopeless proposal. Foo<T> and Foo<U> may differ not just because of associated types but also because of associated constants, const methods, and even without any trait bounds Foo could use parametricity-breaking tools. For exaple, type_name and str::len will likely be const soon, and then you can have a field of type e.g. [u8; type_name::<T>().len()]. In the future you can also do an arbitrary "type switch" with specialization.

And that's all just for actual layout, to say nothing of the aforementioned invariants which are just as much of a menace here. I stress these, @ExpHP, because they're just as much of an obstacle to safe transmutes as layout differences are. If you're fine with requiring the user to unsafely shoulder responsibility for asserting that no invariant is violated, it's not really a big step up from requiring them to also ensure layout compatibility.

So in summary, this:

Perhaps we need a new attribute—one that is stronger than #[repr(transparent)]—which provides this guarantee.

is not possible at all. Even if T and U are completely identical in every aspect that could reasonably be relevant for layout, ArbitraryStruct<_> may (without any "cooperation" from or the knowledge of T and U) do lots of things that result in different layouts for ArbitraryStruct<T> vs ArbitraryStruct<U>.


PS: @KrishnaSannasi

#[repr(transparent)]
struct Quax<T: Foo>(T::Bar, [T; 0]);

This isn't transparent (as you could easily find out on the playground) precisely because the [T; 0] may raise the alignment requirement and thus make layout of the newtype different from T::Bar.

@canndrew

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2019

Isn't the code in the RFC wrong? It should be converting [u32; 2] to [u8; 8], not [u8; 4] no?

@vorner

This comment has been minimized.

Copy link

commented Sep 5, 2019

I agree it's bad when people try to do it and do it wrong. However, I'm a bit sceptical to putting this to std, for two reasons:

  • It seems to be rare case to do. And it could easily go to a stand-alone crate, not to pollute std. As a prior art to that, safely initializing a fixed-size array is apparently a hard problem too, but array-init is a stand-alone crate.
  • To me, having to do transmutes is kind of code/design smell or an antipattern. But placing another method on vec into std kind of promotes/validates this antipattern on psychological level.
@petertodd

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2019

Here's an implementation for the fast path of Vec::map

https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=474f3248fdd6ffde61f2846e62c6bb80

I posted it on internals earlier

Ah cool! I wrote an implementation last night too.

If the mapping function is a no-op Rust seems to reliably optimize the entire map to a no-op. Though curiously I did find that transmuting a bool to anything else isn't actually a no-op, along with a whole host of other bool involving things, such as calling MaybeUninit::<bool>::assume_init(). But that seems to be an exception to the general rule, so adding map() to the Vec API, and perhaps other collections, seems potentially useful.

Edit: of course bool does this: all but the LSB are undefined, so the mask is necessary to put those bits into a defined state.

@ExpHP

This comment has been minimized.

Copy link

commented Sep 5, 2019

For exaple, type_name and str::len will likely be const soon, and then you can have a field of type e.g. [u8; type_name::<T>().len()]. In the future you can also do an arbitrary "type switch" with specialization.

Yikes, those are some nasty examples. Clearly, then, to specify valid transmutes of type Type<T> -> Type<U>, there must also be further requirements on Type to ensure parametricity. (if that is even possible!)

This is starting to derail too far from the RFC, so I would like to move the discussion here:

https://internals.rust-lang.org/t/specifying-a-safe-set-of-transmutes-from-struct-t-to-struct-u/10917

@Shnatsel Shnatsel referenced this pull request Sep 6, 2019
@Shnatsel

This comment has been minimized.

Copy link

commented Sep 6, 2019

The example in the RFC is actually unsound. 0u32 has the same size as [u8; 4] but not the same alignment. Relevant excerpt from Vec documentation:

ptr's T needs to have the same size and alignment as it was allocated with.

So unlike slices, you cannot actually transmute a Vec of type with greater alignment to a type with a lesser alignment. This is even more complicated than we realized!

@KrishnaSannasi

This comment has been minimized.

Copy link

commented Sep 6, 2019

For those who don't follow the internals thread I linked earlier, I updated Vec::map and added Vec::try_map. Now both don't panic on their own and will reallocate if layouts are not compatible. Vec::try_map allows you to early return by using the Try trait.

https://play.rust-lang.org/?version=nightly&mode=release&edition=2018&gist=f215b68e41601875d562d3655c5b0a4f


It looks like using Vec::map with transmute reliably optimizes down to a noop!

v.map(|x| unsafe { std::mem::transmute(x) })

turns into

playground::vec_transmute: # @playground::vec_transmute
# %bb.0:
	movq	%rdi, %rax
	movups	(%rsi), %xmm0
	movq	16(%rsi), %rcx
	movups	%xmm0, (%rdi)
	movq	%rcx, 16(%rdi)
	retq
                                        # -- End function

On playground for many types that I tried that are layout compatible. Note that this is the same as using the safe conversion of Vec by using Vec::from_raw_parts

/// # Safety
///
/// Calling this function requires the target item type be compatible with
/// `Self::Item` (see [`mem::transmute`]).

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg Sep 6, 2019

Contributor

I think it would be better to just say something like:

Safe if and only if transmute::<T,I> is safe and align_of::<T>() == align_of::<I>().

instead of using words like "compatible" which might not have a clear meaning for everybody.

This comment has been minimized.

Copy link
@gnzlbg
# Drawbacks
[drawbacks]: #drawbacks

Adding a new method to `std` increases code size and needs to be maintained.

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg Sep 6, 2019

Contributor

While this increases the amount of code in std::collections, these generic methods should not incur a code-size penalty if they are not used.

@Shnatsel

This comment has been minimized.

Copy link

commented Sep 6, 2019

I've requested a Clippy lint to catch transmutes of Vec into different size and/or alignment: rust-lang/rust-clippy#4515

[unresolved-questions]: #unresolved-questions

- are there other types that would benefit from such a method? What about
`HashSet`, `HashMap` and `LinkedList`?

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg Sep 6, 2019

Contributor

What is the argument against doing this?

This comment has been minimized.

Copy link
@KrishnaSannasi

KrishnaSannasi Sep 6, 2019

For HashMap and HashSet, you shouldn't be able to do this because it would likely break the hash and values would be in the wrong buckets. HashMap provides no api to fix this. LinkedList is more amenable to this change, as is VecDeque.

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg Sep 6, 2019

Contributor

@KrishnaSannasi

For HashMap and HashSet, you shouldn't be able to do this because it would likely break the hash and values would be in the wrong buckets.

We define transmute between two types to be semantically equivalent to a bitwise move.

While reading the RFC it was unclear to me whether the semantics of the proposed methods are a bitwise move of the collection types, or of the elements themselves.

For the types covered by the RFC, even if the semantics are to perform a bitwise move of the element types, we can optimize that and make the operations O(1) in time and space, so there isn't really a difference in practice.

Whether we can do this for HashMap and HashSet would depend on what the semantics are. For example, if the semantics are to perform a bitwise transmute of each element, then we can just implement these transmutes as a .into_iter().map(|x| transmute(x)).collect(). OTOH if these should be O(1), then we can't.

This comment has been minimized.

Copy link
@KrishnaSannasi

KrishnaSannasi Sep 6, 2019

I would expect these to be O(1), we could add other methods like HashMap::map which would try and reuse the allocation but be semantically equivalent to into_iter().map(...).collect()to , but that is a separate issue.

acceptable, to steer people away from the latter?
- this RFC does not deal with collection types in crates such as
[`SmallVec`](https://docs.rs/smallvec), though it is likely an implementation
in `std` might motivate the maintainers to include similar methods.

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg Sep 6, 2019

Contributor

FWIW I don't think this is an unresolved question. It is kind of up to those crates to keep up with the libstd collection API. It is normal that they will lag a bit behind, but for the APIs proposed here keeping up is only one PR away.

- are there other types that would benefit from such a method? What about
`HashSet`, `HashMap` and `LinkedList`?
- would it make sense to add implementations to types where `mem::transmute` is
acceptable, to steer people away from the latter?

This comment has been minimized.

Copy link
@gnzlbg

gnzlbg Sep 6, 2019

Contributor

I'm not sure what you mean by this. Could you elaborate what you mean by "types where mem::transmute is acceptable" ?

This comment has been minimized.

Copy link
@llogiq

llogiq Sep 7, 2019

Author Contributor

I mean types where it isn't insta-UB like Vec.

@danielhenrymantilla

This comment has been minimized.

Copy link

commented Sep 6, 2019

@vorner using this transmute is not always a code smell, and could, on the contrary, encourage to encode more invariants at the type level.

There is a PR for having constant-time From<Vec<NonZeroU8>> for CString, which is only possible with this kind of transmutes.

Code here: https://github.com/rust-lang/rust/pull/64069/files

I encourage you all to look at the code, since it had (as I like to do), plenty of safety-related annotations.

Especially, such transmute function:

  • is obviously unsafe, but with quite explicit invariants:

  • the first [T; len] elements must be sound to transmute to [U; len]

  • [T; capacity] must have the same layout as [U; capacity]

    • both are verified for all len and capacity, for instance, when T = NonZeroU8 and U = u8
  • can be guarded by type-based runtime checks, since these checks are expected to be removed at monomorphisation (trivial true or trivial false).

    • the check should be improved to using Layout rather than equal size and equal alignment, as @KrishnaSannasi did, since it is more explicit w.r.t. to dealing with an allocator.
@KrishnaSannasi

This comment has been minimized.

Copy link

commented Sep 7, 2019

I have done it again! Now, I have added try_zip_with and zip_with. These take an additional vec and try and use the capacity of either one! I have also make the code more robust, now it will clean up it's mess even if you panic in the drop.

https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=8655604d47b5a128ab306b1a5c87589a

In the interest of not polluting this thread even more with these updates, I have published this as a minimal crate so that you can track it's progress there and contribute if you would like to!

https://crates.io/crates/vec-utils

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.