Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: truly unsized types #709

Closed
wants to merge 4 commits into from

Conversation

Projects
None yet
10 participants
@mzabaluev
Copy link
Contributor

mzabaluev commented Jan 22, 2015

Further subdivide unsized types into dynamically-sized types, implementing
an intrinsic trait DynamicSize, and types of indeterminate size. References
for the latter kind will be thin, while allocating slots or copying values
of such types is not possible outside unsafe code.

Rendered

@mzabaluev

This comment has been minimized.

Copy link
Contributor Author

mzabaluev commented Jan 22, 2015

@P1start

This comment has been minimized.

Copy link
Contributor

P1start commented Jan 23, 2015

What exactly is the difference between ‘truly’ unsized types and DSTs? The only difference I can see is that DSTs use fat pointers, while ‘truly’ unsized types use thin pointers. I don’t think that’s a necessary distinction to make, because they both have the same restriction: they cannot be used without being behind a pointer. If we, for example, introduced fat pointers that have more than one word of extra data (a feature I’ve wanted a few times), they wouldn’t need extra traits for each separate size, so why should having zero bytes of extra data be any different?

@mzabaluev

This comment has been minimized.

Copy link
Contributor Author

mzabaluev commented Jan 24, 2015

The only difference I can see is that DSTs use fat pointers, while ‘truly’ unsized types use thin pointers.

DSTs are dynamically sized types, meaning that the size of the value is known while the value exists. To reconstruct a reference to a DST from a raw pointer, one has to obtain the size. This RFC proposes to lift this restriction in cases when the size of the holistic value is not needed immediately or is unknown.

@Ericson2314

This comment has been minimized.

Copy link
Contributor

Ericson2314 commented Jan 25, 2015

A huge area this would help is disambiguating function pointers and functions. Basically it would be cool if fn(A..) -> B is the type of function themselves, and then &'a fn(A..) -> B is the type of function pointers, 'static being the common case. While I prefer this even out of elegance alone, it potentially really help with safe dynamic linking, etc.

There was a thread around this, but I'm afraid and I can't find it. IIRC @eddyb said it wouldn't work exactly because the unsized vs dynamically sized problem.

@mzabaluev

This comment has been minimized.

Copy link
Contributor Author

mzabaluev commented Jan 25, 2015

@Ericson2314: you might be referring to #661.

@Ericson2314

This comment has been minimized.

Copy link
Contributor

Ericson2314 commented Jan 26, 2015

Hmm, that didn't have the conversation with eddyb I remember, but that's definitely one example of people wanting safer dynamic linking. Might of been a conversation about JITing, where perhaps this is even more useful (I've never heard of unlinking dynamically linked libraries).

@P1start

This comment has been minimized.

Copy link
Contributor

P1start commented Jan 26, 2015

@Ericson2314 Is this discuss post on fn lifetimes the discussion you were thinking about?

@Ericson2314

This comment has been minimized.

Copy link
Contributor

Ericson2314 commented Jan 26, 2015

@P1start well, not exactly as I remembered, but that's probably it. Thanks!

@Diggsey

This comment has been minimized.

Copy link
Contributor

Diggsey commented Jan 29, 2015

This is a bit of a crazy idea, but you could give DynamicSize an associated type (Ptr) which the compiler would automatically use for references to that type. This would eliminate the need for special handling of slices and trait objects within the compiler.

This type would fullfill the role of a raw pointer, *T by implementing Deref, from which the compiler constructs an implicit reference type &T which enforces the correct borrow rules, but otherwise acts like the Ptr type. A &T then implicitly converts only to DynamicSize::Ptr, but not *T.

The Unsized bound is equivalent to DynamicSize<Ptr = *T>, ie. a raw pointer.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Jan 29, 2015

My understanding of "unsized" is that it is the result of "unsizing", which turns static type info into dynamic values. Not that I have a better name for the types that lack any size information whatsoever.

@Kimundi

This comment has been minimized.

Copy link
Member

Kimundi commented Jan 29, 2015

I agree with @Diggsey that a generalization of DST metadata might be more worthwhile than adding special cases to the possible DST values. At least, if those special cases add additional syntax and semantic, like in this proposal.

@mzabaluev

This comment has been minimized.

Copy link
Contributor Author

mzabaluev commented Jan 29, 2015

@Kimundi, what additional syntax you are referring to? The only visible changes this proposal adds are a new marker type and the DynamicSize trait, both pretty conventional as per the current syntax.

Semantics do change, but I think there are two different use cases currently lumped in with Sized:

  1. The Sized bound tells that the value has a statically known size, so it can be copied or moved.
  2. Lifting the Sized bound to accommodate DSTs, while keeping the assumption that the size of the value is known at runtime.

I'm not entirely sure that case 2 is a real concern, since an implementation of a generic trait parametric on an ?Sized type would need to specify the DST in contexts where the size is needed. So there may be little need to use the DynamicSize bound explicitly. However, I haven't gone over this in a formal way and I'm not familiar with the intricacies of the type system, so your help is appreciated with proving or disproving my assumption.

@Diggsey

This comment has been minimized.

Copy link
Contributor

Diggsey commented Jan 29, 2015

@mzabaluev
My suggestion is in no way an objection to this RFC - I think this has a chance of being accepted for 1.0 which would be great, while mine obviously doesn't, and I think with this RFC in place, expanding on DynamicSize can be done in a backward compatible way.

With regard case 2: there was a PR somewhere to allow "size_of_val" and friends to work on DSTs, so the distinction between DynamicSize and !Sized is definitely needed, although there's still a bit of a grey area between the cases of "raw pointer, but can figure out the size at runtime (not necessarily in an efficient way)" vs "raw pointer, can't figure out size/type has no concept of size".

@Kimundi

This comment has been minimized.

Copy link
Member

Kimundi commented Jan 29, 2015

@mzabaluev: I meant the addition of the additional marker types and traits as additional "syntax".

In my opinion, there is no difference between unsized and dynamically sized - in both cases being sized refers the knowing the size at compiletime, and in both cases there is some runtime mechanism for finding it out.

Whether that runtime mechanism involves storing the size directly (slices), or in form of a vtable (trait objects), or as part of the data structure pointed-at (CStr) seems unrelated to that core distinction to me.

I'm not saying that a DST value with a thin pointer representation is not useful, I just don't think it needs to be its own "thing".

@Diggsey

This comment has been minimized.

Copy link
Contributor

Diggsey commented Jan 29, 2015

@Kimundi I think there is a useful difference between thin/fat pointer DSTs, which is that thin pointers can be transmuted between each other, and passed to C/C++ code as a raw pointer. It's quite easy to come up with examples where the generic constraints should be "is a thin pointer" rather than "is not a DST" (for example, compatibility with void*).

@Ericson2314

This comment has been minimized.

Copy link
Contributor

Ericson2314 commented Jan 29, 2015

In my opinion, there is no difference between unsized and dynamically sized - in both cases being sized refers the knowing the size at compiletime, and in both cases there is some runtime mechanism for finding it out.

It was my understanding that this would support cases where you couldn't find out. This is needed for functions.

@mzabaluev

This comment has been minimized.

Copy link
Contributor Author

mzabaluev commented Jan 29, 2015

@Kimundi even the types with a size that can be calculated from content may have sufficiently different performance characteristics for this operation (e.g. O(N) for C strings vs O(1) for DSTs) so genericity may not be desirable. Anyway, there doesn't seem to be a generic way to calculate the size of a DST value, so the only difference is whether a fat pointer is required to represent a reference.

@mzabaluev

This comment has been minimized.

Copy link
Contributor Author

mzabaluev commented Jan 30, 2015

@Diggsey If I understood your proposal about DynamicSize::Ptr correctly, the DynamicSize trait is meant for the DSTs in their current form only, and a thin-pointer requirement would still need a negative bound on that (or a positive bound on its complement provided by the compiler), right?

@Diggsey

This comment has been minimized.

Copy link
Contributor

Diggsey commented Jan 30, 2015

@mzabaluev Originally, I was thinking of something like this:

  • All types become either Sized or DynamicSize, depending solely on whether their size is known at compile-time.
  • DynamicSize has an associated type, Ptr
  • What used to be "Unsized" is now just "DynamicSize<Ptr = *T>", ie. uses a thin pointer. (edit: it's been pointed out that even *T is not always a thin pointer, see below for alternative)

However, it might make more sense like this:

  • The Unsized trait has an associated type, Ptr
  • All types whose size is known at compile-time, implement Sized, unless they opt-out
  • Unsized is implemented automatically for all T: Sized, with Ptr = *T
  • All other types must implement Unsized directly
  • Some !Sized types may implement RuntimeSized, which has a method to calculate the size of a value at runtime. This would include all current DSTs.
  • All Sized types implement RuntimeSized automatically.

So the useful bounds become:

  • Has thin pointer => size_of(::Ptr) == size_of(isize)
  • Has compile-time size => T: Sized
  • Has runtime size => T: RuntimeSized
  • And the complements of the above, using !

And for any type, it's pointer type can be obtained via:

  • ::Ptr
@SSheldon

This comment has been minimized.

Copy link

SSheldon commented Jan 30, 2015

"DynamicSize<Ptr = *T>", ie. uses a thin pointer

@Diggsey, fyi *T is not guaranteed to be a thin pointer; for types with fat references, *T is also fat (like *str, *Trait, *[T])

@Diggsey

This comment has been minimized.

Copy link
Contributor

Diggsey commented Jan 30, 2015

@SSheldon Ah, I didn't realise that - I've updated my previous post to reflect that.

@mzabaluev

This comment has been minimized.

Copy link
Contributor Author

mzabaluev commented Jan 30, 2015

@Diggsey:

Has thin pointer => size_of(::Ptr) == size_of(isize)

That's quite a mouthful, and I don't think current Rust allows expressions as bounds. But if bounds like that could actually be used, a convenience trait could be provided to assert it.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Feb 3, 2015

Given #738 it looks like we may end up removing all of the various marker structs, in which case adding a new NotSized may stick out a bit.

Perhaps we could beef up the compiler to consider types unsized such as:

struct CStr {
    data: c_char,
    marker: Phantom<[c_char]>,
}

In this case we'd basically be saying that the compiler for type analysis should consider CStr as containing [c_str] but for representation purposes it only has one c_char field.

(just a thought)

@pnkfelix

This comment has been minimized.

Copy link
Member

pnkfelix commented Feb 5, 2015

postponing for post 1.0; cannot spend time thinking about this.

(Also, RFC is thin on details, at least for a change of this size.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.