Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upCustom Dynamically Sized Types for Rust #1524
Conversation
This comment has been minimized.
This comment has been minimized.
|
|
This comment has been minimized.
This comment has been minimized.
|
@dgrunwald Ah, well, this RFC does support that. struct CStr([libc::c_char]);
impl Dst for CStr {
type Meta = ();
}
impl Sizeable for CStr {
fn size_of_val(&self) -> usize {
libc::strlen(self.as_ptr())
}
}
impl CStr {
pub unsafe fn from_ptr(p: *const libc::c_char) -> &CStr {
// either
std::mem::transmute::<*const libc::c_char, &CStr>(p)
// or
std::ptr::make_fat_ptr(p as *const (), ())
}
} |
This comment has been minimized.
This comment has been minimized.
|
I'm not sure about this. My main concern is around how actually useful this is compared to the complexity. There is a lot of code that assumes fat pointers are two words and two words only. This would completely change that. It's a very large increase in complexity. Another issue is how this interacts with generics and unsizing. Currently Right now, my thought is that it's a nice feature (like most that get proposed), but not worth the increased complexity. |
This comment has been minimized.
This comment has been minimized.
|
@Aatch We need something like this. Look at the documentation for
This is just in the standard library. Out there in real code, we have to deal with C interop, where stuff like typedef struct _TOKEN_GROUPS {
DWORD GroupCount;
SID_AND_ATTRIBUTES Groups[ANYSIZE_ARRAY];
} TOKEN_GROUPS, *PTOKEN_GROUPS;has to be translated into struct TokenGroups {
group_count: u32,
groups: [SID_AND_ATTRIBUTES; 0],
}And traits like struct 2DSlice<'a, T> {
width: usize,
height: usize,
ptr: *const T,
lifetime: PhantomData<&'a [T]>,
}
struct 2DSliceMut<'a, T> {
width: usize,
height: usize,
ptr: *mut T,
lifetime: PhantomData<&'a mut [T]>,
} |
This comment has been minimized.
This comment has been minimized.
|
@Aatch Unsizing isn't touched in this RFC. We really need integer generics to start doing stuff with unsizing. However, once we do get integer generics, it'll be quite easy to add that to this and we can have unsizing for everybody :) |
This comment has been minimized.
This comment has been minimized.
|
@Aatch A generic |
This comment has been minimized.
This comment has been minimized.
reem
commented
Mar 3, 2016
|
@ubsan |
This comment has been minimized.
This comment has been minimized.
|
@reem 1) That's not an unsized type, 2) that wouldn't actually be a problem because you know what Meta is (as long as |
This comment has been minimized.
This comment has been minimized.
bluss
commented
Mar 3, 2016
|
@japaric Didn't you work on user-defined DSTs before? Do you have a link to your old effort? |
This comment has been minimized.
This comment has been minimized.
solson
commented on ca35a3e
Mar 3, 2016
|
Bikeshed: |
This comment has been minimized.
This comment has been minimized.
|
How would indexing the inner Idea: Add |
This comment has been minimized.
This comment has been minimized.
|
@thepowersgang It's an unsafe operation. The inner |
This comment has been minimized.
This comment has been minimized.
|
I originally proposed using @Aatch Fat pointers being hardcoded to one pointer-sized piece of metadata is a mistake. There are important extensions to slice and traits we need to support and that hardcoding has to go (which is easier with the MIR). |
This comment has been minimized.
This comment has been minimized.
|
I liked |
This comment has been minimized.
This comment has been minimized.
Yes, I was working on it last year. I don't have time right now to analyze/comment on the design proposed in this RFC. So I'm just going to drop a few links here: |
This comment has been minimized.
This comment has been minimized.
|
@thepowersgang actually brings up a good point, if you can get at the Either use a |
This comment has been minimized.
This comment has been minimized.
|
Hmm, yeah, I do think [T; 0] is better now that it's been discussed. |
This comment has been minimized.
This comment has been minimized.
|
Although, I'm thinking about how it would work with integer generics and Unsizing coercions... I think that struct PascalStr {
len: usize,
buf: [u8],
}
// &PascalStr == ptr -> { len, [buf; len] }
// &PascalStr<N> == ptr -> { N, [buf; N] }
// PascalStr<N> == { N, [buf; N] } |
This comment has been minimized.
This comment has been minimized.
|
@ubsan eh, I'm not sure that's really worth it here. My main concern is that the compiler now has to be aware of a lot more context when handling field access. The main advantage of Unsizing is probably better handed via separate types. This is already the case, |
This comment has been minimized.
This comment has been minimized.
|
@Aatch but they're not completely different types. You need them to be exactly the same type, in fact, behind the pointer, for unsizing coercions to work. |
This comment has been minimized.
This comment has been minimized.
|
@ubsan no, they're the same representation, not the same type. |
This comment has been minimized.
This comment has been minimized.
struct PascalStrBuffer<B> {
len: usize,
buf: B
}
type PascalStrArray<const N: usize> = PascalStrBuffer<[u8; N]>;
type PascalStr = PascalStrBuffer<[u8]>;This might be an okay compromise of sorts. I've also considered having a single optional integer parameter, which denotes the minimal contained size and defaults to |
This comment has been minimized.
This comment has been minimized.
|
@Aatch fine; you can argue whether they're the same type or not. They are the same representation, definitely, and if you had to write separate Sized versions of each type, imagine the really big types: #[repr(C)]
struct CLAIM_SECURITY_ATTRIBUTE_RELATIVE_V1 {
Name: DWORD,
ValueType: WORD,
Reserved: WORD,
Flags: DWORD,
ValueCount: DWORD,
union Values {
pInt64: [DWORD],
pUint64: [DWORD],
ppString: [DWORD],
pFqbn: [DWORD],
pOctetString: [DWORD],
}
}imagine writing two of these types, and now for each of the Windows flexible array structs. It gets ridiculous. |
This comment has been minimized.
This comment has been minimized.
|
It would be almost impossible to ensure the same layout for two different definitions though. |
This comment has been minimized.
This comment has been minimized.
|
In that case how about we restrict this to slice-like DSTs and use @thepowersgang's suggestion of having an |
This comment has been minimized.
This comment has been minimized.
|
@Aatch that seems overly restrictive. For example, if you have a struct 2dSlice<T>([T]);
impl<T> Dst for 2dSlice<T> {
type Meta = (usize, usize);
fn element_count(&self) -> usize {
// (meta.0 * meta.1)
}
}
impl<T> Sizeable for 2dSlice<T> {...}
fn main() {
let x = 2dSlice::<(), (std::usize::MAX, std::usize::MAX)>::new(); // not possible with `element_count`
} |
This comment has been minimized.
This comment has been minimized.
|
@ubsan that's not possible anyway! You can't have something that large. And it's not like the same reasoning doesn't apply to the Oh, and something I just thought of: destructors/dropping. It's all well and good having something that only contains Copy` types, but what if you have something non-Copy in your custom DST? |
This comment has been minimized.
This comment has been minimized.
|
@Aatch yes you can :) look at the type inside! You only have a pointer to the values, and aren't the owner. You could |
This comment has been minimized.
This comment has been minimized.
|
I can no longer keep this open in good faith. Someone else can take it from me, but ... personal issues. |
ubsan
closed this
Feb 6, 2017
This comment has been minimized.
This comment has been minimized.
|
So, for some time now I've been wanting to write a followup to my previous comment sharing my thoughts about this work. Even though @ubsan has closed the RFC, I figured I should leave it anyway. Hopefully it is clear from my previous comments that I think the ideas in here are good in a technical sense, and I'd like to do some experimentation with them. However, it's probably also clear that I feel somewhat hestitant about this. My concerns here are not technical, but rather all about prioritization and motiviation. Rust is an ambitious language. Making it successul is a big job and there is a lot of work ahead for us. When I ask myself "how would Custom DST help Rust grow over the next year", I don't feel like I hear a convincing answer. The main use case that I see is helping make matrix libraries more ergonomic. Now, admittedly, this is somewhat compelling, but otoh lack of custom dst isn't, I don't think, blocking work on Matrix-like things, it's just a way to make them nicer to use. (I'd be interested to hear otherwise; are there any applications where the presence of custom DST would dramatically change the design in some way that it's not even worth doing it otherwise?) Earlier I had thoughts, which I mentioned briefly in my comment, about the idea of trying to land this work in a very provisional way, to enable hacking. I am still interested in establishing a process for this. I think there are a number of things that might benefit from having a more "experimentation friendly" process (e.g., naked fns, inline asm, and interrupt ABIs come to mind). But I haven't gone and publicly worked on defining such a process because I also have concerns: the fact is that any time there is significant churn in the codebase, it has the potential to take a lot of time for core developers who are trying to focus on other things. I think the work around (On the other hand, there are some areas where we are adopting a looser process. We did land naked fns in a "provisional state", for example, and that has largely been a non-event. There is some experimentation around the embedded area, particularly with ABIs. But all of these are quite narrow and targeted compared to custom dst, which affects the code that handles every single reference.) Before @ubsan closed this RFC, I was working up my courage to move to postpone (close) -- but I kept hesitating, because I think it's a good idea and I like it. Yet still I have the feeling that on balance it just isn't a good investment of resources. I regret that hesitating perhaps sends a more frustrating message than actually moving to close. So @ubsan, I'm sorry about that. |
This comment has been minimized.
This comment has been minimized.
AtheMathmo
commented
Feb 17, 2017
|
@nikomatsakis Thank you for your comment. I agree with what you're saying - there are more pressing things to work on right now. I also wanted to comment on the following:
I think this is an accurate statement. There are a few things that I cannot do without custom DSTs, some of which don't even have ugly workarounds, but these are only mechanisms to tidy up the code and make things easier for users. Here are some example issues that we had to put off: AtheMathmo/rulinalg#149 (I think custom DSTs help this) It would be really nice to parallel the |
This comment has been minimized.
This comment has been minimized.
|
@AtheMathmo interestingly, I was planning to come and post a comment here anyhow just to mention I myself ran into a use-case for this just yesterday when hacking on my NLL prototype, specifically the bitset code. Really it's just the "2-D matrix use case" in disguise, but in trying to play with it, I did appreciate that there are things that are hard to do "just with methods". Though maybe I didn't find the best answer. In particular, I have a In any case, I definitely think that this conversation -- to what extent is this "just sugar" vs unlocking some fundamental capability -- is the critical one to deciding how to prioritize this change. |
This comment has been minimized.
This comment has been minimized.
|
Another thing that I was thinking about was this question: is this the sort of capability that, when added, would mean that the interface for existing libraries would want to be completely reworked to take advantage of it? |
This comment has been minimized.
This comment has been minimized.
AtheMathmo
commented
Feb 21, 2017
I cannot speak for the others who are looking to make use of this feature, but for me I would be making significant changes. Right now we have the following types in rulinalg: Another particularly difficult area is with operator overloading. Right now we have an explosion of overloading implementations for all combinations of matrix types. I think that custom DSTs could help here too but I haven't explored the idea too much. (Hoping I'll try to spend some time thinking about whether there is anything else that I can add to this discussion. I've got quite used to working around the lack of custom DSTs and I don't think there are many features missing for the user. There are however quite a few ways we could make things nicer - not needing to import traits everywhere, being able to have |
This comment has been minimized.
This comment has been minimized.
|
@AtheMathmo it seems like it would be a fruitful exercise to try and use e.g. the design from my comment and sketch out how the types in your library would work (in contrast to now) and throw it in a gist. (I'd like to see how you envision the operator overloading working, too.) Apologies if you've already done it and I missed it. (The design from the RFC seems mostly fine too, I don't recall there being any major differences.) |
This comment has been minimized.
This comment has been minimized.
AtheMathmo
commented
Feb 22, 2017
|
@nikomatsakis that sounds like a good idea! I'm a little busy for the rest of this week but hopefully will be able to sketch something out early next week. |
This comment has been minimized.
This comment has been minimized.
|
@AtheMathmo great, I'd love to see it. Since this RFC is closed, and it feels like we're still a bit in "design discussion" here, I'm going to open a thread on internals to carry on the conversation: https://internals.rust-lang.org/t/custom-dst-discussion/4842 |
mikeyhew
referenced this pull request
Jun 2, 2017
Open
Make mem::size_of_val and mem::align_of_val take raw pointers instead of reference #2017
This comment has been minimized.
This comment has been minimized.
|
For some more motivation, here's what I need in Crossbeam:
Pointers to such objects must be thin because atomically manipulating multiple words is a pain to do portably and performantly. Some support from the language for such use of DSTs would be great to have. |
This comment has been minimized.
This comment has been minimized.
|
Heap-allocating |
plietar
referenced this pull request
Aug 10, 2017
Open
Tracking issue for RFC 1861: Extern types #43467
dtolnay
referenced this pull request
Nov 1, 2017
Merged
Remove `T: Sized` on pointer `as_ref()` and `as_mut()` #44932
rkruppe
referenced this pull request
Nov 30, 2017
Open
Consider adding Box::uninitialized function #46406
mikeyhew
referenced this pull request
Dec 20, 2017
Closed
Add DynSized trait (rebase of #44469) #46108
This comment has been minimized.
This comment has been minimized.
bergus
commented
Jun 14, 2018
|
@SimonSapin Looking for exactly that as well. Have you found a suitable workaround yet? |
This comment has been minimized.
This comment has been minimized.
|
@bergus Today, the best you can do is manual memory layout computation, raw allocation, and https://github.com/rust-lang/rust/blob/1.26.2/src/liballoc/rc.rs#L714-L727 |
This comment has been minimized.
This comment has been minimized.
|
There's also this pattern for a slice with a header, and |
This comment has been minimized.
This comment has been minimized.
bergus
commented
Jun 15, 2018
|
@SimonSapin Thanks |
This comment has been minimized.
This comment has been minimized.
Yes, and if you need mutable references, it prevents someone from (mis-)using |
gnzlbg
referenced this pull request
Oct 11, 2018
Closed
Representation of Rust references (`&T`, `&mut T`) and raw pointers (`*const T, `*mut T`) #16
This comment has been minimized.
This comment has been minimized.
velvia
commented
Jan 6, 2019
|
Hi folks, I'm new to the Rust community, but was really hoping to implement a cross-platform library for high performance compressed vectors using this DST feature. These vectors would have, for compatibility reasons, a u32 size header and other header fields followed by something like |
ubsan commentedMar 2, 2016
I believe this fixes #813, and is a nicer, and far more powerful, solution than #709.