Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upRFC: Raw Reform #365
Conversation
Gankro
force-pushed the
Gankro:raw-reform
branch
from
8ec55dc
to
0b42887
Oct 7, 2014
aturon
self-assigned this
Oct 7, 2014
This comment has been minimized.
This comment has been minimized.
glaebhoerl
commented on active/0000-raw-reform.md in 0b42887
Oct 7, 2014
|
I don't believe this works as written. The lifetime of the resulting slice has to come from somewhere. Two options I see: use (Same thing in several other places.) |
This comment has been minimized.
This comment has been minimized.
|
Given |
This comment has been minimized.
This comment has been minimized.
Yes, though at least for the first one, |
This comment has been minimized.
This comment has been minimized.
|
Do we have this for normal raw pointers? (I. e. |
This comment has been minimized.
This comment has been minimized.
arthurprs
commented
Oct 7, 2014
Doesn't this requires bounds checking? Thus effectively defeating the raw access advantage? |
This comment has been minimized.
This comment has been minimized.
|
@glaebhoerl Good catch! I think I caught them all. |
This comment has been minimized.
This comment has been minimized.
|
@arthurprs Perhaps I miscommunicated the intended behaviour, why do you expect bounds checking is required? Perhaps I didn't sufficiently clarify that raw slices are the |
Gankro
added some commits
Oct 8, 2014
This comment has been minimized.
This comment has been minimized.
arthurprs
commented
Oct 8, 2014
|
@Gankro nevermind, when I read the uint -> int change I had python indexes in mind (so -1 is the last element, -2 the before last ...) |
This comment has been minimized.
This comment has been minimized.
|
I've started implementing these changes in a branch as a sanity check, and to demonstrate the simplicity. One thing I've run into: Where previously you could call
Now you would need to do:
Hmm... perhaps some of these would be better of as free fn's in |
This comment has been minimized.
This comment has been minimized.
|
Ack, and |
This comment has been minimized.
This comment has been minimized.
reem
commented
Oct 8, 2014
|
Generally I think it's a bad idea for pointers to have inherent methods if they can avoid it, and that those should usually be free functions in |
This comment has been minimized.
This comment has been minimized.
|
More concise unsafe code means fewer bugs. |
This comment has been minimized.
This comment has been minimized.
|
insertion_sort could actually be done more cleanly as: /// Rotates a slice one element to the right,
/// moving the last element to the first one and all other elements one place
/// forward. (i.e., [0,1,2,3] -> [3,0,1,2])
fn rotate_right<T>(s: &mut [T]) {
let len = s.len();
let s = s.as_raw_mut();
if len == 0 { return; }
unsafe {
let first = s.read(len-1);
s[1..].copy(s[..len-1]);
s.write(0, first);
}
}
fn insertion_sort<T>(v: &mut [T], compare: |&T, &T| -> Ordering) {
let len = v.len();
// 1 <= i < len;
for i in range(1, len) {
// j satisfies: 0 <= j <= i;
let mut j = i;
// `i` is in bounds.
let read_ptr = unsafe { v.unsafe_get(i) };
// find where to insert, we need to do strict <,
// rather than <=, to maintain stability.
// 0 <= j - 1 < len, so j - 1 is in bounds.
while j > 0 && compare(read_ptr,
unsafe { v.unsafe_get(j - 1) }) == Less {
j -= 1;
}
// `i` and `j` are in bounds, so [j, i+1) = [j, i] is valid.
rotate_right(unsafe { v.unsafe_slice_mut(j, i+1) });
}
}– we should support this style (although |
This comment has been minimized.
This comment has been minimized.
|
@reem When I drafted this up I was trying to apply the lesson's I'd learned working on collections stuff. Namely, that it's generally easier and safer to work and reason with objects in "modes". With collections we have the "iterator" mode, and with maps we now have "entry" mode. Unfortunately this translates poorly to references and rawptrs because we don't (and probably shouldn't ever) have I still think it would be nicer if when you have a ptr I'm leaning towards having the methods and the free fns, so that people can just do whatever is most natural. Possibly moving most if not all of the free functions to |
This comment has been minimized.
This comment has been minimized.
|
@arielb1 I've definitely mulled over the possibility of offering safe Otherwise, I'm not clear what you're suggesting. Every single one of your slicing operations is unchecked, and consequently riddled with unsafe's. It seems much cleaner at that point to simply say "okay, this is all unsafe" and just use raw slices. It doesn't seem very helpful to say exactly this subexpression is unsafe. But maybe that's just me? |
This comment has been minimized.
This comment has been minimized.
I think the main reason not to have Generally, we freely coerce from |
This comment has been minimized.
This comment has been minimized.
|
@Gankro Ah, I think I understand now: you're saying that we don't (and shouldn't) automatically make the various methods of |
This comment has been minimized.
This comment has been minimized.
|
@arielb1 Also, you can just do this, if you really want:
Which is basically the same code-wise. Just more sigily. |
This comment has been minimized.
This comment has been minimized.
Given the above discussion, providing both free functions and methods probably makes sense (and it explains why these weren't just methods in the first place). I believe the motivation for having the |
This comment has been minimized.
This comment has been minimized.
|
I was discussing this with @kballard yesterday on IRC. They suggested that I'm not totally convinced by this argument, but I don't have particularly strong feelings. Removing it would reduce the API surface area with minimal ergonomic loss in my mind. It might also prevent novices from thinking Any thoughts? |
This comment has been minimized.
This comment has been minimized.
I think I agree with @kballard on this: I think the same is true for |
This comment has been minimized.
This comment has been minimized.
|
Free fns are mostly all back in the latest draft. Fleshed out some other stuff too. |
This comment has been minimized.
This comment has been minimized.
|
Wrapping the operations in individual unsafe-blocks is just a style thing – I put unsafe blocks around all places where safety invariants are temporarily being violated, so each occurrence unchecked indexing (which does not violate any invariant) gets placed in its own unsafe block. The primary point of my example is that it mostly uses unchecked indexing, rather than playing with raw pointers (other than within rotate). Raw pointers lose the aliasing guarantees, which just adds unneeded unsafety. I just noticed your proposal didn't talk about unchecked indexing, which is what we want in this case. Having official rotations (and "rotate-through-carry"-es) would be quite nice (should I post an RFC?). |
Gankro
changed the title
Raw Reform
RFC: Raw Reform
Oct 9, 2014
This comment has been minimized.
This comment has been minimized.
|
@arielb1: You're totally right. I strongly alluded to the fact that these operations would be unchecked in the earlier sections, but completely forgot to state this in the detailed design. I've now fixed this. I've also added your lifetime concern as a drawback to the proposal, as it is a very legitimate one! You can post an RFC, or prototype it out on discuss if you aren't totally comfortable with your current design. |
This comment has been minimized.
This comment has been minimized.
|
@Gankro This is one reason that I think leaving |
This comment has been minimized.
This comment has been minimized.
lilyball
reviewed
Oct 9, 2014
| ``` | ||
| trait RawSlice<T> { | ||
| /// Gets the length of the rawslice | ||
| fn len(self) -> uint; |
This comment has been minimized.
This comment has been minimized.
lilyball
Oct 9, 2014
Contributor
Instead of putting len() in RawSlice<T>, you should just implement Collection on *const [T]/*mut [T].
This comment has been minimized.
This comment has been minimized.
Gankro
Oct 9, 2014
Author
Contributor
I don't think Collection is going to survive #235, although that's a bit ambiguous.
lilyball
reviewed
Oct 9, 2014
|
|
||
| * Deprecate `RawPtr::null` and `RawPtr::is_not_null` as awkward to use. | ||
|
|
||
| * Deprecate all of `slice::raw` as poorly motivated, especially with raw slices available. |
This comment has been minimized.
This comment has been minimized.
lilyball
Oct 9, 2014
Contributor
Half of slice::raw is actually operations on std::raw::Slice, not unsafe operations on slices.
This comment has been minimized.
This comment has been minimized.
Gankro
Oct 9, 2014
Author
Contributor
And that half doesn't seem to actually be used anywhere, as far as I can tell. Searching github just reveals... various versions of slice.rs.
This comment has been minimized.
This comment has been minimized.
lilyball
Oct 9, 2014
Contributor
Looks like you're right, pop_ptr and shift_ptr are used inside rust only in libcore/slice.rs.
lilyball
reviewed
Oct 9, 2014
| a proper reference to the value. Therefore we offer both free functions and methods so that the | ||
| most ergonomic calling style can be used. In one hypothetical UFCS future, it would be possible to | ||
| import the RawPtr methods *as* free functions, in which case the free functions can reasonably | ||
| be deprecated. |
This comment has been minimized.
This comment has been minimized.
lilyball
Oct 9, 2014
Contributor
If they're going to be duplicated, why not just leave them as free functions in the ptr module? Seems like ptr::read(foo) is more convenient than RawPtr::read(foo).
Edit: I thought you were saying RawPtr would have static functions on it, but reading below, you left the ptr free functions in place.
lilyball
reviewed
Oct 9, 2014
|
|
||
| * As written, `RawSlice<T>` theoretically prevents `RawPtr<T>` ever being implemented for | ||
| `T: Unsized`. This could be potentially worked around by a very specific exception to the blanket | ||
| impl for `T = [U]`. |
This comment has been minimized.
This comment has been minimized.
lilyball
Oct 9, 2014
Contributor
RawPtr<T> can't ever be implemented on T: Unsized anyway, because the read() method returns a T, which is only valid if T is sized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
lilyball
Oct 9, 2014
Contributor
On that note, this means that raw pointers to unsized types besides [T] will implement neither RawSlice<T> nor RawPtr<T>. This seems potentially unfortunate. For example, a *const str will have this issue.
Perhaps RawPtr<T> should actually be split into two traits, RawPtr<T> which contains everything that doesn't conflict with RawSlice<T>, and then RawSizedPtr<T: Sized>.
lilyball
reviewed
Oct 9, 2014
| * Are some of the proposed-to-be-deprecated functions worth saving? | ||
|
|
||
| * Should `ptr.offset` be deprecated in favour of only using pointer arithmetic? Being able to | ||
| method chain offsets is moderately convenient. |
This comment has been minimized.
This comment has been minimized.
lilyball
Oct 9, 2014
Contributor
No. I'm actually somewhat inclined to say I don't agree with pointer arithmetic. Pointer arithmetic, as a + b, can be one of the most confusing aspects of dealing with pointers, partially because it looks like regular arithmetic, and partially because the actual value of the result depends not just on the value of b, but also on the size of the type of *a. For this reason, I'm leaning towards saying that we should just not have pointer arithmetic and continue to use the .offset() method.
This comment has been minimized.
This comment has been minimized.
Gankro
Oct 9, 2014
Author
Contributor
That's really good to know. @thestinger suggested to me on IRC that pointer arithmetic was highly desirable. If I understand the history correctly, it used to be there, but had to be removed because it had to be an Unsafe operation, and Add and Sub are safe. If that's true, then that implies the only reason it was removed was purely technical, and not philosophical.
Personally, I don't have a very strong opinion. offset is easier to chain, which I prefer. However arithmetic does make porting tricky C/C++ code easier.
This comment has been minimized.
This comment has been minimized.
lilyball
Oct 9, 2014
Contributor
However arithmetic does make porting tricky C/C++ code easier.
Porting to offset, I would argue, makes sure you understand what the code is actually doing. Being able to copy a chunk of pointer arithmetic without modification is something I would be extremely nervous about recommending anyone do.
This comment has been minimized.
This comment has been minimized.
mahkoh
Oct 9, 2014
Contributor
Being able to copy a chunk of pointer arithmetic without modification is something I would be extremely nervous about recommending anyone do.
If pointer arithmetic behaves the same in both languages then modifications are more likely to introduce new bugs. Not to mention that offset is harder to read.
This comment has been minimized.
This comment has been minimized.
lilyball
Oct 10, 2014
Contributor
If pointer arithmetic behaves the same in both languages
That's a very big if. When porting code to Rust, you may modify types along the way, and that would potentially invalidate any pointer arithmetic. For example, if you took C code that used int and converted it to Rust code that used int, the size of the type is now different on 64-bit machines.
offset is harder to read
I actually think it's slightly easier to read, because it highlights the fact that this is pointer arithmetic, which is especially helpful when doing pointer arithmetic and integer arithmetic at the same time.
If you're doing enough pointer arithmetic that .offset() becomes too unwieldy, then I think you might have a problem ;)
This comment has been minimized.
This comment has been minimized.
mahkoh
Oct 10, 2014
Contributor
When porting code to Rust, you may modify types along the way, and that would potentially invalidate any pointer arithmetic. For example, if you took C code that used int and converted it to Rust code that used int, the size of the type is now different on 64-bit machines.
If your pointer arithmetic depends on the size of a certain type, then choosing a different size when porting is a bug unrelated to pointer arithmetic. In this case the issues and RFCs related to the name of Rust's int type are more relevant. Either way, if
(x + 1) as uint == x as uint + size_of::<X>()then a faithful port will behave the same way. Furthermore, pointer arithmetic will be used much more often for the inspection of arrays, and in this case only the number of elements matters, and not the contained type.
I actually think it's slightly easier to read, because it highlights the fact that this is pointer arithmetic
This has nothing to do with readability but [something whose name I don't know [the transfer of knowledge]]. Given full knowledge of the behavior of the operations involved,
x + yis more readable than
x.add_with_possible_overflow(y).In this case, x and y are integers. And while integer arithmetic is more frequent than pointer arithmetic in Rust code, when you port code, you want to change as little as possible in order to not introduce new bugs.
Secondly, I do not believe that the behavior of pointer arithmetic is a source of many bugs (compared to ordinary unchecked array indexing.) Most of the time you either index an array or inspect an object with c_char pointers (also because of the stricter aliasing rules in C.) In the second case, pointer arithmetic behaves just like integer arithmetic.
Thirdly, pointer arithmetic only happens in unsafe blocks where increased attention is required anyway.
Lastly, raw pointers are not arbitrary memory locations but the addresses of properly aligned objects. Nobody would expect x + 1 to create a pointer whose address is one larger than the address of the previous pointer, because that would be undefined behavior if the required alignment for *x is larger than one.
lilyball
reviewed
Oct 9, 2014
| * raw slices could also support `offset`, and consequently unsafe addition and subtraction. This | ||
| could be useful for shifting a window into a larger slice around. This would also bring raw slices | ||
| and pointers closer together in functionality. Unclear if this is desirable. May accidentally fall | ||
| out of just providing the functionality on raw ptrs, unless explicitly prevented. |
This comment has been minimized.
This comment has been minimized.
lilyball
Oct 9, 2014
Contributor
I don't think raw slices should have offset(). It would be used far too rarely to justify its presence.
lilyball
reviewed
Oct 9, 2014
| * More slice methods can be ported to raw slices to provide more unchecked operations. It may | ||
| be worth considering this. This can be done in a back-compat way later, though. | ||
|
|
||
| * Checked or truncating versions of the copy methods on raw slices? |
This comment has been minimized.
This comment has been minimized.
lilyball
Oct 9, 2014
Contributor
Truncating, and it should return a uint with the number of elements it actually copied.
This comment has been minimized.
This comment has been minimized.
Gankro
Oct 9, 2014
Author
Contributor
That's a C-ism, yeah? Maybe we could have something Like Result<(), uint>?
This comment has been minimized.
This comment has been minimized.
lilyball
Oct 10, 2014
Contributor
A C-ism? I don't think so. It's a common idiom in many languages when doing a copy where the number of elements copied is calculated rather than passed as a parameter.
I don't see what use Result<(), uint> would be. All it does is let you check if the copy wasn't truncated without having to get the length, but it's unclear as to whether it returns Ok(()) in the case where src.len() == dest.len() or merely in the case where src.len() <= dest.len(). It also is intentionally throwing away information that might be potentially useful if you actually do care how many elements were copied regardless of whether there was truncation.
This comment has been minimized.
This comment has been minimized.
Gankro
Oct 10, 2014
Author
Contributor
Result<uint, uint> then? I would expect only truncation would be an Err, but I could see that that really depends on context. I can live with a single uint, it just feel a bit semantically poor. I obviously don't work with those kinds of APIs much, though.
lilyball
reviewed
Oct 9, 2014
|
|
||
| * Checked or truncating versions of the copy methods on raw slices? | ||
|
|
||
| * Maybe deprecate `is_null()` in favour of `== ptr::null()`? |
This comment has been minimized.
This comment has been minimized.
lilyball
Oct 9, 2014
Contributor
No, I think null is special enough and comparing against null is common enough that .is_null() is worth preserving.
lilyball
reviewed
Oct 9, 2014
|
|
||
| * Maybe deprecate `is_null()` in favour of `== ptr::null()`? | ||
|
|
||
| * swap and replace *aren't* provided on the current RawSlice API for no particular reason other |
This comment has been minimized.
This comment has been minimized.
lilyball
Oct 9, 2014
Contributor
I assume you mean variants that would swap/replace a value at an index? As long as it's easy to get a pointer to one of the elements, then I don't think we should add it to RawSlice.
This comment has been minimized.
This comment has been minimized.
|
What about slice_unchecked and slice_mut_unchecked? |
This comment has been minimized.
This comment has been minimized.
|
@arielb1 Can you clarify? |
This comment has been minimized.
This comment has been minimized.
|
Just noticed that the unsafe methods on |
This comment has been minimized.
This comment has been minimized.
|
Egh, just got reminded of |
This comment has been minimized.
This comment has been minimized.
|
Discussed with @pcwalton the viability of adding any lang stuff for new unsafe operators and it looks like that's a total no-go for 1.0; not out of the question for later, though! If this is indeed the case, then we need a migration plan. I propose adding the desired operators as unsafe named methods marked |
This comment has been minimized.
This comment has been minimized.
|
After some discussion with @aturon we've concluded that the most interesting bits can largely be hacked out in cargo while we wait for Rust to flesh out its operator overloading story. There's some useful ideas in here for stabilization, but we can revisit that with a different RFC. |
Gankro commentedOct 7, 2014
*const [T]and*mut [T]that provide parts of the slice and ptr API to better bridge thegap between the two.
unchecked slice manipulation.
provide more ergonomic ptr manipulation.
ptr, and duplicate the rest as convenience methods onthe RawPtr extension traits.
unsafemethods on slices andslice::rawin favour of raw slices.them.
Rendered