Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upRFC: Make `*T` not nullable #10571
Comments
This comment has been minimized.
This comment has been minimized.
|
I feel a little weird having Rust types like |
This comment has been minimized.
This comment has been minimized.
|
We need to discuss this further, there are interesting issues on both sides of this. P-backcompat-lang. |
This comment has been minimized.
This comment has been minimized.
|
Some comments:
Upside of the change:
Downside of the change:
|
This comment has been minimized.
This comment has been minimized.
|
See some discussion in #9788 |
This comment has been minimized.
This comment has been minimized.
|
I like this a lot. (It's similar to what I tried proposing a few months ago.) I think the mentioned downside is an upside. It should be explicit whether nulls are possible or not. Why should it be different for If we do this then the |
jld
referenced this issue
Nov 24, 2013
Closed
`Option<*T>` should be represented as a pointer #10570
This comment has been minimized.
This comment has been minimized.
|
My issue with this is that I consider the current enum-pointer optimization to be just that, an optimization, and relying on that behaviour doesn't sit right with me. Currently, I am firmly against anything that makes This also, for some reason, singles out 0 as a bad value. The whole point of the raw pointers is that Rust cannot determine if they point to any valid location. Why should we treat 0 as special in this case? What about other bogus pointers? What about tagged pointers? They aren't safe either and I don't see how dubiously preventing I can see this only producing misleading code and faulty assumptions. Canonicalising If |
This comment has been minimized.
This comment has been minimized.
|
I extremely strongly agree with @Aatch. Making |
This comment has been minimized.
This comment has been minimized.
|
I also agree with @Aatch. |
This comment has been minimized.
This comment has been minimized.
|
I approve of this change. It will ensure that writers of C bindings document their assumptions and that their code is consistent with their assumptions. In debug builds, Rust could automatically insert assertions to test these assumptions (e.g. that certain C functions never return null).
They are not pointers! They should be represented by corresponding Rust types depending on how they differ from pointers (both 0 and 1 are special; low bits are meant to be lopped off; etc). In the worst case, they can be represented in Rust as (newtyped) uints. |
This comment has been minimized.
This comment has been minimized.
|
This could be really annoying in kernel code where |
This comment has been minimized.
This comment has been minimized.
If this is referring to
How then would you distinguish I don't get the attachment to C semantics, either. We have no trouble learning from C's mistakes in other parts of the language. Where else in the semantics (not syntax) of the language do we say "we have to do it this way because C does it this way, period", even if another way might be better? Why here? Finally, do (plural) you feel the same way about function pointers, where the same things (non-nullable, need Things I agree with are that writing the null pointer optimization in stone and assuming One further concern I would raise from an ergonomics perspective, overlapping with but not quite the same as things above, is that if it looks like a C pointer, and it sure does, then people are going to expect that it behaves like a C pointer (to some degree this might already be apparent). They're going to think it's the same thing as in C and use it in their |
This comment has been minimized.
This comment has been minimized.
|
@glehel I don't like the idea of pushing systems concerns further to the side in a systems programming language. There needs to be some type that you can use to directly deal with the hardware without Rust sticking it's nose in. We have pointers that don't follow C semantics. We do need one that does. This isn't something we can change, we have to deal with C code. As for other values being passed to extern code being made into illegal values, yes, you're right, but in those cases either some has gone terribly wrong or you are making a terrible mistake. My issue is that it gives the illusion of safety where it's blatantly false. It's basically saying "these pointers are never null, except when they are", because it's not like the compiler will catch the little things like forgetting to wrap some type in an Option. Lastly, what, in light of these criticisms, would this change gain you? It doesn't make raw pointers any more safe. It doesn't make writing FFI bindings any easier, it could help catch a class of errors, but only by introducing a new class. All it does, to me, is make things more complicated for no good reason. |
This comment has been minimized.
This comment has been minimized.
|
@Aatch Agreed. Furthermore, all current FFI bindings can be created pretty easily by looking purely at the types involved in the C declaration. This absolutely is not the case with a non-nullable |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
@huonw Sure, but almost no code ever uses it. And note that it's an opt-in to being non-nullable, whereas the proposal here makes it an opt-out. |
This comment has been minimized.
This comment has been minimized.
|
@nikomatsakis you listed as one upside of the change: "More accurate types"; I think that would be more correctly stated as "more precise types"; all doing this can buy you (I think) is the ability to directly express in a Rust-ic fashion a distinction between a nonnullable and a nullable pointer. But as they said in my high school chemistry class, precision is not accuracy. |
This comment has been minimized.
This comment has been minimized.
|
This comment from @Aatch, "If There are potentially three or more options here, not two, and I do not know which niko intended from the original description.
From the debates on this ticket, it seems like a lot of people have been assuming that some variant of (3) is what is being proposed. (There are also subvariants of (2.), for example where we remove So, @nikomatsakis: can you clarify which (sub)variant were you proposing? |
This comment has been minimized.
This comment has been minimized.
|
@pnkfelix I was proposing option 3, though I had considered Option 2 for a while. I find the "hands off my The main motivator here is the fact that we permit casts from random integers to |
This comment has been minimized.
This comment has been minimized.
|
Oh, the other motivator is that I think it's genuinely surprising that |
This comment has been minimized.
This comment has been minimized.
|
And one parting thought -- I agree that in general enum repr should remain undefined, but I am ok with specifying the pointer optimization. I suspect it'll be a de facto standard whatever we do. |
This comment has been minimized.
This comment has been minimized.
|
Hmm, I was thinking more about the question of kernel code. My first thought was since one could still construct a |
This comment has been minimized.
This comment has been minimized.
|
On Wed, Nov 27, 2013 at 01:12:01AM -0800, Felix S Klock II wrote:
True. But then, ain't that always the case? But I guess it's |
This comment has been minimized.
This comment has been minimized.
|
My original instinct was to go with option 1: "Keep things as they are". I think the main reason why I'm considering alternatives is that I too find it genuinely surprising that @nikomatsakis what about an option 2 subvariant where we also remove (This is basically my way of trying to deal with my option 3 probably being a weak version of option 2.) |
This comment has been minimized.
This comment has been minimized.
|
By the way, in case it is unclear, I do find it distasteful (emphatically "not clever") that a consequence of option 2 as I described it (and I think in expected practice, option 3 as well) an application of |
This comment has been minimized.
This comment has been minimized.
Automation tools would/should stick to
Here too, if the story is that the syntax of nullable pointers is changing from There are two wrinkles. One is that The other wrinkle is that, as mentioned earlier, Another direction you could approach it from. What's going unmentioned is the other major use case for unsafe pointers: as building blocks for Rustic smart pointers, and in data structures with invariants the type system can't express. In these cases you almost never want implicit nullability, and in fact it's a hindrance because it means that @pnkfelix: I agree that I don't like Option 2 either. :-) |
This comment has been minimized.
This comment has been minimized.
|
On Wed, Nov 27, 2013 at 03:00:58AM -0800, Felix S Klock II wrote:
Yes, clearly this is suprising too. |
This comment has been minimized.
This comment has been minimized.
|
@nikomatsakis: On second thought, kernels wanting to use 0 pointers for something real will probably need a project-wide attribute like |
This comment has been minimized.
This comment has been minimized.
You don't even need a keyword, you could just use something like |
This comment has been minimized.
This comment has been minimized.
|
Keep in mind that what when you say "*T can point anywhere at anything", you're invoking undefined behavior in LLVM. |
This comment has been minimized.
This comment has been minimized.
|
@cmr As far as I'm aware, it's only undefined if you dereference the |
ghost
referenced this issue
Nov 28, 2013
Merged
Represent most nullable ptrs as `Option<*T>`s in the ffi #51
This comment has been minimized.
This comment has been minimized.
|
I agreed with this at first, but I don't anymore. I was working with some embedded code where 0 was a valid and used pointer. WIth this change, I wouldn't be able to use Rust for that project. What does this change actually win us, anyway? I don't find the upsides particularly compelling. |
This comment has been minimized.
This comment has been minimized.
|
@cmr: as @thestinger pointed out, this is just as relevant for other pointer types, i.e. you can't have |
This comment has been minimized.
This comment has been minimized.
|
I'm not worried about those other types. On Wed, Dec 4, 2013 at 2:35 PM, Gábor Lehel notifications@github.comwrote:
|
This comment has been minimized.
This comment has been minimized.
|
I'm wondering what the precise meaning of What I have bouncing around in my head is that maybe there should be two types. One that shares all of the same properties and type system invariants as the safe pointer types, including non-nullability, except that it's up to the programmer, rather than the compiler, to uphold them, potentially gets followed by the GC, and so forth. This would mainly be used for things like smart pointers and data structures. And one that's truly just a raw memory address "like in C" (or perhaps instead asm?), without nothing at all assumed about it, the programmer can dereference it if she wants to or she can not, and otherwise it's just as inert as an |
This comment has been minimized.
This comment has been minimized.
|
I would not expect the garbage collector to ever follow a |
This comment has been minimized.
This comment has been minimized.
|
In C, you're not allowed to do pointer arithmetic outside of the bounds of an object (with a special case allowing one-byte-past-the-end), you're not allowed to do make arbitrary casts between pointers and you're definitely not allowed to dereference a null/dangling pointer. They're not just an address at all, and it's not possible to write something like an XOR-linked-list without hitting undefined behaviour due to the aliasing/derived pointer rules you must respect. LLVM inherits almost all of these semantics from C and |
This comment has been minimized.
This comment has been minimized.
|
@kballard what about @thestinger I know, which is why I said "or perhaps instead asm?". I don't personally care how C-like versus uint-like it is or isn't. |
This comment has been minimized.
This comment has been minimized.
|
@glehel: Hrm, I hadn't considered the fact that |
This comment has been minimized.
This comment has been minimized.
|
C++ programmers aren't going to be willing to make compromises for garbage collection, so it can't dictate the design of the language. If it's intended to be a fully optional feature, it's entirely a library/compiler issue and doesn't belong in language design. The standard library can use as many attributes as needed to support it. |
This comment has been minimized.
This comment has been minimized.
|
my assumption has been that when we add a proper Gc, we will probably have to also add some way for 3rd party libraries providing smart pointers and/or allocators to properly interoperate with it. (And if a library does not o cannot interoperate with it, then a task won't be able to compose that library with the Gc -- though I hope that attempts to perform erroneous compositions would at least be statically detected rather than dynamic failures.) The design is still quite fuzzy in my head, but this may involve any/all of:
These topics remain to be worked out. (I was about to say "I don't know what bearing the above has on the issue of making |
This comment has been minimized.
This comment has been minimized.
|
@pnkfelix: I expected it would be something like adding an attribute to fields with raw pointers the garbage collector should trace through. Anyway, adding lots of pain to low-level code is an incentive to maintain another library ecosystem. |
This comment has been minimized.
This comment has been minimized.
|
@thestinger hmm, I admit that my definition of "some other protocol" had not included that option. But I'm going to take the liberty of reinterpreting my own comment to now include that option, (though I'm still not sure if its what I would go with). |
This comment has been minimized.
This comment has been minimized.
|
I withdraw this suggestion. |
nikomatsakis commentedNov 19, 2013
I think the fact that
*Tis nullable is an anachronism (or will be once #10570 is fixed). We should just useOption<*T>for nullable pointers. Anybody else have an opinion?Nominating.