New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsafe fields #381

Open
pnkfelix opened this Issue Oct 9, 2014 · 67 comments

Comments

Projects
None yet
@pnkfelix
Member

pnkfelix commented Oct 9, 2014

Multiple developers have registered a desire to allow public access to fields that expose abstraction-breaking impl details (rather than following the current convention of making all such fields private to the structure in question, thus containing the abstraction-breakage to the structure's module).

See the following postponed (or otherwise closed) RFCs:

@pnkfelix pnkfelix added the postponed label Oct 9, 2014

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Jan 12, 2016

Member

I think there is a use for such unsafe fields even when they are private, see https://internals.rust-lang.org/t/pre-rfc-unsafe-types/3073.

Member

RalfJung commented Jan 12, 2016

I think there is a use for such unsafe fields even when they are private, see https://internals.rust-lang.org/t/pre-rfc-unsafe-types/3073.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Jan 12, 2016

Member

Honestly speaking, none of these RFCs really convinced me we want unsafe public fields. The further away an access is from where the invariant "is", the more likely it is for that access to be done without enough consideration of the invariant. I'm already worried about modules growing big, I don't think there should ever be direct access to fields carrying invariants from outside the module. If something like this is needed, just provide a public, unsafe setter function. This provides a good place to add precise documentation about when this function is safe to use (thus also very explicitly making this part of the promised, stable API), and the name of the function may give some additional hints to callers.

Member

RalfJung commented Jan 12, 2016

Honestly speaking, none of these RFCs really convinced me we want unsafe public fields. The further away an access is from where the invariant "is", the more likely it is for that access to be done without enough consideration of the invariant. I'm already worried about modules growing big, I don't think there should ever be direct access to fields carrying invariants from outside the module. If something like this is needed, just provide a public, unsafe setter function. This provides a good place to add precise documentation about when this function is safe to use (thus also very explicitly making this part of the promised, stable API), and the name of the function may give some additional hints to callers.

@pnkfelix

This comment has been minimized.

Show comment
Hide comment
@pnkfelix

pnkfelix Jan 12, 2016

Member

@RalfJung if I'm reading your two comments correctly, it sounds like you do want this feature, but you don't want it for public fields?

(Seems odd to add it solely for non pub fields, just in terms of it seeming like a semi-arbitrary restriction ... don't get me wrong, I can understand the reasoning that "its unsafe, so there's surely some invariant that should be maintained by the declaring module itself"; but nonetheless, it seems like that is an argument to lint and warn against unsafe pub fields, but not a reason to outlaw them in the language itself.)


Update: Oh, whoops, I didn't read the linked pre-RFC or even look carefully at its title. After skimming the proposal there, I can see how that model means that pub and unsafe are no longer orthogonal, but rather quite intermingled.

Member

pnkfelix commented Jan 12, 2016

@RalfJung if I'm reading your two comments correctly, it sounds like you do want this feature, but you don't want it for public fields?

(Seems odd to add it solely for non pub fields, just in terms of it seeming like a semi-arbitrary restriction ... don't get me wrong, I can understand the reasoning that "its unsafe, so there's surely some invariant that should be maintained by the declaring module itself"; but nonetheless, it seems like that is an argument to lint and warn against unsafe pub fields, but not a reason to outlaw them in the language itself.)


Update: Oh, whoops, I didn't read the linked pre-RFC or even look carefully at its title. After skimming the proposal there, I can see how that model means that pub and unsafe are no longer orthogonal, but rather quite intermingled.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Jan 12, 2016

Member

I'm not sure I want that feature (unsafe private fields), I am uncertain. But I thought it was worth brining it up for discussion.

I do not want unsafe public fields. I think it is a good idea to keep the intuitive notion of abstraction boundary and "how far out do I have to care about this invariant" aligned.

Member

RalfJung commented Jan 12, 2016

I'm not sure I want that feature (unsafe private fields), I am uncertain. But I thought it was worth brining it up for discussion.

I do not want unsafe public fields. I think it is a good idea to keep the intuitive notion of abstraction boundary and "how far out do I have to care about this invariant" aligned.

@pnkfelix

This comment has been minimized.

Show comment
Hide comment
@pnkfelix

pnkfelix Jan 12, 2016

Member

(just for proper cross referencing purposes, there is an excellent series of comments regarding unsafe fields that unfortunately occurred on an RFC issue other than this one.)

Member

pnkfelix commented Jan 12, 2016

(just for proper cross referencing purposes, there is an excellent series of comments regarding unsafe fields that unfortunately occurred on an RFC issue other than this one.)

@daniel-vainsencher

This comment has been minimized.

Show comment
Hide comment
@daniel-vainsencher

daniel-vainsencher Jan 12, 2016

In that RFC issue, glaebhoerl mentions that safe code can break unsafe code assumptions in at least two ways:
A) Modify a field incorrectly (this is what declaring fields unsafe could prevent).
B) Unsafe code calls safe code and makes assumptions (say, that arithmetic is done correctly). If the called safe code is wrong, the unsafe code now does horrible things.

glaebhoerl, tsion and RalfJung seem about half convinced that the existence of B invalidates the value of unsafe fields, but I think B is much less terrible than A. In a nutshell, eliminating A is like limiting the use of global mutable state in software.

Unsafe fields would create a wonderful invariant that "every memory-safety critical code is an explicit dependency of some unsafe code". This makes inspecting code for safety much easier and more intuitive than the current situation, where starting from unsafe code we need to propagate through every variable it relies on to all code that may access that variable (a much more implicit rule).

The rule "only code in unsafe blocks can contribute to any memory unsafety" is not a good goal IMO; unsafe code X relying on the functional correctness of code Y should not mean we need to increase the proof burden around Y by turning off some compiler checks, right? I think the explicit dependency invariant, if we can get it, is about as good as it gets.

daniel-vainsencher commented Jan 12, 2016

In that RFC issue, glaebhoerl mentions that safe code can break unsafe code assumptions in at least two ways:
A) Modify a field incorrectly (this is what declaring fields unsafe could prevent).
B) Unsafe code calls safe code and makes assumptions (say, that arithmetic is done correctly). If the called safe code is wrong, the unsafe code now does horrible things.

glaebhoerl, tsion and RalfJung seem about half convinced that the existence of B invalidates the value of unsafe fields, but I think B is much less terrible than A. In a nutshell, eliminating A is like limiting the use of global mutable state in software.

Unsafe fields would create a wonderful invariant that "every memory-safety critical code is an explicit dependency of some unsafe code". This makes inspecting code for safety much easier and more intuitive than the current situation, where starting from unsafe code we need to propagate through every variable it relies on to all code that may access that variable (a much more implicit rule).

The rule "only code in unsafe blocks can contribute to any memory unsafety" is not a good goal IMO; unsafe code X relying on the functional correctness of code Y should not mean we need to increase the proof burden around Y by turning off some compiler checks, right? I think the explicit dependency invariant, if we can get it, is about as good as it gets.

@pnkfelix

This comment has been minimized.

Show comment
Hide comment
@pnkfelix

pnkfelix Jan 12, 2016

Member

Some quick musings in response to @daniel-vainsencher 's comment

A. Modify[ing] a field incorrectly [... in the context of safe code ...] is what declaring fields unsafe could prevent

In the absence of an actual RFC, its hard to make verifiable absolute statements about what a given feature would provide.

Will it be legal for unsafe code to first borrow an unsafe field into a usual &-reference, and then pass that reference into safe code?

  • If so, who is at fault when the safe code does something unexpected in that scenario?
  • If not, then what happens when borrowing an unsafe field (within the context of an unsafe block, that is)? Would we add &unsafe-references as part of this hypothetical feature? Or would borrows of unsafe fields only go straight to unsafe *-pointers, never to &-references?

(I am just trying to say that there is a non-trivial burden of proof here, even when it comes to supporting seemingly obvious claims such as A above.)


Update: One might argue that safe code won't be able to do anything harmful with the &-reference; but of course there is always both &mut and also interior mutability to consider.


Update 2: And of course my argument above is a bit of a straw-man, in the sense that buggy unsafe code could also pass a borrowed reference to private data to clients who should not have any access to it. But I'm not claiming that privacy prevents such things. I'm just wary of what claims are made of a proposed feature.

Member

pnkfelix commented Jan 12, 2016

Some quick musings in response to @daniel-vainsencher 's comment

A. Modify[ing] a field incorrectly [... in the context of safe code ...] is what declaring fields unsafe could prevent

In the absence of an actual RFC, its hard to make verifiable absolute statements about what a given feature would provide.

Will it be legal for unsafe code to first borrow an unsafe field into a usual &-reference, and then pass that reference into safe code?

  • If so, who is at fault when the safe code does something unexpected in that scenario?
  • If not, then what happens when borrowing an unsafe field (within the context of an unsafe block, that is)? Would we add &unsafe-references as part of this hypothetical feature? Or would borrows of unsafe fields only go straight to unsafe *-pointers, never to &-references?

(I am just trying to say that there is a non-trivial burden of proof here, even when it comes to supporting seemingly obvious claims such as A above.)


Update: One might argue that safe code won't be able to do anything harmful with the &-reference; but of course there is always both &mut and also interior mutability to consider.


Update 2: And of course my argument above is a bit of a straw-man, in the sense that buggy unsafe code could also pass a borrowed reference to private data to clients who should not have any access to it. But I'm not claiming that privacy prevents such things. I'm just wary of what claims are made of a proposed feature.

@glaebhoerl

This comment has been minimized.

Show comment
Hide comment
@glaebhoerl

glaebhoerl Jan 12, 2016

Contributor

@daniel-vainsencher Also how do (or don't) indirect calls factor into that scenario?

Contributor

glaebhoerl commented Jan 12, 2016

@daniel-vainsencher Also how do (or don't) indirect calls factor into that scenario?

@daniel-vainsencher

This comment has been minimized.

Show comment
Hide comment
@daniel-vainsencher

daniel-vainsencher Jan 12, 2016

To clarify (to myself first of all) I am referring to RFC #80 and believe that the proposed feature is that safe code by itself cannot access unsafe fields. Your scenario of unsafe code passing references to unsafe fields into safe code is allowed. This could require annotating the parameter in the safe function as unsafe (which makes some sense) but I am not assuming this (allowing more uses for the unsafe keyword is a bigger change).

Note that unsafe code may access both safe and unsafe fields, but should be written so that safety depends only on values of unsafe fields. I was not at all clear on this in my previous comment.

@pnkfelix :
I do not understand how you are using "at fault" in this context. My point of view is in terms of the guarantees that programmers can get while acting "naturally", and what it costs.

  • To guarantees safety, it seems obvious you have to carefully inspect all the unsafe code.
  • The feature above relies on the programmer marking any fields carrying non-trivial invariants (ones that are not implied by the type, and that can affect safety) as unsafe. This is something we will need to inspect unsafe code for, but feels "natural enough" to me (compared to inspecting the whole module for any access to any field unsafe code relies on).
  • Under the feature specified above, all access to unsafe fields is through unsafe code:
    • direct access obviously
    • to my understanding of Rust, borrowed access implies that some unsafe context loaned it to you ( @glaebhoerl : directly or otherwise).
  • Since safety of unsafe code already depends on functional correctness of safe code it calls, we are not increasing the burden by enabling unsafe fields to be loaned into safe code. On seeing an unsafe field passed into called functions (safe or unsafe), we should "naturally" worry a lot about what exactly the callee is doing to it.

daniel-vainsencher commented Jan 12, 2016

To clarify (to myself first of all) I am referring to RFC #80 and believe that the proposed feature is that safe code by itself cannot access unsafe fields. Your scenario of unsafe code passing references to unsafe fields into safe code is allowed. This could require annotating the parameter in the safe function as unsafe (which makes some sense) but I am not assuming this (allowing more uses for the unsafe keyword is a bigger change).

Note that unsafe code may access both safe and unsafe fields, but should be written so that safety depends only on values of unsafe fields. I was not at all clear on this in my previous comment.

@pnkfelix :
I do not understand how you are using "at fault" in this context. My point of view is in terms of the guarantees that programmers can get while acting "naturally", and what it costs.

  • To guarantees safety, it seems obvious you have to carefully inspect all the unsafe code.
  • The feature above relies on the programmer marking any fields carrying non-trivial invariants (ones that are not implied by the type, and that can affect safety) as unsafe. This is something we will need to inspect unsafe code for, but feels "natural enough" to me (compared to inspecting the whole module for any access to any field unsafe code relies on).
  • Under the feature specified above, all access to unsafe fields is through unsafe code:
    • direct access obviously
    • to my understanding of Rust, borrowed access implies that some unsafe context loaned it to you ( @glaebhoerl : directly or otherwise).
  • Since safety of unsafe code already depends on functional correctness of safe code it calls, we are not increasing the burden by enabling unsafe fields to be loaned into safe code. On seeing an unsafe field passed into called functions (safe or unsafe), we should "naturally" worry a lot about what exactly the callee is doing to it.
@daniel-vainsencher

This comment has been minimized.

Show comment
Hide comment
@daniel-vainsencher

daniel-vainsencher Jan 12, 2016

I am not advocating for making unsafe fields potentially public (or against it). I can more easily imagine uses for private unsafe fields (restricting access even more than now, hence reducing the inspection burden), but system level abstraction may exist where public makes sense. Do we know of examples of such?

daniel-vainsencher commented Jan 12, 2016

I am not advocating for making unsafe fields potentially public (or against it). I can more easily imagine uses for private unsafe fields (restricting access even more than now, hence reducing the inspection burden), but system level abstraction may exist where public makes sense. Do we know of examples of such?

@glaebhoerl

This comment has been minimized.

Show comment
Hide comment
@glaebhoerl

glaebhoerl Jan 12, 2016

Contributor

@daniel-vainsencher What I was thinking of is that "every memory-safety critical code is an explicit dependency of some unsafe code" seems to break down the moment that the unsafe code makes an indirect call -- suddenly the range of code which may be implicated becomes open-ended again.

It may be that this isn't relevant in practice because if unsafe code relies on particular assumptions about the behavior of fns or trait objects passed into it for its safety, then it is violating the unsafe contract - that it should be safe for all possible inputs? I haven't thought that all the way through...

FWIW, if unsafe fields really would let us so sharply delineate the code which may or may not be implicated in memory safety, the idea sounds appealing to me. Are struct fields really the only vector through which unsafe code can indirectly depend on the behavior of safe code?

Contributor

glaebhoerl commented Jan 12, 2016

@daniel-vainsencher What I was thinking of is that "every memory-safety critical code is an explicit dependency of some unsafe code" seems to break down the moment that the unsafe code makes an indirect call -- suddenly the range of code which may be implicated becomes open-ended again.

It may be that this isn't relevant in practice because if unsafe code relies on particular assumptions about the behavior of fns or trait objects passed into it for its safety, then it is violating the unsafe contract - that it should be safe for all possible inputs? I haven't thought that all the way through...

FWIW, if unsafe fields really would let us so sharply delineate the code which may or may not be implicated in memory safety, the idea sounds appealing to me. Are struct fields really the only vector through which unsafe code can indirectly depend on the behavior of safe code?

@daniel-vainsencher

This comment has been minimized.

Show comment
Hide comment
@daniel-vainsencher

daniel-vainsencher Jan 12, 2016

@glaebhoerl sorry, I thought by indirect you meant merely transitive callees, now I understand you mean calls to functions (or methods of traits) passed into the unsafe code as parameters.

I am still calling that an explicit dependency in terms of a human inspector, even if the call is more or less opaque to the compiler: when inspecting unsafe code with a function parameter, you are in general liable to find out what values (code) those parameters might have and reason about their functional correctness.

The cost incurred is proportional to the power being used (in contrast to accessing any field requires inspection of the module), so it seems ok to me.

daniel-vainsencher commented Jan 12, 2016

@glaebhoerl sorry, I thought by indirect you meant merely transitive callees, now I understand you mean calls to functions (or methods of traits) passed into the unsafe code as parameters.

I am still calling that an explicit dependency in terms of a human inspector, even if the call is more or less opaque to the compiler: when inspecting unsafe code with a function parameter, you are in general liable to find out what values (code) those parameters might have and reason about their functional correctness.

The cost incurred is proportional to the power being used (in contrast to accessing any field requires inspection of the module), so it seems ok to me.

@daniel-vainsencher

This comment has been minimized.

Show comment
Hide comment
@daniel-vainsencher

daniel-vainsencher Jan 12, 2016

@glaebhoerl I am not sure it is very meaningful as a contract to require unsafe code to be "safe for all possible inputs" when in the current situation, it is often not safe for every value of fields that can be accessed by safe code anywhere in the module. Feels a bit like requiring iron bars on the window when the door is missing... ;)

I may well be missing ways to screw things up, we'll see.

daniel-vainsencher commented Jan 12, 2016

@glaebhoerl I am not sure it is very meaningful as a contract to require unsafe code to be "safe for all possible inputs" when in the current situation, it is often not safe for every value of fields that can be accessed by safe code anywhere in the module. Feels a bit like requiring iron bars on the window when the door is missing... ;)

I may well be missing ways to screw things up, we'll see.

@daniel-vainsencher

This comment has been minimized.

Show comment
Hide comment
@daniel-vainsencher

daniel-vainsencher Jan 25, 2016

Wondering what @Gankro thinks about this proposal, having just read his thesis.

daniel-vainsencher commented Jan 25, 2016

Wondering what @Gankro thinks about this proposal, having just read his thesis.

@Gankro

This comment has been minimized.

Show comment
Hide comment
@Gankro

Gankro Jan 26, 2016

Contributor

I have frequently argued against unsafe fields and unsafe types. I essentially introduce the problem they're supposed to they're supposed to solve, and then reject them in this section of the nomicon.

I don't really have the energy to put in a detailed argument, but basically I don't believe it's well-motivated. unsafe-fn is, unsafe-trait is. Pub unsafe fields could be argued for as a useful ergonomic boon for exposing an unsafe piece of data (see: mutating statics, accessing repr(packed) fields, derefing raw ptrs). Priv-unsafe fields are basically pointless. I don't think I've ever seen an error in the wild that could have been prevented, and I've seen several that wouldn't have.

Contributor

Gankro commented Jan 26, 2016

I have frequently argued against unsafe fields and unsafe types. I essentially introduce the problem they're supposed to they're supposed to solve, and then reject them in this section of the nomicon.

I don't really have the energy to put in a detailed argument, but basically I don't believe it's well-motivated. unsafe-fn is, unsafe-trait is. Pub unsafe fields could be argued for as a useful ergonomic boon for exposing an unsafe piece of data (see: mutating statics, accessing repr(packed) fields, derefing raw ptrs). Priv-unsafe fields are basically pointless. I don't think I've ever seen an error in the wild that could have been prevented, and I've seen several that wouldn't have.

@daniel-vainsencher

This comment has been minimized.

Show comment
Hide comment
@daniel-vainsencher

daniel-vainsencher Jan 26, 2016

I just read that section (all of chapter two, actually), unsafe fields are
not mentioned nor argued against, except if you mean implicitly by stating
privacy works perfectly?

The argument I have put above for unsafe private fields is twofold: the
code one needs to inspect to ensure invariants is smaller (possibly much
smaller, for a large module only small parts of which are unsafe relevant),
and explicitly enumerated, instead of "the whole module".

Isn't this the place to argue for/against this feature?

I have frequently argued against unsafe fields and unsafe types. I
essentially introduce the problem they're supposed to they're supposed to
solve, and then reject them in this section
https://doc.rust-lang.org/nightly/nomicon/working-with-unsafe.html of the
nomicon.

I don't really have the energy to put in a detailed argument, but basically
I don't believe it's well-motivated. unsafe-fn is, unsafe-trait is. Pub
unsafe fields could be argued for as a useful ergonomic boon for exposing
an unsafe piece of data. Priv-unsafe fields are basically pointless (I
don't think I've ever seen an error in the wild that could have been
prevented, and I've seen several that wouldn't have).


Reply to this email directly or view it on GitHub
#381 (comment).

daniel-vainsencher commented Jan 26, 2016

I just read that section (all of chapter two, actually), unsafe fields are
not mentioned nor argued against, except if you mean implicitly by stating
privacy works perfectly?

The argument I have put above for unsafe private fields is twofold: the
code one needs to inspect to ensure invariants is smaller (possibly much
smaller, for a large module only small parts of which are unsafe relevant),
and explicitly enumerated, instead of "the whole module".

Isn't this the place to argue for/against this feature?

I have frequently argued against unsafe fields and unsafe types. I
essentially introduce the problem they're supposed to they're supposed to
solve, and then reject them in this section
https://doc.rust-lang.org/nightly/nomicon/working-with-unsafe.html of the
nomicon.

I don't really have the energy to put in a detailed argument, but basically
I don't believe it's well-motivated. unsafe-fn is, unsafe-trait is. Pub
unsafe fields could be argued for as a useful ergonomic boon for exposing
an unsafe piece of data. Priv-unsafe fields are basically pointless (I
don't think I've ever seen an error in the wild that could have been
prevented, and I've seen several that wouldn't have).


Reply to this email directly or view it on GitHub
#381 (comment).

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Jan 26, 2016

Member

the code one needs to inspect to ensure invariants is smaller (possibly much smaller, for a large module only small parts of which are unsafe relevant), and explicitly enumerated, instead of "the whole module".

This is not actually true. First of all, this relies on nobody forgetting to mark any fields unsafe. Secondly, the code that has to be inspected it still "any function that contains an unsafe block", since there will usually be local variables that we rely on to have certain invariants. Finally, I'd be surprised if, after properly marking all fields as unsafe, there would be a significant number of functions left that have no unsafe block.

Pub unsafe fields could be argued for as a useful ergonomic boon for exposing an unsafe piece of data (see: mutating statics, accessing repr(packed) fields, derefing raw ptrs).

I disagree, and think "pub unsafe" is a contradiction in itself. Privacy is tied to abstraction safety is tied to hiding unsafe behind an abstraction boundary, I think it would just be confusing to loosen this connection.

Member

RalfJung commented Jan 26, 2016

the code one needs to inspect to ensure invariants is smaller (possibly much smaller, for a large module only small parts of which are unsafe relevant), and explicitly enumerated, instead of "the whole module".

This is not actually true. First of all, this relies on nobody forgetting to mark any fields unsafe. Secondly, the code that has to be inspected it still "any function that contains an unsafe block", since there will usually be local variables that we rely on to have certain invariants. Finally, I'd be surprised if, after properly marking all fields as unsafe, there would be a significant number of functions left that have no unsafe block.

Pub unsafe fields could be argued for as a useful ergonomic boon for exposing an unsafe piece of data (see: mutating statics, accessing repr(packed) fields, derefing raw ptrs).

I disagree, and think "pub unsafe" is a contradiction in itself. Privacy is tied to abstraction safety is tied to hiding unsafe behind an abstraction boundary, I think it would just be confusing to loosen this connection.

@reem

This comment has been minimized.

Show comment
Hide comment
@reem

reem Jan 26, 2016

I once opened an RFC for this, but after talking to @Gankro I am convinced this feature doesn't offer much. The number of lines of code in unsafe blocks is not at all a good proxy for the "unsafety" of the code.

A single line of unsafe code can do basically anything and affect any other code (usually within the privacy boundary) so you don't shorten review time by typing unsafe a smaller number of times.

reem commented Jan 26, 2016

I once opened an RFC for this, but after talking to @Gankro I am convinced this feature doesn't offer much. The number of lines of code in unsafe blocks is not at all a good proxy for the "unsafety" of the code.

A single line of unsafe code can do basically anything and affect any other code (usually within the privacy boundary) so you don't shorten review time by typing unsafe a smaller number of times.

@arielb1

This comment has been minimized.

Show comment
Hide comment
@arielb1

arielb1 Jan 26, 2016

Contributor

@RalfJung

pub unsafe makes perfect sense - it means that the fields have additional invariants that are not represented by the type-system - for example, possibly the len field of a Vec. It's not like they add an additional proof obligation to every piece of unsafe code that is written, only to unsafe code that writes to the field.

Contributor

arielb1 commented Jan 26, 2016

@RalfJung

pub unsafe makes perfect sense - it means that the fields have additional invariants that are not represented by the type-system - for example, possibly the len field of a Vec. It's not like they add an additional proof obligation to every piece of unsafe code that is written, only to unsafe code that writes to the field.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Jan 26, 2016

Member

If fields have additional invariants, they should be private. If you make them public, you actually make your invariants part of the public API, and you can never ever change the invariants associated with that field again because that could break in combination with client code that still assumes (and establishes!) the old invariants. This is just really bad API design, and IMHO confusing. If you really want this, go ahead and add a getter/setter, that doesn't even incur any overhead after inlining. But at least you had to make some conscious effort to decide that this should be part of your public API.

Member

RalfJung commented Jan 26, 2016

If fields have additional invariants, they should be private. If you make them public, you actually make your invariants part of the public API, and you can never ever change the invariants associated with that field again because that could break in combination with client code that still assumes (and establishes!) the old invariants. This is just really bad API design, and IMHO confusing. If you really want this, go ahead and add a getter/setter, that doesn't even incur any overhead after inlining. But at least you had to make some conscious effort to decide that this should be part of your public API.

@arielb1

This comment has been minimized.

Show comment
Hide comment
@arielb1

arielb1 Jan 26, 2016

Contributor

The API concern isn't restricted to unsafe fields. Every time you define a pub field with an invariant, even if it is not a memory-safety-related invariant, you allow users of your struct to depend on it. That's how things work.

Contributor

arielb1 commented Jan 26, 2016

The API concern isn't restricted to unsafe fields. Every time you define a pub field with an invariant, even if it is not a memory-safety-related invariant, you allow users of your struct to depend on it. That's how things work.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Jan 26, 2016

Member

Sure. And the answer to this problem is to make the field private, not to make it unsafe. There is a way to solve this problem in Rust right now, and it is superior to pub unsafe, we should not add a second way.

Member

RalfJung commented Jan 26, 2016

Sure. And the answer to this problem is to make the field private, not to make it unsafe. There is a way to solve this problem in Rust right now, and it is superior to pub unsafe, we should not add a second way.

@arielb1

This comment has been minimized.

Show comment
Hide comment
@arielb1

arielb1 Jan 26, 2016

Contributor

That was also my argument the first time this feature came around - I don't believe this feature pulls its weight.

OTOH, privacy is primarily intended for abstraction (preventing users from depending on incidental details), not for protection (ensuring that invariants always hold). The fact that it can be used for protection is basically an happy accident.

To clarify the difference, C strings have no abstraction whatever - they are a raw pointer to memory. However, they do have an invariant - they must point to a valid NUL-terminated string. Every place that constructs such a string must ensure it is valid, and every place that consumes it can rely on it. OTOH, a safe, say, buffered reader needs abstraction but doesn't need protection - it does not hold any critical invariant, but may want to change its internal representation.

On the other 2 corners, a tuple has no abstraction and no protection - it is just its contents, while an unsafe HashMap both wants to hide its internals and has complex invariants that need to be protected from careless (or theoretically, malicious) programmers.

Contributor

arielb1 commented Jan 26, 2016

That was also my argument the first time this feature came around - I don't believe this feature pulls its weight.

OTOH, privacy is primarily intended for abstraction (preventing users from depending on incidental details), not for protection (ensuring that invariants always hold). The fact that it can be used for protection is basically an happy accident.

To clarify the difference, C strings have no abstraction whatever - they are a raw pointer to memory. However, they do have an invariant - they must point to a valid NUL-terminated string. Every place that constructs such a string must ensure it is valid, and every place that consumes it can rely on it. OTOH, a safe, say, buffered reader needs abstraction but doesn't need protection - it does not hold any critical invariant, but may want to change its internal representation.

On the other 2 corners, a tuple has no abstraction and no protection - it is just its contents, while an unsafe HashMap both wants to hide its internals and has complex invariants that need to be protected from careless (or theoretically, malicious) programmers.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Jan 26, 2016

Member

I disagree, that's not an accident. Abstraction is exactly what allows code to rely on local invariants to be maintained, no matter what well-typed code runs using our data structure. Besides type safety, establishing abstraction is the key feature of type systems.
In other words, a well-established way to make sure that your invariants always hold is to ensure that users cannot even tell you are having invariants, because they can't see any of the implementation details. People are used to this kind of thinking and programming from all kinds of languages that have privacy. This even works formally, once you have proven "representation independence" you can use that proof to show that your invariants are maintained.

Now, Rust has this mechanism to pretty much embed untyped code within typed code (I know that unsafe still does enforce typing rules, but fundamentally, not much is lost when thinking of unsafe code as untyped). Of course untyped code can poke the abstraction. But that doesn't mean we should make it convenient for that code to do so.

Member

RalfJung commented Jan 26, 2016

I disagree, that's not an accident. Abstraction is exactly what allows code to rely on local invariants to be maintained, no matter what well-typed code runs using our data structure. Besides type safety, establishing abstraction is the key feature of type systems.
In other words, a well-established way to make sure that your invariants always hold is to ensure that users cannot even tell you are having invariants, because they can't see any of the implementation details. People are used to this kind of thinking and programming from all kinds of languages that have privacy. This even works formally, once you have proven "representation independence" you can use that proof to show that your invariants are maintained.

Now, Rust has this mechanism to pretty much embed untyped code within typed code (I know that unsafe still does enforce typing rules, but fundamentally, not much is lost when thinking of unsafe code as untyped). Of course untyped code can poke the abstraction. But that doesn't mean we should make it convenient for that code to do so.

@mahkoh

This comment has been minimized.

Show comment
Hide comment
@mahkoh

mahkoh Jan 26, 2016

Contributor

Of course untyped code can poke the abstraction. But that doesn't mean we should make it convenient for that code to do so.

Here we go again.

Contributor

mahkoh commented Jan 26, 2016

Of course untyped code can poke the abstraction. But that doesn't mean we should make it convenient for that code to do so.

Here we go again.

@pnkfelix

This comment has been minimized.

Show comment
Hide comment
@pnkfelix

pnkfelix Jan 26, 2016

Member

@reem I am also not convinced that the feature would pull it's weight.

But part of your response confused me:

you don't shorten review time by typing unsafe a smaller number of times.

I didn't think this feature leads to fewer unsafe blocks. If anything, there will be more.

I thought the reason some claim it shortens review times is that, in some cases, you get to focus your attention solely on the unsafe blocks. That's quite different, no?

Member

pnkfelix commented Jan 26, 2016

@reem I am also not convinced that the feature would pull it's weight.

But part of your response confused me:

you don't shorten review time by typing unsafe a smaller number of times.

I didn't think this feature leads to fewer unsafe blocks. If anything, there will be more.

I thought the reason some claim it shortens review times is that, in some cases, you get to focus your attention solely on the unsafe blocks. That's quite different, no?

@arielb1

This comment has been minimized.

Show comment
Hide comment
@arielb1

arielb1 Jan 26, 2016

Contributor

Maybe "accidentally" wasn't the right word. Still, the existence of (public) unsafe functions demonstrate the usefulness of protection without abstraction. Unsafe fields are just a better interface in some cases.

Contributor

arielb1 commented Jan 26, 2016

Maybe "accidentally" wasn't the right word. Still, the existence of (public) unsafe functions demonstrate the usefulness of protection without abstraction. Unsafe fields are just a better interface in some cases.

@BurntSushi

This comment has been minimized.

Show comment
Hide comment
@BurntSushi

BurntSushi Jan 26, 2016

Member

Moderator note: Just a gentle reminder to all to keep comments constructive. Thanks!

Member

BurntSushi commented Jan 26, 2016

Moderator note: Just a gentle reminder to all to keep comments constructive. Thanks!

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Jan 28, 2016

Member

I don't think it is flawed at all. Actually, I think I'd follow a similar strategy. Unsafety is all about assigning more meaning to types than they (syntactically) have. Carefully documenting what exactly it is that is additionally assumed is certainly the first step to a formal proof of safety (e.g., in the framework that I am trying to develop). In other words, if a module has been documented the way you suggested, that's already taking some burden from whoever ends up doing a full, formal proof of correctness. If then every unsafe block also has a comment explaining why this particular access maintains the invariant, that's already half the proof ;-) .

I think something like this has been in my mind when I suggested unsafe fields (well, I suggested unsafe types, but I was quickly convinced that doing this on the field level is better). I just wasn't able to articulate it so clearly.

Member

RalfJung commented Jan 28, 2016

I don't think it is flawed at all. Actually, I think I'd follow a similar strategy. Unsafety is all about assigning more meaning to types than they (syntactically) have. Carefully documenting what exactly it is that is additionally assumed is certainly the first step to a formal proof of safety (e.g., in the framework that I am trying to develop). In other words, if a module has been documented the way you suggested, that's already taking some burden from whoever ends up doing a full, formal proof of correctness. If then every unsafe block also has a comment explaining why this particular access maintains the invariant, that's already half the proof ;-) .

I think something like this has been in my mind when I suggested unsafe fields (well, I suggested unsafe types, but I was quickly convinced that doing this on the field level is better). I just wasn't able to articulate it so clearly.

@daniel-vainsencher

This comment has been minimized.

Show comment
Hide comment
@daniel-vainsencher

daniel-vainsencher Apr 26, 2016

In a discussion [1] on Reddit, /u/diwic proposed a solution within current Rust. I cleaned it up a bit and put it on a playpen at [2]. I think the best way forward is to use (a bikeshedded variant of) this module in some relevant use cases, and if the pattern is deemed important enough that the ugly syntax is unacceptable to incorporate into the language, pick up this issue again.

[1] https://www.reddit.com/r/rust/comments/4gdi4a/curious_about_state_in_safeunsafe_code/
[2] http://is.gd/9kGSvF

daniel-vainsencher commented Apr 26, 2016

In a discussion [1] on Reddit, /u/diwic proposed a solution within current Rust. I cleaned it up a bit and put it on a playpen at [2]. I think the best way forward is to use (a bikeshedded variant of) this module in some relevant use cases, and if the pattern is deemed important enough that the ugly syntax is unacceptable to incorporate into the language, pick up this issue again.

[1] https://www.reddit.com/r/rust/comments/4gdi4a/curious_about_state_in_safeunsafe_code/
[2] http://is.gd/9kGSvF

@Kimundi

This comment has been minimized.

Show comment
Hide comment
@Kimundi

Kimundi Apr 28, 2016

Member

That solution doesn't actually work though - you can still swap out a field with that wrapper type with another instance of that wrapper type that might contain a entirely different value, completely safely.

Member

Kimundi commented Apr 28, 2016

That solution doesn't actually work though - you can still swap out a field with that wrapper type with another instance of that wrapper type that might contain a entirely different value, completely safely.

@ticki

This comment has been minimized.

Show comment
Hide comment
@ticki

ticki Apr 28, 2016

Contributor

I think of this as "public privacy" and "private unsafety". Privacy is a safety feature, allowing the programmer to make local reasoning about type invariants. Adding unsafe fields adds the ability to set public fields, which might hold some guarantee (e.g., vec.len <= vec.cap) without pushing to the stack (i.e. setter method). Although this is inlined most of the time when optimizations is activated, it will certainly improve performance in e.g. debug builds (for example, you can read a vector's length without pushing to the stack).

Even more interesting is the ability to express module-local invariants through the type system. In particular, you can ensure certain guarantees about your type in the defining module it self. Let's look at vec.rs from libcollections. This module is 1761 lines long (excluding tests), and has many invariants it relies on for safety.

Currently, Rust has a module-level safety model. You can shoot you self in the foot by breaking invariants of structures in the defining module. Adding unsafe fields helps this situation a lot. It will bring down the responsibility for holding the invariants to the function itself, making reasoning about unsafety even easier.

The last thing to mention is better syntax. A lot of getters can be removed, arguably improving readability.

On the flip side, there is a lot of code that needs to be rewritten. And I imagine that, for consistency, certain fields in the standard library should be made pub unsafe. Note that such a change won't be breaking, since you can already have fields and methods with overlapping names.

Contributor

ticki commented Apr 28, 2016

I think of this as "public privacy" and "private unsafety". Privacy is a safety feature, allowing the programmer to make local reasoning about type invariants. Adding unsafe fields adds the ability to set public fields, which might hold some guarantee (e.g., vec.len <= vec.cap) without pushing to the stack (i.e. setter method). Although this is inlined most of the time when optimizations is activated, it will certainly improve performance in e.g. debug builds (for example, you can read a vector's length without pushing to the stack).

Even more interesting is the ability to express module-local invariants through the type system. In particular, you can ensure certain guarantees about your type in the defining module it self. Let's look at vec.rs from libcollections. This module is 1761 lines long (excluding tests), and has many invariants it relies on for safety.

Currently, Rust has a module-level safety model. You can shoot you self in the foot by breaking invariants of structures in the defining module. Adding unsafe fields helps this situation a lot. It will bring down the responsibility for holding the invariants to the function itself, making reasoning about unsafety even easier.

The last thing to mention is better syntax. A lot of getters can be removed, arguably improving readability.

On the flip side, there is a lot of code that needs to be rewritten. And I imagine that, for consistency, certain fields in the standard library should be made pub unsafe. Note that such a change won't be breaking, since you can already have fields and methods with overlapping names.

@daniel-vainsencher

This comment has been minimized.

Show comment
Hide comment
@daniel-vainsencher

daniel-vainsencher Apr 28, 2016

@Kimundi: that version had a more general bug: initialization should also take invariants into account.

The fix is that new should be unsafe also. I have already applied this change (I am preparing a micro crate for unsafe_field).

This fix addresses your concern as well: http://is.gd/4JWZc2

daniel-vainsencher commented Apr 28, 2016

@Kimundi: that version had a more general bug: initialization should also take invariants into account.

The fix is that new should be unsafe also. I have already applied this change (I am preparing a micro crate for unsafe_field).

This fix addresses your concern as well: http://is.gd/4JWZc2

@daniel-vainsencher

This comment has been minimized.

Show comment
Hide comment
@daniel-vainsencher

daniel-vainsencher Apr 28, 2016

@Kimundi also, thanks for having a poke at it!

daniel-vainsencher commented Apr 28, 2016

@Kimundi also, thanks for having a poke at it!

@ticki

This comment has been minimized.

Show comment
Hide comment
@ticki

ticki Apr 28, 2016

Contributor

Is there any compelling usecase for reading to be unsafe?

Contributor

ticki commented Apr 28, 2016

Is there any compelling usecase for reading to be unsafe?

@daniel-vainsencher

This comment has been minimized.

Show comment
Hide comment
@daniel-vainsencher

daniel-vainsencher Apr 28, 2016

@Kimundi: actually my fix does not quite address your concern: creating the new value is unsafe, so the common case of struc.a = UnsafeMember<usize>::new(4); would fail to compile. But if we happen to have a value of that type lying around, we certainly could swap it in and break an invariant in safe code as you say :(

This means that the proposed library only adds a speed bump, not a guarantee. So it is not a reasonable replacement for the language feature, and I am not sure it is even worth using in a project: it will probably prevent some bugs, but also make maintainers over-confident. Blech. Great catch!

In short, yes, unsafe fields seem to require a language feature.

daniel-vainsencher commented Apr 28, 2016

@Kimundi: actually my fix does not quite address your concern: creating the new value is unsafe, so the common case of struc.a = UnsafeMember<usize>::new(4); would fail to compile. But if we happen to have a value of that type lying around, we certainly could swap it in and break an invariant in safe code as you say :(

This means that the proposed library only adds a speed bump, not a guarantee. So it is not a reasonable replacement for the language feature, and I am not sure it is even worth using in a project: it will probably prevent some bugs, but also make maintainers over-confident. Blech. Great catch!

In short, yes, unsafe fields seem to require a language feature.

@golddranks

This comment has been minimized.

Show comment
Hide comment
@golddranks

golddranks Apr 28, 2016

@ticki Unsynchronized reading during a write from another thread can be unsafe, because it may produce an invalid value. If a public field is declared as safely readable and there's some kind of unsafe (non-enforced) synchronization strategy, unsafe code can't guard against reading from safe code, and UB may result.

golddranks commented Apr 28, 2016

@ticki Unsynchronized reading during a write from another thread can be unsafe, because it may produce an invalid value. If a public field is declared as safely readable and there's some kind of unsafe (non-enforced) synchronization strategy, unsafe code can't guard against reading from safe code, and UB may result.

@ticki

This comment has been minimized.

Show comment
Hide comment
@ticki

ticki Apr 28, 2016

Contributor

No, that's not possible.

Contributor

ticki commented Apr 28, 2016

No, that's not possible.

@golddranks

This comment has been minimized.

Show comment
Hide comment
@golddranks

golddranks Apr 28, 2016

Why not? Am I missing something?

golddranks commented Apr 28, 2016

Why not? Am I missing something?

@daniel-vainsencher

This comment has been minimized.

Show comment
Hide comment
@daniel-vainsencher

daniel-vainsencher Apr 28, 2016

I am guessing that @ticki is relying on "mutable xor shared" and @golddranks is talking about a module that breaks it internally e.g., to provide an efficient concurrent data structure.

I think that my assumption that only writing is unsafe needs discarding. Unsafe fields should be unsafe to access, period. And I think that fields not subject to "mutable xor shared" should never be public, even if any unsafe fields are allowed to be (which I don't know about).

daniel-vainsencher commented Apr 28, 2016

I am guessing that @ticki is relying on "mutable xor shared" and @golddranks is talking about a module that breaks it internally e.g., to provide an efficient concurrent data structure.

I think that my assumption that only writing is unsafe needs discarding. Unsafe fields should be unsafe to access, period. And I think that fields not subject to "mutable xor shared" should never be public, even if any unsafe fields are allowed to be (which I don't know about).

@ticki

This comment has been minimized.

Show comment
Hide comment
@ticki

ticki Apr 28, 2016

Contributor

If unsafe fields are unsafe to read, they don't have the same power. Everyone would prefer to write the respective getter function instead of the field directly, sacrificing performance.

Contributor

ticki commented Apr 28, 2016

If unsafe fields are unsafe to read, they don't have the same power. Everyone would prefer to write the respective getter function instead of the field directly, sacrificing performance.

@daniel-vainsencher

This comment has been minimized.

Show comment
Hide comment
@daniel-vainsencher

daniel-vainsencher Apr 28, 2016

How important is the performance benefit of shaving off an accessor in debug builds?

I think the main argument for private unsafe fields is preserving memory safety at a small maintenance expense. In this case, accessing those fields through a small set of functions that ensure written and read values are indeed valid is probably a best practice anyway.

daniel-vainsencher commented Apr 28, 2016

How important is the performance benefit of shaving off an accessor in debug builds?

I think the main argument for private unsafe fields is preserving memory safety at a small maintenance expense. In this case, accessing those fields through a small set of functions that ensure written and read values are indeed valid is probably a best practice anyway.

@ticki

This comment has been minimized.

Show comment
Hide comment
@ticki

ticki Apr 28, 2016

Contributor

How important is the performance benefit of shaving off an accessor in debug builds?

Well, considering how many getters there are, I think that some gain will be there.

Contributor

ticki commented Apr 28, 2016

How important is the performance benefit of shaving off an accessor in debug builds?

Well, considering how many getters there are, I think that some gain will be there.

@ticki

This comment has been minimized.

Show comment
Hide comment
@ticki

ticki Apr 28, 2016

Contributor

But the argument for local guarantees for invariants is much more compelling.

Contributor

ticki commented Apr 28, 2016

But the argument for local guarantees for invariants is much more compelling.

@Kimundi

This comment has been minimized.

Show comment
Hide comment
@Kimundi

Kimundi Apr 30, 2016

Member

My thoughts on this:

  • unsafe fields are not needed since you can emulate them with abstraction around the struct.

  • They are a useful convenience. since they would allow expressing with one keyword what you'd need up to 4 unsafe accessor functions for:

    unsafe fn get_ref(&self) -> &T;
    unsafe fn get_mut(&mut self) -> &mut T;
    unsafe fn get_move(self) -> T;
    unsafe fn set(t: T);
  • The abstraction would be useful both outside the abstraction for reducing the need for said accessors, and inside for giving a more strict self-check of the implementation.

  • They should have the semantic "any access is unsafe" for the same reasons static mut has - you can't know in what state the field is.

    • This semantic would have a hurdle with types that have destructors though: How do unsafe fields get dropped, seeing how drop is a safe operation?
    • Possible solution: Unsafe fields may only contain Copy data. In combination with a copyable ManualDrop type, this would still allow all types.
Member

Kimundi commented Apr 30, 2016

My thoughts on this:

  • unsafe fields are not needed since you can emulate them with abstraction around the struct.

  • They are a useful convenience. since they would allow expressing with one keyword what you'd need up to 4 unsafe accessor functions for:

    unsafe fn get_ref(&self) -> &T;
    unsafe fn get_mut(&mut self) -> &mut T;
    unsafe fn get_move(self) -> T;
    unsafe fn set(t: T);
  • The abstraction would be useful both outside the abstraction for reducing the need for said accessors, and inside for giving a more strict self-check of the implementation.

  • They should have the semantic "any access is unsafe" for the same reasons static mut has - you can't know in what state the field is.

    • This semantic would have a hurdle with types that have destructors though: How do unsafe fields get dropped, seeing how drop is a safe operation?
    • Possible solution: Unsafe fields may only contain Copy data. In combination with a copyable ManualDrop type, this would still allow all types.
@daniel-vainsencher

This comment has been minimized.

Show comment
Hide comment
@daniel-vainsencher

daniel-vainsencher May 3, 2016

@Kimundi About safety of dropping: To my understanding your concern is that any implementation of the Drop trait that accesses the unsafe fields has to include unsafe section? why is that a problem exactly? it seems to exactly encode the fact that these fields carry invariants, and any finalization done on them requires care.

Or are you concerned about memory safety when Drop is not implemented at all? IIUC, Rust does not guarantees that drops will occur, so we should not depend on them for memory safety purposes anyway. Right?

daniel-vainsencher commented May 3, 2016

@Kimundi About safety of dropping: To my understanding your concern is that any implementation of the Drop trait that accesses the unsafe fields has to include unsafe section? why is that a problem exactly? it seems to exactly encode the fact that these fields carry invariants, and any finalization done on them requires care.

Or are you concerned about memory safety when Drop is not implemented at all? IIUC, Rust does not guarantees that drops will occur, so we should not depend on them for memory safety purposes anyway. Right?

@ticki

This comment has been minimized.

Show comment
Hide comment
@ticki

ticki May 3, 2016

Contributor

But it is strongly wanted (and often assumed, though not for memory safety) for destructors to be called. Not doing so would be quite counterintuitive. If destructors are not to be called in unsafe fields, I would require a !Drop.

Contributor

ticki commented May 3, 2016

But it is strongly wanted (and often assumed, though not for memory safety) for destructors to be called. Not doing so would be quite counterintuitive. If destructors are not to be called in unsafe fields, I would require a !Drop.

@daniel-vainsencher

This comment has been minimized.

Show comment
Hide comment
@daniel-vainsencher

daniel-vainsencher May 3, 2016

@ticki, @Kimundi, you seem to be referring to dropping the value in the field that is marked unsafe, right? I though Kimundi was referring to a drop of the struct containing the field.

But unsafe code already should ensure (beforehand) that it is safe to drop any sensitive fields before they get dropped, right?

So the unsafe fields proposal merely helps formalize this obligation: any field that might be unsafe to drop at any time should be marked an unsafe field, and brought to a safe state before it gets dropped.

Hence, I think the compiler should drop values in unsafe fields exactly when it would in regular ones. Am I missing something? This is also an important property if we want porting modules (with unsafe to use unsafe fields) to be an easy enhancement of documentation with no associated danger.

daniel-vainsencher commented May 3, 2016

@ticki, @Kimundi, you seem to be referring to dropping the value in the field that is marked unsafe, right? I though Kimundi was referring to a drop of the struct containing the field.

But unsafe code already should ensure (beforehand) that it is safe to drop any sensitive fields before they get dropped, right?

So the unsafe fields proposal merely helps formalize this obligation: any field that might be unsafe to drop at any time should be marked an unsafe field, and brought to a safe state before it gets dropped.

Hence, I think the compiler should drop values in unsafe fields exactly when it would in regular ones. Am I missing something? This is also an important property if we want porting modules (with unsafe to use unsafe fields) to be an easy enhancement of documentation with no associated danger.

@Kimundi

This comment has been minimized.

Show comment
Hide comment
@Kimundi

Kimundi May 4, 2016

Member

@daniel-vainsencher: I'm referring to this case:

struct Foo {
    unsafe bar: SomeType
}

where SomeType has a destructor.

If Foo where a regular struct, then dropping an instance of Foo would call the destructor of SomeType with a &mut to the bar field.

If, however, bar is marked as a unsafe field, and thus unsafe to access, then it would be unsafe for a drop of Foo to also drop the bar field.

So either unsafe fields need to be defined as needing manual dropping, requiring to implement Drop explicitly; or they need to be defined as being Copy to ensure they don't need to be dropped at all.

Third option would be to silently not drop then, but that seems like a footgun.

Member

Kimundi commented May 4, 2016

@daniel-vainsencher: I'm referring to this case:

struct Foo {
    unsafe bar: SomeType
}

where SomeType has a destructor.

If Foo where a regular struct, then dropping an instance of Foo would call the destructor of SomeType with a &mut to the bar field.

If, however, bar is marked as a unsafe field, and thus unsafe to access, then it would be unsafe for a drop of Foo to also drop the bar field.

So either unsafe fields need to be defined as needing manual dropping, requiring to implement Drop explicitly; or they need to be defined as being Copy to ensure they don't need to be dropped at all.

Third option would be to silently not drop then, but that seems like a footgun.

@daniel-vainsencher

This comment has been minimized.

Show comment
Hide comment
@daniel-vainsencher

daniel-vainsencher May 4, 2016

@Kimundi Thank you for taking the time to write your view very explicitly. The term unsafe is being overloaded (maybe more specialized terms exist and I am not aware of them):

  1. Some actions are dangerous bugs, causing UB etc. We call those unsafe or memory unsafe.
  2. unsafe sections allow us to do some things in category (1), but have another desired use: The rust community seems to want to be able to trace all cases of (1) to cases of (2).
  3. For this purpose, we use for example the annotation of functions as unsafe to propagate the danger signal; marking a function unsafe does not add potential for bugs!

You seem to treat accessing unsafe fields as unsafe in sense (1): actually can cause UB, like static mut variables. This could happen, for example if we allowed the compiler more liberty in optimizing around such variables. In this case, allowing the regular drop behavior (so when we drop Foo, we call drop on SomeType as well, which can be a bug, therefore the compiler should not do it. Do I understand correctly?

As far as I am concerned, unsafe fields are exactly like unsafe functions, a tool for (3). Marking a field unsafe is sufficient warning to any maintainer of the code that when the compiler does its usual thing around drops, heavy objects had better be put away. This is the same responsibility that writers of unsafe code face now, just made a bit more precise: with correctly annotated unsafe fields, the drop of other fields should never cause unsafety (in sense 1). I imagine that when it is non-trivial, this responsibility will often be discharged using an unsafe section inside the destructor of Foo. Does this make sense?

Therefore in my opinion unsafe fields should be dropped exactly when regular fields are; they should not require manual dropping, implementation of Drop, a Copy bound. All of those are tools an implementor might use to avoid unsafety in sense (1), but its his choice.

daniel-vainsencher commented May 4, 2016

@Kimundi Thank you for taking the time to write your view very explicitly. The term unsafe is being overloaded (maybe more specialized terms exist and I am not aware of them):

  1. Some actions are dangerous bugs, causing UB etc. We call those unsafe or memory unsafe.
  2. unsafe sections allow us to do some things in category (1), but have another desired use: The rust community seems to want to be able to trace all cases of (1) to cases of (2).
  3. For this purpose, we use for example the annotation of functions as unsafe to propagate the danger signal; marking a function unsafe does not add potential for bugs!

You seem to treat accessing unsafe fields as unsafe in sense (1): actually can cause UB, like static mut variables. This could happen, for example if we allowed the compiler more liberty in optimizing around such variables. In this case, allowing the regular drop behavior (so when we drop Foo, we call drop on SomeType as well, which can be a bug, therefore the compiler should not do it. Do I understand correctly?

As far as I am concerned, unsafe fields are exactly like unsafe functions, a tool for (3). Marking a field unsafe is sufficient warning to any maintainer of the code that when the compiler does its usual thing around drops, heavy objects had better be put away. This is the same responsibility that writers of unsafe code face now, just made a bit more precise: with correctly annotated unsafe fields, the drop of other fields should never cause unsafety (in sense 1). I imagine that when it is non-trivial, this responsibility will often be discharged using an unsafe section inside the destructor of Foo. Does this make sense?

Therefore in my opinion unsafe fields should be dropped exactly when regular fields are; they should not require manual dropping, implementation of Drop, a Copy bound. All of those are tools an implementor might use to avoid unsafety in sense (1), but its his choice.

@Kimundi

This comment has been minimized.

Show comment
Hide comment
@Kimundi

Kimundi May 7, 2016

Member

@daniel-vainsencher

I think you convinced me :) To put what you said in different words, would you agree that unsafe fields could be defined as this:

  • Marking a field of a struct as unsafe requires an unsafe block for any direct access of the field.
    • This also includes construction of the struct.
  • The implementation of the struct still needs to ensure that the field is always in a state where destroying the struct does not cause memory unsafety.
    • Specifically, if the field contains a type with a destructor, it needs to be in a safe-to-call-drop state after the drop of the containing struct.
    • This would be a invariant that needs to be kept in mind at initialization/modification time of the field value
Member

Kimundi commented May 7, 2016

@daniel-vainsencher

I think you convinced me :) To put what you said in different words, would you agree that unsafe fields could be defined as this:

  • Marking a field of a struct as unsafe requires an unsafe block for any direct access of the field.
    • This also includes construction of the struct.
  • The implementation of the struct still needs to ensure that the field is always in a state where destroying the struct does not cause memory unsafety.
    • Specifically, if the field contains a type with a destructor, it needs to be in a safe-to-call-drop state after the drop of the containing struct.
    • This would be a invariant that needs to be kept in mind at initialization/modification time of the field value
@daniel-vainsencher

This comment has been minimized.

Show comment
Hide comment
@daniel-vainsencher

daniel-vainsencher May 7, 2016

That is a concise and precise definition that I agree with.
On May 7, 2016 8:38 AM, "Marvin Löbel" notifications@github.com wrote:

@daniel-vainsencher https://github.com/daniel-vainsencher

I think you convinced me :) To put what you said in different words, would
you agree that unsafe fields could be defined as this:

  • Marking a field of a struct as unsafe requires an unsafe block for
    any direct access of the field.
    • This also includes construction of the struct.
  • The implementation of the struct still needs to ensure that the
    field is always in a state where destroying the struct does not cause
    memory unsafety.
    • Specifically, if the field contains a type with a destructor, it
      needs to be in a safe-to-call-drop state after the drop of the containing
      struct.
    • This would be a invariant that needs to be kept in mind at
      initialization/modification time of the field value


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#381 (comment)

daniel-vainsencher commented May 7, 2016

That is a concise and precise definition that I agree with.
On May 7, 2016 8:38 AM, "Marvin Löbel" notifications@github.com wrote:

@daniel-vainsencher https://github.com/daniel-vainsencher

I think you convinced me :) To put what you said in different words, would
you agree that unsafe fields could be defined as this:

  • Marking a field of a struct as unsafe requires an unsafe block for
    any direct access of the field.
    • This also includes construction of the struct.
  • The implementation of the struct still needs to ensure that the
    field is always in a state where destroying the struct does not cause
    memory unsafety.
    • Specifically, if the field contains a type with a destructor, it
      needs to be in a safe-to-call-drop state after the drop of the containing
      struct.
    • This would be a invariant that needs to be kept in mind at
      initialization/modification time of the field value


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#381 (comment)

@petrochenkov

This comment has been minimized.

Show comment
Hide comment
@petrochenkov

petrochenkov Sep 13, 2016

Contributor

I think unions will need to eventually support very similar capability, but with the opposite sign - making fields safe.
Unsafe blocks is a huge burden especially on something that is statically known (to the user, but not to the language) to be safe. A good example is unions like LARGE_INTEGER (again).
cc rust-lang/rust#32836

Contributor

petrochenkov commented Sep 13, 2016

I think unions will need to eventually support very similar capability, but with the opposite sign - making fields safe.
Unsafe blocks is a huge burden especially on something that is statically known (to the user, but not to the language) to be safe. A good example is unions like LARGE_INTEGER (again).
cc rust-lang/rust#32836

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Apr 29, 2017

Member

I feel like rust-lang/rust#41622 is another point where unsafe fields could have helped: If we make it so that types with unsafe fields to NOT get any automatic instances for unsafe OIBITS, that would have prevented accidentally implementing Sync for MutexGuard<Cell<i32>>.

Really, there is no way to predict how the invariant of a type with unsafe fields could interact with the assumptions provided by unsafe traits, so implementing them automatically is always dangerous. The author of a type with unsafe fields has to convince himself that the guarantees of an unsafe traits a satisfied. Not getting these instances automatically is an ergonomics hit, but the alternative is to err on the side of "what could possibly go wrong", which is not very Rust-y.

Member

RalfJung commented Apr 29, 2017

I feel like rust-lang/rust#41622 is another point where unsafe fields could have helped: If we make it so that types with unsafe fields to NOT get any automatic instances for unsafe OIBITS, that would have prevented accidentally implementing Sync for MutexGuard<Cell<i32>>.

Really, there is no way to predict how the invariant of a type with unsafe fields could interact with the assumptions provided by unsafe traits, so implementing them automatically is always dangerous. The author of a type with unsafe fields has to convince himself that the guarantees of an unsafe traits a satisfied. Not getting these instances automatically is an ergonomics hit, but the alternative is to err on the side of "what could possibly go wrong", which is not very Rust-y.

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis May 4, 2017

Contributor

@RalfJung good point. I am pretty strongly in favor of unsafe fields at this point. The only thing that holds me back is some desire to think a bit more about the "unsafe" model more generally.

Contributor

nikomatsakis commented May 4, 2017

@RalfJung good point. I am pretty strongly in favor of unsafe fields at this point. The only thing that holds me back is some desire to think a bit more about the "unsafe" model more generally.

@withoutboats withoutboats added the T-lang label May 14, 2017

@Kixunil

This comment has been minimized.

Show comment
Hide comment
@Kixunil

Kixunil Jun 10, 2017

Bit OT: I wonder how hard would it be to implement a way to express invariants maintained by said struct and insert lints (and possibly runtime checks in debug mode) about them. Probably it wouldn't help with broken Send and Sync but it might be interesting e.g. in case of Vec (len <= capacity).

Kixunil commented Jun 10, 2017

Bit OT: I wonder how hard would it be to implement a way to express invariants maintained by said struct and insert lints (and possibly runtime checks in debug mode) about them. Probably it wouldn't help with broken Send and Sync but it might be interesting e.g. in case of Vec (len <= capacity).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment