Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have tidy ensure that we document all `unsafe` blocks in libcore #63793

Open
wants to merge 2 commits into
base: master
from

Conversation

@oli-obk
Copy link
Contributor

commented Aug 21, 2019

cc @rust-lang/libs

I documented a few and added ignore flags on the other files. We can incrementally document the files, but won't regress any files this way.

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

commented Aug 21, 2019

r? @KodrAus

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

commented Aug 21, 2019

The job mingw-check of your PR failed (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
2019-08-21T18:12:19.8607065Z ##[command]git remote add origin https://github.com/rust-lang/rust
2019-08-21T18:12:19.8805627Z ##[command]git config gc.auto 0
2019-08-21T18:12:19.8889655Z ##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
2019-08-21T18:12:19.8936506Z ##[command]git config --get-all http.proxy
2019-08-21T18:12:19.9079955Z ##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/63793/merge:refs/remotes/pull/63793/merge
---
2019-08-21T18:12:56.4261839Z do so (now or later) by using -b with the checkout command again. Example:
2019-08-21T18:12:56.4264361Z 
2019-08-21T18:12:56.4264801Z   git checkout -b <new-branch-name>
2019-08-21T18:12:56.4264836Z 
2019-08-21T18:12:56.4264915Z HEAD is now at 2dca2ba86 Merge 6fb59fa17c87baab781792ee5db4a9db834cb92f into 7b0085a613e69cb69fc9e4eb5d422fa4a39d5de1
2019-08-21T18:12:56.4450033Z ##[section]Starting: Collect CPU-usage statistics in the background
2019-08-21T18:12:56.4453709Z ==============================================================================
2019-08-21T18:12:56.4453796Z Task         : Bash
2019-08-21T18:12:56.4453848Z Description  : Run a Bash script on macOS, Linux, or Windows
---
2019-08-21T18:18:21.6178904Z     Checking core v0.0.0 (/checkout/src/libcore)
2019-08-21T18:18:21.7454577Z error: attributes are not yet allowed on `if` expressions
2019-08-21T18:18:21.7455006Z    --> src/libcore/iter/adapters/mod.rs:527:13
2019-08-21T18:18:21.7455359Z     |
2019-08-21T18:18:21.7455774Z 527 |             #[cfg(boostrap_stdarch_ignore_this)]
2019-08-21T18:18:21.7456216Z 
2019-08-21T18:18:21.7532903Z error: aborting due to previous error
2019-08-21T18:18:21.7532989Z 
2019-08-21T18:18:21.7553992Z error: Could not compile `core`.
2019-08-21T18:18:21.7553992Z error: Could not compile `core`.
2019-08-21T18:18:21.7554389Z warning: build failed, waiting for other jobs to finish...
2019-08-21T18:18:26.8122596Z error: build failed
2019-08-21T18:18:26.8178221Z command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "check" "--target" "x86_64-unknown-linux-gnu" "-Zbinary-dep-depinfo" "-j" "2" "--release" "--color" "always" "--features" "panic-unwind backtrace compiler-builtins-c" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "--message-format" "json-render-diagnostics"
2019-08-21T18:18:26.8185797Z failed to run: /checkout/obj/build/bootstrap/debug/bootstrap check
2019-08-21T18:18:26.8185904Z Build completed unsuccessfully in 0:02:34
2019-08-21T18:18:26.8235919Z == clock drift check ==
2019-08-21T18:18:26.8253643Z   local time: Wed Aug 21 18:18:26 UTC 2019
2019-08-21T18:18:26.8253643Z   local time: Wed Aug 21 18:18:26 UTC 2019
2019-08-21T18:18:26.9816313Z   network time: Wed, 21 Aug 2019 18:18:26 GMT
2019-08-21T18:18:26.9819441Z == end clock drift check ==
2019-08-21T18:18:40.0060021Z ##[error]Bash exited with code '1'.
2019-08-21T18:18:40.0097444Z ##[section]Starting: Checkout
2019-08-21T18:18:40.0099262Z ==============================================================================
2019-08-21T18:18:40.0099348Z Task         : Get sources
2019-08-21T18:18:40.0099401Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

src/libcore/benches/ascii.rs Outdated Show resolved Hide resolved
src/libcore/num/f32.rs Outdated Show resolved Hide resolved
src/libcore/num/f64.rs Outdated Show resolved Hide resolved
@fbstj

This comment has been minimized.

Copy link
Contributor

commented Aug 22, 2019

are the six now-safe intrinsics (size_of_val, min_align_of_val, discriminant_value, type_id, likely, unlikely) blocked on different PRs?

@oli-obk

This comment has been minimized.

Copy link
Contributor Author

commented Aug 22, 2019

are the six now-safe intrinsics (size_of_val, min_align_of_val, discriminant_value, type_id, likely, unlikely) blocked on different PRs?

No, I just did them here because otherwise I'd have had to add // SAFETY: this intrinsic is safe to call, it's just unsafe because noone made it safe

Apply suggestions from code review
Co-Authored-By: Joe ST <joe@fbstj.net>
@rust-highfive

This comment has been minimized.

Copy link
Collaborator

commented Aug 22, 2019

The job mingw-check of your PR failed (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
2019-08-22T14:50:56.1683604Z ##[command]git remote add origin https://github.com/rust-lang/rust
2019-08-22T14:50:56.1868101Z ##[command]git config gc.auto 0
2019-08-22T14:50:56.1938279Z ##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
2019-08-22T14:50:56.1999724Z ##[command]git config --get-all http.proxy
2019-08-22T14:50:56.2143890Z ##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/63793/merge:refs/remotes/pull/63793/merge
2019-08-22T14:50:58.3345795Z remote:                                                                                         
---
2019-08-22T14:51:32.1438609Z do so (now or later) by using -b with the checkout command again. Example:
2019-08-22T14:51:32.1438913Z 
2019-08-22T14:51:32.1439431Z   git checkout -b <new-branch-name>
2019-08-22T14:51:32.1439741Z 
2019-08-22T14:51:32.1440477Z HEAD is now at b3f7bc1b4 Merge c2d541f186a249260f09a32a3aef31f190804cd0 into 201e52e5fe73ccf3dd22946b1216ad8d64f8c2ba
2019-08-22T14:51:32.1605694Z ##[section]Starting: Collect CPU-usage statistics in the background
2019-08-22T14:51:32.1609211Z ==============================================================================
2019-08-22T14:51:32.1609273Z Task         : Bash
2019-08-22T14:51:32.1609373Z Description  : Run a Bash script on macOS, Linux, or Windows
---
2019-08-22T14:56:48.4599632Z     Checking core v0.0.0 (/checkout/src/libcore)
2019-08-22T14:56:48.6125697Z error: attributes are not yet allowed on `if` expressions
2019-08-22T14:56:48.6126113Z    --> src/libcore/iter/adapters/mod.rs:527:13
2019-08-22T14:56:48.6126457Z     |
2019-08-22T14:56:48.6126798Z 527 |             #[cfg(boostrap_stdarch_ignore_this)]
2019-08-22T14:56:48.6127216Z 
2019-08-22T14:56:48.6225837Z error: aborting due to previous error
2019-08-22T14:56:48.6225948Z 
2019-08-22T14:56:48.6281427Z error: Could not compile `core`.
2019-08-22T14:56:48.6281427Z error: Could not compile `core`.
2019-08-22T14:56:48.6281804Z warning: build failed, waiting for other jobs to finish...
2019-08-22T14:56:53.4666642Z error: build failed
2019-08-22T14:56:53.4698646Z command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "check" "--target" "x86_64-unknown-linux-gnu" "-Zbinary-dep-depinfo" "-j" "2" "--release" "--color" "always" "--features" "panic-unwind backtrace compiler-builtins-c" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "--message-format" "json-render-diagnostics"
2019-08-22T14:56:53.4710292Z failed to run: /checkout/obj/build/bootstrap/debug/bootstrap check
2019-08-22T14:56:53.4710554Z Build completed unsuccessfully in 0:02:31
2019-08-22T14:56:53.4763277Z == clock drift check ==
2019-08-22T14:56:53.4778337Z   local time: Thu Aug 22 14:56:53 UTC 2019
2019-08-22T14:56:53.4778337Z   local time: Thu Aug 22 14:56:53 UTC 2019
2019-08-22T14:56:53.6399129Z   network time: Thu, 22 Aug 2019 14:56:53 GMT
2019-08-22T14:56:53.6402864Z == end clock drift check ==
2019-08-22T14:57:06.7146623Z ##[error]Bash exited with code '1'.
2019-08-22T14:57:06.7195018Z ##[section]Starting: Checkout
2019-08-22T14:57:06.7196811Z ==============================================================================
2019-08-22T14:57:06.7196870Z Task         : Get sources
2019-08-22T14:57:06.7196922Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Aug 26, 2019

I'm not personally a huge fan of style lints like this in the sense that it just forces unhelpful boilerplate documentation to be written in many circumstances. Most of the safety comments I'm seeing sort of just trivially reference the line above it so I'm not really sure how a comment actually bolsters understanding of the function because it's all being read anyway. There's also other comments like "SAFETY: transmuting a sequence of u8 to u64 is always fine" which I feel like taken in isolation isn't actually right, in theory that requires some form of alignment checks as well but in reality the function being called here is doing all that internally (it's just the comment that's a bit misleading).

Overall I think I would prefer to not have a lint that guarantees all unsafe blocks are documented, but I'm certainly not the only member of the libs team!

@JohnCSimon

This comment has been minimized.

Copy link

commented Sep 7, 2019

Ping from triage:
@oli-obk @fbstj @KodrAus
Hi! This has sat idle for over two weeks. What else needs to happen for this pr to move onwards?

Thank you!

@Centril

This comment has been minimized.

Copy link
Member

commented Sep 8, 2019

It's no secret I think this PR is sorely needed.

Most of the safety comments I'm seeing sort of just trivially reference the line above it so I'm not really sure how a comment actually bolsters understanding of the function because it's all being read anyway.

I don't think each and every safety comment needs to be super helpful and if there are trivially true cases then those can just say "trivially safe". Often however, it's not as trivially true as one experienced reviewer, e.g., @alexcrichton, who often looks at unsafe blocks might think. I think a better question is: are these actively harmful?

which I feel like taken in isolation isn't actually right, in theory that requires some form of alignment checks as well but in reality the function being called here is doing all that internally (it's just the comment that's a bit misleading).

This suggests that it's in fact not as trivial as one might think and that more elaboration is needed rather than not having a comment at all.

It's also not the trivial cases that we wish to catch with these comments but it's also not possible to distinguish them without far more complicated semantic analysis (proofs in e.g. Coq). I think the cost of a bit of boilerplate in trivial cases is well worth it if we get to have commentary on more complicated cases. One might say that this problem should be solved with good reviews. To me however, it is clear that reviews have not been satisfactory thus far.

Also, from a language team perspective, I find it really important to know what assumptions the standard library makes on our operational semantics.

@alexcrichton alexcrichton added the T-libs label Sep 10, 2019

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Sep 10, 2019

@rfcbot fcp close

I'm going to propse that we close this. Other members of the libs team can of course disagree! I explained some reasoning above.

@Centril I personally think that annoying lints can be actively harmful because they encourage the wrong coding patterns and give a false sense of security or correctness. An unsafe block is no more correct if a comment is indicated as to why it's ok, nor is it even any more correct if the comment is quite long and tries to explain everything. Fundamentally unsafe blocks will be mistakes eventually and while we can try to head it off I see this as generating too much work for not enough gain.

No one writing these unsafe blocks has the full context of the language in their head and runs through a checklist of "does every single guarantee Rust provides stay upheld?" as they either write unsafe or write a comment. You say it's good to know what assumptions the libs team is making, and I can say from my perspective I don't even know what assumptions I'm making when writing unsafe. It's not like we're trying to nefariously write code behind the lang team's back that is a gotcha for whatever unsafe guidelines are proposed. In reality we're just trying to get things done and most of this code is years old and probably didn't have a better alternative at the time, but everyone will basically always welcome centralization of unsafe to have fewer and fewer unsafe blocks.

@rfcbot

This comment has been minimized.

Copy link

commented Sep 10, 2019

Team member @alexcrichton has proposed to close this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@Centril

This comment has been minimized.

Copy link
Member

commented Sep 10, 2019

@alexcrichton

@Centril I personally think that annoying lints can be actively harmful because they encourage the wrong coding patterns [...]

Do you think documenting why an unsafe { ... } block is correct is the wrong coding pattern? I think that's something great and which should be encouraged.

[...] and give a false sense of security or correctness.

This feels to me like "because we cannot have perfect security or correctness we should not try". I'd like to point to @RalfJung's comments 1 and 2 which feel relevant.

An unsafe block is no more correct if a comment is indicated as to why it's ok, nor is it even any more correct if the comment is quite long and tries to explain everything.

Comments are not intended as a proof of correctness. However, they can at least be read by humans during review and after the fact. I believe one can use good judgement in the length one gives to unsafe comments. No one has suggested that one should "explain everything".

Fundamentally unsafe blocks will be mistakes eventually and while we can try to head it off I see this as generating too much work for not enough gain.

I can understand that it might be a non-trivial amount for you in libstd's operating system related code. However, in key parts like libcore and liballoc I think it's well worth the time investment.

No one writing these unsafe blocks has the full context of the language in their head and runs through a checklist of "does every single guarantee Rust provides stay upheld?" as they either write unsafe or write a comment.

No one has suggested that you read through the entire nomicon each time you write unsafe { ... } but there are often some specific things to think through depending on what parts of the language you are using.

You say it's good to know what assumptions the libs team is making, [...]. It's not like we're trying to nefariously write code behind the lang team's back that is a gotcha for whatever unsafe guidelines are proposed.

I did not say "what assumptions the libs team is making". I think most of the code added to the standard library are in fact added by people outside the libs team. I referred to assumptions of the standard library as a whole. This is useful information when doing future language design or just for internal consistency of across the standard library.

[...], and I can say from my perspective I don't even know what assumptions I'm making when writing unsafe. [...]

Perhaps we interpret "assumptions" differently, but I would hope that you at least consider what invariants you must uphold in an unsafe { ... } block. If not, that does not inspire confidence.

In reality we're just trying to get things done and most of this code is years old and probably didn't have a better alternative at the time, but everyone will basically always welcome centralization of unsafe to have fewer and fewer unsafe blocks.

"Getting shit done"-ism is not an excuse nor in opposition to being deliberate and careful with unsafe { ... }. Nor is adding comments in opposition to reducing the amount of unsafe blocks through refactoring (who is against that?). I am not suggesting that we do everything over night, but we can improve and audit the years-old code over a longer period of time.

@sfackler

This comment has been minimized.

Copy link
Member

commented Sep 10, 2019

I think that's something great and which should be encouraged.

This does not encourage documentation of the safety of unsafe blocks, it mandates it.

@Centril

This comment has been minimized.

Copy link
Member

commented Sep 10, 2019

@sfackler

This does not encourage documentation of the safety of unsafe blocks, it mandates it.

Strictly speaking, tidy has a way to opt-out per file.

Overall it is true however that it mandates documentation, but I think that's a good idea as well. Otherwise, people will oftentimes not do it when it is necessary and I think that's a problem.

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Sep 11, 2019

I don't think you'll find anyone on any Rust team who actively says unsafe blocks should not be documented, and it seems somewhat disingenuous to think that's where I'm coming from? I am specifically objecting to what @sfackler mentioned, the mandate and requirement to document all unsafe blocks.

The amount of work to document unsafe blocks in libcore/liballoc is going to be the same whether or not we have this lint, so if that's the goal of the work why not proceed with that without the lint? I've got what I believe are legitimate concerns about the lint actively hindering and causing problems in some areas, but it's not like I'm trying to say "remove all the documentation from unsafe blocks".

@Centril

This comment has been minimized.

Copy link
Member

commented Sep 11, 2019

@alexcrichton

I don't think you'll find anyone on any Rust team who actively says unsafe blocks should not be documented, [...]

I hope you are right but I think that's a low bar to set.

[...], and it seems somewhat disingenuous to think that's where I'm coming from?

Elaboration re. what you mean by "wrong coding patterns" would be helpful then as your argument largely felt like "this is security theater".

The amount of work to document unsafe blocks in libcore/liballoc is going to be the same whether or not we have this lint, so if that's the goal of the work why not proceed with that without the lint?

Because once we have done so, without lints, the documentation coverage would regress over time. If tidy yells at you, even if we have a local override like // ignore-tidy-safety on an unsafe { ... } block, at least the developer will be nudged in the direction of providing the comment. In my view, as unsafe { ... } is a dangerous thing to do (I mean... it says so on the tin...), it is well justified to lint in this case.

I've got what I believe are legitimate concerns about the lint actively hindering and causing problems in some areas, [...]

I think those concerns have not been fleshed out. I replied re. the specifics of alignment and whatnot and would appreciate fewer generalities in return. Let's be specific and concrete about what those areas are where problems and hindrances occur (like... are there specific file paths you'd not like to lint, like in the OS / FFI code?). Perhaps we can find a middle ground.

@sfackler

This comment has been minimized.

Copy link
Member

commented Sep 11, 2019

After looking through the diff here, 90+% of them look like "SAFETY: we performed the check that's one line above this unsafe block", or "SAFETY: it's okay to do this". These read to me like line noise - they add no real additional information to the context. It reminds me of Javadoc requirements in some of my college homework assignments where you end up with hundreds of lines of @param people The people.

Every PR is code reviewed by another person. It's their job to make sure that the code is sufficiently comprehensible. Sometimes there are tricky invariants or behavior that require a couple of lines of explanation in comments. Sometimes those happen to be oriented around unsafe blocks, but they often aren't.

If someone comes along and is confused by some piece of code, that may result in someone making a PR adding some extra commentary on it, but again, that doesn't have anything to do with what kind of expression that code is.

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Sep 12, 2019

I hadn't articulated that before, but yeah this sort of feels like "safety theater" where we're trying to give ourselves a sense of security or a sense of "oh here's everything we have to account for", when in fact a strict lint such as this is largely only going to manufacture comments like @sfackler was mentioning. I, too, have seen endless instances of @param people The people and would prefer to keep that out of the standard library. I concretely mentioned this above.

I don't feel this is the time to get sort of more specific than that, I'm not trying to critize @oli-obk's work here. I'm criticizing the idea that we should have this lint. I think the appropriate way to go about this (if we were to go about this) is to start down the road of documenting unsafe blocks and reviewing those sorts of PRs. If 90% of unsafe blocks are trivial and don't need comments, this lint is probably over the top. If 90% of the comments are on the module instead of unsafe blocks, then literally this PR probably isn't what we want but maybe something similar to it. If only 5% of unsafe blocks end up not needing meaningful documentation, then this sounds like a good lint to have.

I also very much agree with @sfackler that the unsafe block itself is not always where documentation should live. For example std::sync::mpsc is riddled with unsafe but has a large explainer so readers can try to get a rough idea what's going on. I don't think it's really that helpful to have // SAFETY: see the module docs all over the place, nor is it really sufficient because you need to undertand all the safe code as well. This applies to abstractions like Vec as well, we ideally need // SAFETY whenever we modify self.len, but in general you just have to understand the whole module and its algorithms.

I hope you are right but I think that's a low bar to set.

FWIW this still feels very disingenuous, "I hope you are right" again seems like it's attempting to cast me, personally, in this ominous light of "oh look at this guy who refuses to document things and doesn't want anyone else to document anything". I get the impression you really think we should actively not document unsafe code, which gives me the impression that you're not really fully reading what I'm saying and trying to understand it.

@oli-obk

This comment has been minimized.

Copy link
Contributor Author

commented Sep 12, 2019

I wrote this lint with the impression that most unsafe blocks can have meaningful docs. Then I used it and as obvious, that is not the case. It can be partially resolved by expecting either the function docs or the module docs (of any of the surrounding modules) contain an explanatory safety comment. This would solve the "read the docs" comments that I added. Since this scheme allows incrementally expanding its scope, I was hoping to iterate on it to end up at a good scheme.

Since very few unsafe blocks are documented so far, I'd be fine with first starting to do this without a lint and then revisiting once we're nearing full documentation.

@dtolnay
Copy link
Member

left a comment

Thanks for working on this @oli-obk.

I read all the "SAFETY:" comments added in this PR and there aren't any that I would mind having in the codebase. I would be happy with merging this.


I find this comment from @Centril compelling:

It's also not the trivial cases that we wish to catch with these comments. [...] I think the cost of a bit of boilerplate in trivial cases is well worth it if we get to have commentary on more complicated cases. One might say that this problem should be solved with good reviews. To me however, it is clear that reviews have not been satisfactory thus far.


In response to @alexcrichton's and @sfackler's similar concerns,

Most of the safety comments I'm seeing sort of just trivially reference the line above it so I'm not really sure how a comment actually bolsters understanding of the function because it's all being read anyway.

After looking through the diff here, 90+% of them look like "SAFETY: we performed the check that's one line above this unsafe block", or "SAFETY: it's okay to do this". These read to me like line noise - they add no real additional information to the context.

I understand this reaction but it's meaningful to call out when the check on the previous line is the only consideration in an unsafe block being okay, as opposed to some other ambient invariant that comes in from elsewhere. We could reserve comments for only those unsafe blocks with a requirement more complicated than the previous line, but when touching nearby code it's a big help to be told when the safety reasoning is all just local.

@bors

This comment has been minimized.

Copy link
Contributor

commented Sep 17, 2019

☔️ The latest upstream changes (presumably #64535) made this pull request unmergeable. Please resolve the merge conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.