Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uplift lints from clippy to rustc #53224

Open
oli-obk opened this Issue Aug 9, 2018 · 53 comments

Comments

Projects
None yet
@oli-obk
Copy link
Contributor

oli-obk commented Aug 9, 2018

cc @Manishearth @rust-lang/lang

As discussed in the Clippy 1.0 RFC, we would like to uplift some lints from clippy to rustc. Full rationale can be found in https://github.com/rust-lang/rfcs/blob/b9c8471887f308223c226642cad3a8290731b942/text/0000-clippy-uno.md#compiler-uplift

The list of correctness lints in clippy follows. I have checked the boxes of lints that I think should be uplifted

Renamings happening during uplift have been added via -> new_name annotations

@oli-obk

This comment has been minimized.

Copy link
Contributor Author

oli-obk commented Aug 9, 2018

@rfcbot fcp merge

please raise concerns about specific lints directly with rfcbot

@rfcbot

This comment has been minimized.

Copy link

rfcbot commented Aug 9, 2018

Team member @oli-obk has proposed to merge this. The next step is review by the rest of the tagged teams:

Concerns:

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@Mark-Simulacrum

This comment has been minimized.

Copy link
Member

Mark-Simulacrum commented Aug 9, 2018

https://rust-lang-nursery.github.io/rust-clippy/master/index.html#unused_io_amount seems like it should be replaced with a #[must_use] on those, but I'm not sure if we can do that for trait methods yet.

@nrc

This comment has been minimized.

Copy link
Member

nrc commented Aug 9, 2018

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Aug 10, 2018

@Mark-Simulacrum You should be able to do it without any special support on the method declaration in the trait, because that's what paths and method calls resolve to, but if you want to stick it on method definitions in impls, that's significantly harder.

@oli-obk

This comment has been minimized.

Copy link
Contributor Author

oli-obk commented Aug 10, 2018

The problem is not that the return value is unused, it is usually used via ?, because Result is already must_use. The problem is that the Ok value is unused, and I don't think that's solvable with must_use

@nikomatsakis

This comment was marked as resolved.

Copy link
Contributor

nikomatsakis commented Aug 10, 2018

for_loop_over_option: Checks for for loops over Option values.

I feel like classifying this as correctness is a bit of a stretch? It seems like type check ought to resolve most of the bugs that would occur here.

We may still want to uplift it, though, I'm not sure.

UPDATE: ok, my mistake, I didn't realize the meaning of the checked boxes. =)

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Aug 10, 2018

Checks for transmutes that can't ever be correct on any architecture.

Related: #50842


Checks for creation of references to zeroed or uninitialized memory.

IMO description is wrong, as the example shows creation of a zero-reference, not a reference to zeroed memory.


@rfcbot concern tying compiler to stdlib implementation details

Many of these lints are libstd specific (mostly pertaining to iterators). What mnemonic does clippy use to figure out what libstd items are what and whether the actual libstd is used at all and not e.g. use my_funky_crate as std;? How would the compiler deal with this without making half of the libstd a language item?


@rfcbot concern the descriptions of lints are a little too concise for decision making.

For example, it is unclear from description of eq_op whether it does act on generic code or non-builtin-types, where all of the operators can be overridden to have arbitrary side effects.

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Aug 10, 2018

A good way to test your questions is to use https://play.rust-lang.org/. On the right hand side, there’s a “Tools” dropdown which has Clippy in it.


@rfcbot concern some lints are opinionated on how the target platform ought to work

For example, the mut_from_ref lint prevents one from writing code like this, which is fairly common in my, admittedly C-centric, embedded programming experience:

// Gets a mutable reference to resource’s register bank
unsafe fn get_resource(x: &ResourceLock) -> &mut Resource {
    // THE_RESOURCE is linked by the linker into well known hardware operated range of memory.
    #[link_section="something_wellknown"]
    static mut THE_RESOURCE: Resource;
    take_resource_lock(x); // later released by the caller once they’re finished with the Resource
    &mut THE_RESOURCE
}

If it wasn’t for @japaric and WG-embedded, this is how I’d write my embedded code if I wasn’t really interested in exploring the design avenues that exact moment.


One interesting idea that came to mind was an ability for rustc to use different default lint levels depending on the target platform. The code above, is infinitely unlikely to occur on hosted targets, so the lint can be enabled there, but not on bare-metal targets where the code might be entirely valid and is quite likely to show up.

@Manishearth

This comment has been minimized.

Copy link
Member

Manishearth commented Aug 10, 2018

Many of these lints are libstd specific (mostly pertaining to iterators). What mnemonic does clippy use to figure out what libstd items are what and whether the actual libstd is used at all and not e.g. use my_funky_crate as std;?

Clippy doesn't try to solve this, anything found behind a std path is fair game. It tries to use core though as much as possible though (rustc does this too -- ranges and iteration goes through ::core)

The robust solution here is to add a #[diagnostic_item = "foo"] like #[lang_item] which you can check against.

The code above, is infinitely unlikely to occur on hosted targets, so the lint can be enabled there, but not on bare-metal targets where the code might be entirely valid and is quite likely to show up.

I feel like that's enough of a special case that you'll want to turn it off. Idk.

@petrochenkov

This comment has been minimized.

Copy link
Contributor

petrochenkov commented Aug 12, 2018

@rfcbot concern naming conventions

Lints in rustc follow some naming conventions.
A number of the lints proposed for uplifting doesn't follow the conventions.
Supposedly naming for the new lints needs to be harmonized with existing rustc lints.

@oli-obk

This comment has been minimized.

Copy link
Contributor Author

oli-obk commented Aug 13, 2018

I added renaming suggestions to all lints with a checkbox if the name didn't follow the rules

@oli-obk

This comment has been minimized.

Copy link
Contributor Author

oli-obk commented Aug 14, 2018

concern the descriptions of lints are a little too concise for decision making.

For example, it is unclear from description of eq_op whether it does act on generic code or non-builtin-types, where all of the operators can be overridden to have arbitrary side effects.

Unless explicitly stated, all lints are as generic as possible. So in the eq_op case, we don't care about the type at all. Generic parameters and custom types are fair game. I'm aware that you could be missing a Mul<i32> impl and thus use a + a instead of 2 * a. If this is too opinionated, we can adjust the lint during uplift and leave the rest to clippy or not uplift it at all.

The list is a suggestion by me, not reviewed by anyone else. I am absolutely fine with crossing items off the list if they are too opinionated.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Aug 14, 2018

I feel vaguely uneasy about this because this is a lot of lints, being uplifted all at once, without very clear (to me) explanation of what rules determine if a lint belongs in the compiler or not.

What I'd prefer to do is establish consensus on the guidelines and then entrust other people to make the call on whether individual lints fit those guidelines.

@oli-obk

This comment has been minimized.

Copy link
Contributor Author

oli-obk commented Aug 14, 2018

The general guidelines suggested by the RFC are

Some correctness lints should probably belong in the compiler, whereas style/complexity/etc lints should probably belong in Clippy. Lints may be incubated in Clippy, of course.

I don't think the compler will want all correctness lints here, however if the lint is about a common enough situation where it being not a bug is an exceedingly rare case (i.e. very low false positive frequency) it should probably belong in the compiler.

This of course depends on the rules for correctness lints:

Probable bugs

Yes this is a very short and general definition, but the space for correctness lints is just too big to create a guideline for them (as evidenced by the variance of the lints in the OP).

@Manishearth

This comment has been minimized.

Copy link
Member

Manishearth commented Aug 14, 2018

@varkor

This comment has been minimized.

Copy link
Member

varkor commented Aug 14, 2018

Unless explicitly stated, all lints are as generic as possible. So in the eq_op case, we don't care about the type at all. Generic parameters and custom types are fair game. I'm aware that you could be missing a Mul impl and thus use a + a instead of 2 * a. If this is too opinionated, we can adjust the lint during uplift and leave the rest to clippy or not uplift it at all.

I think eq_op specifically for types such as integers is fine, but if applied in the general case, it makes assumptions about the semantics of the operations that I don't think are reasonable, considering there's no requirement that the operations respect any algebraic laws.

(I'll take a better look at the others soon.)

@goodmanjonathan

This comment has been minimized.

Copy link
Contributor

goodmanjonathan commented Aug 15, 2018

If I may contribute to the bikeshed a bit, these are the renames I would like to see:

  • absurd_extreme_comparisons -> unused_comparisons (existing in rustc)
  • almost_swapped -> incorrect_swaps
  • bad_bit_mask -> unused_bitmasks
  • clone_double_ref -> cloning_references
  • cmp_nan -> unused_comparisons (existing in rustc)
  • deprecated_semver -> incorrect_semver_strings
  • drop_copy -> dropping_copy_types
  • drop_ref -> dropping_references
  • eq_op -> same_operands
  • fn_to_numeric_cast_with_truncation -> truncating_fn_ptr_as_int
  • forget_copy -> forgetting_copy_types
  • forget_ref -> forgetting_references
  • if_same_then_else -> unused_branching
  • ifs_same_cond -> unused_branching
  • infinite_iter -> infinite_iterators
  • inline_fn_without_body -> inline_fns_without_body
  • invalid_ref -> undefined_references
  • iter_next_loop -> for_val_in_iter_next
  • iterator_step_by_zero -> iter_step_by_zero
  • logic_bug -> unused_short_circuits
  • min_max -> incorrect_clamps
  • mut_from_ref -> returning_mut_ref_from_ref
  • nonsensical_open_options -> unused_file_options
  • not_unsafe_ptr_arg_deref -> deref_ptr_arg_in_safe_fns
  • reverse_range_loop -> incorrect_reversed_ranges
  • unit_cmp -> unit_comparisons
  • unused_io_amount -> unused_io_amounts
  • while_immutable_condition -> while_immutable_val
  • wrong_transmute -> undefined_transmutes
  • zero_width_space -> zero_width_spaces

The out_of_bounds_indexing lint should be deprecated now that const_err is warn by default. temporary_cstring_as_ptr seems fine as is.

Many of these names were originally proposed by @clarcharr (rust-lang/rust-clippy#2845, rust-lang/rust-clippy#2846). They came up with some naming rules there too.

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Aug 15, 2018

Sorry again for the land grab, but...
...I think that uplifting lints to the compiler, while not being directly about the language specification (and other Rust compilers are not expected to have the same set of lints...), is not like diagnostics because it is highly normative about what Rust code should and should not look like.
...Also, the RFC policy - language design document states that this falls under T-lang's purview. (I know this hasn't been exercised for some time, but now it is, and I like sticking to rules...)
...And as @withoutboats's noted, this is a lot of lints
...So I'm adding T-lang to the party.
...Again, sorry about that... :)

EDIT: Apparently rfcbot wasn't appreciative; I'll have to fix the bot first ^,-

@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Sep 1, 2018

@rfcbot concern infinite_iter

I think the infinite_iter lint might be excessively opinionated about desirable style. Looping over an infinite iterator doesn't seem like an inherent problem. On the other hand, an infinite iterator fed to collect definitely is.

@Manishearth

This comment has been minimized.

Copy link
Member

Manishearth commented Sep 1, 2018

I guess you're talking about cases where you're iterating over a receiver? Yeah, that makes sense.

@clarcharr Yeah, this sounds great! I don't think there needs to be a rush, and we can rename lints in clippy later too in a backwards compat way.

@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Sep 2, 2018

I think possible_missing_comma should be uplifted as well.

@leonardo-m

This comment has been minimized.

Copy link

leonardo-m commented Sep 2, 2018

Are some Clippy (or Rustc warnings) good to become errors (instead of warnings) in future Rust versions?

@clarfon

This comment has been minimized.

Copy link
Contributor

clarfon commented Sep 2, 2018

I should add that iter_step_by_zero will panic at runtime now, rather than looping infinitely. May not even be worth having a lint for something so specific.

@leonardo-m

This comment has been minimized.

Copy link

leonardo-m commented Sep 2, 2018

Avoiding one panic caused by zero step sounds good.

@oli-obk

This comment has been minimized.

Copy link
Contributor Author

oli-obk commented Sep 4, 2018

@nagisa are your concerns resolved?

@varkor

This comment has been minimized.

Copy link
Member

varkor commented Sep 4, 2018

(Reposting a previous comment because rfcbot doesn't take edits into account.)

@rfcbot concern overly-aggressive stylistic lints

Some of the lints discourage patterns that seem reasonable in practice.

  • eq_op: sensible for integer and boolean types, but not generic types.
  • bad_bit_mask: same concern as for eq_op.
  • infinite_iter: while it's probably not good practice, I'm not sure it's something the compiler itself should warn about.
  • mut_from_ref: the description claims it's "trivially unsound", but it seems naïvely like such a signature could be plausible for some sort of allocator.

@rfcbot concern possible false positives

With the current descriptions, some of the lints seem like they could behave incorrectly in some cases. If these are correct, the descriptions should be updated.

  • ifs_same_cond: this ignores function calls in the condition, but there are other ways to cause effects within a condition. Does this lint definitely not have false positives?
  • logic_bug: could the comment about ignoring short-circuiting behaviour be elaborated? What expressions will it warn against that might be effectful?
  • absurd_extreme_comparisons: usize should probably be excluded?

Some other comments, but not blocking concerns.
not_unsafe_ptr_arg_deref: seems reasonable, though it could be helpful to address the limitations in "Known problems".

  • wrong_transmute: the description doesn't go into which transmutes are actually forbidden.
  • zero_width_space: where exactly is this caught? Is it caught inside strings for instance? Are there other characters it'd similarly make sense to forbid (such as other zero-width whitespace characters)?

All the other suggested lints seem sensible to me.

@leonardo-m

This comment has been minimized.

Copy link

leonardo-m commented Sep 10, 2018

Some lints could become errors in Rustc, like: wrong_transmute,
fn_to_numeric_cast_with_truncation, out_of_bounds_indexing, inline_fn_without_body.

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented Sep 24, 2018

are your concerns resolved?

Assume that there are no additional checks. The lint addresses all code that matches the description.

I still think that this is not nearly enough. What I was really looking for is a well thought out list of corner cases (most of the lints’ descriptions neglect to mention even the obvious ones) & some sort of list if pros and cons we could amend as a part of decision making process. Most of the lints only specify the obvious case where they fire, but then there must exist samples of code which are not as clear-cut. Where exactly is the line drawn? I don’t want (and hardly have the attention span) to think about possible corner cases for 50 different lints at once.

Since making a decision for this many lints at once is infeasible. Perhaps an issue for each lint or a small number of lints (grouped according to some criteria) would be better, so that discussion does not mix this much?

I would be willing to mark my checkbox here provided that the purpose of this tracking issue is to select lints we want to consider for inclusion to rustc, each on their own merit.

platform opinions, we can silence some lints on non-host platforms, but your example makes me want to lint this even more. While it is sound, it is most certainly unidiomatic and seems fragile/easy to get wrong. Especially with the ownership based embedded work that is going on.

There are multiple different approaches we could take, sure. Just like I elaborated above, I think this tracking issue is not a great location to come to a consensus on that.

@scottmcm

This comment has been minimized.

Copy link
Member

scottmcm commented Sep 29, 2018

Did I miss a concise list somewhere of "here's why these but not the other are getting uplifted"?

Part of me feels that, even where I agree with the lints, that a bunch of these ought to be generalized if they'd go into rustc itself. For example, .step_by(0) feels like it should be some sort of "what inputs are sane" attribute on parameters that could be checked more generally. Or things like .unwrap_of(foo()) (rust-lang/rfcs#2536) is roughly the same problem as .resize(n, HashMap::new()) (rust-lang/rfcs#2551), so there could be a "did you want the lazy version of this?" attribute.

@JeanMertz

This comment has been minimized.

Copy link

JeanMertz commented Oct 2, 2018

If there is an agreement on naming conventions, and the rules that are used to determine if a lint should be uplifted, wouldn't it make more sense to split this issue into one single issue per proposed lint uplift?

That way, the discussion can stay focussed on that one lint, potential improvements can be discussed (and linked to PRs implementing that improvement), and when a specific lint is ready to be uplifted, it can be, without blocking on any discussion on other lints that are more debatable.

It would also make it easier to link back to decisions in the past for specific lints when talking about them in the future.

I'm a bystander here, but interested in this process, and having a hard time with all the different discussions crossing. I'm also concerned that it might leave some concerns unnoticed, because of the noise (although that's less of a concern with the rfcbot keeping track).

The obvious downside is having to comment on multiple issues when discussing multiple lints. So this would only make sense if there is an agreement on the general way to move forward with all uplift proposals (so naming convention, level of documentation, and a general set of rules to determine if a lint can be uplifted or not). If there is no such agreement yet, then I would propose to postpone this issue, and figure that out first, in a separate issue, without the noise of talking about specific lint implementations.

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Oct 2, 2018

I very much agree with @JeanMertz and @nagisa here; going through such a large list of lints at once is a daunting task and delays reaching consensus by a lot. I'd much rather we split things up and consider each lint one by one.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Oct 2, 2018

@rfcbot concern guidelines

I agree that trying to filter through these lints in a single thread is unworkable, but I am also not thrilled by the idea of each lint being considered individually without establishing a clear consensus regarding guidelines; that would result in inconsistent decision making.

When I brought this up in August, @oli-obk pointed out this guidance from the clippy 1.0 RFC:

I don't think the compler will want all correctness lints here, however if the lint is about a common enough situation where it being not a bug is an exceedingly rare case (i.e. very low false positive frequency) it should probably belong in the compiler.

This is good guidance, but my interpretation of this guidance would put quite a few of these lints out of the question, especially several of those highlighted in @varkor's previous comment. I'm not certain if this was an oversight or if I am applying the "very low false positive" standard more stringently than other users.

My own experience with the compiler is that I essentially never have a false positive warning, and when I do I am doing something really exceptionally strange. For example, I have had to allow dead code for a field that is never accessed because I want its destructor to run when this object is dropped: this is about the only time I've ever had to permanently allow a lint. I would perceive a change to this situation as a real drawback: false positives make me feel (wrongly) less confident about my understanding of the code and whether or not it is correct.

On the other hand, I have some objections outside of just the "low false positive" standard. Some of these lints are also indirect, highlighting that your code is probably wrong but without highlighting the problematic code. Take mut_from_ref for example: based on the signature of this function, it determines that you probably have UB in the function. But it doesn't identify the point in the function where the UB was introduced. Is this actually a net benefit as a part of the ordinary user flow (as opposed to in some static analysis tool users opt into explicitly)? I'm not so certain, I think it would be better if on-by-default lints always pointed you at the specific code you need to change to appease them.

So I think overall the next step should be to determine what the guidelines for correctness lints in the compiler actually are. For me, the requirements I prefer are:

  1. The lint identifies a genuine bug; not bad code but a bug.
  2. The lint has essentially no false positives.
  3. The lint can identify exactly the code it considers to contain the bug.

Once we have consensus on the guidelines, people can go through individual lints and make a decision following those guidelines.

@leonardo-m

This comment has been minimized.

Copy link

leonardo-m commented Oct 2, 2018

The lint is identifies a genuine bug;

If it's a bug, then the compiler should raise a compilation error. If we talk about compiler warnings they are allowed to be about things that aren't bugs.

@Manishearth

This comment has been minimized.

Copy link
Member

Manishearth commented Oct 5, 2018

Btw, clippy's relicense is going through, so uplifting lints into Rust will be easy now.

rust-lang/rust-clippy#3269

@lu-zero

This comment has been minimized.

Copy link
Contributor

lu-zero commented Oct 23, 2018

Would be possible to add #54905 as well? :)

@Centril

This comment has been minimized.

Copy link
Contributor

Centril commented Nov 20, 2018

@withoutboats

To get this unstuck... Perhaps we can "evolve" the guidelines through the first lints we deal with... i.e. let's establish policy through precedent and do it more "by example".

Some of these lints are also indirect, highlighting that your code is probably wrong but without highlighting the problematic code. Take mut_from_ref for example: based on the signature of this function, it determines that you probably have UB in the function. But it doesn't identify the point in the function where the UB was introduced. Is this actually a net benefit as a part of the ordinary user flow (as opposed to in some static analysis tool users opt into explicitly)? I'm not so certain, I think it would be better if on-by-default lints always pointed you at the specific code you need to change to appease them.

I do think that if the compiler lints and determines that you probably have UB in the function, even if it is indirect, it is better than nothing. While it might not point directly point at the problem and a possible solution, it will hopefully at least make the user raise some questions which then leads to them searching for answers from various resources. The lint can also offer a link which provides more context in the description.

So I think overall the next step should be to determine what the guidelines for correctness lints in the compiler actually are. For me, the requirements I prefer are:

  1. The lint identifies a genuine bug; not bad code but a bug.

  2. The lint has essentially no false positives.

  3. The lint can identify exactly the code it considers to contain the bug.

I agree with 2, but I'm not so sure about 1 (for reasons noted below) and 3 (for reasons aforementioned).

We already have lints that identifies bad code such as non_standard_style, unused_attribute, etc. It seems to me that it is OK to lint against "bad code" but the bar for such lints are higher while the bar for identifying true bugs are lower.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.