Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation result_large_err #9373

Merged
merged 1 commit into from Aug 30, 2022

Conversation

lukaslueg
Copy link
Contributor

@lukaslueg lukaslueg commented Aug 24, 2022

This is a shot at #6560, #4652, and #3884. The lint checks for Result being returned from functions/methods where the Err variant is larger than a configurable threshold (the default of which is 128 bytes). There has been some discussion around this, which I'll try to quickly summarize:

  • A large Err-variant may force an equally large Result if Err is actually bigger than Ok.
  • There is a cost involved in large Result, as LLVM may choose to memcpy them around above a certain size.
  • We usually expect the Err variant to be seldomly used, but pay the cost every time.
  • Result returned from library code has a high chance of bubbling up the call stack, getting stuffed into MyLibError { IoError(std::io::Error), ParseError(parselib::Error), ...}, exacerbating the problem.

This PR deliberately does not take into account comparing the Ok to the Err variant (e.g. a ratio, or one being larger than the other). Rather we choose an absolute threshold for Err's size, above which we warn. The reason for this is that Errs probably get map_err'ed further up the call stack, and we can't draw conclusions from the ratio at the point where the Result is returned. A relative threshold would also be less predictable, while not accounting for the cost of LLVM being forced to generate less efficient code if the Err-variant is large in absolute terms.

We lint private functions as well as public functions, as the perf-cost applies to in-crate code as well.

In order to account for type-parameters, I conjured up fn approx_ty_size. The function relies on LateContext::layout_of to compute the actual size, and in case of failure (e.g. due to generics) tries to come up with an "at least size". In the latter case, the size of obviously wrong, but the inspected size certainly can't be smaller than that. Please give the approach a heavy dose of review, as I'm not actually familiar with the type-system at all (read: I have no idea what I'm doing).

The approach does, however flimsy it is, allow us to successfully lint situations like

pub union UnionError<T: Copy> {
    _maybe: T,
    _or_perhaps_even: (T, [u8; 512]),
}

// We know `UnionError<T>` will be at least 512 bytes, no matter what `T` is
pub fn param_large_union<T: Copy>() -> Result<(), UnionError<T>> {
    Ok(())
}

I've given some refactoring to functions/result_unit_err.rs to re-use some bits. This is also the groundwork for #6409

The default threshold is 128 because of #4652 (comment)

lintcheck does not trigger this lint for a threshold of 128. It does warn for 64, though.

The suggestion currently is the following, which is just a placeholder for discussion to be had. I did have the computed size in a span_label. However, that might cause both ui-tests here and lints elsewhere to become flaky wrt to their output (as the size is platform dependent).

error: the `Err`-variant returned via this `Result` is very large
  --> $DIR/result_large_err.rs:36:34
   |
LL | pub fn param_large_error<R>() -> Result<(), (u128, R, FullyDefinedLargeError)> {
   |                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The `Err` variant is unusually large, at least 128 bytes

changelog: Add [result_large_err] lint

@rust-highfive
Copy link

r? @Alexendoo

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Aug 24, 2022
@lukaslueg
Copy link
Contributor Author

lukaslueg commented Aug 24, 2022

I'll deal with CI (seems trivial) after review. As far as suggestions go, maybe:

error: the Err-variant returned from this function is very large
^^^^^^^^^^^^^ The Err-variant is at least [actual size] bytes (is this a problem for automation?)
help: Try reducing the size of UnionError<T>, e.g. by boxing elements or replacing it with Box<UnionError<T>>

Copy link
Member

@Alexendoo Alexendoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far!

Might be worth grepping for layout_of and checking if any other usages of it could benefit from this approx_ty_size fn

clippy_lints/src/functions/result.rs Outdated Show resolved Hide resolved
clippy_lints/src/functions/result.rs Outdated Show resolved Hide resolved
clippy_lints/src/functions/result.rs Outdated Show resolved Hide resolved
clippy_lints/src/functions/result.rs Outdated Show resolved Hide resolved
clippy_lints/src/functions/result.rs Outdated Show resolved Hide resolved
clippy_lints/src/functions/result.rs Outdated Show resolved Hide resolved
@lukaslueg
Copy link
Contributor Author

lukaslueg commented Aug 25, 2022

All addressed. Will squash if so requested.

I've moved approx_ty_size to clippy_utils::ty, where it might be useful for other things like large_enum_variant. I may grep and address that in a later PR. Added the array-case including substs and added tests for that. Added lint-description. Modified the suggestion as indicated above, happy for comments.

@lukaslueg lukaslueg changed the title [WIP] Initial implementation result_large_err Initial implementation result_large_err Aug 25, 2022
@lukaslueg lukaslueg changed the title Initial implementation result_large_err Initial implementation result_large_err Aug 25, 2022
Copy link
Member

@Alexendoo Alexendoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay! Just a couple nits then if you squash the commits this should be ready to merge

clippy_lints/src/functions/mod.rs Outdated Show resolved Hide resolved
@lukaslueg
Copy link
Contributor Author

Addressed the nit, squashed. One may decide to close #6560, #4652, and/or #3884 once merged.

@lukaslueg
Copy link
Contributor Author

yells at CI for catching things cargo test doesn't

@Alexendoo
Copy link
Member

Great, thanks for the lint!

I stuck a changelog: .. entry in the PR message

@bors r+

@bors
Copy link
Collaborator

bors commented Aug 30, 2022

📌 Commit 66a9705 has been approved by Alexendoo

It is now in the queue for this repository.

@bors
Copy link
Collaborator

bors commented Aug 30, 2022

⌛ Testing commit 66a9705 with merge 09e4659...

@lukaslueg
Copy link
Contributor Author

@Alexendoo Thanks for the review

@bors
Copy link
Collaborator

bors commented Aug 30, 2022

☀️ Test successful - checks-action_dev_test, checks-action_remark_test, checks-action_test
Approved by: Alexendoo
Pushing 09e4659 to master...

@bors bors merged commit 09e4659 into rust-lang:master Aug 30, 2022
@lukaslueg lukaslueg deleted the result_large_err branch August 30, 2022 18:35
bors added a commit that referenced this pull request Sep 2, 2022
Use `approx_ty_size` for `large_enum_variant`

This builds upon #9373 to use the approximate size of each variant for `large_enum_variant`. This allows us to lint in situations where an `enum` contains generics but is still guaranteed to have a large variant on an at-least basis, e.g. with `(T, [u8; 512])`.

* I've changed the wording from "is ... bytes" to "contains at least" because
  * the size is now an approximate lower bound (e.g. `512` in the example above). The actual size is larger due to `T`, including due to `T`'s memory layout.
  * the discriminant is not taken into account in the message. This comes up with variants like `A(T)`, which are "is at least 0 bytes" otherwise, which may be misleading.
* If the second-largest variant has no fields, there is a special case "carries no data" instead of "is at least 0 bytes".
* A variant like `A(T)` is "at least 0 bytes", which is technically true, yet we don't distinguish between "indeterminate" and truly "ZST".
* The generics-tests that were there before now lint while they didn't lint before. AFAICS this is correct.

I guess the above is correct-ish. However, I use the `SubstsRef` that I got via `cx.tcx.type_of(item.def_id)` to solve for generics in the variants. Is this even applicable, since we start from an `ItemKind`?
bors added a commit that referenced this pull request Sep 3, 2022
Use `approx_ty_size` for `large_enum_variant`

This builds upon #9373 to use the approximate size of each variant for `large_enum_variant`. This allows us to lint in situations where an `enum` contains generics but is still guaranteed to have a large variant on an at-least basis, e.g. with `(T, [u8; 512])`.

* I've changed the wording from "is ... bytes" to "contains at least" because
  * the size is now an approximate lower bound (e.g. `512` in the example above). The actual size is larger due to `T`, including due to `T`'s memory layout.
  * the discriminant is not taken into account in the message. This comes up with variants like `A(T)`, which are "is at least 0 bytes" otherwise, which may be misleading.
* If the second-largest variant has no fields, there is a special case "carries no data" instead of "is at least 0 bytes".
* A variant like `A(T)` is "at least 0 bytes", which is technically true, yet we don't distinguish between "indeterminate" and truly "ZST".
* The generics-tests that were there before now lint while they didn't lint before. AFAICS this is correct.

I guess the above is correct-ish. However, I use the `SubstsRef` that I got via `cx.tcx.type_of(item.def_id)` to solve for generics in the variants. Is this even applicable, since we start from an - [ ] `ItemKind`?

changelog: none
bors added a commit that referenced this pull request Apr 19, 2023
Add size-parameter to unecessary_box_returns

Fixes #10641

This adds a configuration-knob to the `unecessary_box_returns`-lint which allows _not_ linting a `fn() -> Box<T>` if `T` is "large". The default byte size above which we no longer lint is 128 bytes (due to #4652 (comment), also used in #9373). The overall rational is given in #10641.

---

changelog: Enhancement: [`unnecessary_box_returns`]: Added new lint configuration `unnecessary-box-size` to set the maximum size of `T` in `Box<T>` to be linted
[#10651](#10651)
<!-- changelog_checked -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties
Projects
None yet
4 participants