Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: cmd/vet: unusedresult should have an option to work with a "blacklist" of functions to check that results are used #65984

Open
janpfeifer opened this issue Feb 28, 2024 · 7 comments
Labels
Milestone

Comments

@janpfeifer
Copy link

janpfeifer commented Feb 28, 2024

Proposal Details

Similar in spirit to proposal #20803 (for the go compiler), and related to #63689, #14972, and #20148.

I recently lost time on a subtle bug where I forgot to use the result of a function. I tried to simplify the issue to a minimal example:

func f(x float64) float64 {
	math.Sqrt(16.0)  // No warning!?
	return x + 1.0
}

Currently, go vet will only catch this if I list -unusedfuncs=math.Sqrt. But I would only add math.Sqrt if I already knew about the bug, which defies the go vet purpose. I would like to pass all my projects functions/methods and its dependencies (including standard library) to -unusedfuncs.

Instead, can we have an alternative flag -unusedignorefuncs (or something else) where one can pass a "blacklist", and every other function whose results are unused are reported back ? (maybe with some defaults, like fmt.Printf) ?

I understand a similar point is being made for the compiler (and extra syntax?) in proposal #20803.

Having a "blacklist" option to go vet could also be an intermediary solution for that proposal -- for those using go vet systematically.

@janpfeifer janpfeifer changed the title cmd/vet: unused should have an option to work with a "black list" of functions to check that results are used cmd/vet: unused should have an option to work with a "blacklist" of functions to check that results are used Feb 28, 2024
@janpfeifer janpfeifer changed the title cmd/vet: unused should have an option to work with a "blacklist" of functions to check that results are used cmd/vet: unusedresult should have an option to work with a "blacklist" of functions to check that results are used Feb 28, 2024
@timothy-king
Copy link
Contributor

IIUC this proposal is to add a flag that instructs the checker to treat all functions as [#mustuse] unless listed in -unusedignorefuncs. Because functions can be called for their side-effects, like fmt.Printf, it seems to me like this would be quite noisy until one built up a sufficient local list of functions to ignore in -unusedignorefuncs. (Maybe long enough to maybe warrant being in a file that is checked in for re-usability?) This is not really the user experience we aim for with cmd/vet. (Granted it would be off by default.)

It would be better to have some data to work with than to speculate though. Have you collected any data about the user experience of such an option? What does this list look like on large projects? This may be a good candidate for trying such a checker out in the wild without/before being integrated in vet itself.

@janpfeifer
Copy link
Author

Exactly.

Btw what is the [#mustuse] you mentioned ? Is it a flag / annotation that can be used today ?

Now, I would argue that if a function return a result, in most cases it is meant to be used -- otherwise why return a value. The fmt.Printf scenario are less common. More on the argumentation for this in #20803, but that is a more drastic solution.

Cost benefit analysis of the proposed features (not counting the cost of implementing it ... hopefully it's simple?):

  • Pros: Cost of not checking unused results:
    • hidden bugs, that can be hard to catch.
    • can be added to automatic pre-submit checks.
  • Cons: Cost of false positive checks (unused results being ok):
    • Need to add function to -unusedignorefuncs
    • Or, add an empty assignment _ = f()
    • Alternatively: since this proposal is disabled by default, one can simply not take the trade-off, or only use it occasionally.

Data analysis: a bias dataset, but easy for me to generate and hopefully it's a data-point useful enough for this proposal.

I took one of my larger current project, grepped for all standard library references (
so functions we all can understand/agree on the semantics),
manually filtered out constants and functions that don't return any results, and also manually labeled them as
benefit from unused return check (1) or not (0). The script I used is below -- it's just a quick rough approximation.
Also, method calls are not counted -- they are not easy to grep for.

The results for my project follow below. Basically, outside of fmt.Print* functions (
which I assume would be listed in -unusedignorefunc by default), I couldn't find one case where one wouldn't want
to have a general check that the results are used. Pls, double check, and let me know if you agree with the labeling:

Labels:

  • 0 unused return check would just generate noise; "0, listed" are presumably listed by default in -unusedignorefunc.
  • 1 benefit from unused return check; "1, error" means the return is an error, hence benefits from check.
Function # in code Label
atomic.AddInt64 4 1
binary.Write 2 1, error
bytes.NewBuffer 2 1
bytes.Replace 1 1
C.CString 4 1
C.GoString 1 1
C.malloc 2 1
csv.NewReader 1 1
errors.Cause 1 1
errors.Error 1 1
errors.Is 1 1
errors.New 30 1
filepath.Join 6 1
flag.Bool 33 1
flag.Float64 31 1
flag.Int 67 1
flag.Lookup 2 1
flag.Set 2 1, error
flag.String 36 1
fmt.Errorf 17 1
fmt.Fprint 1 0, listed
fmt.Print 3 0, listed
fmt.Printf 389 0, listed
fmt.Println 65 0, listed
fmt.Sprintf 168 1
gob.NewDecoder 9 1
gob.NewEncoder 9 1
gzip.NewReader 1 1
hex.Encode 1 0
hex.EncodeToString 1 0
image.NewNRGBA 3 1
image.NewRGBA 1 1
image.NRGBA 3 1
io.Copy 4 1, error
jpeg.Decode 1 1
json.Marshal 1 1
json.NewDecoder 2 1
json.NewEncoder 2 1
json.Unmarshal 1 1
math.Abs 2 1
math.Acos 2 1
math.Ceil 1 1
math.Cos 2 1
math.E 2 1
math.Exp 7 1
math.Expm1 1 1
math.Floor 2 1
math.Inf 2 1
math.IsInf 7 1
math.IsNaN 9 1
math.Log 3 1
math.Log1p 1 1
math.Max 1 1
math.Mod 1 1
math.Nextafter 1 1
math.Nextafter32 1 1
math.Pow 2 1
math.Round 7 1
math.Signbit 1 1
math.Sin 1 1
math.Sqrt 6 1
math.Sqrt2 1 1
math.Tanh 1 1
os.Chmod 2 1, error
os.Create 14 1
os.CreateTemp 4 1
os.Getenv 3 1
os.IsExist 3 1
os.IsNotExist 15 1
os.LookupEnv 6 1
os.Mkdir 4 1, error
os.MkdirAll 10 1, error
os.MkdirTemp 5 1
os.Open 22 1
os.OpenFile 1 1
os.ReadDir 6 1
os.ReadFile 5 1
os.Remove 4 1, error
os.RemoveAll 4 1, error
os.Rename 3 1, error
os.Setenv 4 1, error
os.Stat 9 1
os.Unsetenv 1 1, error
os.WriteFile 1 1, error
path.Base 1 1
path.Dir 4 1
path.IsAbs 8 1
path.Join 91 1
png.Encode 1 1
rand.Intn 6 1
rand.IntN 3 1
rand.New 12 1
rand.NewSource 12 1
reflect.Copy 4 0
reflect.DeepEqual 13 1
reflect.MakeFunc 1 1
reflect.MakeSlice 4 1
reflect.ValueOf 35 1
regexp.MustCompile 8 1
runtime.FuncForPC 2 1
runtime.NumCPU 5 1
sha256.New 1 1
sort.Ints 2 1
sort.Slice 3 1
sort.Sort 1 1
sort.Strings 3 1
strconv.Atoi 5 1
strconv.FormatInt 7 1
strconv.ParseFloat 3 1
strconv.ParseInt 1 1
strings.Compare 1 1
strings.Contains 1 1
strings.HasPrefix 12 1
strings.HasSuffix 8 1
strings.IndexRune 1 1
strings.Join 17 1
strings.NewReader 1 1
strings.Repeat 1 1
strings.Replace 1 1
strings.Split 10 1
strings.ToLower 3 1
strings.ToUpper 1 1
strings.TrimSpace 1 1
template.New 2 1
time.Now 19 1
time.Since 5 1
unsafe.Pointer 64 1
unsafe.Sizeof 1 1
unsafe.Slice 25 1
unsafe.SliceData 2 1
os/user.Current 1 1
os/user.Lookup 1 1

ps.: Script to generate the data, just so one can reproduce on other projects / or verify for correctness:

go install github.com/itchyny/gojq/cmd/gojq@latest
go install github.com/ezekg/xo@latest

function enumerateImports() { go list -json ./... | gojq ".Imports" | egrep '^  ' | sed 's/^\s*//; s/,$//; s/"//g;' ; }
function stdlibImports() { egrep -v '(\.org|\.com|\.io)' | sort | uniq ; }
function createExtractionRule() { pkgs="$(xo '/(\w+)\n/$1/i' | sort | uniq | paste -sd '|')" ; printf "/((%s)\.\w+)/\$1/" "$pkgs" ; }

extractRule="$(enumerateImports | stdlibImports | createExtractionRule)" ; echo $extractRule

function enumerateAndCountStdlib() { find . -name '*.go' | while read fileName ; do cat $fileName | xo "$extractRule" ; done | sort | uniq -c | xo '/(\d+)\s([\w.]+)\n/| $2 | $1 | |/'; }

enumerateAndCountStdlib

@timothy-king
Copy link
Contributor

Thank you for taking the time to collect this data. This is exactly the kind of information we need to consider to make such a decision about vet.

I think what you have given is mostly a claim that the set of functions in the standard library that should be warned on by default for unusedresult is much larger than the current set. I am not sure you will achieve agreement agreement on the Labels you have assigned, for example atomic.AddInt64. But I think there is a case to be made that vet should warn on many of these like bytes.NewBuffer. The preponderance of useless computations indicate correctness bugs, and addressing it always saves cycles. There is at least a debate that some of these are worth warning on in vet. There will be cases that more contentious. See #20803. Expanding this set is a different conversation than the proposed flag though. It should have its own proposal.

The data that is really needed for this proposal is what such a flag would actually end up reporting. This helps us know is both the concrete benefit of adding such a flag (whats reported), and where the potential objections might come from (what needed to be suppressed). This also involves going outside the standard library and including user code.

Here is a diff that will be good enough for experimentation:

% git diff go/analysis/passes/unusedresult/unusedresult.go
diff --git a/go/analysis/passes/unusedresult/unusedresult.go b/go/analysis/passes/unusedresult/unusedresult.go
index 76f42b052..4f167f8ce 100644
--- a/go/analysis/passes/unusedresult/unusedresult.go
+++ b/go/analysis/passes/unusedresult/unusedresult.go
@@ -111,6 +111,12 @@ func run(pass *analysis.Pass) (interface{}, error) {
                if !ok {
                        return // e.g. var or builtin
                }
+               if fn.Type().(*types.Signature).Results().Len() > 0 {
+                       pass.Reportf(call.Lparen, "unusedignorefuncs: result of %s.%s call not used",
+                               fn.Pkg().Path(), fn.Name())
+                       return
+               }
+
                if sig := fn.Type().(*types.Signature); sig.Recv() != nil {
                        // method (e.g. foo.String())
                        if types.Identical(sig, sigNoArgsStringResult) {

I attempted to apply this using:

% cat go/analysis/passes/unusedresult/main.go 
//go:build ignore
package main

import (
        "golang.org/x/tools/go/analysis/passes/unusedresult"
        "golang.org/x/tools/go/analysis/singlechecker"
)

func main() {
        singlechecker.Main(unusedresult.Analyzer)
}
% gotip build -o unused ./go/analysis/passes/unusedresult/main.go
% cd <gomlx>
% unused ./...

This had problems due to missing headers. I believe this is as expected given the Installation instructions for the project.

For the packages that do succeed, these are mostly fmt.Print and friends, as well as github.com/stretchr/testify/assert. The former is expected (the Label 0). The latter is evidence that 'may ignore result' annotations are required for such a checker to be helpful over time, and this list might not be amenable to a flag. File with annotations or checked into source seem like better matches.

Now, I would argue that if a function return a result, in most cases it is meant to be used -- otherwise why return a value.

Like fmt.Printf there may be other side-effects and the return results may be optional. I do not really disagree that most of the time results are intended to be used, but to be clear, "most cases" is the not the bar for vet. The README outlines the precision requirements https://github.com/golang/go/blob/master/src/cmd/vet/README#L25 .

Btw what is the [#mustuse] you mentioned ? Is it a flag / annotation that can be used today ?

It was a typo of #[must_use] of Rust. I chose this as it was the first listed in #20803 (comment) . It could equivalently be [[nodiscard]]. It is just some instance of an annotation that a return result must be used.

@janpfeifer
Copy link
Author

Thanks for the follow up @timothy-king !

I interpret the data I collected in a slightly different fashion: in a (typical??) project, most of the standard library" functions used would benefit from an "unused return values" checks, weighted by number of uses.

I'm hoping this factors in the calculation of the cost/benefit, and how often such flag would be beneficial.

The data that is really needed for this proposal is what such a flag would actually end up reporting...

I fear this data wouldn't be telling in committed projects: I already fixed my problem, it took me time. It's possible that most projects eventually fixed their unused returns caused bugs, it just took developers time for them fix it.

A better metric would be how much time such a lint option would save developers, by finding bugs quicker. Alas ... I don't have a crystal ball, nor can I think of a proxy metric that would approximate it.

But maybe we can borrow the wisdom of other programming languages (in g++ -Wunused-result is enabled by default) and assume this is a useful enough feature ?

I'm proposing it to be disabled by default. I wonder if it's possible to instrument it to collect usage statistics (??), such that one can observe over time how often it becomes used/enabled by developers ? That could actually be a good proxy metric to saving developers time...

If it turns out to be popular, and after a few iterations to minimize false positives, one can decide to make it default one day. Also the statistics collected with this feature could help better inform #20803, without committing to a language change.

Here is a diff that will be good enough for experimentation:

Thank you!!!
I suspected it would be simple, but I didn't know that it would be that easy :) I had already looked at golangci-lint.run and staticcheck and neither reported unused returns.

I'll give it a try and report back tomorrow (it's Sundar's day ...), I have the headers here. I can also run on another larger project here. But again, my point is that likely I won't find anything ... what I'm hoping is that this warning would help me fix future bugs quicker.

If it's so simple I feel hopeful :)

ps.: Indeed atomic.AddInt64 was mislabeled, thanks for the catch!

@janpfeifer
Copy link
Author

janpfeifer commented Mar 1, 2024

Very related proposal by @robpike , with similar motivation: #20148

This proposal is broader in the sense it doesn't limit itself only to errors -- so not special casing errors (my particular case was not related to an error).
But also narrower in that it is suggested to be made optional.

@timothy-king
Copy link
Contributor

I am not sure rule 3 in #20148 applies equally well to types that are not errors. A string/int/bool result being ignored in the next statement that reads a string/int/bool variable does not quite seem like a bug without more context. FWIW the fmt.Print family does not seem like they obvious work well with rule 3 either.

@janpfeifer
Copy link
Author

Agreed in part. Even with respect to errors there are some cases one may want to ignore -- e.g.: closing a file opened for reading. I agree that an error is less likely one wants to ignore than a string, int or bool. But my bug was pretty serious, and what I was ignoring was not an error (it was an object). So errors are not the only thing one would like checked in every case.

My point is only about giving the option for the user to decide what they want to be warned about, and what not. This power is already available as a "whitelist" flag. And this proposal is about improving on that control by providing an alternative "blacklist" flag instead.

My use case (and assume others) is that the "whitelist" control doesn't help me at all with the cases that matter, but a "blacklist" will cost some (I would have to list some functions that I don't care about the results), but will be very valuable for catching those functions I hadn't thought about.

Btw, with regard to #20148 my point is that the "blacklist" control here could be used to catch those same cases of unhandled errors, along others. But it is not the same.

@ianlancetaylor ianlancetaylor changed the title cmd/vet: unusedresult should have an option to work with a "blacklist" of functions to check that results are used proposal: cmd/vet: unusedresult should have an option to work with a "blacklist" of functions to check that results are used Mar 6, 2024
@gopherbot gopherbot added this to the Proposal milestone Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

3 participants