Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/vet: warn about string(int) type conversions #32479

Closed
ianlancetaylor opened this issue Jun 7, 2019 · 61 comments
Closed

cmd/vet: warn about string(int) type conversions #32479

ianlancetaylor opened this issue Jun 7, 2019 · 61 comments

Comments

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jun 7, 2019

#3939 proposes removing the explicit type conversion string(i) where i has an integer type other than rune. That is a backward incompatible change, so we cannot make it until we can count on having modules everywhere, with the support modules provide for determining the version of the language to use to compile code.

There is a step we can take today, which is to change go vet to report conversions of integer types other than rune to string. If we think that removing this feature from the language is a good idea, then a vet check for it is a good idea, and it is one that we can implement today.

@ianlancetaylor ianlancetaylor added this to the Go1.14 milestone Jun 7, 2019
@agnivade
Copy link
Contributor

@agnivade agnivade commented Jun 8, 2019

@networkimprov
Copy link

@networkimprov networkimprov commented Jun 8, 2019

Pls also permit string(b) where b is a byte or uint8.

@mvdan
Copy link
Member

@mvdan mvdan commented Jun 24, 2019

This seems fine to me. The warning should probably point the user in the right direction.

@alanfo
Copy link

@alanfo alanfo commented Jun 27, 2019

Pls also permit string(b) where b is a byte or uint8.

I think this is essential if this proposal (and eventually #3939) is to be adopted.

The string(byte) conversion, when you're dealing with ASCII characters only, is probably used at least as often as string(rune)in my experience.

@dolmen
Copy link

@dolmen dolmen commented Jul 2, 2019

@networkimprov @alanfo You will still be able to use string([]byte{b}).

@networkimprov
Copy link

@networkimprov networkimprov commented Jul 2, 2019

That would be a surprising requirement. And allocates a byte slice for a single statement?

@alanfo
Copy link

@alanfo alanfo commented Jul 2, 2019

@dolmen

You will still be able to use string([]byte{b}).

Well string(rune(b)) would be a better alternative because you wouldn't then have to allocate a byte slice but it would still be a nuisance to have to update all my code to use this workaround :(

@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@ianlancetaylor ianlancetaylor modified the milestones: Backlog, Go1.15 Nov 5, 2019
@bradfitz bradfitz added the NeedsFix label Nov 5, 2019
@gopherbot gopherbot removed the NeedsDecision label Nov 5, 2019
@smasher164
Copy link
Member

@smasher164 smasher164 commented Nov 6, 2019

I will take a crack at adding an analysis pass and vet check for this.

@alandonovan
Copy link
Contributor

@alandonovan alandonovan commented Nov 6, 2019

string([]byte{b}) is not equivalent to string(b) for a byte value b: the former yields the string whose bytes are just b, the latter returns the string that UTF-8-encodes the single rune b. If b is greater than 127, then the former will be a single-byte string but the latter will be a two-byte encoding.

it would still be a nuisance to have to update all my code to use this workaround :(

Michael Matloob recently presented at a conference the "suggested fixes" feature of the analysis Diagnostics API. The diagnostic for the new analyzer could replace (or offer to replace) the deprecated expressions by the preferred version.

@networkimprov
Copy link

@networkimprov networkimprov commented Nov 6, 2019

string([]byte{b}) ... yields the string whose bytes are just b

That could be invalid UTF-8. So string(buf[x:y]) may be a bug.

To extract strings from buffers, should we generally use string(bytes.ToValidUTF8(buf, nil))?

@alandonovan
Copy link
Contributor

@alandonovan alandonovan commented Nov 6, 2019

That could be invalid UTF-8. So string(buf[x:y]) may be a bug.

It could be, but it very much depends on the situation.

To extract strings from buffers, should we generally use string(bytes.ToValidUTF8(buf, nil))?

I had never even heard of that function until now, and I'm more deliberate than most about string encodings. Slicing a string or byte slice may be perfectly fine, again depending on the situation.

@alanfo
Copy link

@alanfo alanfo commented Nov 7, 2019

The diagnostic for the new analyzer could replace (or offer to replace) the deprecated expressions by the preferred version.

Be that as it may, it still seems anomalous to allow string(rune) but deprecate string(byte).

After all, string([]rune] and string([]byte) are both allowed but not conversions from slices of other integer types. It may therefore seem strange to people that both corresponding 'scalar' conversions are not allowed as well.

Indeed, @ianlancetaylor made a similar point in #3939 itself.

@smasher164
Copy link
Member

@smasher164 smasher164 commented Dec 30, 2019

The diagnostic for the new analyzer could replace (or offer to replace) the deprecated expressions by the preferred version.

What would be the suggested fix for an integer that overflows the size of the rune? Although this shouldn't occur in practice, simply changing it to rune(i) would break pre-existing code.

package main

import "fmt"

func main() {
	i := int64(1 << 33)
	fmt.Printf("%q %q\n", string(i), string(rune(i))) // prints "�" "\x00"
}
@gopherbot
Copy link

@gopherbot gopherbot commented Dec 31, 2019

Change https://golang.org/cl/212919 mentions this issue: go/analysis: add pass to check string(int) conversions

@ianlancetaylor
Copy link
Contributor Author

@ianlancetaylor ianlancetaylor commented Dec 31, 2019

@smasher164 Do you have any examples of correct code that has this problem?

The correct fix is going to be to add a condition comparing the value to utf8.MaxRune, and when it is larger using utf8.RuneError.

@smasher164
Copy link
Member

@smasher164 smasher164 commented Dec 31, 2019

@ianlancetaylor Running the tool over my GOPATH revealed 16 cases where an int, uint, int64, or uint64 were converted to a string, and did not do bounds checking to ensure that the converted value doesn't overflow MaxRune. However, most of these cases depend on these functions' call sites, which could potentially be passing only valid input.

github.com/antlr/antlr4/runtime/Go/antlr/dfa_serializer.go:115:15: conversion from int to string
func (l *LexerDFASerializer) getEdgeLabel(i int) string {
    return "'" + string(i) + "'"
}
github.com/antlr/antlr4/runtime/Go/antlr/interval_set.go:235:34: conversion from int to string
github.com/antlr/antlr4/runtime/Go/antlr/interval_set.go:238:30: conversion from int to string
github.com/antlr/antlr4/runtime/Go/antlr/interval_set.go:238:53: conversion from int to string
type Interval struct {
    start int
    stop  int
}
type IntervalSet struct {
    intervals []*Interval
    readOnly  bool
}
for j := 0; j < len(i.intervals); j++ {
    v := i.intervals[j]
    if v.stop == v.start+1 {
        if v.start == TokenEOF {
            names = append(names, "<EOF>")
        } else {
            names = append(names, ("'" + string(v.start) + "'"))
        }
    } else {
        names = append(names, "'"+string(v.start)+"'..'"+string(v.stop-1)+"'")
    }
}
github.com/antlr/antlr4/runtime/Go/antlr/lexer_atn_simulator.go:633:15: conversion from int to string
func (l *LexerATNSimulator) GetTokenName(tt int) string {
    if tt == -1 {
        return "EOF"
    }

    return "'" + string(tt) + "'"
}
github.com/antlr/antlr4/runtime/Go/antlr/transition.go:239:15: conversion from int to string
github.com/antlr/antlr4/runtime/Go/antlr/transition.go:239:42: conversion from int to string
type RangeTransition struct {
    *BaseTransition

    start, stop int
}
func (t *RangeTransition) String() string {
    return "'" + string(t.start) + "'..'" + string(t.stop) + "'"
}
github.com/cznic/ql/etc.go:464:10: conversion from int to string
switch i := int(o); i {
case andand:
    return "&&"
case andnot:
    return "&^"
case lsh:
    return "<<"
case le:
    return "<="
case eq:
    return "=="
case ge:
    return ">="
case neq:
    return "!="
case oror:
    return "||"
case rsh:
    return ">>"
default:
    return string(i)
}
github.com/cznic/ql/etc.go:901:11: conversion from idealInt to string
github.com/cznic/ql/etc.go:905:11: conversion from idealUint to string
github.com/cznic/ql/etc.go:918:11: conversion from int64 to string
github.com/cznic/ql/etc.go:928:11: conversion from uint64 to string
case qString:
switch x := val.(type) {
//case nil:
//case idealComplex:
//case idealFloat:
case idealInt:
    return string(x), nil
case idealRune:
    return string(x), nil
case idealUint:
    return string(x), nil
//case bool:
//case complex64:
//case complex128:
//case float32:
//case float64:
case int8:
    return string(x), nil
case int16:
    return string(x), nil
case int32:
    return string(x), nil
case int64:
    return string(x), nil
case string:
    return x, nil
case uint8:
    return string(x), nil
case uint16:
    return string(x), nil
case uint32:
    return string(x), nil
case uint64:
    return string(x), nil
case []byte:
    return string(x), nil
case *big.Int:
    return x.String(), nil
case time.Time:
    return x.String(), nil
case time.Duration:
    return x.String(), nil
default:
    return invConv(val, typ)
}
github.com/cznic/ql/etc.go:1698:15: conversion from uint64 to string
rec[i] = string(y)
github.com/cznic/ql/ql.go:820:17: conversion from int to string
for _, f := range f {
    a = append(a, string(f.typ)+f.name)
}
github.com/go-delve/delve/pkg/proc/eval.go:535:9: conversion from int64 to string
b, _ := constant.Int64Val(argv.Value)
s := string(b)
github.com/unidoc/unidoc/pdf/internal/cmap/cmap.go:396:36: conversion from uint64 to string
target := hexToUint64(v)
i := uint64(0)
for sc := srcCodeFrom; sc <= srcCodeTo; sc++ {
    r := target + i
    cmap.codeMap[numBytes-1][sc] = string(r)
    i++
}
@ianlancetaylor
Copy link
Contributor Author

@ianlancetaylor ianlancetaylor commented Jan 1, 2020

Thanks for the list. The antlr and ql cases all look like incorrect uses of string(int), where they really meant to use strconv.Itoa(int). If I'm right, the warning is serving its purpose for those cases. The unicode case probably really does mean to use string(r), so that code would have to change. I'm not sure about the go-delve case.

@smasher164
Copy link
Member

@smasher164 smasher164 commented Jan 1, 2020

I'm not sure about the go-delve case.

The go-delve case is implementing eval for type conversion, so I would ignore that case.

FWIW, if the goal of backwards compatibility in the suggested fixes is to ensure that the fix results in the same string as string(int), then comparing against utf8.MaxRune isn't the only option:

func conv(v uint64) uint64 {
    if v > utf8.MaxRune {
        return utf8.RuneError
    }
    return v
}

When a rune is converted to a string, if bit 22 is set, then the string will print "�". Using this fact, the conversion can be made branchless:

func conv(v uint64) uint64 {
    return v | (1-((v>>22|v>>21&1)-1)>>63)<<21
}

This checks if any of the top 43 bits are nonzero, and uses that truth to set bit 22.

@ianlancetaylor
Copy link
Contributor Author

@ianlancetaylor ianlancetaylor commented Jan 1, 2020

The goal of the warning is to tell people when their code is not acting as they expect. As far as I can see the fix for cases like antlr is not to add a condition; it's to change the code to use strconv.Itoa instead of string(int), so that the right thing happens.

@smasher164
Copy link
Member

@smasher164 smasher164 commented Jan 1, 2020

That's fair. But we'd have to detect and distinguish those cases from cases where the conversion is correctly made and stays within the bounds of a rune. For instance:

/Users/akhil/work/src/golang.org/x/net/webdav/internal/xml/xml.go:1043:14: conversion from uint64 to string
if err == nil && n <= unicode.MaxRune {
    text = string(n)
    haveText = true
}
/Users/akhil/work/src/golang.org/x/net/webdav/internal/xml/xml.go:1066:15: conversion from int to string
var entity = map[string]int{
    "lt":   '<',
    "gt":   '>',
    "amp":  '&',
    "apos": '\'',
    "quot": '"',
}
if r, ok := entity[s]; ok {
    text = string(r)
    haveText = true
} else if d.Entity != nil {
    text, haveText = d.Entity[s]
}
/Users/akhil/work/src/golang.org/x/text/unicode/cldr/base.go:101:10: conversion from int64 to string
if s[1] == '#' {
    r, _ := strconv.ParseInt(s[3:len(s)-1], 16, 32)
    return string(r)
}
/Users/akhil/work/src/golang.org/x/text/message/fmt_test.go:1454:9: conversion from int to string
for i := 0; i < 128; i++ {
    if f.Flag(i) {
        s += string(i)
    }
}
/Users/akhil/work/src/golang.org/x/exp/errors/fmt/scan.go:607:50: conversion from int64 to string
r := int64(s.getRune())
n := uint(bitSize)
x := (r << (64 - n)) >> (64 - n)
if x != r {
    s.errorString("overflow on character value " + string(r))
}
/Users/akhil/work/src/github.com/docker/docker/vendor/github.com/golang/protobuf/proto/text_parser.go:315:10: conversion from uint64 to string
if i > utf8.MaxRune {
    return "", "", fmt.Errorf(`\%c%s is not a valid Unicode code point`, r, ss)
}
return string(i), s, nil
/Users/akhil/work/src/github.com/docker/docker/pkg/chrootarchive/archive_test.go:103:32: conversion from int to string
for i := 0; i < 65534; i++ {
    excludes[i] = strings.Repeat(string(i), 64)
}
/Users/akhil/work/src/github.com/rogpeppe/godef/vendor/9fans.net/go/plan9/dir.go:173:9: conversion from int to string
/Users/akhil/work/src/github.com/rogpeppe/godef/vendor/9fans.net/go/plan9/dir.go:177:10: conversion from int to string
var permChars = []permChar{
    permChar{DMDIR, 'd'},
    permChar{DMAPPEND, 'a'},
    permChar{DMAUTH, 'A'},
    permChar{DMDEVICE, 'D'},
    permChar{DMSOCKET, 'S'},
    permChar{DMNAMEDPIPE, 'P'},
    permChar{0, '-'},
    permChar{DMEXCL, 'l'},
    permChar{DMSYMLINK, 'L'},
    permChar{0, '-'},
    permChar{0400, 'r'},
    permChar{0, '-'},
    permChar{0200, 'w'},
    permChar{0, '-'},
    permChar{0100, 'x'},
    permChar{0, '-'},
    permChar{0040, 'r'},
    permChar{0, '-'},
    permChar{0020, 'w'},
    permChar{0, '-'},
    permChar{0010, 'x'},
    permChar{0, '-'},
    permChar{0004, 'r'},
    permChar{0, '-'},
    permChar{0002, 'w'},
    permChar{0, '-'},
    permChar{0001, 'x'},
    permChar{0, '-'},
}
for _, pc := range permChars {
    if p&pc.bit != 0 {
        did = true
        s += string(pc.c)
    }
    if pc.bit == 0 {
        if !did {
            s += string(pc.c)
        }
        did = false
    }
}
/Users/akhil/work/src/github.com/miekg/dns/msg_test.go:70:13: conversion from int64 to string
n, _ := strconv.ParseInt(escape[1:], 10, 8)
return string(n)
/Users/akhil/work/src/github.com/google/skylark/library.go:406:16: conversion from int to string
i, err := AsInt32(args[0])
if err != nil {
    return nil, fmt.Errorf("chr: got %s, want int", args[0].Type())
}
if i < 0 {
    return nil, fmt.Errorf("chr: Unicode code point %d out of range (<0)", i)
}
if i > unicode.MaxRune {
    return nil, fmt.Errorf("chr: Unicode code point U+%X out of range (>0x10FFFF)", i)
}
return String(string(i)), nil
/Users/akhil/work/src/github.com/cznic/ql/all_test.go:2949:9: conversion from int to string
for _, v := range rand.Perm(32) {
    s += string('0' + v)
}
/Users/akhil/work/src/upspin.io/client/clientutil/all_test.go:129:40: conversion from int to string
for i := 0; i < 10; i++ {
    loc := upspin.Location{
        Endpoint:  inProcess,
        Reference: upspin.Reference("ref" + string(i)),
    }
    tLocs = append(tLocs, loc)
}
@alandonovan
Copy link
Contributor

@alandonovan alandonovan commented May 7, 2020

I commented on the CL: use Sprintf("%c", rune). Performance is not important here.

gopherbot pushed a commit to golang/tools that referenced this issue May 13, 2020
Updates golang/go#32479

Change-Id: I0ede4f0da3f125df39b19a0687d858465cd45e6c
Reviewed-on: https://go-review.googlesource.com/c/tools/+/232718
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
gopherbot pushed a commit that referenced this issue May 13, 2020
As per discussion on the accepted proposals, enable these vet checks by
default in the go command. Update corresponding documentation as well.

Updates #32479.
Updates #4483.

Change-Id: Ie93471930c24dbb9bcbf7da5deaf63bc1a97a14f
Reviewed-on: https://go-review.googlesource.com/c/go/+/232660
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@gopherbot
Copy link

@gopherbot gopherbot commented May 13, 2020

Change https://golang.org/cl/233899 mentions this issue: message, unicode/cldr: avoid string(int)

@gopherbot
Copy link

@gopherbot gopherbot commented May 13, 2020

Change https://golang.org/cl/233900 mentions this issue: icmp, webdav/internal/xml: avoid string(int)

@gopherbot
Copy link

@gopherbot gopherbot commented May 13, 2020

Change https://golang.org/cl/233902 mentions this issue: errors/fmt: avoid string(int)

gopherbot pushed a commit to golang/net that referenced this issue May 13, 2020
Updates golang/go#32479

Change-Id: Ife0c3a2f10afb676af5f2110a9681216122c8808
Reviewed-on: https://go-review.googlesource.com/c/net/+/233900
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
gopherbot pushed a commit to golang/text that referenced this issue May 13, 2020
Updates golang/go#32479

Change-Id: Idd10c8208aaa006b3c80b90644e540a8af3572c0
Reviewed-on: https://go-review.googlesource.com/c/text/+/233899
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
gopherbot pushed a commit to golang/exp that referenced this issue May 13, 2020
Updates golang/go#32479

Change-Id: I23e48e5c91cdb7ffa7bb82ed0f94927c452e9975
Reviewed-on: https://go-review.googlesource.com/c/exp/+/233902
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@toothrot
Copy link
Contributor

@toothrot toothrot commented May 14, 2020

@ianlancetaylor Do you know who should be assigned as the owner for this? It's labeled as a blocker for the Go 1.15 beta.

@smasher164
Copy link
Member

@smasher164 smasher164 commented May 14, 2020

@toothrot Since CL 232660 has been merged and all of the trybot issues are taken care of, this issue is resolved, and no longer a release blocker.

@cespare
Copy link
Contributor

@cespare cespare commented May 14, 2020

Can we improve the vet error message? I'm seeing this on tip:

x/x.go:9:14: conversion from int to string yields a string of one rune

This message states a fact without indicating a problem. If I weren't aware of this upcoming language change, I'd wonder why vet is spouting non sequiturs.

@alandonovan
Copy link
Contributor

@alandonovan alandonovan commented May 14, 2020

Would this be this better?

"conversion from int to string yields a string of one rune, not a string of digits (did you mean fmt.Sprintf("%d", x)?)"
@josharian
Copy link
Contributor

@josharian josharian commented May 14, 2020

I think even better would be to add a link to a short wiki page where we explain about the language change, and provide two replacements, one like Alan suggested and one use fmt.Sprintf %c if they really did mean it.

@smasher164
Copy link
Member

@smasher164 smasher164 commented May 14, 2020

Would this be this better?

But that doesn't cover the case when the user meant fmt.Sprintf("%c", x). I agree that the current error message isn't ideal, but the conversion is discouraged both because it doesn't mean a "string of digits" and because there is a more robust way of converting the codepoint to a string.

I think even better would be to add a link to a short wiki page

Indeed the package documentation for the vet check explains the alternatives in more detail. Perhaps this should be included in the wiki?

This checker flags conversions of the form string(x) where x is an integer
(but not byte or rune) type. Such conversions are discouraged because they
return the UTF-8 representation of the Unicode code point x, and not a decimal
string representation of x as one might expect. Furthermore, if x denotes an
invalid code point, the conversion cannot be statically rejected.

For conversions that intend on using the code point, consider replacing them
with string(rune(x)). Otherwise, strconv.Itoa and its equivalents return the
string representation of the value in the desired base.
@gopherbot
Copy link

@gopherbot gopherbot commented May 18, 2020

Change https://golang.org/cl/234517 mentions this issue: doc/go1.15: mention vet warning for string(x)

gopherbot pushed a commit that referenced this issue May 18, 2020
For #32479

Change-Id: I974709d471021d370aa9bdc5f24b02afa8bd9b54
Reviewed-on: https://go-review.googlesource.com/c/go/+/234517
Reviewed-by: Robert Griesemer <gri@golang.org>
@josharian
Copy link
Contributor

@josharian josharian commented May 18, 2020

I'm not sure that CL 234517 fixes this issue. It is good to have things spelled out in the release notes, but I think have more info, or a link to more info, in the vet error itself is what this issue is about.

@josharian josharian reopened this May 18, 2020
@ianlancetaylor
Copy link
Contributor Author

@ianlancetaylor ianlancetaylor commented May 19, 2020

This issue is about vet issuing a warning for string(int).

I'm fine with improving the vet warning, of course.

@josharian
Copy link
Contributor

@josharian josharian commented May 19, 2020

@ianlancetaylor Ack. Filed #39151.

@josharian josharian closed this May 19, 2020
@ianlancetaylor
Copy link
Contributor Author

@ianlancetaylor ianlancetaylor commented May 19, 2020

@josharian Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Proposals
Accepted
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.