Skip to content

proposal: strconv: add ParseFloatPrefix #53340

@benhoyt

Description

@benhoyt

The strconv.ParseFloat function parses a floating-point value from a string, but it requires that the floating-point value is the entire string, otherwise it returns an error. In many types of parsing you want to know if a string starts with a valid floating-point value, and then continue parsing after that.

For example, you could imagine code that parsed <float> <op> <float>. Currently you'd either have to write a full scanner for floats and pass the result to ParseFloat, which means a lot of tricky code and doing work that ParseFloat will then do again.

There's an existing, unexported function called parseFloatPrefix, with the following signature (the returned int is the number of bytes of s parsed):

func parseFloatPrefix(s string, bitSize int) (float64, int, error) { ... }

If this was exported as ParseFloatPrefix, you could us it to parse a float and skip over the number of bytes it returns. (In fact, parseFloatPrefix is used by ParseComplex, which is basically parsing <float>±<float>.)

Real-world examples other than ParseComplex:

  • I would use ParseFloatPrefix in my GoAWK project, because the implicit conversion that AWK does when you ask for a number but the input is a string is allowed to be a prefix, like "1.5xyz" yields 1.5. I currently implement this with my own pre-scanner, and then pass the result to ParseFloat. (I believe I used to use text/scanner, but it was rather heavy and slow.) I ran into this because I wanted to add hexadecimal floating point support, which is allowed by the POSIX AWK spec.
  • strconv.parseFloatPrefix is used by the Vitess database project in some expression-parsing code. They pull in the actual strconv.parseFloatPrefix using a //go:linkname hack.

In the golang-dev thread where I asked about this, @rsc mentioned that strconv.QuotedPrefix was added recently, and said "I think we could reasonably add FloatPrefix, although at that point perhaps we should also consider IntPrefix, UintPrefix, BoolPrefix, and ComplexPrefix". Two things from that:

  • I'd argue that we need [Parse]FloatPrefix because it's quite complex and error-prone to write, whereas IntPrefix and UintPrefix are relatively simple (at least with a fixed base), BoolPrefix is trivial, and ComplexPrefix becomes relatively simple once we have [Parse]FloatPrefix.
  • I'm not a fan of how QuotedPrefix doesn't return the unescaped string, because then (in most use cases) you have to call strconv.Unquote later, which will loop over the bytes a second time to unescape it -- rather inefficient! So I'd much rather see ParseFloatPrefix, or a similar signature which returns the parsed floating-point value as well. In my GoAWK use case, efficiency is important, because it's not used just for parsing source code, but may be used on each field in the data passed to the AWK script, so I don't want to scan the bytes twice.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions