Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leading plus for string to integer parsing #27580

Closed
Cigaes opened this Issue Aug 7, 2015 · 8 comments

Comments

Projects
None yet
5 participants
@Cigaes
Copy link

Cigaes commented Aug 7, 2015

The parse::<i32>() function (and all its cousins for other integer types) fail for strings starting with a + with ParseIntError { kind: InvalidDigit }. Failing snippet:

fn main () {
  assert_eq!("+42".parse::<i32>().unwrap(), 42);
}

The same error occurs with the symmetric function call i32::from_str("+42").

Test snippet on Rust Playground
Rust Forum discussion

@Gankro

This comment has been minimized.

Copy link
Contributor

Gankro commented Aug 7, 2015

Need to decide if this is intentional or incidental.

CC @arthurprs who wrote the current implementation.

@arthurprs

This comment has been minimized.

Copy link
Contributor

arthurprs commented Aug 7, 2015

Interesting. I'm 99% sure this was the behavior before my change though. https://github.com/rust-lang/rust/pull/27110/files#diff-01076f91a26400b2db49663d787c2576L1457

@Gankro

This comment has been minimized.

Copy link
Contributor

Gankro commented Aug 7, 2015

@arthurprs I agree, I just figured you might have an opinion.

@arthurprs

This comment has been minimized.

Copy link
Contributor

arthurprs commented Aug 7, 2015

@Gankro since all major languages accept the leading + we should probably do the same.

@rkruppe

This comment has been minimized.

Copy link
Member

rkruppe commented Oct 3, 2015

Bump: We should absolutely support leading +. It is super annoying to work around this when you're parsing something that permits leading +. I already encountered this twice in the last months, and I don't write a lot of Rust code to begin with.

@sstewartgallus

This comment has been minimized.

Copy link
Contributor

sstewartgallus commented Oct 7, 2015

If you add support for a bunch of additional features then people end up needing to write wrapper functions around the implementation to stop those features. For example, I often end up with code that is similar to the following because I don't want to accept nonconforming input:

static linted_error parse_int(char const *str, int *resultp)
{
    linted_error err = 0;

    char start = str[0U];

    if (isspace(start))
        return EINVAL;
    if ('+' == start)
        return EINVAL;
    if ('-' == start)
        return ERANGE;

    errno = 0;
    long yy;
    char *endptr;
    {
        char *xx;
        yy = strtol(str, &xx, 10);
        endptr = xx;
    }
    err = errno;
    if (err != 0)
        return err;

    if (*endptr != '\0')
        return EINVAL;
    if (yy > INT_MAX)
        return ERANGE;

    *resultp = yy;
    return 0;
}

This is not to mention the fact that strtol is locale specific in C anyway but Rust doesn't have that problem.

I bet in the future that a LOT of people will be sloppy and not take care of corner cases and then their parsers will end up accepting nonconforming input for whatever program they write and so gratuitous incompatibilities and extra features between programs will proliferate. I feel that is a net harm.

@Gankro

This comment has been minimized.

Copy link
Contributor

Gankro commented Oct 7, 2015

@sstewartgallus This is an interesting perspective, thank you!

However it has been noted elsewhere (#28826) that the float parsing algorithm already does this, which suggests we should probably bite the bullet and do this just to be consistent.

@rkruppe

This comment has been minimized.

Copy link
Member

rkruppe commented Oct 7, 2015

Just because current nightly and beta accept leading plus on floats doesn't mean a decision has been made. Like I said, this was probably unconscious, it slipped through the cracks so to speak. If desired, I can easily alter the code to reject leading plus. It would make me sad, because I think it ought to be accepted, but it would be be a one-minute fix and could be backported with little danger.

@sstewartgallus In broad strokes, you are right and this is a real concern. On the other hand, there are also formats where leading plus is legal (for example, the exponents in floating point: 1.5e+3, 1.0Fp+10). Working around that if str::parse does not accept leading plus is about as much work as the part of your example that rejects leading plus. Of course, this way around the incompatibility is a bit more likely to be noticed early, but it's still an edge case.

Additionally, it's not like all occurrences of integers in fixed data formats are floating around in a vacuum. Often some code has to identify the substrings containing them, then pass those substrings to str::parse — and this code is also complicit in accepting or rejecting leading plus.

In total, I don't think in this instance there is a significant hazard one way or another, it's simply a matter of consistency with other languages and doing the thing that most users find convenient.

bors added a commit that referenced this issue Oct 8, 2015

@bors bors closed this in #28826 Oct 8, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.