Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing numbers starting with 0 is not supported #3

Closed
Shnatsel opened this issue Aug 28, 2023 · 9 comments
Closed

Parsing numbers starting with 0 is not supported #3

Shnatsel opened this issue Aug 28, 2023 · 9 comments
Labels
enhancement New feature or request

Comments

@Shnatsel
Copy link
Contributor

Add the following test to the codebase:

#[test]
fn test_parse_i64_fuzzed() {
    let input = b"02221222122210221223";
    let result = parse::<i64>(input).unwrap();
    assert_eq!(result, 02221222122210221223);
}

Run it with SIMD disabled (with the .cargo/config.toml file deleted) and you'll see that it fails. Meanwhile the standard library parses this number correctly.

This bug was discovered using cargo-fuzz. You can find a guide to it here and the fuzzing setup I used here. I will contribute the fuzzing setup to your repository eventually.

@Shnatsel
Copy link
Contributor Author

Actually, it fails with SIMD enabled as well. So turning SIMD on/off doesn't make any difference.

@Shnatsel
Copy link
Contributor Author

I am also getting a similar error when trying to parse 0000000000000000000000000000000000000001

The standard library implementation also handles this one correctly.

@RoDmitry
Copy link
Owner

Zeroes at the beginning of the input are not supported by the design, and I am not sure if it's needed, because it will work slower.

@Shnatsel
Copy link
Contributor Author

I see, thanks for clarification.

Parsing numbers with leading zeroes is something people also need to do, and I think this library could do it faster than str::strip_leading_maches('0'). There are ways to quickly count leading decimal zeroes with SIMD: KholdStare/qnd-integer-parsing-experiments#2

@RoDmitry
Copy link
Owner

RoDmitry commented Aug 28, 2023

Hm, it might be not that slow as I thought. Will think about it, thanks

@Shnatsel Shnatsel changed the title Fails to parse some valid i64 and u64 numbers using the fallback implementation Parsing numbers starting with 0 is not supported Aug 28, 2023
@RoDmitry RoDmitry added the enhancement New feature or request label Sep 11, 2023
@RoDmitry
Copy link
Owner

RoDmitry commented Sep 23, 2023

It works, but with length not more then type's numbers max length 2288368 (starting with 16)

@Shnatsel
Copy link
Contributor Author

FWIW the standard library parser supports even longer sequences of leading zeroes, so that 000000000000000001 does parse correctly into a u16.

It's up to you whether you choose to support this use case.

@RoDmitry
Copy link
Owner

RoDmitry commented Sep 23, 2023

I just don't see any use cases. If anybody needs it, write it here, with an explanation of the use case, would be interesting to know.

@RoDmitry RoDmitry closed this as not planned Won't fix, can't repro, duplicate, stale Sep 23, 2023
@RoDmitry
Copy link
Owner

RoDmitry commented Oct 12, 2023

Done in a separate function atoi_simd::parse_skipped

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants