Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential DOS with exponent notation? #8

Closed
david-christiansen opened this issue Jun 9, 2022 · 3 comments · Fixed by #9
Closed

Potential DOS with exponent notation? #8

david-christiansen opened this issue Jun 9, 2022 · 3 comments · Fixed by #9

Comments

@david-christiansen
Copy link

david-christiansen commented Jun 9, 2022

Thanks for writing a library to parse TOML 1.0!

The TOML spec allows exponents to be used in integers. When I try to decode a TOML.Value from this:

[thing]
oops = 1e1000000000000000000000000000000000000000000000000000000000000000

the parser takes a very long time. I presume that it's allocating a quite large integer, or doing some multiplications, but I haven't dug into the source code to see.

I think that Aeson uses an explicit scientific number representation for integers, which preserves exponent notation as-written, and gives client libraries a way to check whether it's in range. Is this something worth doing here?

@brandonchinn178
Copy link
Owner

Thanks for the issue! Couple things here: when you use scientific notation, it's a float, not an integer. Unlike JSON, TOML distinguishes between integers and floats. Also, TOML's spec explicitly says that floats should be represented as a IEEE double precision float, which makes me hesitant to use an arbitrary precision representation. (The Integer spec does specify "the implementation should support 64-bit signed integers", which I take to mean "at least", while the Float spec specifies it "should be implemented as a double precision float" which explicitly specifies the implementation).

The design space is also a bit different here, because generally, TOML isnt used with untrusted input.

Also, does the parser itself take a long time? With Haskell's laziness, I would expect the parser to finish quickly and only take a long time when you explicitly inspect the value.

Related: toml-lang/toml#538

@brandonchinn178
Copy link
Owner

Update: just tested it; the following code runs and exits immediately:

case parseTOML "" "a = 1e1000000000000000000000000000000000000000000000000000000000000000" of
  Right _ -> return ()
  Left e -> error $ show e

so the issue is converting the thing into a Double.

One sane thing I could do right now is check if the exponent is greater than some arbitrary value like 1000 (since Double's max value is around 10^308) and just return infinity immediately.

@david-christiansen
Copy link
Author

Sorry for the slow responses here!

I had assumed that 1e100 was an integer, and 1.0e100 a float. I suppose I read it wrong :-)

Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants