Stack overflow when parsing a deeply nested toml_edit::Document #206

5225225 · 2021-09-13T21:15:04Z

Basically the same error as seen as toml-rs/toml-rs#428

Reproduction code:

fn main() {
    let brackets = "[".repeat(20000);
    let input_string = format!("x={}", &brackets);
    input_string.parse::<toml_edit::Document>().unwrap();
}

epage · 2021-09-13T22:12:13Z

I assume this is also true for { (inline tables).

Likewise, it'd take longer but I assume we'd also hit OOM for many dotted keys, many tables, and many key-values.

For dotted keys, I'm actually unsure whether we would hit OOM (too many keys for memory) or stack overflow (recursively walking data structures) first.

ordian · 2021-09-14T08:03:10Z

The issue is about stackoverflow, not OOM.

A solution to this problem is to use an explicit stack (using a vec) instead of recursion.

epage · 2021-09-14T13:34:30Z

I brought up OOM because

Stackoverflow is a special case of OOM
Because we generally don't handle OOM errors in Rust, they lead to the same behavior (crash)
Changing recursion to explicit stack just converts the problem from stackoverflow to OOM, it will just take longer to crash without the stack frame boilerplate

The main reason I can think of to handle stackoverflows specially is to provide a good error on accidental infinite recursion so people can get out of that case, which is not applicable here.

We can easily change descend-path from recursion but parsing of inline tables and arrays will be more difficult.

ordian · 2021-09-14T14:26:40Z

Well, by OOM, people usually mean Out Of (heap) Memory.
And one of the differences between the stack and the heap is that the default stack size is fixed and relatively small (8MiB), whereas the heap size is usually limited by the amount of RAM a computer has (yeah, it's more complicated than that).

So I would say, ensuring no stackoverflow is almost always lib's responsibility, as opposed to ensure no OO(heap)M.

This is a part of toml-rs#206. This wasn't even my goal but to work out what was going on with lifetimes which led to the loosening of lifetimes on `*Table::entry_format`. Hopefully this will also help with performance.

This is a part of #206. This wasn't even my goal but to work out what was going on with lifetimes which led to the loosening of lifetimes on `*Table::entry_format`. Hopefully this will also help with performance.

ordian · 2022-07-31T15:20:26Z

We could use something like https://github.com/rust-lang/stacker for this.

manunio · 2022-09-19T10:24:30Z

Hi @ordian @epage While working on issue:#335 and running fuzzer locally for sometime, i found a similar stackoverflow crash. Following is the reproduction code.

#[test]
fn test_stackoverflow() {
    let testcase = "8x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
{8x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[{8x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
{7x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[{8x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[{8x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[{8x=[[[[[[[[[[[[[[[[[[[
[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[{8x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[
[{0x=[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[8x0={22={2[";
    testcase.parse::<toml_edit::Document>().unwrap();    
}

running 1 test

thread 'test_stackoverflow' has overflowed its stack
fatal runtime error: stack overflow
error: test failed, to rerun pass `--test test_stackoverflow`

Caused by:
  process didn't exit successfully: `/home/maxx/dev/security/oss-fuzz-projects/toml_edit/target/debug/deps/test_stackoverflow-e48339b99633f4c2` (signal: 6, SIGABRT: process abort signal)

Just thought of sharing this.

epage · 2022-09-19T13:23:28Z

Yes, anything that will cause too much recursion will cause an overflow.

Speaking of ways of solving this, I've been thinking of switching this to nom once some ergoomic issues are resolved. I suspect I could more easily write a heap-allocated recursive parser with it. We'll see.

5225225 · 2022-09-19T13:27:54Z

That being said, it might also be worth adding recursion limits by default, since if you have a deeply nested object, you might run into issues working with it (drop being recursive and blowing the stack without taking care, for example).

I would be surprised if any object nested more than, say, 100 deep is actually legitimate. And if someone really does have objects nested that deep, they can disable the limits and be careful themselves.

Without the macros of the old parser, it was much easier to add new state for this. I chose the limit of 128 as that is what serde_json does. Technically, this may break someone but the likelihood is extremely low, especially with how toml isn't exactly designed for arbitrary depth and with how recently this release was out. Fixes toml-rs#206

Without the macros of the old parser, it was much easier to add new state for this. I chose the limit of 128 as that is what serde_json does. I didn't both exposing more configuration for this than the `unbounded` feature. We can add more later if needed but I doubt that. Technically, this may break someone but the likelihood is extremely low, especially with how toml isn't exactly designed for arbitrary depth and with how recently this release was out. Fixes toml-rs#206

epage added the C-bug Category: Things not working as expected label Sep 13, 2021

epage mentioned this issue Sep 24, 2021

fix: Remove some recursion to avoid overflow #213

Merged

epage added the A-parse Area: Parsing TOML label Sep 23, 2022

epage mentioned this issue Dec 27, 2022

fix(parser): Limit recursion to block stackoverflow #404

Merged

epage closed this as completed in #404 Dec 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stack overflow when parsing a deeply nested toml_edit::Document #206

Stack overflow when parsing a deeply nested toml_edit::Document #206

5225225 commented Sep 13, 2021

epage commented Sep 13, 2021

ordian commented Sep 14, 2021

epage commented Sep 14, 2021

ordian commented Sep 14, 2021

ordian commented Jul 31, 2022

manunio commented Sep 19, 2022

epage commented Sep 19, 2022

5225225 commented Sep 19, 2022

Stack overflow when parsing a deeply nested toml_edit::Document #206

Stack overflow when parsing a deeply nested toml_edit::Document #206

Comments

5225225 commented Sep 13, 2021

epage commented Sep 13, 2021

ordian commented Sep 14, 2021

epage commented Sep 14, 2021

ordian commented Sep 14, 2021

ordian commented Jul 31, 2022

manunio commented Sep 19, 2022

epage commented Sep 19, 2022

5225225 commented Sep 19, 2022