nom consuming 100% cpu #27

filipegoncalves · 2015-04-09T13:06:38Z

I am exploring the possibility of switching to nom in a project I am working on. I am not fully familiar with nom yet, so please bear with me.

For starters, I was trying to come up with a parser that matches strings of the form [a-zA-Z][-a-zA-Z0-9_]*. I wrote this:

#[macro_use]
extern crate nom;

use std::str::from_utf8;

use nom::{alpha, alphanumeric};
use nom::{IResult, Needed};
use nom::IResult::*;

named!(identifier<&[u8], String>,
       chain!(
           h: map_res!(alpha, from_utf8) ~
           t: many0!(alt!(alphanumeric | tag!("-") | tag!("_"))),
           || {
               let  s = h.to_string();
               t.into_iter().fold(s, |mut accum, slice| {
                   accum.push_str(from_utf8(slice).unwrap()); accum })}));

And I tested it with:

    #[test]
    fn id_name() {
        let a_setting = &b"miles"[..];
        let res = setting_name(a_setting);
        assert_eq!(res, Done(&b""[..], "miles".to_string()));
    }

When I run cargo test my PC completely hangs. With top I can see that it starts consuming more and more CPU and memory until the entire system is completely unusable and I have to hard reset.

Am I doing something wrong? Is this the best way to make a parser to match this type of strings?

The text was updated successfully, but these errors were encountered:

filipegoncalves · 2015-04-09T14:17:28Z

Update:

This is harmless:

named!(setting_name<&[u8], String>,
       chain!(
           h: map_res!(alpha, from_utf8) ~
           t: map_res!(alt!(alphanumeric | tag!("-") | tag!("_")), from_utf8),
           || { let mut s = h.to_string(); s.push_str(t); s }));

Apparently, adding many0! on t is the issue here. When I do it, the problem is back. Weird.

Geal · 2015-04-09T15:26:04Z

Can I get back to you on Saturday about this issue? I am currently at a conference and do not have much time.

Usually, when there was a lot of CPU and memory consumption, it was caused by the compiler having a hard time resolving macros. Since nom uses a lot of macros, it happens sometimes.

You could try making a named function for the many0 line and call it from the chain. Please let me know if it does not work even with that fix, that will mean I have something to fix :)
Which version of nom are you using?

filipegoncalves · 2015-04-09T15:52:14Z

Sure, take your time.

I tried with latest nom version on crates.io (0.2.1) with rustc 1.0.0-nightly (6436e348e). I also forked the repo. and applied #25 - same thing happened (this time on rustc 1.0.0-beta).

I'll try to follow your suggestion and go with a function when I have the chance. I'll let you know when I have news.

Enjoy the conference!

thehydroimpulse · 2015-04-09T23:00:39Z

I've also experienced this with the many0 and many1 macros during runtime (tests just hang). Haven't dug into the issue yet.

Geal · 2015-04-09T23:36:04Z

Ok, can you share the code? I'll investigate.
Le 10 avr. 2015 01:00, "Daniel Fagnan" notifications@github.com a écrit :

I've also experienced this with the many0 and many1 macros during runtime
(tests just hang). Haven't dug into the issue yet.

—
Reply to this email directly or view it on GitHub
#27 (comment).

filipegoncalves · 2015-04-10T11:10:32Z

After some digging I have a minimal test case that reproduces the issue. It is triggered by combining many0!() with alphanumeric. The problem is that many0!() expands into a loop that only breaks when the underlying parser returns something that is not Done. However, alphanumeric always returns Done, thus the loop never terminates.

The same happens with alpha. In general, any parser that always returns Done will cause an infinite loop when wrapped inside a many0!.

This is enough to reproduce the bug:

#[macro_use]
extern crate nom;

use nom::alphanumeric;
use nom::IResult;
use nom::IResult::*;

named!(just_a_test<&[u8], Vec<&[u8]> >,
       many0!(alphanumeric));

fn main() {
    println!("Entered main()");

    let test = &b"1as32"[..];
    let res = just_a_test(test);

    println!("Parsed input");
    assert_eq!(res, Done(&b""[..], vec![&b"1"[..], &b"a"[..], &b"s"[..], &b"3"[..], &b"2"[..]]));
}

This will print Entered main() and then it hangs on just_a_test().

In your opinion, what would be the correct way to fix this? I thought about changing alphanumeric and alpha so as not to return Done, but I don't know if this makes sense from a design point of view.

Perhaps a better design would be for many0!() to terminate if the underlying parser returned Done but didn't consume any input (in addition to stopping if the parser doesn't return Done, as it is now). What do you think? I'd be glad to work on this bug.

Geal · 2015-04-11T15:36:30Z

Good catch. I think alpha, alphanumeric, is_a and others should be able to return Error if they do not find any input, since we have opt! for the cases where we could ignore some input.

I will also update many to ignore parsers that accept empty input. I just checked Parsec, there's a special case for that in their code as well.

filipegoncalves · 2015-04-11T19:29:57Z

Alright, good to know. I'll wait for your fix.

Keep up the good job, this library is really cool.

Fix for #27

Geal · 2015-04-11T22:13:56Z

Here is a fix, let me know if this works for you.

filipegoncalves · 2015-04-11T22:54:38Z

Works like a charm. Thanks!

Geal · 2015-04-12T13:34:37Z

Great!

NilsIrl · 2021-10-14T15:37:54Z

Failing when nothing is parsed is probably not a great idea either, it may be better if many just stopped parsing.

For example, I'm trying to parse comments (in the context of a programming language), and have the following code:

fn space0(input: &str) -> IResult<&str, &str> {
    recognize(many0(tuple((
        complete::multispace0,
        opt(comment),
        complete::multispace0,
    ))))(input)
}

Unfortunately it fails when there is no whitespace, what can I do?

Stargateur · 2021-10-14T20:24:12Z

@NilsIrl without detail of grammar of what you try to parse is gonna be hard to advice you.

Geal · 2021-10-14T21:39:11Z

it should be possibl to make a separate function for tuple((complete::multispace0, opt(comment), complete::multispace0)), and in that one check the first character. If it's not a space or the first char of a comment, return an error

NilsIrl · 2021-10-18T09:41:49Z

I was able to fix it like so:

fn space0(input: &str) -> IResult<&str, &str> {
    recognize(many0(verify(
        recognize(tuple((
            complete::multispace0,
            opt(comment),
            complete::multispace0,
        ))),
        |s: &str| !s.is_empty(),
    )))(input)
}

filipegoncalves changed the title ~~nom consuming 100% cpu~~ nom consuming 100% cpu Apr 9, 2015

Geal added a commit that referenced this issue Apr 11, 2015

Avoid infinite loops in many0! and many1!

e455ddb

Fix for #27

filipegoncalves closed this as completed Apr 11, 2015

minamorl mentioned this issue Aug 21, 2017

Many parser should not handle infinite loop jneen/parsimmon#198

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nom consuming 100% cpu #27

nom consuming 100% cpu #27

filipegoncalves commented Apr 9, 2015

filipegoncalves commented Apr 9, 2015

Geal commented Apr 9, 2015

filipegoncalves commented Apr 9, 2015

thehydroimpulse commented Apr 9, 2015

Geal commented Apr 9, 2015

filipegoncalves commented Apr 10, 2015

Geal commented Apr 11, 2015

filipegoncalves commented Apr 11, 2015

Geal commented Apr 11, 2015

filipegoncalves commented Apr 11, 2015

Geal commented Apr 12, 2015

NilsIrl commented Oct 14, 2021 •

edited

Stargateur commented Oct 14, 2021

Geal commented Oct 14, 2021

NilsIrl commented Oct 18, 2021

nom consuming 100% cpu #27

nom consuming 100% cpu #27

Comments

filipegoncalves commented Apr 9, 2015

filipegoncalves commented Apr 9, 2015

Geal commented Apr 9, 2015

filipegoncalves commented Apr 9, 2015

thehydroimpulse commented Apr 9, 2015

Geal commented Apr 9, 2015

filipegoncalves commented Apr 10, 2015

Geal commented Apr 11, 2015

filipegoncalves commented Apr 11, 2015

Geal commented Apr 11, 2015

filipegoncalves commented Apr 11, 2015

Geal commented Apr 12, 2015

NilsIrl commented Oct 14, 2021 • edited

Stargateur commented Oct 14, 2021

Geal commented Oct 14, 2021

NilsIrl commented Oct 18, 2021

NilsIrl commented Oct 14, 2021 •

edited