Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Why is my npeg parser done, then starts backtracking and fails. #53

Closed
thatrandomperson5 opened this issue Dec 11, 2022 · 8 comments

Comments

@thatrandomperson5
Copy link

I have this line of my parser: tinput <- tblock * !1 after EOL it should check again for the !1 and end but from the trace below you can see it backs up and starts again then fails. Why is this happening and how can i fix it?

 75|  0| 94|                        |EOL            |chr "\r"                                |***
 79|  0| 94|                        |EOL            |chr "\n"                                |***
 82|  0| 94|                        |EOL            |chr "\r"                                |***
 85|  0| 94|                        |EOL            |any                                     |***
 89|  0| 94|                        |tcontent       |return                                  |***
 93|  0| 94|                        |tblock         |commit 91                               |***
 91|  0| 94|                        |tblock         |choice 94                               |**
 92|  0| 94|                        |tblock         | call tcontent:9                        |***
  9|  0| 94|                        |tcontent       |choice 60                               |***
 10|  0| 94|                        |Static         | capopen ckAction -> 94                 |****
 11|  0| 94|                        |Static         |  capopen ckVal -> 94                   |****
 12|  0| 94|                        |whitespace     |   span {'\t',' '}                      |****
 13|  0| 94|                        |Static         |  capclose ckVal -> 94                  |****
 14|  0| 94|                        |Static         | capclose ckAction -> 94                |****
 15|  0| 94|                        |Alpha          | set {'A'..'Z','a'..'z'}                |****
 95|  0| 94|                        |               |opFail (backtrack)                      |****
 60|  0| 94|                        |Static         |capopen ckAction -> 94                  |***
 61|  0| 94|                        |Static         | capopen ckVal -> 94                    |***
 62|  0| 94|                        |whitespace     |  span {'\t',' '}                       |***
 63|  0| 94|                        |Static         | capclose ckVal -> 94                   |***
 64|  0| 94|                        |Static         |capclose ckAction -> 94                 |***
 65|  0| 94|                        |Alnum          |set {'0'..'9','A'..'Z','a'..'z'}        |***
 95|  0| 94|                        |               |opFail (backtrack)                      |***
 94|  0| 94|                        |tblock         |return                                  |**
 48|  0| 94|                        |tcontent       | choice 59                              |**
 49|  0| 94|                        |tcontent       |  choice 57                             |***
 50|  0| 94|                        |Dedent         |   capopen ckAction -> 94               |****
 51|  0| 94|                        |Dedent         |    capopen ckVal -> 94                 |****
 52|  0| 94|                        |whitespace     |     span {'\t',' '}                    |****
 53|  0| 94|                        |Dedent         |    capclose ckVal -> 94                |****
 54|  0| 94|                        |Dedent         |   capclose ckAction -> 94              |****
 95|  0| 94|                        |               |opFail (backtrack)                      |****
 57|  0| 94|                        |tcontent       | commit 58                              |***
 58|  0| 94|                        |               |opFail (backtrack)                      |**
 60|  0| 51|Fooo:\n    Barr\n    Fo |Static         |capopen ckAction -> 51                  |*
 61|  0| 51|Fooo:\n    Barr\n    Fo |Static         | capopen ckVal -> 51                    |*
 62|  0| 51|Fooo:\n    Barr\n    Fo |whitespace     |  span {'\t',' '}                       |*
 63|  0| 51|Fooo:\n    Barr\n    Fo |Static         | capclose ckVal -> 51                   |*
 64|  0| 51|Fooo:\n    Barr\n    Fo |Static         |capclose ckAction -> 51                 |*
 65|  0| 51|Fooo:\n    Barr\n    Fo |Alnum          |set {'0'..'9','A'..'Z','a'..'z'}        |*
 66|  0| 52|ooo:\n    Barr\n    Foo |value          |capopen ckAction -> 52                  |*
 67|  0| 52|ooo:\n    Barr\n    Foo |value          | capopen ckVal -> 52                    |*
 68|  0| 52|ooo:\n    Barr\n    Foo |value          |  span {'0'..'9','A'..'Z','a'..'z'}     |*
 69|  0| 55|:\n    Barr\n    Foooo: |value          | capclose ckVal -> 55                   |*
 70|  0| 55|:\n    Barr\n    Foooo: |value          | chr ":"                                |*
 71|  0| 56|\n    Barr\n    Foooo:  |               | jump :73                               |*
 73|  0| 56|\n    Barr\n    Foooo:  |               |opFail (backtrack)                      |*
  4|  0| 51|Fooo:\n    Barr\n    Fo |tinput         |any                                     |
  5|  0| 52|ooo:\n    Barr\n    Foo |               |jump :7                                 |
  7|  0| 52|ooo:\n    Barr\n    Foo |               |opFail (error)                          |
@zevv
Copy link
Owner

zevv commented Dec 11, 2022

Hard to tell without seeing your grammar, any chance you could post a minimal self contained snippet of nim code + grammar that shows this behavior?

@thatrandomperson5
Copy link
Author

I have no idea where the issue lies so it is hard to make a minimal self contained snippet. I will try tho

@thatrandomperson5
Copy link
Author

thatrandomperson5 commented Dec 11, 2022

Is this good enough?

  let testParser = peg("tinput", o: seq[OC]):
    name <- >(+Alpha) * ":":
      o.add ("name", $1, indent())
    value <- >(+Alnum) * !":":
      o.add ("value", $1, indent())
    tcontent <-  (ib.Line(name) * ib.Block(tblock)) | ib.Line(value)
    tblock <- +tcontent
    tinput <- tblock * !1

It’s minimal but not self contained

@zevv
Copy link
Owner

zevv commented Dec 11, 2022

I'd like to help you debug this, but [lease try to come up with something I can run to inspect what is happening; I can see from the trace that NPeg is backtracking so your input does not match the grammar, but I can not tell what is wrong with your grammar without having your grammar and the subject string you are trying to parse.

@thatrandomperson5
Copy link
Author

This is self contained but not minimal:

  var indentStack = newSeq[int]()

  template top[T](s: seq[T]): T = 
    if s.high == -1:
      0
    else:
      s[s.high]
  
  grammar "ib":
    EOL <- "\r\n" | "\n" | "\r" | !1
    whitespace <- *Blank

    Indent <- >whitespace: 
        validate len($0) > indentStack.top
        indentStack.add len($0)
    Dedent <- >whitespace:
        let delStack = len($0)
        validate delStack < indentStack.top
        while delStack < indentStack.top:
            discard indentStack.pop
    Static <- >whitespace:
        validate len($0) == indentStack.top
    Line(content) <- ib.Static * content * ib.EOL
    Block(content) <- &ib.Indent * content * &ib.Dedent
    
  
  type OC = tuple[kind: string, s: string, indent: int]
  let testParser = peg("tinput", o: seq[OC]):
    name <- >(+Alpha) * ":":
      o.add ("name", $1, indentStack.len)
    value <- >(+Alnum) * !":":
      o.add ("value", $1, indentStack.len)
    tcontent <-  (ib.Line(name) * ib.Block(tblock)) | ib.Line(value)
    tblock <- +tcontent
    tinput <- tblock * !1

  var o = newSeq[OC]()
  const testStr = """Foo:
    FooBar:
        Bar
        Bar2
    Bar3
Fooo:
    Barr
    Foooo:
        Bar4
Bar5"""
  echo testStr
  let r = testParser.match(testStr, o)
  echo o
  doAssert r.ok

@thatrandomperson5
Copy link
Author

@zevv can you run the above successfully?

@zevv
Copy link
Owner

zevv commented Dec 12, 2022

The parser parses ok, but your Dedent rule fails because of validate delStack < indentStack.top, causing the whole shebang to backtrack to the start.

@thatrandomperson5
Copy link
Author

You were right, just needed to change to this: Block(content) <- &ib.Indent * content * (&ib.Dedent | !1). Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants