Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug? - surroundedBy with optional whitespace fails to parse #201

Closed
precision-software opened this issue Apr 13, 2021 · 5 comments
Closed

Comments

@precision-software
Copy link

precision-software commented Apr 13, 2021

`
/**
Demonstrates a possible bug with cats-parse.

Parses a parenthesized word list  (foo, bar, x, y),
 but fails to parse if a space precedes the final ')'.

cats-parse version 0.3.2
Scala version      3.0.0-RC2

*/

import cats.parse.Parser=>P

@main def main():Unit =

// Specific types of characters.
val whitespace = P.charIn( " \r\t\n")
val letter = P.charIn('a' to 'z')
val comma = P.char(',')
val lParen = P.char('(')
val rParen = P.char(')')

// For testing, a lowercase word.
val word = letter.rep.string

// Allow optional spaces around the list characters -  ( , )
val whitespaces0 = whitespace.rep0.void
val listStart = lParen.surroundedBy(whitespaces0).void
val listEnd = rParen.surroundedBy(whitespaces0).void
val listSeparator = comma.surroundedBy(whitespaces0).void

// Define a parenthesized list of words ... eg. (foo, bar, x, y)
val wordList = listStart ~ word.repSep0(listSeparator) ~ listEnd

// This wordlist parses fine.
val result1 = wordList.parseAll("(foo, bar, x, y)")
assert(result1.isRight)

// PROBLEM: If a space precedes the final ')', then it fails.
val result2 = wordList.parseAll("(foo, bar, x, y )")
assert(result2.isRight)

`

@johnynek
Copy link
Collaborator

See this comment:

#188 (comment)

surroundedBy with whitespace does not work great. The problem is lack of backtracking. Your design is assuming backtracking. As soon as you see whitespace, the first item will start to parse, if you don't hit the surrounded thing, you will get an error (again, an intentional goal is opt-in backtracking).

So, I think instead you want something like: whitespaces0.soft *> comma *> whitespaces0

The reason your final example fails is because with you hit y ) the space starts looking like a listSeparator, but then you don't see a , but the parser expects that.

Altenatively, you could make listSeparator backtrack by doing:

val listSeparator = comma.surroundedBy(whitespaces0).backtrack

@precision-software
Copy link
Author

Thanks for the explanation. It pointed me to a reasonable solution.

Rather than introducing backtracking just to eliminate whitespace, I'm removing whitespace after each token, rather than both before and after. I created a ".strip" method to skip over trailing whitespace.

extension[T](self:Parser[T]) def strip = self <* (whitespace.rep0.void)

I fell into my way of errors by following the README example. Given the pitfalls, it probably isn't good practice to use surroundedBy to remove whitespace.

@johnynek
Copy link
Collaborator

you are right. We need to improve the readme... it compiles, but it actually isn't great for the same reason.

@johnynek
Copy link
Collaborator

actually... .soft.surroundedBy(a) seems like a good idea to have, which would be a.soft *> p.soft <* a

What do you think of that API?

@johnynek
Copy link
Collaborator

done in #289 #290

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants