Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help with parsing that "hangs" #52

Closed
sigvaldm opened this issue Apr 23, 2021 · 3 comments
Closed

Help with parsing that "hangs" #52

sigvaldm opened this issue Apr 23, 2021 · 3 comments

Comments

@sigvaldm
Copy link

Dear community, thanks for the nice library. I am not very familiar with parser combinator libraries yet, but I sometimes encounter expressions that get stuck, such as this one:

>>> spaces = regex(r'[ \t]*')
>>> word = regex('[a-zA-Z0-9\-._:%]*')
>>> words = word.sep_by(spaces)
>>> words.parse('ak kjd l  lksdjf')

Any guidance on what I'm doing wrong here? Thank you.

@bugaevc
Copy link
Member

bugaevc commented Apr 23, 2021

Your example works if you replace the * with a + inside the regexes.

When you use a *, you're saying that an empty string is a valid spaces and also a valid word, and so word.sep_by(spaces) can have an infinite amount of empty words separated by empty spaces as a prefix.

@sigvaldm
Copy link
Author

My goodness! I had not thought about that. Thank you. Is there any caveat in leaving * in spaces but remove it from word (i.e. to allow zero or more spaces between them)?

@sigvaldm
Copy link
Author

Never mind, it would mean that a long word could be parsed as one word per letter, I suppose. In any way, thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants