Help with parsing that "hangs" #52

sigvaldm · 2021-04-23T13:23:22Z

Dear community, thanks for the nice library. I am not very familiar with parser combinator libraries yet, but I sometimes encounter expressions that get stuck, such as this one:

>>> spaces = regex(r'[ \t]*')
>>> word = regex('[a-zA-Z0-9\-._:%]*')
>>> words = word.sep_by(spaces)
>>> words.parse('ak kjd l  lksdjf')

Any guidance on what I'm doing wrong here? Thank you.

bugaevc · 2021-04-23T13:29:17Z

Your example works if you replace the * with a + inside the regexes.

When you use a *, you're saying that an empty string is a valid spaces and also a valid word, and so word.sep_by(spaces) can have an infinite amount of empty words separated by empty spaces as a prefix.

sigvaldm · 2021-04-23T13:32:34Z

My goodness! I had not thought about that. Thank you. Is there any caveat in leaving * in spaces but remove it from word (i.e. to allow zero or more spaces between them)?

sigvaldm · 2021-04-23T13:33:46Z

Never mind, it would mean that a long word could be parsed as one word per letter, I suppose. In any way, thank you.

sigvaldm closed this as completed Apr 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help with parsing that "hangs" #52

Help with parsing that "hangs" #52

sigvaldm commented Apr 23, 2021

bugaevc commented Apr 23, 2021

sigvaldm commented Apr 23, 2021

sigvaldm commented Apr 23, 2021

Help with parsing that "hangs" #52

Help with parsing that "hangs" #52

Comments

sigvaldm commented Apr 23, 2021

bugaevc commented Apr 23, 2021

sigvaldm commented Apr 23, 2021

sigvaldm commented Apr 23, 2021