Token parser consumes newlines #24

jkarni · 2015-01-29T14:50:29Z

As described in this SO question, many of the token parser functions consume newlines, which is often not the desired behaviour. It's also not easy, without copying most of the code, to change that. It'd be nice if this behaviour were configurable!

doppioandante · 2015-05-07T17:45:15Z

It would be nice indeed

albertnetymk · 2015-07-03T22:57:34Z

Looking at the source, we could see that isSpace is used. I think we could add one more field in the LanguageDef record, like treatNewLineAsSpace :: Bool, which controls how isSpace from Data.Char is called. I could create a PR if this proposal sounds sensible.

mrkkrp · 2015-07-06T05:37:54Z

I could create a PR if this proposal sounds sensible.

This would be cool. I'm also interested in hearing @aslatter 's opinion about the improvement.

doppioandante · 2015-07-06T19:16:10Z

@albertnetymk What if someone wants to treat separately other space characters like tab? I have never had the need for this, but I'm wondering if a more general solution can be achieved.

albertnetymk · 2015-07-06T19:21:35Z

@doppioandante One simple solution could be creating one more field, like treatTabAsSpace :: Bool. By that time, it's probably good to come up with a more general solution. For now, I think treating new lines specially is enough, avoid over engineering.

mrkkrp · 2015-09-02T14:12:09Z

@jkarni, @doppioandante Work has been started on this issue in Megaparsec, see branch new-lexer. While it may be not entirely usable for now, you could try it and give your feedback. I've posted a comment in dedicated issue thread: mrkkrp/megaparsec#5.

doppioandante · 2015-09-02T15:01:46Z

@mrkkrp Cool, I had solved forking parsec and introducing a dirty hack.
My project has a really dumb parser (basically some fake asm) but I'll certainly try to switch to megaparsec and see if it clicks.

mrkkrp · 2015-09-02T15:04:28Z

@doppioandante, Thanks! If it doesn't click, please describe your experience, so we can correct our shortcomings. Megaparsec is almost ready for its first release, lexer (token parser is Parsec's terminology) is the only thing that's left.

minad mentioned this issue Jul 14, 2015

Text.Parsec.Language.LanguageDef - Add spaceChar #41

Closed

mrkkrp mentioned this issue Jul 30, 2015

Add option to avoid automatic consumption of new-lines by lexemes mrkkrp/megaparsec#5

Closed

Chobbes mentioned this issue Aug 21, 2015

Option to change whitespace in token parsing. #44

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Token parser consumes newlines #24

Token parser consumes newlines #24

jkarni commented Jan 29, 2015

doppioandante commented May 7, 2015

albertnetymk commented Jul 3, 2015

mrkkrp commented Jul 6, 2015

doppioandante commented Jul 6, 2015

albertnetymk commented Jul 6, 2015

mrkkrp commented Sep 2, 2015

doppioandante commented Sep 2, 2015

mrkkrp commented Sep 2, 2015

Token parser consumes newlines #24

Token parser consumes newlines #24

Comments

jkarni commented Jan 29, 2015

doppioandante commented May 7, 2015

albertnetymk commented Jul 3, 2015

mrkkrp commented Jul 6, 2015

doppioandante commented Jul 6, 2015

albertnetymk commented Jul 6, 2015

mrkkrp commented Sep 2, 2015

doppioandante commented Sep 2, 2015

mrkkrp commented Sep 2, 2015