Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Token parser consumes newlines #24

Open
jkarni opened this issue Jan 29, 2015 · 8 comments
Open

Token parser consumes newlines #24

jkarni opened this issue Jan 29, 2015 · 8 comments

Comments

@jkarni
Copy link

jkarni commented Jan 29, 2015

As described in this SO question, many of the token parser functions consume newlines, which is often not the desired behaviour. It's also not easy, without copying most of the code, to change that. It'd be nice if this behaviour were configurable!

@doppioandante
Copy link

It would be nice indeed

@albertnetymk
Copy link

Looking at the source, we could see that isSpace is used. I think we could add one more field in the LanguageDef record, like treatNewLineAsSpace :: Bool, which controls how isSpace from Data.Char is called. I could create a PR if this proposal sounds sensible.

@mrkkrp
Copy link
Contributor

mrkkrp commented Jul 6, 2015

I could create a PR if this proposal sounds sensible.

This would be cool. I'm also interested in hearing @aslatter 's opinion about the improvement.

@doppioandante
Copy link

@albertnetymk What if someone wants to treat separately other space characters like tab? I have never had the need for this, but I'm wondering if a more general solution can be achieved.

@albertnetymk
Copy link

@doppioandante One simple solution could be creating one more field, like treatTabAsSpace :: Bool. By that time, it's probably good to come up with a more general solution. For now, I think treating new lines specially is enough, avoid over engineering.

@mrkkrp
Copy link
Contributor

mrkkrp commented Sep 2, 2015

@jkarni, @doppioandante Work has been started on this issue in Megaparsec, see branch new-lexer. While it may be not entirely usable for now, you could try it and give your feedback. I've posted a comment in dedicated issue thread: mrkkrp/megaparsec#5.

@doppioandante
Copy link

@mrkkrp Cool, I had solved forking parsec and introducing a dirty hack.
My project has a really dumb parser (basically some fake asm) but I'll certainly try to switch to megaparsec and see if it clicks.

@mrkkrp
Copy link
Contributor

mrkkrp commented Sep 2, 2015

@doppioandante, Thanks! If it doesn't click, please describe your experience, so we can correct our shortcomings. Megaparsec is almost ready for its first release, lexer (token parser is Parsec's terminology) is the only thing that's left.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants