Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why sourcePos has line and column, but no token number? #17

Closed
gibiansky opened this issue Jan 3, 2015 · 3 comments
Closed

Why sourcePos has line and column, but no token number? #17

gibiansky opened this issue Jan 3, 2015 · 3 comments

Comments

@gibiansky
Copy link

The SourcePos data structure seems to expose line number and column number, which is great for error messages, but it'd be really nice to also have the character number in the file. This would enable things like the combinator discussed here.

Is there a reason this doesn't exist? Would it be hard to add? Is there a better way to write that combinator?

@aslatter
Copy link
Collaborator

aslatter commented Jan 3, 2015

Reading your request literally:

Currently the SourcePos data structure is only used for error reporting - it is the responsibility of the running parser to update it in whatever way makes sense for it, and the "API" for doing so is public - adding fields could break parsers built on top of Parsec using a custom token-type.

For example the built-in character parsers, all of them are built on top of Text.Parsec.Char.satisfy, which is a wrapper around Text.Parsec.Prim.tokenPrim - satisfy tells tokenPrim to use the function updatePosChar to update the SourcePos.

Digging a bit deeper:

It looks like you actually want to track the tokens consumed, not necessarily the characters consumed.

This would be different from the reported source position when we're parsing a stream of some other token type which has embedded location information in the token itself - the expectation is that the primitive parsers for the token-type will lift the token source positions into the SourcePos type while it parses (for the Char token-type this lifting is pretty simple).

A token-count would live in a new field in Text.Parsec.Prim.State. You'd need to figure out how to update tokens, token and tokenPrim in Text.Parsec.Prim. I'd also want to know how this affects run-time and memory usage of Parsec-based parsers.

@gibiansky
Copy link
Author

Yes, indeed, your deeper reading is what I was going for – a token count of number of tokens consumed. Perhaps SourcePos is not the right location for that, you are right. I was also unsure of whether this already existed and I had simply missed it.

So, it sounds like there is no fundamental reason this doesn't exist – more like one of lack of need perhaps or lack of effort to do it. Thanks! I do not think that I will be implementing this – as I have not personally needed this and do not think it is a good idea to implement features "just in case someone someday" needs them someday... So you can probably go ahead and close the issue – thanks for the clarification.

@aslatter
Copy link
Collaborator

aslatter commented Jan 5, 2015

Okay!

I'm not opposed to this sort of thing, but I'll close the issue until someone wants to push for it.

@aslatter aslatter closed this as completed Jan 5, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants