Why sourcePos has line and column, but no token number? #17

gibiansky · 2015-01-03T04:07:14Z

The SourcePos data structure seems to expose line number and column number, which is great for error messages, but it'd be really nice to also have the character number in the file. This would enable things like the combinator discussed here.

Is there a reason this doesn't exist? Would it be hard to add? Is there a better way to write that combinator?

The text was updated successfully, but these errors were encountered:

aslatter · 2015-01-03T05:19:38Z

Reading your request literally:

Currently the SourcePos data structure is only used for error reporting - it is the responsibility of the running parser to update it in whatever way makes sense for it, and the "API" for doing so is public - adding fields could break parsers built on top of Parsec using a custom token-type.

For example the built-in character parsers, all of them are built on top of Text.Parsec.Char.satisfy, which is a wrapper around Text.Parsec.Prim.tokenPrim - satisfy tells tokenPrim to use the function updatePosChar to update the SourcePos.

Digging a bit deeper:

It looks like you actually want to track the tokens consumed, not necessarily the characters consumed.

This would be different from the reported source position when we're parsing a stream of some other token type which has embedded location information in the token itself - the expectation is that the primitive parsers for the token-type will lift the token source positions into the SourcePos type while it parses (for the Char token-type this lifting is pretty simple).

A token-count would live in a new field in Text.Parsec.Prim.State. You'd need to figure out how to update tokens, token and tokenPrim in Text.Parsec.Prim. I'd also want to know how this affects run-time and memory usage of Parsec-based parsers.

gibiansky · 2015-01-03T05:40:40Z

Yes, indeed, your deeper reading is what I was going for – a token count of number of tokens consumed. Perhaps SourcePos is not the right location for that, you are right. I was also unsure of whether this already existed and I had simply missed it.

So, it sounds like there is no fundamental reason this doesn't exist – more like one of lack of need perhaps or lack of effort to do it. Thanks! I do not think that I will be implementing this – as I have not personally needed this and do not think it is a good idea to implement features "just in case someone someday" needs them someday... So you can probably go ahead and close the issue – thanks for the clarification.

aslatter · 2015-01-05T20:27:06Z

Okay!

I'm not opposed to this sort of thing, but I'll close the issue until someone wants to push for it.

aslatter closed this as completed Jan 5, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why sourcePos has line and column, but no token number? #17

Why sourcePos has line and column, but no token number? #17

gibiansky commented Jan 3, 2015

aslatter commented Jan 3, 2015

gibiansky commented Jan 3, 2015

aslatter commented Jan 5, 2015

Why sourcePos has line and column, but no token number? #17

Why sourcePos has line and column, but no token number? #17

Comments

gibiansky commented Jan 3, 2015

aslatter commented Jan 3, 2015

gibiansky commented Jan 3, 2015

aslatter commented Jan 5, 2015