New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide ability to reference underlying source from a returned JParsec object #5
Comments
Definitely reasonable! Will have a look at it tomorrow (2013-06-17) and assess complexity of needed devs. |
Any update?
|
So once you have a Parser that returns the fully parsed object, parser.source() is a parser that returns the original source string. Is that useful? |
Yes, sounds like it could be useful. Is it easy to to implement? |
I wonder whether you can retrieve both an Object and the source of the object using the same Parser object? I will try to build an example using the proposed technique and push it to the repo. |
looking forward to seeing the results. |
source() is available in v2.0: http://jparsec.codehaus.org/jparsec2/api/org/codehaus/jparsec/Parser.html#source() Though to get both the object and the underlying source, maybe try token(): http://jparsec.codehaus.org/jparsec2/api/org/codehaus/jparsec/Parser.html#token() Token has the value, and the index and length in the original source. |
Actually, I really don't see how this could work as is. Method But working on it I had a simpler idea than I initially thought: Provide a method WDYT ? |
* Locatable interface is called by successful parser when implemented by result
hmm, does that mean adding the same interface to every parser object, as an On Thu, Jun 20, 2013 at 9:15 PM, Arnaud Bailly notifications@github.comwrote:
|
So token() may feel a bit crude but it seems to work if you already have access to the source at parser creation time: final String source = ...;
parser.token().map(token-> {
return new AnnotatedFoo((Foo) token.getValue(), source.substring(token.getIndex(), token.length()));
}); |
Some more alternatives that're supported in v2:
Mapper.curry(LocationAnnotated.class).sequence(Parsers.INDEX, parser, Parsers.INDEX);
public class LocationAnnotated<T> {
public LocationAnnotated(int begin, T value, int end) {...}
}
Parser<SourceAnnotated<T>> annotateWithSource(Parser<T> parser) {
return Mapper.curry(SourceAnnotated.class).sequence(parser.source().peek(), parser);
}
public class SourceAnnotated<T> {
public SourceAnnotated(String source, T value) {...}
} |
@jasons2000 Yes, you would need to add the interface to each node you want to make Locatable. @fluentfuture Quite a few different options, indeed! Looks like I still have a lot to learn about the capabilities of jparsec :-) I have committed a small piece of code implementing the Locatable idiom so that you may try it. Let us know which solution suits you best. I will add a small section on the wiki to document this feature which might interest other people. |
thanks for you help guys, I'm going to take a look at the weekend |
Arnaud, I have concerns over the Locatable approach. It's a side-effect based solution while jparsec has been immutable and functional so far. I'm afraid introducing side-effect could cause surprising or confusing behavior or even bugs. Here's one use case I was thinking: class Foo implements Locatable {...}
Parser<Foo> fooParser = (...).peek(); The 'fooParser' will return foo, but its source will be empty, counter-intuitively. sequence(INDEX, parser.peek(), INDEX) has the same effect. But at least it's obvious to code reader that the INDEX is obtained after peek(). I'm also unsure whether Locatable will make it more difficult to implement incremental parser. WDYT? |
Ben, I hear and understand your concerns. I initially was not very confident with this solution and now I see it is probably because it somehow feels clunky with respect to the rest of jparsec. However, I was guided by the fact it does not clutter much the parser's grammar and decouples the expression of the language we are parsing from the technical details of how it is constructed. Using What about something like:
Instead of relying on a side-effect, return a wrapped object from which you can retrieve both the location information and the successful parser? Locatable would be a functor then, providing common mapping operations to allow composition inside a locatable while preserving the structure of the parse tree. Another possibility would be to augment the Both solutions would better fit an incremental parser. Anyhow, I do not take it as personal offense to be contradicted by the creator of the library so feel free to (positively) criticize what I do, this is ok with me. I will revert the current Locatable and move it to a branch, which I should have done in the first place anyway. |
Thanks for continuing the brainstorming! On Jun 25, 2013, at 12:09 PM, Arnaud Bailly notifications@github.com wrote:
I suppose by "intrusive", you meant "less convenient" because intrusive can also mean that the user code has to implement a framework specified interface, in which sense, implementing Locatable is more intrusive because user domain code would then need to implement a jparsec-specific interface, as opposed to being pojo. Using INDEX isn't very convenient since if user wants to track index for all of his tree nodes, he'll have to have a lot of Parser<LocationAnnotated> and every of his parser would need to be wrapped in annotateWithLocation(). So yeah, I definitely agree that using INDEX or an annotateWithLocation() helper leaves much to be desired. I was just not sure how strong the demand is and was hoping to learn from actual use cases before trying to think about the API. I'm very curious about what jsons2000 find out in his use case, since it's the first time we hear about it.
locate() is almost similar to the annotateWithLocation() method, with 2 main differences:
As I said above, I'm still not very sure how much this is needed to justify a core API addition vs. a small helper in user's code. It's not clear to me either whether user code really needs the source string in addition to the indices. Jason's original question was: "either as a "image" ie String or through pointers to the original source, perhaps using two Location objects,begin and end". So it sounded like he's fine without the actual "image". Creating a lot of substring objects may not be the most efficient if most clients end up only using the indices anyway. So, I guess what I'm saying is: I'd like to learn a bit more about the requirement/use-case.
Again, thank you!
|
You are right, "less convenient" suits better what I wanted to say than "more intrusive": Forcing implementing an interface is actually quite intrusive! About the "alternate" And finally yes, let's wait about @jasons2000's feedback. |
HI Guys, I didn't get the chance to look at this at the weekend as planned, I've been following the debate keenly, and agree that the original proposal I'd expect to be able to walk the parse tree and print of the tokens as I Should be able to give you good feedback by the weekend. Jason You are right, "less convenient" suits better what I wanted to say than And finally yes, let's wait about @jasons2000https://github.com/jasons2000's — |
* Locatable interface is called by successful parser when implemented by result
@jasons2000 Did you manage to test one or another solution? Otherwise, I think we can close the issue for now and move on to as there seems to exist quite a few solutions to retrieve this information using current framework's implementation. I will update the documentation to reflect proposed solution that do not impact existing codebase. |
haven't had a chance t look at this in detail, but I will be getting to it Thanks for your help again On 12 August 2013 13:21, Arnaud Bailly notifications@github.com wrote:
|
I would like to produce an annotated version of an input source file. To be able to do this one needs the ability to reference the original source, either as a "image" ie String or through pointers to the original source, perhaps using two Location objects,begin and end. It should include any white spaces too.
Does this sound reasonable?
The text was updated successfully, but these errors were encountered: