Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overload operators as parser combinators #32

Closed
Marwes opened this issue May 28, 2015 · 10 comments
Closed

Overload operators as parser combinators #32

Marwes opened this issue May 28, 2015 · 10 comments

Comments

@Marwes
Copy link
Owner

Marwes commented May 28, 2015

Since all parsers are currently functions or methods I find that large parsers often become a bit of a word soup. Large chains of parser can become rather hard to read. I have some ideas for what might be useful to implement which I document below.

Tuples

Tuples could allow parsers which should be applied in sequence to be written as.

string("if").with(expr()).skip(string("then")).and(expr()).skip(string("else")).and(expr())
    .map(|(b, (t, f))| Expr::IfElse(Box::new(b), Box::new(t), Some(Box::new(f))))

//With tuples
(string("if"), expr(), string("then"), expr(), string("else"), expr())
    .map(|(_, b, _, t, _, f)| Expr::IfElse(Box::new(b), Box::new(t), Some(Box::new(f))))

Strings and character literals

Strings and character literals could implement parser directly allow them to be written without string and char.

("if", expr(), "then", expr(), "else", expr())
    .map(|(_, b, _, t, _, f)| Expr::IfElse(Box::new(b), Box::new(t), Some(Box::new(f))))

Use std::ops::* traits

The most likely candidate here is overloading | to work the same as the or parser. Unfortunately this won't work directly without changing the library rather radically since it is not possible to implement it as below.

impl <A: Parser, B: Parser> BitOr<B> for A {
     type Output = Or<A, B>;
    fn bitor(self, other: B) -> Or<A, B> { self.or(other) }
}

This will not work since A and B due to coherence (A and B must appear inside some local type). The same applies for any other operator.

Since all of these could be seen as being to clever with the syntax it would be nice to have some feedback on which of these (if any) that may be good to implement.

@hawkw
Copy link
Contributor

hawkw commented May 28, 2015

This is definitely something I'm in favour of, especially implementing parser for strings and character literals. Anything that makes the syntax easier to write and understand would be excellent, in my book.

Overloaded operators would be cool, but if it requires a great deal of changes internally, it certainly doesn't have to be a priority. I really like the use of operator overloading in the Scala parser combinators library, but making it work in Rust could be a bit of a struggle.

@Marwes
Copy link
Owner Author

Marwes commented May 28, 2015

@hawkw I might actually have been a bit to optimistic about allowing strings and character literals as parser. It would work but they would need do be specialized to work on only a single type of input stream (probably &str). It may still be a bit of useful sugar but it does feel like something that would be easy to trip up new users on.

The changes I was thinking of to make operator overloading possible is to introduce a newtype which all parsers return so instead of many(p: P) -> Many<P> it would be many(p: ParserT<P>) -> ParserT<Many<P>>. Unfortunately this makes it impossible to pass a parser by &mut reference (many(&mut parser)) which I find useful quite frequently. It may need some more thought to make it work smoothly.

@hawkw
Copy link
Contributor

hawkw commented May 28, 2015

@Marwes: Ah well. I haven't ever had to use pass a &mut reference to a parser, but if you find that it's important, that should probably remain possible - my particular use case may not be typical. I'm already specializing to &str because of #21, so I for one would love being able to use literal parsers, but the current syntax should definitely remain available for unspecialized parsers.

@Marwes
Copy link
Owner Author

Marwes commented May 29, 2015

Passing parsers by &mut is pretty common for combinators since they can't move the parsers they own (see And).

It would be possible to just add a method on ParserT which converts it like ParserT<P> -> ParserT<&mut P> but it would also need a method which converts ParserT<P> -> ParserT<&mut Parser> (trait object) and for Box etc. I have yet to think of a better way though.

@Marwes
Copy link
Owner Author

Marwes commented Aug 2, 2015

Tuples as a sequencing parser are implemented in f8c15a2 (0.5.0).

With RangeStreams (#42) I a am a bit more keen on the string literal overloaded as a parser as the &str Stream feels a bit more important which could warrant the extra sugar for it.

Operators won't happen for 1.0 as it would break everything and require a lot of rewriting.

@Marwes Marwes added this to the 1.1 milestone Aug 2, 2015
@Marwes Marwes removed this from the 1.1 milestone Dec 23, 2015
@Yamakaky
Copy link

For reference, I like how nom handles it.
str and char literals would be cool too.

@Marwes
Copy link
Owner Author

Marwes commented Oct 24, 2016

For reference, I like how nom handles it.

Tuples are the equivalent constructs in combine. Were you thinking of some extension to that?

str and char literals would be cool too.

Still torn on allowing the use of literals directly as they can only work with a single stream type which may make that those streams seem (&str and &[u8]) more important than they should be. At least in the case of &str one would probably use State<&str> instead and only use a &str stream directly if more precise control over the position is required.

@Yamakaky
Copy link

Yamakaky commented Oct 24, 2016

In nom, you can name the fields you want, while with the tuples you have to use .0 or pattern matching.

@Marwes
Copy link
Owner Author

Marwes commented Oct 24, 2016

True. The idea is to use pattern matching to name the fields you want. I suppose it may be clearer in code to put the names as close to the actual parser as possible though (as nom does). It is pretty low priority for me though. I'd be happy to accept a PR for such a macro if you feel inclined to add it.

@Marwes
Copy link
Owner Author

Marwes commented Aug 16, 2017

@Yamakaky struct_parser! were added which makes it possible to name the fields of a sequence.

I don't see it as worthwhile making &str a parser due to Input being an associated type which would force &str to only work with a single input type. This could be solved by making Input a type parameter to the Parser trait but that comes with problems in type inference instead which would offset any benefit of adding it.

As for overloading operators. This would be an extremely invasive change and in the end all it would give is the ability to write p1 | p2 | p3 .... Since this can now be written as choice((p1, p2, p3, ...)) instead of an or chain I believe the complexity is not worth it for the little syntactic sugar which operators would give. https://docs.rs/combine/3.0.0-alpha.2/combine/fn.choice.html

So as a conclusion I will close this issue

@Marwes Marwes closed this as completed Aug 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants