-
Notifications
You must be signed in to change notification settings - Fork 115
Add tx{-parser} crates; start parsing transactions. #164
Conversation
I don't see these errors with
Will try with stable. |
Builds for me locally with stable. |
Oh, no it doesn't. |
Fixed branch: https://github.com/mozilla/mentat/tree/rnewman/parse-tx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This'll need rebasing on top of keywords (please, so I don't have to 'rebase' this when I land!), but otherwise it looks excellent.
|
||
use combine::{any, eof, many, optional, parser, satisfy_map, token, Parser, ParseResult, Stream}; | ||
use combine::combinator::{Expected, FnParser}; | ||
// TODO: understand why this is self::edn rather than just edn. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed this in my branch.
} | ||
|
||
fn integer_(input: I) -> ParseResult<i64, I> { | ||
return satisfy_map(|x: Value| if let Value::Integer(y) = x { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've read combinator.rs, and do not claim to understand it. But this is a nice abstraction for allowing failure to arbitrary depth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
combinator.rs
is tricky; it took me three days of digging to really understand what is happening :)
fn entid_(input: I) -> ParseResult<EntId, I> { | ||
let p = Tx::<I>::integer() | ||
.map(|x| EntId::EntId(x)) | ||
.or(Tx::<I>::keyword().map(|x| EntId::Ident(x))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting — you're unwrapping the keyword, right?
This will need to change a little on top of #163, but in a good way — you'll still be unwrapping, but you'll get e.g., a NamespacedKeyword
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I am unwrapping, although there's no real reason for this.
} | ||
|
||
fn keyword_(input: I) -> ParseResult<String, I> { | ||
return satisfy_map(|x: Value| if let Value::Keyword(y) = x { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really makes you want ??
, doesn't it?
|
||
fn lookup_ref_(input: I) -> ParseResult<LookupRef, I> { | ||
return satisfy_map(|x: Value| if let Value::Vector(y) = x { | ||
let mut p = (Tx::<&[Value]>::entid(), any(), eof()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the parser library memoize the construction of these parsers? Should I be worried about multiple invocations of fn_parser
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, and there's a subtle thing happening with consuming Parser
instances that I don't yet understand. You'll note all these parsers are mutable; that's because they're consumed by parse
and friends. So you can't avoid the invocations and allocations. It's unclear how this impacts performance overall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because FnParser
s (and probably other parsers, too) are implicitly stateful — they wrap a function, and there's no concept in Rust of a pure function?
(Indeed, with 'try' for lookahead, one expects LL(n) parser implementations to be stateful.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the parser library memoize the construction of these parsers? Should I be worried about multiple invocations of fn_parser?
Author of combine
here. Constructing parsers should be free/very cheap since Parser
follow the same model as Iterator
so its all stack allocations and since most parsers are either zero-sized or just a few bytes for the function or parameters they take to construct them. As long as LLVM inlines properly there should be zero overhead.
(Indeed, with 'try' for lookahead, one expects LL(n) parser implementations to be stateful.)
That is almost true but the try
parser has no state itself, that is all contained in Parser::Input
(try, the only thing it contains is the parser it wraps).
use self::edn::types::Value; | ||
|
||
#[derive(Clone, Debug, PartialEq)] | ||
pub enum EntId { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you steal/clean up the EntId
stuff in the top-level repo?
|
||
extern crate edn; | ||
|
||
// TODO: understand why this is self::edn rather than just edn. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See earlier note.
Also try ::edn::types::Value
, see how that differs…
#[derive(Clone, Debug, PartialEq)] | ||
pub struct LookupRef { | ||
pub a: EntId, | ||
// TODO: consider boxing to allow recursive lookup refs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You crazy.
#[derive(Clone, Debug, PartialEq)] | ||
pub enum EntId { | ||
EntId(i64), | ||
Ident(String), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This'll eventually be Ident(NamespacedOrPlainKeyword)
, I expect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might just ban PlainKeyword
, since it's bad form in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, fair. That's one reason I split, actually: the 'syntax' of the query parser is the main place I expect Keyword
s (:find
, :in
), with all of the 'data' being namespaced.
Add { | ||
e: EntIdOrLookupRef, | ||
a: EntId, | ||
v: ValueOrLookupRef, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strongly typed cat approves of your strongly typed format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a strongly typed cat meme generator, stat!
Yes, I eventually found it. Sadly, it requires a type annotation in this case, which I couldn't quite make work. (There are TODO's later in the commit). If you can work this out, please do! Thanks! |
Worth calling out that the |
This depends on edn and uses the combine parser combinator library.
@rnewman final stamp before I fold? I'm going to punt on the entid simplification until we have a motivation to fix it. There are some decisions about crate organization to be made that don't block this landing. |
|
||
// TODO: abstract the "match Vector, parse internal stream" pattern to remove this boilerplate. | ||
fn add_(input: I) -> ParseResult<Entity, I> { | ||
return satisfy_map(|x: Value| -> Option<Entity> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I may, I believe you can change this parser into this to preserve errors from parsing the inner value.
satisfy_map(|x: Value| -> Option<Entity> {
if let Value::Vector(y) = x { Some(y) } else { None }
}).flat_map(|y| {
let mut p = ...;
p.parse(&y[..]).map(|t| t.0)
})
https://docs.rs/combine/2.1.1/combine/trait.Parser.html#method.flat_map
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I try this, I get errors like:
error[E0271]: type mismatch resolving `<parse::combine::combinator::SatisfyMap<&[error::edn::Value], [closure@query-parser/src/parse.rs:95:20: 95:94]> as parse::combine::Parser>::Input == I`
--> query-parser/src/parse.rs:101:9
|
101 | .parse_stream(input)
| ^^^^^^^^^^^^ expected reference, found type parameter
|
= note: expected type `&[error::edn::Value]`
= note: found type `I`
= note: required because of the requirements on the impl of `parse::combine::Parser` for `parse::combine::combinator::FlatMap<parse::combine::combinator::SatisfyMap<&[error::edn::Value], [closure@query-parser/src/parse.rs:95:20: 95:94]>, [closure@query-parser/src/parse.rs:96:22: 100:9]>`
so this isn't a trivial change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. In order to do this, I needed to either use SliceStream
or work with the ::Range
type, so that the lifetime was preserved. It can be done, but it's awkward. See also the discussion in Marwes/combine#74 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/me gives up, runs git checkout -- .
I think it is, but it's not possible! You can only construct enums and structs, and the |
@rnewman |
We are blessed with the `lazy_struct` crate…
|
This depends on edn and uses the combine parser combinator library.
This depends on edn and uses the combine parser combinator library.
@rnewman, you can see my progress here. There's an abstraction around parsing the internals of vector to be worked out, but it's actually quite nice to work with (once you've absorbed some
combine
magic). I think some of the constructors mappings might be able to be shortened, but it's not worth waiting for.