New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse nodrop #49
Parse nodrop #49
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to add an integration test.
[Russell, editing @tstack's review comment] See integration.rs
& the structured_test!
macro
src/typecheck.rs
Outdated
} => { | ||
let regex = pattern.to_regex(); | ||
|
||
let options: HashSet<lang::ParseOption> = parse_options.iter().cloned().collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed if you're only checking if the vector contains the flag or not?
src/typecheck.rs
Outdated
@@ -81,6 +85,9 @@ impl lang::InlineOperator { | |||
regex, | |||
fields, | |||
input_column.map(|e| e.into()), | |||
operator::ParseOptions { | |||
drop_nonmatching: options.contains(&lang::ParseOption::NoDrop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this condition backwards? drop_nonmatching will be true if there's a "nodrop" option.
src/operator.rs
Outdated
@@ -600,7 +612,16 @@ impl UnaryPreAggFunction for Parse { | |||
fn process(&self, rec: Record) -> Result<Option<Record>, EvalError> { | |||
let matches = self.matches(&rec)?; | |||
match matches { | |||
None => Ok(None), | |||
None => match self.options.drop_nonmatching { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the rust style is to use an if-expression instead of matching on a boolean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably use tuple matching and do something like
match (self.options,drop_nonmatching, matches) {
None, true => None,
None, false => addFields(rec)
Some(rec), _ => rec
}
src/operator.rs
Outdated
let rec = self | ||
.fields | ||
.iter() | ||
.fold(rec, |rec, field| rec.put(field, data::Value::None)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe make 'rec' mutable and use for_each() ? Not sure what the style is...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is OK as is. This is probably a little more readable though: (as @tstack mentioned, you need mut
here)
for field <- fields {
rec = rec.put(field, data::Value::None)
}
Another totally fine option is adding a put_all
method on rec
to handle this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! A few small tweaks to simplify but the overall direction is good. I'm going to land @tstack's #44 land first because these definitely conflict, but it will much easier to resolve conflicts in yours than in their's.
As a side note, it looks like the CI job failed because you need to run cargo fmt
. I'll add an improved error for that for new contributors.
src/lang.rs
Outdated
@@ -93,6 +93,7 @@ pub enum InlineOperator { | |||
pattern: Keyword, | |||
fields: Vec<String>, | |||
input_column: Option<Expr>, | |||
parse_options: Vec<ParseOption>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets just have this be a boolean no_drop
-- I appreciate the forward thinking but might as well just keep it simple for now. It will simplify the downstream code as well
src/operator.rs
Outdated
@@ -600,7 +612,16 @@ impl UnaryPreAggFunction for Parse { | |||
fn process(&self, rec: Record) -> Result<Option<Record>, EvalError> { | |||
let matches = self.matches(&rec)?; | |||
match matches { | |||
None => Ok(None), | |||
None => match self.options.drop_nonmatching { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably use tuple matching and do something like
match (self.options,drop_nonmatching, matches) {
None, true => None,
None, false => addFields(rec)
Some(rec), _ => rec
}
src/operator.rs
Outdated
let rec = self | ||
.fields | ||
.iter() | ||
.fold(rec, |rec, field| rec.put(field, data::Value::None)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is OK as is. This is probably a little more readable though: (as @tstack mentioned, you need mut
here)
for field <- fields {
rec = rec.put(field, data::Value::None)
}
Another totally fine option is adding a put_all
method on rec
to handle this case
@@ -931,6 +952,9 @@ mod tests { | |||
"length".to_string(), | |||
], | |||
None, | |||
ParseOptions { | |||
drop_nonmatching: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as evidenced by getting it backwards, this name is kind of confusing. Lets just call it no_drop
to be parallel with lang
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as evidenced by getting it backwards, this name is kind of confusing. Lets just call it
no_drop
to be parallel withlang
I was actually going for the "don't have negated variable/settings names" rule - let me know if you still want to change it after the other changes, it's a bit clearer now.
I might add unit tests in typecheck.ts as well, as that would have caught this brainfart before the integration test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love the "don't have negated variable names" rule!
(None, false) => { | ||
let mut rec = rec; | ||
for field in &self.fields { | ||
rec = rec.put(field, data::Value::None); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thinking a bit more, we should only do this if the field doesn't already exist in the record
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, see one inline comment. Rebase from latest master to pick up the parser changes (happy to help if you get sticky conflicts)
Did the merge in the github online view, we'll see if I got it right 😅 |
haha you're game. I can rebase, was just waiting until i got to a non-windows machine to do it on. |
9d7b78c
to
96cff48
Compare
96cff48
to
172cfcc
Compare
Implement
nodrop
option forparse
, which leaves non-matching lines in the logs with the parsed fields set to None.Code comments welcome however trivial, I'm quite new to Rust.