Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse nodrop #49

Merged
merged 5 commits into from Jan 11, 2019
Merged

Parse nodrop #49

merged 5 commits into from Jan 11, 2019

Conversation

akdor1154
Copy link
Contributor

Implement nodrop option for parse, which leaves non-matching lines in the logs with the parsed fields set to None.

Code comments welcome however trivial, I'm quite new to Rust.

Copy link
Contributor

@tstack tstack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to add an integration test.

[Russell, editing @tstack's review comment] See integration.rs & the structured_test! macro

src/typecheck.rs Outdated
} => {
let regex = pattern.to_regex();

let options: HashSet<lang::ParseOption> = parse_options.iter().cloned().collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed if you're only checking if the vector contains the flag or not?

src/typecheck.rs Outdated
@@ -81,6 +85,9 @@ impl lang::InlineOperator {
regex,
fields,
input_column.map(|e| e.into()),
operator::ParseOptions {
drop_nonmatching: options.contains(&lang::ParseOption::NoDrop)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this condition backwards? drop_nonmatching will be true if there's a "nodrop" option.

src/operator.rs Outdated
@@ -600,7 +612,16 @@ impl UnaryPreAggFunction for Parse {
fn process(&self, rec: Record) -> Result<Option<Record>, EvalError> {
let matches = self.matches(&rec)?;
match matches {
None => Ok(None),
None => match self.options.drop_nonmatching {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the rust style is to use an if-expression instead of matching on a boolean.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably use tuple matching and do something like

match (self.options,drop_nonmatching, matches) {
  None, true => None,
  None, false => addFields(rec)
  Some(rec), _ => rec
}

src/operator.rs Outdated
let rec = self
.fields
.iter()
.fold(rec, |rec, field| rec.put(field, data::Value::None));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make 'rec' mutable and use for_each() ? Not sure what the style is...

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is OK as is. This is probably a little more readable though: (as @tstack mentioned, you need mut here)

for field <- fields {
  rec = rec.put(field, data::Value::None)
}

Another totally fine option is adding a put_all method on rec to handle this case

Copy link
Owner

@rcoh rcoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! A few small tweaks to simplify but the overall direction is good. I'm going to land @tstack's #44 land first because these definitely conflict, but it will much easier to resolve conflicts in yours than in their's.

As a side note, it looks like the CI job failed because you need to run cargo fmt. I'll add an improved error for that for new contributors.

src/lang.rs Outdated
@@ -93,6 +93,7 @@ pub enum InlineOperator {
pattern: Keyword,
fields: Vec<String>,
input_column: Option<Expr>,
parse_options: Vec<ParseOption>,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets just have this be a boolean no_drop -- I appreciate the forward thinking but might as well just keep it simple for now. It will simplify the downstream code as well

src/operator.rs Outdated
@@ -600,7 +612,16 @@ impl UnaryPreAggFunction for Parse {
fn process(&self, rec: Record) -> Result<Option<Record>, EvalError> {
let matches = self.matches(&rec)?;
match matches {
None => Ok(None),
None => match self.options.drop_nonmatching {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably use tuple matching and do something like

match (self.options,drop_nonmatching, matches) {
  None, true => None,
  None, false => addFields(rec)
  Some(rec), _ => rec
}

src/operator.rs Outdated
let rec = self
.fields
.iter()
.fold(rec, |rec, field| rec.put(field, data::Value::None));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is OK as is. This is probably a little more readable though: (as @tstack mentioned, you need mut here)

for field <- fields {
  rec = rec.put(field, data::Value::None)
}

Another totally fine option is adding a put_all method on rec to handle this case

@@ -931,6 +952,9 @@ mod tests {
"length".to_string(),
],
None,
ParseOptions {
drop_nonmatching: true,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as evidenced by getting it backwards, this name is kind of confusing. Lets just call it no_drop to be parallel with lang

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as evidenced by getting it backwards, this name is kind of confusing. Lets just call it no_drop to be parallel with lang

I was actually going for the "don't have negated variable/settings names" rule - let me know if you still want to change it after the other changes, it's a bit clearer now.

I might add unit tests in typecheck.ts as well, as that would have caught this brainfart before the integration test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the "don't have negated variable names" rule!

(None, false) => {
let mut rec = rec;
for field in &self.fields {
rec = rec.put(field, data::Value::None);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking a bit more, we should only do this if the field doesn't already exist in the record

Copy link
Owner

@rcoh rcoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, see one inline comment. Rebase from latest master to pick up the parser changes (happy to help if you get sticky conflicts)

@rcoh
Copy link
Owner

rcoh commented Jan 11, 2019

Did the merge in the github online view, we'll see if I got it right 😅

@akdor1154
Copy link
Contributor Author

haha you're game. I can rebase, was just waiting until i got to a non-windows machine to do it on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants