Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make quoted capture less greedy when we have unambiguous separator #62

Conversation

yaauie
Copy link
Contributor

@yaauie yaauie commented May 3, 2018

This PR contains a couple separate fixes, that rely on each other; each fix is
documented in its own separate commit, but the commits must be applied in
order
.


Greediness of Quoted Values

In [logstash-plugins/logstash-filter-kv#60][], GitHub user @robcowart reports
that in some scenarios we can fail to split on match of `field_split_pattern`.

The example given is a partially-quoted value, with additional bytes between
it and the unambiguous separator, followed by another key and quoted value.

Since the quoted value isn't immediately followed by a field splitter, our
semi-greedy quoted-value capture was too greedy, continuing to consume until
it found a close-quote that would be followed by either a field-split or EOF.

Here, we become much less greedy, capturing any sequence of characters that
is _not_ the close-quote character.

[logstash-plugins/logstash-filter-kv#60]: https://github.com/logstash-plugins/logstash-filter-kv/issues/60

Greediness of trim_key and trim_value operations

In logstash-plugins/logstash-filter-kv#59, GitHub user @lennardk reports that
the `trim_key` and `trim_value` patterns are only trimming a single character
from the beginning and end of the keys and values respectively.

By making the patterns match one-or-more characters anchored to the head or
tail of the string, we can better mimic the expected trim behaviour.

Resolves: logstash-plugins/logstash-filter-kv#59

In [logstash-plugins#60][], GitHub user @robcowart reports
that in some scenarios we can fail to split on match of `field_split_pattern`.

The example given is a partially-quoted value, with additional bytes between
it and the unambiguous separator, followed by another key and quoted value.

Since the quoted value isn't immediately followed by a field splitter, our
semi-greedy quoted-value capture was too greedy, continuing to consume until
it found a close-quote that would be followed by either a field-split or EOF.

Here, we become much less greedy, capturing any sequence of characters that
is _not_ the close-quote character.

[logstash-plugins#60]: logstash-plugins#60
@yaauie
Copy link
Contributor Author

yaauie commented May 5, 2018

CI failing because of elastic/logstash-devutils#73

@jsvd jsvd self-requested a review May 8, 2018 14:41
Copy link
Contributor

@jsvd jsvd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a tiny minor spec structure suggestion
I tested a few cases manually and confirmed lib changes solve the new specs
LGTM

}
}

context '' do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since there's no context here, can we just move the it block one level back?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d'oh, will fix this and do the merge on Monday.

@yaauie yaauie force-pushed the partially-quoted-values-with-unambiguous-separator branch from 9646eb5 to 21eee81 Compare May 14, 2018 16:17
@karmi

This comment has been minimized.

1 similar comment
@karmi

This comment has been minimized.

@yaauie yaauie force-pushed the partially-quoted-values-with-unambiguous-separator branch from 21eee81 to 5be7b60 Compare May 14, 2018 16:19
expect(event.get("foo")).to eq('bar')
expect(event.get("baz")).to eq('"quoted stuff" and more unquoted')
expect(event.get("msg")).to eq('fully-quoted with a part! of the separator')
expect(event.get("blip")).to eq('this!!!!!is it')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@webmat
Copy link

webmat commented May 16, 2018

This looks good. :shipit: :-)

@elasticsearch-bot
Copy link

Ry Biesemeyer merged this into the following branches!

Branch Commits
master f063820, 6054996, 9236016

elasticsearch-bot pushed a commit that referenced this pull request May 16, 2018
elasticsearch-bot pushed a commit that referenced this pull request May 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants