Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable value replacement on match failure for RegexExtractionFn #2075

Merged
merged 1 commit into from
Dec 15, 2015

Conversation

jon-wei
Copy link
Contributor

@jon-wei jon-wei commented Dec 9, 2015

Addresses issue #2064

This adds 3 new parameters to the RegexExtractionFn:

replaceMissingValues: Boolean, if a regex match fails, replace the dimval with a user specified value instead of returning the orignal dimval, default is false

replaceMissingValuesWith: String, replace dimval with this value when regex match fails if replaceMissingValues == true

injective: Boolean, currently unused, but same usage as "injective" property in the JavaScriptExtractionFn and LookupExtractionFn

@jon-wei
Copy link
Contributor Author

jon-wei commented Dec 9, 2015

@b-slim Reopened PR, referring to your earlier comment, what property did you think was unneeded?

@b-slim
Copy link
Contributor

b-slim commented Dec 9, 2015

@jon-wei injective property is only used by classes extending FunctionalExtraction.

Assert.assertTrue(extracted.contains("foobar"));

byte[] cacheKey = extractionFn.getCacheKey();
Assert.assertEquals(17, cacheKey.length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use arrays to compare bytes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@b-slim added byte array value check for cache key

@jon-wei
Copy link
Contributor Author

jon-wei commented Dec 9, 2015

@b-slim I think "injective" is fine here, a regex->match extraction can be thought of as a a function.

JavascriptExtractionFn also supports the injective property and it doesn't extend FunctionalExtraction.

A hypothetical "injective" use case might be a dimension value with various formats that all contain some unique ID, where the regex is used to strip out parts of the dim val that don't provide any unique information beyond the ID substring.

@jon-wei jon-wei force-pushed the regex_extract branch 2 times, most recently from bb71a89 to 4e2a3ca Compare December 10, 2015 00:21
@jon-wei jon-wei closed this Dec 10, 2015
@jon-wei jon-wei reopened this Dec 10, 2015
@b-slim
Copy link
Contributor

b-slim commented Dec 10, 2015

@jon-wei ok i got it now, thanks !

@vogievetsky
Copy link
Contributor

Very excited about this PR!

@@ -24,6 +24,7 @@
import com.google.common.base.Preconditions;
import com.google.common.base.Strings;
import com.metamx.common.StringUtils;
import org.jboss.netty.util.internal.StringUtil;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extraneous import

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gianm removed unused import

@JsonProperty("replaceMissingValues") Boolean replaceMissingValues,
@JsonProperty("replaceMissingValuesWith") String replaceMissingValuesWith,
@JsonProperty("injective") Boolean injective
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code style

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gianm fixed code style

@JsonProperty("expr") String expr,
@JsonProperty("replaceMissingValues") Boolean replaceMissingValues,
@JsonProperty("replaceMissingValuesWith") String replaceMissingValuesWith,
@JsonProperty("injective") Boolean injective
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Injective" doesn't seem to be used for anything right now- is it necessary? What should it be used for?

@jon-wei
Copy link
Contributor Author

jon-wei commented Dec 15, 2015

@b-slim @gianm Based on prevailing opinion, I'll remove the injective property from this PR; isInjective() isn't really being used by LookupExtractionFn or JavascriptExtractionFn either, presently.

@gianm
Copy link
Contributor

gianm commented Dec 15, 2015

👍 after travis

@fjy
Copy link
Contributor

fjy commented Dec 15, 2015

👍

fjy added a commit that referenced this pull request Dec 15, 2015
Configurable value replacement on match failure for RegexExtractionFn
@fjy fjy merged commit e7f06cf into apache:master Dec 15, 2015
@fjy fjy modified the milestone: 0.9.0 Feb 4, 2016
@fjy fjy mentioned this pull request Feb 5, 2016
@jon-wei jon-wei deleted the regex_extract branch October 6, 2017 22:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants