Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SelectorDimFilter optimize() does not work correctly with LookupExtractionFn missing value handling #2775

Closed
jon-wei opened this issue Apr 1, 2016 · 1 comment
Labels
Milestone

Comments

@jon-wei
Copy link
Contributor

jon-wei commented Apr 1, 2016

If optimize() is called on a ExtractionDimFilter/SelectorDimFilter with a LookupExtractionFn, an incorrect filter will be returned in some cases.

Suppose we have a single dimension, dimA, with rows:

{dimA = "a"}
{dimA = "b"}
{dimA = "c"}
{dimA = "d"}.

Suppose we define a LookupExtractionFn with the following underlying map, with retainMissingValues set to true:

{"a" -> "d"}.

If we define a selector/extraction filter that matches on "d" using the LookupExtractionFn above and call optimize() on the filter, the unapply() reverse-lookup will only pick up value "a". The optimize() step has no knowledge of the untransformed value "d", and the resulting InFilter will not match all the rows it needs to.


Similarly, if retainMissingValues is false, and replaceMissingValuesWith has the same value as the selector value, optimize() will not be aware of all the row values that it needs.

Using the same example rows, suppose we have the following selector filter and lookup extraction:

selector: 
- value: "b"
lookup:
- map: "a" -> "b"
- replaceMissingValuesWith = "b"

This should match all rows, since "a" will be mapped to "b", and "b","c","d" are not in the lookup map and will be transformed to "b".

If we call optimize() on this filter, the resulting filter built from reverse-lookup will only select for value "a". It is not aware of all the other values that will be transformed to "b" via the replaceMissingValuesWith property.

@jon-wei jon-wei added the Bug label Apr 1, 2016
@jon-wei jon-wei added this to the 0.9.1 milestone Apr 1, 2016
@jon-wei jon-wei changed the title SelectorDimFilter optimize() does not work correctly with missing value handling on LookupExtractionFn SelectorDimFilter optimize() does not work correctly with LookupExtractionFn missing value handling Apr 1, 2016
@jon-wei
Copy link
Contributor Author

jon-wei commented Apr 11, 2016

Fixed in #2690

@jon-wei jon-wei closed this as completed Apr 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant