You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Suppose we define a LookupExtractionFn with the following underlying map, with retainMissingValues set to true:
{"a" -> "d"}.
If we define a selector/extraction filter that matches on "d" using the LookupExtractionFn above and call optimize() on the filter, the unapply() reverse-lookup will only pick up value "a". The optimize() step has no knowledge of the untransformed value "d", and the resulting InFilter will not match all the rows it needs to.
Similarly, if retainMissingValues is false, and replaceMissingValuesWith has the same value as the selector value, optimize() will not be aware of all the row values that it needs.
Using the same example rows, suppose we have the following selector filter and lookup extraction:
This should match all rows, since "a" will be mapped to "b", and "b","c","d" are not in the lookup map and will be transformed to "b".
If we call optimize() on this filter, the resulting filter built from reverse-lookup will only select for value "a". It is not aware of all the other values that will be transformed to "b" via the replaceMissingValuesWith property.
The text was updated successfully, but these errors were encountered:
jon-wei
changed the title
SelectorDimFilter optimize() does not work correctly with missing value handling on LookupExtractionFn
SelectorDimFilter optimize() does not work correctly with LookupExtractionFn missing value handling
Apr 1, 2016
If optimize() is called on a ExtractionDimFilter/SelectorDimFilter with a LookupExtractionFn, an incorrect filter will be returned in some cases.
Suppose we have a single dimension,
dimA
, with rows:Suppose we define a LookupExtractionFn with the following underlying map, with
retainMissingValues
set totrue
:If we define a selector/extraction filter that matches on "d" using the LookupExtractionFn above and call optimize() on the filter, the unapply() reverse-lookup will only pick up value "a". The optimize() step has no knowledge of the untransformed value "d", and the resulting InFilter will not match all the rows it needs to.
Similarly, if
retainMissingValues
is false, andreplaceMissingValuesWith
has the same value as the selector value, optimize() will not be aware of all the row values that it needs.Using the same example rows, suppose we have the following selector filter and lookup extraction:
This should match all rows, since "a" will be mapped to "b", and "b","c","d" are not in the lookup map and will be transformed to "b".
If we call optimize() on this filter, the resulting filter built from reverse-lookup will only select for value "a". It is not aware of all the other values that will be transformed to "b" via the
replaceMissingValuesWith
property.The text was updated successfully, but these errors were encountered: