Harmonization and bug-fixing for selector and filter behavior on unknown types. #9484

gianm · 2020-03-09T16:24:03Z

Migrate ValueMatcherColumnSelectorStrategy to newer ColumnProcessorFactory
system, and set defaultType COMPLEX so unknown types can be dynamically matched.
Remove ValueGetters in favor of ColumnComparisonFilter doing its own thing.
Switch various methods to use convertObjectToX when casting to numbers, rather
than ad-hoc and inconsistent logic.
Fix bug in RowBasedExpressionColumnValueSelector: isBindingArray should return
true even for 0- or 1- element arrays.
Adjust various javadocs.

…own types. - Migrate ValueMatcherColumnSelectorStrategy to newer ColumnProcessorFactory system, and set defaultType COMPLEX so unknown types can be dynamically matched. - Remove ValueGetters in favor of ColumnComparisonFilter doing its own thing. - Switch various methods to use convertObjectToX when casting to numbers, rather than ad-hoc and inconsistent logic. - Fix bug in RowBasedExpressionColumnValueSelector: isBindingArray should return true even for 0- or 1- element arrays. - Adjust various javadocs.

gianm · 2020-03-09T16:26:25Z

Note to reviewers — some of the bugs fixed here aren't tested by existing tests, but I plan to add tests for them in a future patch that also adds a RowBasedStorageAdapter. That's because the simplest & best way to test them is to add a row-based cursor to BaseFilterTest, which won't exist until the future patch.

clintropolis

🤘

clintropolis · 2020-03-09T17:26:14Z

...ng/src/main/java/org/apache/druid/segment/virtual/RowBasedExpressionColumnValueSelector.java

@@ -95,7 +95,7 @@ private boolean isBindingArray(String x)
  {
    Object binding = bindings.get(x);
    if (binding != null) {
-      if (binding instanceof String[] && ((String[]) binding).length > 1) {
+      if (binding instanceof String[]) {


clintropolis · 2020-03-09T17:28:46Z

processing/src/main/java/org/apache/druid/segment/filter/ValueMatchers.java

+    };
+  }
+
+  public static ValueMatcher makeLongValueMatcher(final BaseLongColumnValueSelector selector, final String value)


super nit: missing javadoc (since almost all the others have it)

clintropolis · 2020-03-09T17:29:00Z

processing/src/main/java/org/apache/druid/segment/filter/ValueMatchers.java

+    };
+  }
+
+  public static ValueMatcher makeLongValueMatcher(


same nit re javadoc

…o that.

gianm · 2020-03-09T18:48:36Z

Pushed some updates to address test failures in InputRowSerdeTest. I had to add a throwParseExceptions option to the RowBasedColumnSelectorFactory, since it turns out some users want that behavior and some don't.

clintropolis · 2020-03-10T12:11:16Z

Tagged release notes because this PR changes the behavior of complex metric aggregation at ingestion time when SQL compatible null handling is disabled (the default mode) to now aggregate the default 0 values for rows instead of skipping them. This change is for the better imo since it makes things symmetrical to as if you ingested the raw data and built the sketch at query time, but it is different so worth calling out, and you can see the effects in some of the test changes in this PR.

gianm · 2020-03-10T14:14:46Z

Tagged release notes because this PR changes the behavior of complex metric aggregation at ingestion time when SQL compatible null handling is disabled (the default mode) to now aggregate the default 0 values for rows instead of skipping them. This change is for the better imo since it makes things symmetrical to as if you ingested the raw data and built the sketch at query time, but it is different so worth calling out, and you can see the effects in some of the test changes in this PR.

Thanks for pointing that out. Yes, I agree, it is for the better since it makes the ingest-time behavior and query-time behavior the same. This is part of the promise of Druid rollup in the first place (you can move aggregations to ingest time if you want).

Btw, this patch also ends up making ingest-time transforms and filters behave more consistently with query-time ones.

The reason is that all this ingest-time stuff runs in unknown-type mode, which til now had various inconsistencies with known-type mode (which is used at query time).

gianm added the Area - Querying label Mar 9, 2020

clintropolis approved these changes Mar 9, 2020

View reviewed changes

Add throwParseExceptions option to Rows.objectToNumber, switch back t…

7b70930

…o that.

gianm added 2 commits March 9, 2020 15:42

Update tests.

9fca43f

Adjust moment sketch tests.

b6f703b

clintropolis added the Release Notes label Mar 10, 2020

gianm merged commit c6c2282 into apache:master Mar 10, 2020

gianm deleted the harmonize-untyped branch March 10, 2020 14:16

jihoonson added this to the 0.18.0 milestone Mar 26, 2020

jihoonson mentioned this pull request Apr 9, 2020

[Draft] 0.18.0 release notes #9652

Closed

jihoonson mentioned this pull request May 3, 2020

Fix filtering on boolean values in transformation #9812

Merged

9 tasks

morokosi mentioned this pull request Sep 9, 2020

Filtered Aggregator at ingestion time don't work #10293

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harmonization and bug-fixing for selector and filter behavior on unknown types. #9484

Harmonization and bug-fixing for selector and filter behavior on unknown types. #9484

gianm commented Mar 9, 2020

gianm commented Mar 9, 2020

clintropolis left a comment

clintropolis Mar 9, 2020

clintropolis Mar 9, 2020

clintropolis Mar 9, 2020

gianm commented Mar 9, 2020

clintropolis commented Mar 10, 2020 •

edited

Loading

gianm commented Mar 10, 2020 •

edited

Loading

Harmonization and bug-fixing for selector and filter behavior on unknown types. #9484

Harmonization and bug-fixing for selector and filter behavior on unknown types. #9484

Conversation

gianm commented Mar 9, 2020

gianm commented Mar 9, 2020

clintropolis left a comment

Choose a reason for hiding this comment

clintropolis Mar 9, 2020

Choose a reason for hiding this comment

clintropolis Mar 9, 2020

Choose a reason for hiding this comment

clintropolis Mar 9, 2020

Choose a reason for hiding this comment

gianm commented Mar 9, 2020

clintropolis commented Mar 10, 2020 • edited Loading

gianm commented Mar 10, 2020 • edited Loading

clintropolis commented Mar 10, 2020 •

edited

Loading

gianm commented Mar 10, 2020 •

edited

Loading