-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Description
Affected Version
V27.0.0
Impact
This issue appears to be reliably reproduced by executing a single-dimension, single-filter Native Druid query on any string dimension in a kinesis ingestion task that is derived from a Schema Auto-Discovery spec, as long as the data has not been handed off. The issue resolves after hand-off to Historicals.
Expected Result
GroupBy and Timeseries Queries against actively ingested single dimension values are consistently filtered without regard to data residency (realtime vs fully persisted segment).
Actual Result
GroupBy and Timeseries Queries against actively ingested single dimension values temporarily ignore or mis-apply filters until data segments are persisted, at which point filters are correctly applied.
Description
My team operates multiple large-scale Druid clusters with roughly identical base configurations. Pertinent details are as follows:
- Ingestion Method:
kinesis - Segment size:
1 hour - Lookback period:
3 hours(a small portion of our data is late-arriving) - Relevant Middle Manager architecture: ARM processors, statically defined hardware, dedicated to kinesis ingestion tasks
- Other Middle Manager tasks, such as compaction, are delegated to a separate Middle Manager tier
As part of Schema Auto-discovery migration, we migrated one of our regions to a new schema in which we only define a few legacy lists (to retain them as MVDs) and aggregations - the rest of our fields are ingested via discovery. In total, we produce records with ~100-150 fields, and the dataTypes do appear to align correctly post-migration.
In the process of migrating, we stumbled across a perplexing issue with GroupBy and Timeseries queries. Whenever we perform a single dimension query that overlaps/involves data on the Middle Managers (in our case, queries that touch the most recent 3 hours), the results received are nonsensical - the filter appears to be either inconsistently applied or not applied at all, resulting in other dimension values 'leaking' into the results despite being ruled out by the filter. This behavior is almost reminiscient of some sort of MVD edge case, but the fields experiencing this issue are strictly singular string values (and, as mentioned further down, the behavior changes between different points of the segment's lifecycle - indicating some sort of discrepancy based on query path / segment state).
Consider the following minimally-reproducible query, a GroupBy that groups and filters by an example_field dimension.
{
"queryType": "groupBy",
"dataSource": "Example_Records",
"granularity": "all",
"filter": {
"type": "selector",
"dimension": "example_field",
"value": "expected_value"
},
"dimensions": ["example_field"],
"intervals": [
"2023-10-17T00:00:00+0000/2023-10-17T20:55:00+0000"
]
}
Assuming example_field is guaranteed to be a simple string value, this query should return at a maximum 1 row - the value expected_value. However, that is not what happens.
- When executed on a data range that still resides on Middle Managers, this query returns between 20-40 different rows with miscellaneous values for
example_field. - When executed on a data range that has been successfully handed off to Historicals, this query returns the correct / expected value of only
expected_value. - When the same query is executed twice with a 3-hour delay between runs, it will first return the nonsensical result - and then later return the expected result - indicating a behavior change between the comparable Middle Manager and Historical queries.
Oddly enough, a modification to the original query appears to fix it. If an additional dimension - even one that doesn't exist - is added to the query (ordering does not matter), it returns the expected result 100% of the time:
{
"queryType": "groupBy",
"dataSource": "Sample_Sessions",
"granularity": "all",
"filter": {
"type": "selector",
"dimension": "example_field",
"value": "expected_value"
},
"dimensions": ["example_field", "oof"],
"intervals": [
"2023-10-17T00:00:00+0000/2023-10-17T20:55:00+0000"
]
}
The above query will always return one row with an example_field value of expected_value and an oof value of null, somehow avoiding the nonsensical condition of the first query.