SQL queries don't return rows with null values when null and non-null predicates over a dimension are specified in the where clause. For example, the following query only returns rows with X = 0:
SELECT X, __time
FROM my_data
WHERE (X IS NULL OR X = 0)
However, if I remove the X = 0 predicate, Druid does return rows with null as expected. Combining null and non-null predicates over the same dimension seems to be the issue.
Further looking into the query plan, I see this:
{
"type": "in",
"dimension": "X",
"values": [
null,
"0"
]
}
I think null values aren't compatible with in and should be treated differently?
FWIW, Null handling is enabled with druid.generic.useDefaultValueForNull=false and X is a long datatype.
This issue is found in Druid 0.20.1.
/cc: @clintropolis