Describe the bug
Statistics::with_fetch unconditionally returns Precision::Exact(0) when nr <= skip, even when the input num_rows is Precision::Inexact. This precision promotion causes AggregateStatistics to falsely fold COUNT(*) to literal 0 on multi-way joins.
DataFusion version: 53.1.0
The problem
In stats.rs:454-464, both Exact(nr) and Inexact(nr) fall into the same match arm:
Statistics {
num_rows: Precision::Exact(nr),
..
}
| Statistics {
num_rows: Precision::Inexact(nr),
..
} => {
if nr <= skip {
Precision::Exact(0) // ← promotes Inexact(0) to Exact(0)
}
When nr = 0 and skip = 0, the condition 0 <= 0 is true and the function returns Exact(0) regardless of whether the input was Exact or Inexact.
How this causes incorrect query results
On multi-way joins (4+ tables), estimate_disjoint_inputs can produce a false disjoint detection when per-partition column min/max ranges appear non-overlapping (this depends on the physical partition layout and varies between runs). The chain:
estimate_disjoint_inputs detects false disjoint ranges on join key columns.
estimate_join_statistics wraps the cardinality as Inexact(0) (line 452, always Inexact).
HashJoinExec::partition_statistics calls stats.with_fetch(self.fetch, 0, 1). With nr=0, skip=0, with_fetch promotes Inexact(0) to Exact(0).
Count::value_from_stats (count.rs:371) requires Precision::Exact on num_rows to fold COUNT(*). The false Exact(0) matches, and AggregateStatistics replaces the aggregate with PlaceholderRowExec, returning 0.
The bug is flaky because it depends on the partition layout producing non-overlapping min/max ranges for join keys in the merged partition statistics. Different data distributions, partition counts, or memory constraints change the layout and may or may not trigger the false disjoint detection.
Example: TPC-H 4-table join
The following query over TPC-H data can return cnt = 0 instead of the correct result (~2.4M rows at SF150) when partition statistics happen to produce disjoint min/max ranges on the join keys:
SELECT COUNT(*) AS cnt
FROM part, partsupp, supplier, nation
WHERE p_partkey = ps_partkey
AND s_suppkey = ps_suppkey
AND s_nationkey = n_nationkey
AND p_size = 15;
The conditions that trigger the bug:
- 4+ tables joined via inner joins (deeper join trees propagate column stats through more layers).
- Leaf tables with
Exact column min/max stats (common with Parquet file metadata).
- A selective filter (
p_size = 15) that narrows the apparent min/max range of join keys in intermediate join outputs.
- Multiple partitions whose merged statistics can appear disjoint.
This is the join skeleton of TPC-H Q2. The flakiness means it may not reproduce on every run; reducing the memory pool size (forcing different partition layouts) or increasing the number of partitions increases the probability.
To Reproduce
The simplest deterministic reproduction is a unit test against Statistics::with_fetch:
#[test]
fn test_with_fetch_preserves_inexact_precision() {
let stats = Statistics {
num_rows: Precision::Inexact(0),
total_byte_size: Precision::Absent,
column_statistics: vec![],
};
// fetch=None, skip=0, n_partitions=1
let result = stats.with_fetch(None, 0, 1).unwrap();
// Should preserve Inexact, not promote to Exact
assert_eq!(result.num_rows, Precision::Inexact(0));
}
This test currently fails:
assertion `left == right` failed
left: Exact(0)
right: Inexact(0)
Expected behavior
When nr <= skip and the input num_rows is Inexact, with_fetch should return Inexact(0) (preserving the precision level), not Exact(0).
Suggested fix
Preserve the precision when nr <= skip:
if nr <= skip {
if self.num_rows.is_exact().unwrap_or(false) {
Precision::Exact(0)
} else {
Precision::Inexact(0)
}
}
Additional context
Describe the bug
Statistics::with_fetchunconditionally returnsPrecision::Exact(0)whennr <= skip, even when the inputnum_rowsisPrecision::Inexact. This precision promotion causesAggregateStatisticsto falsely foldCOUNT(*)to literal 0 on multi-way joins.DataFusion version: 53.1.0
The problem
In
stats.rs:454-464, bothExact(nr)andInexact(nr)fall into the same match arm:When
nr = 0andskip = 0, the condition0 <= 0is true and the function returnsExact(0)regardless of whether the input wasExactorInexact.How this causes incorrect query results
On multi-way joins (4+ tables),
estimate_disjoint_inputscan produce a false disjoint detection when per-partition column min/max ranges appear non-overlapping (this depends on the physical partition layout and varies between runs). The chain:estimate_disjoint_inputsdetects false disjoint ranges on join key columns.estimate_join_statisticswraps the cardinality asInexact(0)(line 452, alwaysInexact).HashJoinExec::partition_statisticscallsstats.with_fetch(self.fetch, 0, 1). Withnr=0, skip=0,with_fetchpromotesInexact(0)toExact(0).Count::value_from_stats(count.rs:371) requiresPrecision::Exactonnum_rowsto foldCOUNT(*). The falseExact(0)matches, andAggregateStatisticsreplaces the aggregate withPlaceholderRowExec, returning 0.The bug is flaky because it depends on the partition layout producing non-overlapping min/max ranges for join keys in the merged partition statistics. Different data distributions, partition counts, or memory constraints change the layout and may or may not trigger the false disjoint detection.
Example: TPC-H 4-table join
The following query over TPC-H data can return
cnt = 0instead of the correct result (~2.4M rows at SF150) when partition statistics happen to produce disjoint min/max ranges on the join keys:The conditions that trigger the bug:
Exactcolumn min/max stats (common with Parquet file metadata).p_size = 15) that narrows the apparent min/max range of join keys in intermediate join outputs.This is the join skeleton of TPC-H Q2. The flakiness means it may not reproduce on every run; reducing the memory pool size (forcing different partition layouts) or increasing the number of partitions increases the probability.
To Reproduce
The simplest deterministic reproduction is a unit test against
Statistics::with_fetch:This test currently fails:
Expected behavior
When
nr <= skipand the inputnum_rowsisInexact,with_fetchshould returnInexact(0)(preserving the precision level), notExact(0).Suggested fix
Preserve the precision when
nr <= skip:Additional context
FilterExec::collect_new_statisticsconverts Absent column stats to Exact(NULL), corrupting join cardinality estimation #20388 (FilterExecconverts Absent stats toExact(NULL), another false-precision-promotion bug in the same area).COUNT(*) = 0regression on a 4-table inner join over TPC-H SF150 data with DataFusion 53.1.0.