Skip to content

[common] Fix O(2^n) complexity in FileIndexPredicate.getRequiredNames#7332

Open
dubin555 wants to merge 1 commit intoapache:masterfrom
dubin555:oss-scout/verify-fix-fileindex-predicate-exponential-complexity
Open

[common] Fix O(2^n) complexity in FileIndexPredicate.getRequiredNames#7332
dubin555 wants to merge 1 commit intoapache:masterfrom
dubin555:oss-scout/verify-fix-fileindex-predicate-exponential-complexity

Conversation

@dubin555
Copy link

@dubin555 dubin555 commented Mar 2, 2026

Purpose

Linked issue: close #7230

FileIndexPredicate.getRequiredNames() calls child.visit(this) twice per child in its CompoundPredicate visitor — once discarding the result, then again to collect it. Since PredicateBuilder.or() produces right-nested binary trees via reduce(), this doubles work at each tree level, resulting in O(2^n) time complexity.

For an IN clause with 20 values (which produces a nested OR tree of depth 19), this means ~1,048,576 leaf visits instead of 20. In production, queries with moderately sized IN clauses hang indefinitely.

The fix removes the redundant child.visit(this) call (line 130), matching the correct pattern already used in PredicateVisitor.FieldNameCollector.

The bug was introduced in ebdfa02bd ("[hotfix] Correct visitors for TransformPredicate"), which refactored the visitor to handle TransformPredicate and accidentally left the duplicate call.

Tests

  • FileIndexPredicateTest.testGetRequiredNamesLinearComplexity() — builds a 20-element OR chain, counts leaf visits via AtomicInteger. Asserts exactly 20 visits (linear). Before fix: 1,048,575 visits (exponential).
  • FileIndexPredicateTest.testGetRequiredNamesPerformance() — builds a 20-element OR chain, asserts completion within 100ms.
  • FileIndexPredicateTest.testGetRequiredNamesBasic() — verifies correctness: all field names are collected from a compound predicate.
  • FileIndexPredicateTest.testGetRequiredNamesSinglePredicate() — verifies single leaf predicate returns the correct field name.

API and Format

No.

Documentation

No.

Generative AI tooling

Generated-by: Claude Code 1.0.33

Remove redundant child.visit(this) call in getRequiredNames() that caused
exponential time complexity for deeply nested OR predicates (e.g. IN clauses).
The visitor called child.visit(this) twice per child — once discarding the
result, then again using it — doubling work at each tree level.

For IN clauses with <= 20 values producing right-nested OR trees of depth N,
this caused O(2^N) leaf visits instead of O(N), hanging production CPUs.

Closes apache#7230
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] FileIndexPredicate getRequiredNames() redundant child.visit() causing Exponential algorithmic complexity

1 participant