Skip to content

Add solve fuzzer and fix the soundness bugs it surfaced#9105

Draft
alexreinking wants to merge 2 commits intomainfrom
alexreinking/solve-fuzzer
Draft

Add solve fuzzer and fix the soundness bugs it surfaced#9105
alexreinking wants to merge 2 commits intomainfrom
alexreinking/solve-fuzzer

Conversation

@alexreinking
Copy link
Copy Markdown
Member

@alexreinking alexreinking commented Apr 23, 2026

Summary

  • New libfuzzer harness at test/fuzz/solve.cpp for solve_expression, solve_for_{inner,outer}_interval, and and_condition_over_domain, running over random expression trees up to depth 6 across Int(8..64), UInt(1..64), Float(32), Float(64), with broadcasts and ramps enabled.
  • Fixes every soundness bug the fuzzer surfaced along the way — five in Solve.cpp, four in Bounds.cpp, and one in Simplify_Cast.cpp — each with a companion regression test in test/correctness/solve.cpp that reproduces the failure without the fix.
  • Clean at 250k+ fuzz iterations on the full type set after the fixes.

Solve.cpp

  • visit_cmp a + b @ c -> a @ c - b flipped ordering under modular wrap; restrict to no_overflow_int, keep EQ/NE.
  • visit_cmp Mul rewrites introduced div-by-zero UB on non-constant multipliers; move EQ/NE cases under the positive-const guard and gate on no_overflow_int.
  • visit(Add) a + a -> a * 2 aborted on UInt(1); switch to Mul::make(a, make_const(a.type(), 2)).
  • visit(Add)/visit(Sub) f(x)/a + g(x) -> (f(x) + g(x)*a)/a wrapped on narrow types; gate on no_overflow_int.
  • Canonicalize two inlined copies of is_int() && bits() >= 32 to no_overflow_int.

Bounds.cpp

  • visit(Cast) shortcut silently wrapped unsigned narrowings to signed (UInt(64) -> Int(32) carried a wrapped-negative as max); restrict to non-unsigned sources.
  • Bounded signed-int sources could still wrap and invert the interval (e.g. int64 [59, 4294967287] -> int32 [59, -9]); further restrict the shortcut to unbounded signed-int sources and let bounded ones fall through to the existing can_prove(cast(from, cast(to, x)) == x) fit check.
  • Float sources hit the same flavor of bug (two float endpoints wrap to opposite sides of zero, inverting the int interval); drop the float shortcut entirely — the bounded-a path already does a sound to.can_represent(*fmin) fit check.
  • visit(Mod) claimed [0, ...] even for floats; fmod takes the sign of the dividend and is NaN on zero divisor, so return bounds_of_type(t) for float types.

Simplify_Cast.cpp

  • Nested-cast rule stripped the inner cast on the "sign/zero extend then truncate" theory, but that only holds when the inner source is int-or-uint. For a float source (e.g. int32(uint64(float64(-21)))) stripping changes the value. Add the is_int_or_uint() guard.

Test plan

  • ctest -R 'correctness_(solve|bounds|simplify)$' passes
  • test/fuzz/fuzz_solve -runs=10000 across multiple seeds with floats enabled runs clean
  • Reverting each fix individually reproduces the corresponding regression test failure

Fixes #9101

🤖 Generated with Claude Code

Adds a libfuzzer harness at test/fuzz/solve.cpp that exercises
solve_expression (semantic equivalence under random substitutions),
solve_for_inner_interval / solve_for_outer_interval (inside-inner
implies true; outside-outer implies false), and
and_condition_over_domain (weakened condition implies input) on
random expression trees up to depth 6 across Int(8..64), UInt(1..64),
Float(32), Float(64), with broadcasts and ramps enabled. The
random_expr_generator now skips bitwise ops on float types, which
would otherwise abort the fuzzer on construction.

Alongside the fuzzer, this fixes every soundness bug the fuzzer
surfaced while building it up to the wide type set:

Solve.cpp (src/Solve.cpp):
  - visit_cmp rewrote `a + b @ c` to `a @ c - b` for every
    comparison, but the subtraction wrap flips ordering (LT/LE/GT/GE)
    under modular arithmetic. Restrict ordering rewrites to
    no_overflow_int; EQ/NE stay (modular equivalence preserves
    equality).
  - visit_cmp's Mul rewrites `a*c @ b <=> a @ b/c && b%c == 0`
    computed b/c and b%c for any shape of c, introducing div-by-zero
    UB when c was a non-constant that evaluated to zero. The code
    comment claimed it only fired on constant positive multipliers;
    move the EQ/NE cases inside that guard and also require
    no_overflow_int. Narrow signed types (where overflow is still UB)
    go through the same gate.
  - visit(Add)'s `a + a -> a * 2` used `Expr * int`, which aborts on
    UInt(1) ("Integer constant 2 ... coerced to type uint1"). Switch
    to Mul::make(a, make_const(a.type(), 2)); make_const truncates
    modulo width, which is the correct modular value.
  - visit(Add)/visit(Sub) `f(x)/a + g(x) -> (f(x) + g(x)*a)/a` only
    holds under non-wrapping arithmetic. Gate on no_overflow_int.
  - Canonicalize two inlined `a.type().is_int() && bits() >= 32`
    predicates to no_overflow_int.

Bounds.cpp (src/Bounds.cpp):
  - visit(Cast) took a blanket "no overflow possible" shortcut for
    signed-int/at-least-int32 destinations regardless of the source,
    which silently wrapped UInt(32)/UInt(64) -> Int(32) narrowings
    and carried a wrapped-negative as "max". Restrict the shortcut
    to non-unsigned sources.
  - After the first restriction, the fuzzer also found that *bounded*
    signed-int sources can exceed the destination and wrap (e.g.
    int64 bounds [59, 4294967287] becoming int32 [59, -9], an
    inverted interval that downstream treats as "always true").
    Further restrict the shortcut to unbounded signed-int sources;
    let bounded ones fall through to the existing
    can_prove(cast(from, cast(to, x)) == x) fit check.
  - Later, the fuzzer (with floats enabled) caught the same flavor
    of bug for float sources: the comment claimed "float-to-int
    casts truncate in place and preserve interval orientation", but
    IntImm::make wraps by sign-extending low bits, and two float
    endpoints can wrap to opposite sides of zero (inverting the
    interval). Drop the float shortcut entirely; the bounded-a path
    already does a sound to.can_represent(*fmin) fit check with a
    fallback to bounds_of_type(to).
  - visit(Mod) unconditionally claimed the result is in [0, ...],
    which matches integer mod but not fmod (sign of the dividend;
    NaN on zero divisor). For float types, return
    bounds_of_type(t) (unbounded).

Simplify_Cast.cpp (src/Simplify_Cast.cpp):
  - A nested-cast simplification stripped the inner cast when the
    outer was narrower, on the grounds that the inner is always a
    sign or zero extend. That assumption only holds when the inner
    cast's source is itself int-or-uint; for a float source, the
    inner is an fp-to-int conversion and stripping it changes the
    value. Concretely, `int32(uint64(float64(-21)))` evaluates to 0
    (fp-to-unsigned of negative is implementation-defined, often 0),
    but the stripped `int32(float64(-21))` evaluates to -21. Add
    the `cast->value.type().is_int_or_uint()` guard, mirroring the
    sibling rule above it.

Each of the solver and bounds fixes has a companion regression test
in test/correctness/solve.cpp that, prior to the fix, reproduces the
specific failure mode (verified by reverting each fix individually).

With all fixes in place, the fuzzer runs clean for 250k+ iterations
across the full type set; the correctness suite (solve, bounds,
simplify, cast) still passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@alexreinking alexreinking marked this pull request as draft April 23, 2026 16:15
The previous commit dropped the float-source shortcut from visit(Cast)
to fix the fuzzer-found unsoundness where constant float bounds outside
int range wrapped via IntImm::make and inverted the resulting int
interval. That was too broad: bilateral_grid (and similar pipelines)
produce SYMBOLIC float bounds like
  select(r_sigma > 0, 0/r_sigma, 1/r_sigma) + 0.5
at bounds-inference time, which can't be proven to fit in int range but
are load-bearing for downstream BoundsInference to resolve Func call
extents -- without them, "histogram calls histogram in an unbounded way
in dimension 2" and the build fails.

Restore the float-source shortcut for bounded-a, but reject it only
when an endpoint is a float *constant* we can prove exceeds the int
range. Symbolic bounds pass through (matching Halide's convention that
out-of-range float-to-int is the programmer's problem); constant bounds
that would wrap do not.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread src/Bounds.cpp
could_overflow = false;
} else if (from.is_float() && to.is_int() && to.bits() >= 32) {
// Float source to int32+: out-of-range float-to-signed-int
// casts are implementation-defined, so Halide's convention
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly misleading. Float to signed int casts in Halide saturate for bits <= 16.

Comment thread src/Bounds.cpp

// Mod is always positive
if (t.is_float()) {
// fmod's result takes the sign of the dividend (so it can be
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mod in Halide is supposed to be positive. For integers we have the Euclidean identity:

(a / b) * b + (a % b) == a

The division rounds up when b is negative, so the LHS of the mul is larger than the true answer, which means, when multiplied by the negative b, the LHS of the add is smaller than the true answer, which means mod must be positive.

I think the bug might be in lowering of floating point mod. It's currently lowered as:

a - floor(a / b) * b

but it should be

a - floor((a/b) * b)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this more, my rationale is this: Floating point mod should coincide with integer mod if the floats happen to be integers. This means if 7 % -2 == 1 in the integers 7.f % -2.f should be 1.f in floats.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Floating point mod should coincide with integer mod if the floats happen to be integers.

I would be wary of deviating from what the hardware does (fmod) for this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've already deviated for integer div. We should be consistent.

Comment thread src/Simplify_Cast.cpp
// a float source makes `cast` an fp-to-int conversion, whose low
// bits are not the same as an fp-to-int conversion of a narrower
// type. For example, int32(uint64(float64(-21))) evaluates to 0
// (undefined-behavior territory for negative float-to-unsigned)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think casting a negative float to uint is UB in Halide. It's supposed to saturate.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though again, my understanding may not match the current lowering!

Comment thread src/Bounds.cpp
// for the fit check below to prove, and the destination is
// wide enough that any concretization will fit.
//
// Unsigned sources are excluded unconditionally: unsigned
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is true either. A uint to int cast is assumed to not overflow.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #7814 for what happened when I tried to claim they wrap.

Comment thread src/Solve.cpp
// Several of these rewrites rearrange the comparison by adding
// or subtracting on both sides. That's only sound under an
// assumption of no integer overflow -- for types that may wrap
// (unsigned, or narrow signed whose overflow is UB and could
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"narrow signed whose overflow is UB" Not entirely sure what narrow means here but our narrow signed ints don't have UB on overflow.

Comment thread src/Solve.cpp
// uint8 with c = 3, b = 7: the rewrite misses the
// solutions that arise from wrap).
//
// Don't use operator/ and operator % to sneak past the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original comment was stale. Division and mod by zero is now fine (returns zero). For the rewrite I think this means:

a * 0 == b <=> a == b / 0

which is wrong, so it's just the comment that needs correcting.

I'm worried that the earlier comment says we would only use this for positive or negative constants, but there's no guard to check that. It might mean that we don't statically know at this point that it's a non-zero constant, but the usage site knows that for some other reason. That's sort of horrifying so I hope it isn't the case.

Comment thread src/Solve.cpp
@@ -644,16 +672,14 @@ class SolveExpression : public IRMutator {
} else if (is_ne) {
// f(x) * c != b -> f(x) != b/c || b%c != 0
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see the other rules were guarded by is_positive_const, just not this one, so it's not the horrifying situation. Let's check

f(x) * 0 == b <=> f(x) == b/0 || b%0 == 0

Simplifying both sides

b == 0 <=> true

Nope. The NE case is also busted.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(meaning the code change here is correct)

@@ -0,0 +1,404 @@
#include "Halide.h"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The solver tests are currently in Solve.cpp at the bottom. It would be good to move all of them here, instead of just making a new collection.

Comment thread test/fuzz/solve.cpp
// Returns true if the expression, under the given substitution, contains a
// division or modulo whose divisor simplifies to zero. Solve is allowed to
// rearrange such expressions in ways that cause the simplifier to fold the UB
// inconsistently between the original and rewritten form -- so samples that
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Division by zero is not UB!

Comment thread test/fuzz/solve.cpp
val = random_int_val(reg.fuzz, -32, 32);
}

// Skip samples that invoke div/mod-by-zero UB in the input: the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, not UB

Comment thread src/Bounds.cpp
// extends the low bits and can invert the resulting
// interval, which is worse than just using bounds_of_type.
Interval s = a;
if (s.has_lower_bound()) s.min = simplify(s.min);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a method elsewhere in this file that simplifies whole intervals. It contains a critical difference to this: It only calls the simplifier once if max.same_as(min). This guarantees the two don't become two different but equal Expr objects, which would cause is_single_point to fail.

Comment thread src/Bounds.cpp
double lo = -std::ldexp(1.0, to.bits() - 1);
double hi_exclusive = std::ldexp(1.0, to.bits() - 1);
bool const_oob = false;
if (s.has_lower_bound()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This path may break pipelines that rely on Bounds inference "knowing" that cast<int>(abs(unbounded_float)) is non-negative. I really think we need to just throw up our hands here if the cast is to an int32 and assume it's not going to overflow. People use all sorts of types cast to int to index luts, and they assume the bounds on the original are still true through the cast.

@abadams
Copy link
Copy Markdown
Member

abadams commented Apr 23, 2026

I think, unfortunately, like other fuzzers this one is going to have to carefully avoid getting itself into a situation where an Int(32) Expr overflows.

Comment thread src/Bounds.cpp
if (t.is_float()) {
// fmod's result takes the sign of the dividend (so it can be
// negative), its magnitude is bounded by |b| but can reach it,
// and fmod(x, 0) = NaN which is not in any real interval. Don't
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're assuming no-nans here. Otherwise it would be a hypothetical "strict_mod" intrinsic.

Comment thread src/Simplify_Cast.cpp
// eliminated. The inner cast is either a sign extend
// or a zero extend, and the outer cast truncates the extended bits
// or a zero extend, and the outer cast truncates the extended bits.
// The requirement that cast->value is itself int-or-uint is crucial:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments should not spend this much space arguing why a hypothetical alternative version of this would be buggy. After the first sentence it's enough to just say that everything involved needs to be an int.

Comment thread src/Bounds.cpp
Interval s = a;
if (s.has_lower_bound()) s.min = simplify(s.min);
if (s.has_upper_bound()) s.max = simplify(s.max);
double lo = -std::ldexp(1.0, to.bits() - 1);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: I think we should consider allowing one-line if statements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

We need a Solve.cpp fuzzer

2 participants