Skip to content

Scipy-aligned error tolerance, vector-error control, and adaptive robustness

Choose a tag to compare

@khegazy khegazy released this 09 Jun 20:39
· 5 commits to main since this release

Error tolerance (breaking change for fine tolerances)

The accept/reject threshold for the norm family of error reductions has changed from atol + rtol * |I| (additive) to max(atol, rtol * |I|) (maximum), aligning with scipy.integrate's convention. In practice this tightens the tolerance when both terms are comparable — most visible at fine tolerances (rtol ≪ atol) — and shifts the adaptive mesh accordingly. Snapshot values have been regenerated to match.

Supporting changes:

  • atol and rtol are now stored as dtype-aware tensors in SolverBase and re-cast in _set_dtype, so the max-based comparison is numerically consistent across float32 and float64.
  • y0, mesh_init, and mesh_final defaults now honor the solver's dtype rather than hard-coding float64.
  • Bug fix: the integral loss was incorrectly computed as sum(step²) instead of (integral)²; corrected.
  • Bug fix: a misplaced parenthesis in the tolerance expression has been fixed.

The failure_fraction error scheme retains the additive atol + rtol * |I| form; only the norm family ("2", "max", "rms", callable) was updated.

Pluggable vector-error norm (error_norm)

Vector-valued integrands (D > 1) previously had their per-step error collapsed to a single RMS scalar, bounding only the aggregate error. A new error_norm argument (constructor and per-call override in integrate()) selects the reduction strategy, modeled on scipy.integrate.quad_vec's norm:

  • Norm family ("2" default, "max", "rms", or a callable) — reduce-then-compare: collapse the error vector with the named norm, compare to atol + rtol * norm(integral).
  • "failure_fraction" — per-component: each output element is compared against its own tolerance; a panel is accepted when the fraction of failing components is ≤ mesh_failure_tolerance (default 0.0, requiring every element to pass).

A scipy-inspired rounding floor (max(tol, 50 * eps * |step|)) is applied so refinement halts at the working-dtype precision wall instead of refining indefinitely when the requested tolerance is below the dtype's resolution.

The legacy error_calc_idx argument has been removed (superseded by a callable error_norm).

error_integral_reference argument on integrate()

In absolute-error mode the rtol denominator is the running (incomplete) integral, which over-refines early panels when the integrand's mass is concentrated at later times. The new error_integral_reference argument accepts a Python float or tensor close to the true total, using it as the denominator from the first batch onward. Has no effect in cumulative mode.

Sub-panel max_batch support (take_gradient=False)

A per-batch node budget smaller than one panel's quadrature-point count (max_batch < C) previously crashed. The take_gradient=False split path now handles this correctly:

  • max_batch == 0: the dispatcher rescues it to 1 (with a warning); a direct unit call to _evaluate_f_on_split_nodes raises AssertionError.
  • 0 < max_batch < C: the split path processes one panel per iteration, evaluating its C nodes across several max_batch-sized sub-batches and carrying remainders forward. The final integral is bit-identical to a max_batch ≥ C run with the same seeded mesh.
  • take_gradient=True below C continues to raise an error (unchanged).

NaN/Inf integrand handling

Non-finite error ratios (from singular integrands) previously fell into neither the keep nor remove mask in _adaptively_increase_mesh, causing an infinite refinement loop that the max_adaptive_splits cap could not rescue. Now:

  • Non-finite ratios are routed into the accept mask (splitting cannot reduce a singularity).
  • A new error_on_nonfinite flag (default True) raises a ValueError naming the offending t when f returns NaN/Inf, making silent wrong results impossible.

GPU device handling

850 tests that passed on CPU failed on CUDA machines due to device-placement inconsistencies. Fixes applied:

  • User-provided meshes are coerced to self.device before use in the integration loop.
  • VariableAdaptiveQuadrature now calls method.to_device(self.device) at construction (the uniform solver already did this; the variable solver left method tensors on CPU).
  • Tensor elements of f_args are coerced to self.device in integrate().
  • Internal sanity asserts replaced np.allclose with torch.allclose so they evaluate on the input's native device.

OOM recovery and batch-size caching

  • Out-of-memory errors during integration and memory benchmarking are caught; max_batch is lowered and the evaluation retried rather than crashing.
  • max_batch is now cached between calls when f is the same function, avoiding a full re-benchmark on every integrate() call.