Scipy-aligned error tolerance, vector-error control, and adaptive robustness
Error tolerance (breaking change for fine tolerances)
The accept/reject threshold for the norm family of error reductions has changed from atol + rtol * |I| (additive) to max(atol, rtol * |I|) (maximum), aligning with scipy.integrate's convention. In practice this tightens the tolerance when both terms are comparable — most visible at fine tolerances (rtol ≪ atol) — and shifts the adaptive mesh accordingly. Snapshot values have been regenerated to match.
Supporting changes:
atolandrtolare now stored as dtype-aware tensors inSolverBaseand re-cast in_set_dtype, so the max-based comparison is numerically consistent across float32 and float64.y0,mesh_init, andmesh_finaldefaults now honor the solver's dtype rather than hard-coding float64.- Bug fix: the integral loss was incorrectly computed as
sum(step²)instead of(integral)²; corrected. - Bug fix: a misplaced parenthesis in the tolerance expression has been fixed.
The
failure_fractionerror scheme retains the additiveatol + rtol * |I|form; only the norm family ("2","max","rms", callable) was updated.
Pluggable vector-error norm (error_norm)
Vector-valued integrands (D > 1) previously had their per-step error collapsed to a single RMS scalar, bounding only the aggregate error. A new error_norm argument (constructor and per-call override in integrate()) selects the reduction strategy, modeled on scipy.integrate.quad_vec's norm:
- Norm family (
"2"default,"max","rms", or a callable) — reduce-then-compare: collapse the error vector with the named norm, compare toatol + rtol * norm(integral). "failure_fraction"— per-component: each output element is compared against its own tolerance; a panel is accepted when the fraction of failing components is ≤mesh_failure_tolerance(default0.0, requiring every element to pass).
A scipy-inspired rounding floor (max(tol, 50 * eps * |step|)) is applied so refinement halts at the working-dtype precision wall instead of refining indefinitely when the requested tolerance is below the dtype's resolution.
The legacy error_calc_idx argument has been removed (superseded by a callable error_norm).
error_integral_reference argument on integrate()
In absolute-error mode the rtol denominator is the running (incomplete) integral, which over-refines early panels when the integrand's mass is concentrated at later times. The new error_integral_reference argument accepts a Python float or tensor close to the true total, using it as the denominator from the first batch onward. Has no effect in cumulative mode.
Sub-panel max_batch support (take_gradient=False)
A per-batch node budget smaller than one panel's quadrature-point count (max_batch < C) previously crashed. The take_gradient=False split path now handles this correctly:
max_batch == 0: the dispatcher rescues it to1(with a warning); a direct unit call to_evaluate_f_on_split_nodesraisesAssertionError.0 < max_batch < C: the split path processes one panel per iteration, evaluating itsCnodes across severalmax_batch-sized sub-batches and carrying remainders forward. The final integral is bit-identical to amax_batch ≥ Crun with the same seeded mesh.take_gradient=TruebelowCcontinues to raise an error (unchanged).
NaN/Inf integrand handling
Non-finite error ratios (from singular integrands) previously fell into neither the keep nor remove mask in _adaptively_increase_mesh, causing an infinite refinement loop that the max_adaptive_splits cap could not rescue. Now:
- Non-finite ratios are routed into the accept mask (splitting cannot reduce a singularity).
- A new
error_on_nonfiniteflag (defaultTrue) raises aValueErrornaming the offendingtwhenfreturns NaN/Inf, making silent wrong results impossible.
GPU device handling
850 tests that passed on CPU failed on CUDA machines due to device-placement inconsistencies. Fixes applied:
- User-provided meshes are coerced to
self.devicebefore use in the integration loop. VariableAdaptiveQuadraturenow callsmethod.to_device(self.device)at construction (the uniform solver already did this; the variable solver left method tensors on CPU).- Tensor elements of
f_argsare coerced toself.deviceinintegrate(). - Internal sanity asserts replaced
np.allclosewithtorch.allcloseso they evaluate on the input's native device.
OOM recovery and batch-size caching
- Out-of-memory errors during integration and memory benchmarking are caught;
max_batchis lowered and the evaluation retried rather than crashing. max_batchis now cached between calls whenfis the same function, avoiding a full re-benchmark on everyintegrate()call.