FEA Make LedoitWolf estimator array API compatible#33573
FEA Make LedoitWolf estimator array API compatible#33573betatim wants to merge 18 commits intoscikit-learn:mainfrom
LedoitWolf estimator array API compatible#33573Conversation
The tests were written as part of a TDD approach during development, they cover things that the common tests also cover. So removing them to reduce the amount of duplication.
In some cases it returns flaot64 for float32 input. There are users who rely on this behaviour.
|
Hey @betatim, Sorry for dropping into the PR, but what you are working on is something super nice! At Pyriemann, with @agramfort and @qbarthelemy, we are using many of the covariance matrix implementations from scikit-learn and are also moving towards array API compatibility. I just would like to offer some assistance with the covariance part api compatibility, if necessary. |
|
I tried to find a macro issue to offer help, but I couldn't find a place that seemed appropriate. Please feel free to delete my comment, given that it's off-topic for your PR. |
No worries and welcome! It is fantastic to hear from someone using scikit-learn that says "this looks useful". I don't think there is a macro/mega issue related to this area of scikit-learn + array API support. I picked it more or less at random/as an exercise to see how much infrastructure would be missing. I think if you are interested in contributing to converting estimators/tools from https://scikit-learn.org/stable/api/sklearn.covariance.html go for it! It could be that the first step towards that is helping review this PR or picking another estimator from the list and converting it. I'd have to think about it, it probably depends on how much interdependency there is. Another thing that could be worth doing is creating an example for the gallery that shows off the array API support in I will create a mega-issue for |
bruAristimunha
left a comment
There was a problem hiding this comment.
if you do this, should unlock too the empiral covariance
Co-authored-by: Bru <b.aristimunha@gmail.com>
Convert _oas() and OAS.fit() to use array API compatible operations, following the same pattern as LedoitWolf from PR scikit-learn#33573.
Convert _oas() and OAS.fit() to use array API compatible operations, following the same pattern as LedoitWolf from PR scikit-learn#33573.
Convert _oas() and OAS.fit() to use array API compatible operations, following the same pattern as LedoitWolf from PR scikit-learn#33573.
If we flag it as supporting array API all the classes that inherit from it also get marked as supporting array API. This doesn't work, so we need to do this at the end when all estimators support array API.
|
I undid the changes to |
|
I can create a separate PR for this, if you want, @betatim. I fixed in my draft PR, small tricks are necessary for empirical covariance. |
|
Thanks for the feedback Omar. I implemented your suggestions |
OmarManzoor
left a comment
There was a problem hiding this comment.
LGTM Thank you @betatim
|
CC: @ogrisel |
| covariance = X.T @ X / X.shape[0] | ||
| else: | ||
| X_centered = X - xp.mean(X, axis=0) | ||
| covariance = X_centered.T @ X_centered / X.shape[0] |
There was a problem hiding this comment.
If n_samples >> n_features, it might be much more memory efficiency to do post-hoc centering of the empirical covariance matrix as we do in the PCA solver:
scikit-learn/sklearn/decomposition/_pca.py
Lines 604 to 610 in 94af1df
Shall we factorize this code in sklearn.utils._array_api._cov maybe?
There was a problem hiding this comment.
Let me know what you think of the refactor into _cov
| else: | ||
| # Non-blocked computation: GPU-friendly (single large matmul) | ||
| beta_ = float(xp.sum(X2.T @ X2)) | ||
| delta_ = float(xp.sum((X.T @ X) ** 2)) / n_samples**2 |
There was a problem hiding this comment.
Shouldn't we introduce a chunk_size argument / hparam to all the use to control the chunking, maybe with a chunk_size="auto" that would use larger chunk values on non-CPU devices?
There was a problem hiding this comment.
Note that we can leave this as a TODO for a follow-up PR.
There was a problem hiding this comment.
I vote for doing that in a new PR
There was a problem hiding this comment.
Ok but let's create an issue and link to it from a TODO inline comment.
There was a problem hiding this comment.
The pairwise_distances_chunked function called internally by pairwise_distance_argmin and similar has a related problem (the default value for the workmem_size parameter might only be good for CPUs).
There was a problem hiding this comment.
Investigate how much more memory is used for this "single block" approach
There was a problem hiding this comment.
After investigating and benchmarking a bit, I am not sure what we should do. For "tall and skinny" datasets blocking doesn't matter, because block_size=1000. For wide datasets it limits how much additional memory is used for temporaries. With n_features=20_000 you'd use something like 1.6GB of extra memory without blocking. However, the speedup (MPS over numpy) when using blocking is much smaller than if the MPS implementation runs without blocking. It can be 5x - 10x without blocking and only something like 2x faster if you use block_size=1000 for MPS as well.
Numpy is also faster if you increase the block size. For example, block_size=2000 seems to speed things up by roughly a factor two and still only uses a small amount of memory.
I tried to find out how the current block size was chosen, but couldn't.
So I think we could increase the block size, even for Numpy. But I am not sure how we should decide the trade-off between speed up and memory. We could use a block size like 3000 for all namespaces/devices to get most of the speed up and avoid catastrophic "out of memory" exceptions. The user could then tune the block size if they knew that they have enough memory? But it feels like I just made up 3000, it could equally be 4000 or 2000 - I couldn't argue which of these is more correct/better.
There was a problem hiding this comment.
I suspect that for CPU devices (e.g. NumPy), the optimal block size should be set that the sum of intermediate arrays stays in Level 3 CPU cache (typically a few megabytes on most modern CPUs).
It should be possible to tell if a given device is CPU or not using something like getattr(array, "__dlpack_device__", lambda: (1, 0))()[0] in (1, 3).
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
This avoids shifting X and let's us share the implementation with PCA.
Inspired by scikit-learn/scikit-learn#33573: - _empirical_covariance: post-hoc centering via cov = (X.T @ X - n * outer(mean, mean)) / n instead of allocating a centered copy of X. Avoids one (n_samples, n_features) allocation. OAS and ledoit_wolf benefit too since they call this helper. Cancellation error is negligible for bandpass-filtered signals where mean is near zero. - _ledoit_wolf_shrinkage: replace sum(X2.T @ X2) with sum(sum(X2, axis=1) ** 2). The identity sum_{i,j} (X2.T @ X2)_{i,j} = sum_k (sum_i X2[k,i]) ** 2 is strictly less work (O(n*f) vs O(n*f^2)) and avoids the (n_features, n_features) intermediate matrix — no tradeoff. Verified against sklearn reference outputs to 1e-8 tolerance across four shapes and non-zero means. Full test suite still 3697 passed.
Co-authored-by: Quentin Barthélemy <q.barthelemy@gmail.com>
|
ping @ogrisel |
ogrisel
left a comment
There was a problem hiding this comment.
I am not 100% satisfied with the fact that the NumPy code path operates at a higher floating point precision level than other namespaces on float32 inputs.
I think we should open an issue to track this discrepancy and attempt to "fix" it in a subsequent PR.
Here are other comments:
| return np.reshape(dist, (len(X),)) ** 2 | ||
| else: | ||
| X_centered = X - self.location_ | ||
| return xp.sum((X_centered @ precision) * X_centered, axis=1) |
There was a problem hiding this comment.
I think we need a tracking issue to make sure that pairwise_distances can always be called with array API inputs, either when n_jobs=1.
There was a problem hiding this comment.
Also: I am pretty sure that compilers such as torch.compile or jax.jit might be able to automatically fuse the centering operation, the matrix-sandwhich product operation and the subsequent reduction. However, we don't (yet?) have a library agnostic way to leverage array API namespace compilers (data-apis/array-api-extra#523).
| else: | ||
| # Non-blocked computation: GPU-friendly (single large matmul) | ||
| beta_ = float(xp.sum(X2.T @ X2)) | ||
| delta_ = float(xp.sum((X.T @ X) ** 2)) / n_samples**2 |
There was a problem hiding this comment.
Ok but let's create an issue and link to it from a TODO inline comment.
| covariance = covariance / (n_samples - ddof) | ||
| return covariance |
There was a problem hiding this comment.
| covariance = covariance / (n_samples - ddof) | |
| return covariance | |
| covariance /= (n_samples - ddof) | |
| return covariance |
| Estimators | ||
| ---------- | ||
|
|
||
| - :class:`covariance.LedoitWolf` |
There was a problem hiding this comment.
We need to document the numerical precision discrepancy.
| else: | ||
| covariance = X.T @ X / X.shape[0] | ||
| elif _is_numpy_namespace(xp): | ||
| # Preserve numpy path, because `np.cov` always returns float64 |
There was a problem hiding this comment.
Investigate why the other users of this function fail/check that the numpy behaviour is what it is. It would be nice to have a consistent behaviour and not use float64 when it is not needed (less memory usage and improved computational performance)
There was a problem hiding this comment.
After doing a bit of digging and educating myself (via AI) I realised that xpx.cov exists! So I think we should use that instead of doing our own thing. It mostly dispatches to the namespace of the array, the most popular array libs have a cov().
We need the bias= keyword argument though which I added in data-apis/array-api-extra#691
There was a problem hiding this comment.
I've pushed some changes that make the switch, but I think the order of work is: get xpx PR merged, update the vendored version in a separate scikit-learn PR, then update this PR.
There was a problem hiding this comment.
I would stick to our version to avoid making a memory allocation of the size of X to substract the mean and instead do the post-hoc centering as I did in PCA, at least when n_samples >> n_features.
This trick could be contributed upstream to array-api-extra and numpy (maybe via a new kwarg).
There was a problem hiding this comment.
hey @ogrisel (cc. @agramfort and @qbarthelemy),
I begin by apologizing for intruding on your review and discussion on this point.
I understand your suggestion to refactor the covariance module to use the post-hoc center trick from PCA. I tested it, and indeed reduces memory consumption.
However, my understanding is that while this improvement might fit in this PR, from a code maintenance perspective, it would be better to follow @betatim's suggestion and address this optimization in another PR/issue. I myself would be happy to implement this optimization later upstream or even here.
Delegating to an array api extra will provide good equivalence to what we already have here (we are not reducing any time performance from what scikit-learn already provides) and will unlock us to create the other covariance PR to use the API array with torch.
I think this will be the first of 5 PRs to come because of how all the functions depend on the empirical covariance.
This PR also unlocks the PR from @dherrera1911 at #33738, #33600
There was a problem hiding this comment.
I created the issue at numpy here: numpy/numpy#31292
| ) | ||
|
|
||
| return np.reshape(dist, (len(X),)) ** 2 | ||
| else: |
There was a problem hiding this comment.
| else: | |
| else: | |
| # XXX: pairwise_distances does not support array API with | |
| # metric="mahalanobis" at the time of writing. |
Convert _oas() and OAS.fit() to use array API compatible operations, following the same pattern as LedoitWolf from PR scikit-learn#33573.
Convert _oas() and OAS.fit() to use array API compatible operations, following the same pattern as LedoitWolf from PR scikit-learn#33573.
Convert _oas() and OAS.fit() to use array API compatible operations, following the same pattern as LedoitWolf from PR scikit-learn#33573.
* Add torch backend support for pyriemann utils * Extend torch backend wrappers and log-Cholesky utils * Simplify torch backend comparison tests * Complete torch backend support across utils * Migrate _backend.py from custom _Backend to array-api-compat Replace the custom _Backend dataclass, op dictionaries, and ~20 wrapper functions with array-api-compat's array_namespace. This reduces _backend.py from ~386 lines to ~110 lines while gaining potential CuPy/JAX support. All call sites updated to use the xp namespace pattern (xp.linalg.eigh, xp.sqrt, etc.) instead of backend.op() calls. * Clean up array-api-compat migration: efficiency and quality fixes - Replace xp.zeros_like() with scalar xp.asarray(0) in xp.maximum calls for distance clamping (avoids full-array allocation) - Remove unused xp variable in transport_logeuclid - Replace hacky copy/clone branching in nanmean_riemann with xp.where * Simplify backend and remove torch-specific sklearn reimplementations Remove torch-utils CI job, unused backend helpers (is_torch_tensor, is_numpy_array, is_backend_array), and ~430 lines of torch-specific sklearn re-implementations in covariance.py. Estimators _lwf, _mcd, _oas now use _to_numpy/_from_numpy pattern instead. Clean up unused imports across ajd.py, covariance.py, and _backend.py. * Parametrize tests with numpy/torch backend, remove test_utils_torch_backend.py Replace the separate 1099-line torch backend comparison test file with a parametrized backend fixture in conftest.py. All utils tests now run with both numpy and torch backends automatically. Non-utils tests (sklearn estimators) are marked @pytest.mark.numpy_only to skip torch variants. Add backend-agnostic test utilities: _Approx class (handles torch tensors in pytest.approx comparisons), to_numpy(), assert_array_almost_equal(), and assert_array_equal() wrappers in conftest.py. Result: 3100 passed, 1284 skipped, net -731 lines across 24 files. * Remove _Approx class; pytest.approx handles torch tensors natively pytest.approx already supports torch tensors via its own __eq__, so the custom _Approx wrapper with __array_ufunc__ = None was unnecessary. Simplify approx() to just convert expected via to_numpy() and delegate directly to pytest.approx. * Replace custom diag_embed with array-api-extra's create_diagonal Use array-api-extra (new dependency) for backend-agnostic functions: - create_diagonal replaces custom diag_embed (works with numpy/torch) - broadcast_shapes replaces np.broadcast_shapes in _broadcast_batch_shapes * Remove get_namespace and _broadcast_batch_shapes wrappers get_namespace was a thin wrapper around array_api_compat.array_namespace that only filtered None args (which array_namespace already handles). Re-export it directly as an alias. _broadcast_batch_shapes is inlined at its 2 call sites using array_api_extra.broadcast_shapes directly. * Add __all__ to _backend.py * Simplify ajd reshape: remove unnecessary zeros_like copy * Replace .T with .mT (Array API standard matrix transpose) * Replace xp.swapaxes(X, -2, -1) with X.mT throughout utils * Fix flake8: line length, unused imports/variables, extra blank lines * Use einops.rearrange for reshape+transpose in AJD algorithms * Revert stacklevel=2 additions to match master's warnings * Revert nearest_sym_pos_def to master's vectorized numpy version * Remove unnecessary numpy_only marks from non-backend test files * Revert unnecessary function signature reformatting to match master style * Use math.sqrt for scalar operations (sklearn array API pattern) * Fix torch backend tests: import order, mixed namespaces, xp-compat Root cause: importing pyriemann.datasets before torch caused segfaults in torch.linalg.cholesky during pytest. Fixed by importing torch first in conftest.py and disabling autograd (not needed for pyriemann). Additional fixes: - Replace np.einsum in _apply_inner_product with xp.sum (torch compat) - Fix mixed numpy/torch arrays in test assertions and fixtures - Convert numpy scalars to backend in test_mean_property_joint_homogeneity - Accept RuntimeError alongside ValueError for torch dimension errors - Mark ajd_pham HPD as numpy_only (convergence issue with complex torch) * Fix torch backend tests: import order, mixed namespaces, xp-compat Root cause: importing pyriemann.datasets before torch caused segfaults in torch.linalg.cholesky during pytest. Fixed by importing torch first in conftest.py and disabling autograd (not needed for pyriemann). Additional fixes: - Replace np.einsum in _apply_inner_product with xp.sum (torch compat) - Fix mixed numpy/torch arrays in test assertions and fixtures - Convert numpy scalars to backend in test_mean_property_joint_homogeneity - Accept RuntimeError alongside ValueError for torch dimension errors - Accept TypeError alongside ValueError for scalar input errors - Re-enable grad for autograd_smoke test - Mark ajd_pham HPD as numpy_only (pre-existing convergence issue) * Fix 5 pre-existing test failures ajd_pham HPD: the einops rearrange used a different column ordering than master's np.concatenate(X, axis=0).T — for symmetric (SPD) matrices both orderings work, but for Hermitian (HPD) the algorithm is sensitive to column order. Fixed by using xp.reshape(X, (-1, n)).mT which matches master's Fortran-order interleaving. Also force a copy to prevent in-place modifications from corrupting the input. cross_spectrum broadcasting: replaced as_strided (2D only) with sliding_window_view (supports any batch dims), matching master's PR #426 approach. Replaced per-frequency loop with np.einsum for cross-spectral matrix computation. coherence broadcasting: replaced per-frequency loop with vectorized PSD computation using diagonal indexing and broadcasting, matching master's approach. test_funm_error: accept TypeError alongside ValueError since array_namespace rejects scalar inputs with TypeError. test_autograd_smoke: re-enable grad for this specific test since conftest globally disables it. * Convert nearest_sym_pos_def to array API (numpy+torch) * Fix flake8: unused imports/variables in kernel.py, line length in utils.py * Use sklearn array API dispatch for ledoit_wolf and oas estimators LedoitWolf and OAS now have array_api_support=True in sklearn. Use config_context(array_api_dispatch=True) for non-numpy backends instead of converting to numpy and back, preserving the tensor type. Only fast_mcd still requires _to_numpy fallback (no array API support). * Remove _to_numpy from cov/corr/lwf/oas: use xp-native or sklearn array API dispatch - _cov: use xp.cov with correction kwarg mapping for torch - _corr: use xp.corrcoef for torch - _lwf/_oas: use sklearn array_api_dispatch with graceful fallback - Only _mcd still requires numpy conversion (no sklearn array API support) * Remove _vectorize_nd from 5 mean functions that broadcast natively mean_euclid, mean_harmonic, mean_chol, mean_kullback_sym, mean_logeuclid, and mean_poweuclid only use operations that already broadcast over batch dims (eigh, cholesky, solve, matmul, mean). The Python for-loop in _vectorize_nd was unnecessary overhead, especially for torch where it breaks parallelism and slows GPU computation. Changed axis=0 to axis=-3 in mean_euclid's weighted_average call to work with arbitrary batch dimensions. * Make iterative mean functions batch-native for torch (sklearn pattern) Following sklearn PR #33573's approach: keep the numpy loop for CPU, but let torch call functions directly with batched input for GPU parallelism and autograd support. _vectorize_nd gains batch_native parameter: - batch_native=True (default): torch skips the Python loop, function must handle batch dims natively via axis=-3 and [..., None, :, :] broadcasting. Used by mean_riemann, mean_logdet, mean_power, mean_thompson, mean_wasserstein. - batch_native=False: always loops (for functions that can't batch, e.g. mean_ale/alm/logchol that call ajd_pham or use Python lists). Mean functions updated to use X.shape[-3] instead of X.shape[0], axis=-3 instead of axis=0, and M[..., None, :, :] for broadcasting M (batch, n, n) with X (batch, n_matrices, n, n). * Pass like=X to check_weights/check_init at all call sites Fixes 'array_namespace requires at least one non-scalar array input' when check_weights was called without like=. Also fixes pre-existing flake8-bugbear warnings (B028 stacklevel, B004 hasattr __call__). * Pass like=X to check_weights in test_transfer.py * Fix block_covariances with single-channel blocks np.cov returns a scalar for single-channel input. Added np.atleast_2d in _cov to ensure output is always a 2D matrix. Also reshape scalar block_cov results in block_covariances to (1,1) matrix shape. * Pass like=X to check_weights/check_init at all call sites * Match master formatting in rjd: atan2 on one line, use bitwise or * Add array-api-compat, array-api-extra, einops to requirements.txt * Match master formatting in ajd_pham: reuse tmp variable, guard xp.imag for real * Move to_numpy/from_numpy to _backend.py * Simplify _cov and _corr: clean numpy/xp split, no mixed branches * Use xp.cov/xp.corrcoef directly, shared _apply_xp helper for kwargs * Make _apply_xp explicit * Translate numpy kwargs to torch in _apply_xp (bias/ddof → correction) * Simplify covariance_sch to match master's structure Replace isinstance branching and xp.linalg.outer with broadcasting via [..., :, newaxis]. Use xp.std/xp.var with correction=0 directly. Use xp.where/xp.clip for gamma (batched). Remove create_diagonal import (no longer used). * Keep fast_mcd as numpy fallback only (no torch MCD exists anywhere) * Restore @_complex_estimator on covariance_sch * Move all imports to top-level in _fixes.py * Move local imports to top-level in test_utils_base.py * Restore cross_spectrum and cospectrum broadcasting tests lost in merge * Restore missing tests lost in merge: directional_derivative, first_divided_difference_properties, transport_broadcasting * Remove torch import from covariance.py: use xp.clip, 1j works natively * Revert cosmetic multi-line reformatting to match master style * Compact distance_thompson to match distance_logeuclid style * Use scipy eigvalsh for numpy same-shape, invsqrtm fallback otherwise * Restore install_requires from requirements.txt, keep tests_torch extra * Restore full 5D broadcasting tests for upper, tangent_space, untangent_space * correct flake8 * Address PR #433 review comments Source files: - utils.py: Simplify check_weights (dtype=float for defaults, drop dtype from like), revert check_metric mutable default, simplify check_init - distance.py: Rename _last_axis_norm2 to _last_axis_normfro, use linalg.solve in distance_kullback, remove duplicate _BROADCASTABLE_DISTANCE_FUNCTIONS set - mean.py: Add batch_native=False to all iterative means (logdet, power, riemann, thompson, wasserstein) for independent convergence, remove decorator from mean_logchol using natural [...,] vectorization, revert mean_logeuclid to compact form, use linalg.solve in mean_power, remove unnecessary xp.max from convergence criteria - tangentspace.py: Revert exp_map_logeuclid, log_map_logeuclid, and transport_logeuclid to compact single-line forms Test files: - test_utils_base.py: Use xp.linalg.solve, remove redundant tests (first_divided_difference_properties, ddlogm, ddexpm), move test_autograd_smoke to test_utils.py - test_utils_distance.py: Remove duplicate BROADCAST_DISTANCE_FUNCS - test_utils_geodesic.py: Use xp.linalg.solve, use float(xp.real(...)) for determinants instead of complex() - test_utils_covariance.py: Use method-style diagonal/trace calls - test_utils_backend.py: Add tests for _backend module functions - test_utils.py: New file for test_autograd_smoke * Revert cosmetic-only changes to minimize diff - tangentspace.py: Revert line-split reformatting in exp_map_logchol, log_map_logchol, transport_logchol, transport_riemann; keep functional changes (np→xp, .conj().T→ctranspose, solve) - mean.py: Remove docstring additions ("with w being the weights which sum to 1"), revert mean_euclid docstring reorder, revert mean_power docstring rewording * Make callable_euclidean backend-compatible, remove numpy_only mark Addresses review comment #5: replace scipy.spatial.distance.euclidean with xp.linalg.matrix_norm for cross-backend support. * Compact gratuitous multi-line reformatting to reduce diff vs master - mean.py: Collapse check_weights, mean_euclid, ajd_pham, mean_alm, geodesic_thompson, _apply_masks calls back to single lines; revert M12/Mm12 variable splits; restore tau = np.finfo(np.float64).max - base.py: Restore "at least 2D ndarray" docstring wording; compact isqrt one-liner * Move test_autograd_smoke into test_utils_utils, use library helpers Replace local _make_spd/_to_torch with make_matrices and _to_backend from the library. Remove separate test_utils.py file. * Simplify tangent_space norm test to match master style Use np.linalg.norm(to_numpy(s)) instead of verbose xp.linalg.vector_norm(xp.reshape(s, (-1,))) — no need to change the assertion beyond converting to numpy. * small comment * Simplify test_innerproduct_euclid: use vectorized trace and vecdot Replace per-element loop with batched xp.linalg.trace and xp.vecdot, both available via array-api-compat for numpy and torch. * Revert test_mean_of_means to master style, add float() comment - test_utils_mean.py: Use np.array([M1, M2]) matching master, remove unnecessary xp.stack/get_namespace - test_utils_tangentspace.py: Add comment explaining float() for torch * Revert test_mean_property_invariance_inversion to master style (np.linalg.inv) * Use xp.stack in test_mean_of_means to keep submeans in original backend * Revert test_mean_masked_riemann to master (restore docstring, n_channels, tol) * Keep test_geodesic_invariance_inversion as single line, matching master structure * Restore docstring and use solve in test_distance_invariance_under_inversion * Restore Moakher2005 incorrect formula check in test_distance_riemann_implementations * Use xp.linalg.solve instead of xp.linalg.inv in test_mean_riemann_solution * reverting not necessary modifications * Fix to_numpy import and use solve in geodesic/distance inversion tests * Use xp.linalg.solve(A, B) directly, no need for eye * Remove dead if/else in test_distance_between_set_and_matrix * Use xp namespace in test_normalize instead of to_numpy conversion * Revert test_nearest_sym_pos_def closer to master, remove extra broadcasting block * Use xp.linalg.diagonal and create_diagonal in test_nearest_sym_pos_def * Add broadcasting test for nearest_sym_pos_def * Simplify exp_map_wasserstein: keep d[:, None] like master, keep dtype cast for HPD * Compact _first_divided_difference to match master style (3 lines instead of 12) * Keep scipy solve with assume_a=pos for numpy in distance_kullback, xp.linalg.solve for torch * Reduce unnecessary changes in distance.py: revert harmonic inv, use xp.linalg.trace in wasserstein, restore pairwise_harmonic to master style * Use xp.einsum in distance_mahalanobis, closer to master style * solving conflict * Rewrite mean.py from master with minimal np-to-xp conversion Restored mean.py from master and re-applied only the changes needed for array API / torch support: - np.X -> xp.X for array operations - np.einsum/np.average -> weighted_average - .conj().T -> ctranspose() - np.newaxis -> None - np.diag -> create_diagonal / xp.linalg.diagonal - Added like=X to check_weights/check_init - Added @_vectorize_nd(batch_native=False) to iterative means - Added stacklevel=2 to convergence warnings - .copy() -> zeros_like + assignment for torch compat No docstring changes, no line reformatting, no variable renames. Diff reduced from 437 to 192 changed lines. * Remove unnecessary _vectorize_nd from non-iterative means, use axis=-3 in mean_euclid * Use xp.einsum matching master np.einsum pattern, keep weighted_average only for np.average sites * Move dtype cast into check_weights (like=X), remove per-function casts in mean.py * Fix test_check_weights_vals: use xp.sum instead of np.sum for torch compat * Rewrite mean.py from origin/master with minimal np-to-xp conversion Restored from origin/master (which includes solve from PRs #426/#430) and applied only np→xp substitutions. Uses xp.einsum matching master's np.einsum pattern, xp.linalg.solve matching master's np.linalg.solve, weighted_average only where master used np.average. * Match ord parameter in matrix_norm exactly with origin/master * Fix test_mean_masked_riemann: pass same tol for broadcasting, skip torch for numpy-only masks * Use xp.stack in test_mean_masked_riemann broadcasting, runs on both backends * Use einops rearrange instead of permute_dims in ajd_pham * Fix torch+HPD: keep weights real in check_weights, cast to X.dtype before einsum check_weights uses like.real.dtype so weights stay real (fixing <= comparison on complex). Functions using xp.einsum cast weights to X.dtype for torch complex compatibility. * Remove _last_axis_normfro, use xp.linalg.vector_norm matching master style * Revert remaining cosmetic changes: restore comment colon, error message, docstring wording * small adjust and reverting details * fixing small details * Replace invsqrtm with Cholesky+solve for joint eigenvalues in distance_riemann/thompson torch has no generalized eigvalsh(A, B). Instead of the expensive invsqrtm(B) (full eigen-decomposition), use Cholesky decomposition: L = chol(B), then eigvalsh(L^{-1} A L^{-H}) gives the same joint eigenvalues. ctranspose handles both real and complex (HPD) cases. * Fix pairwise_distance helpers for torch: xp fallback in _euclidean_distances, fix aliasing and indices * Use torch.cdist in _euclidean_distances for GPU-optimized pairwise distances * Minimize _euclidean_distances diff: sklearn for numpy, torch.cdist for torch, recursive for complex * Add pairwise_euclidean to _backend (sklearn/torch.cdist), revert distance() to master style * min difference * Simplify distance_mahalanobis: use xp.sum(xp.abs(Xw)**2) matching master * Use scipy eigvalsh for numpy, Cholesky+solve for torch in geodesic_thompson * Add joint_eigvalsh to _backend, simplify distance_riemann/thompson/geodesic_thompson to one-liners * Move scipy.linalg.eigvalsh to top-level import in _backend * Use xp.broadcast_shapes (native array API) instead of array_api_extra import * Simplify geodesic_thompson: use joint_eigvalsh, xp.where instead of mask indexing * updating the geodesic * updating the code to not modify where is not necessary * cluster * Simplify log_map_logchol: use xp.zeros_like matching master, remove broadcast_shapes * Keep xp.einsum in _apply_inner_product matching master style * Convert median.py to array API: np.einsum->xp.einsum, np.linalg.norm->xp.linalg.matrix_norm * Fix torch+HPD einsum dtype in median: cast weights to X.dtype before einsum * Normalize weights in weighted_average to match np.average behavior (fixes median_euclid) * Fall back to numpy when like=None in check_weights/check_init, skip torch for TangentSpace class tests * Simplify upper(): keep master one-liner with xp.triu + xp.eye * updating the import * Revert cosmetic changes in covariance.py: restore warnings format, psd_prod one-liner, exponent spacing * Simplify normalize: use xp.linalg.trace, xp.linalg.det, xp.expand_dims matching master structure * Simplify covariances_X: outer(ones,ones) -> ones((n,n)), closer to master * Remove unnecessary comments, inline _make_complex back to master style * updating to include comments to revision * Fix all flake8 issues: unused imports, line length, whitespace, semicolons * Fix distance() broadcasting for user-defined callable metrics Built-in distance functions handle 3D-vs-2D broadcasting natively, but user-defined callables (e.g. call_dist in MDM tests) do not. Restore element-wise loop for callables, matching master behavior. * Simplify ajd return statements: remove unnecessary xp.asarray wrapping All three AJD functions (jade, pham, uwedge) now return V, D directly. The dtype cast was a no-op since all intermediates use X.dtype throughout. * Clean up ajd.py: minimize diff from master, fix HPD rearrange bug Use xp.reshape(X, (-1, n)).mT instead of rearrange for input reshape (rearrange transposed wrong for Hermitian matrices). All changes are now mechanical np->xp substitutions with minimal structural changes. * Restore original variable names and np.prod call in conftest.py get_mats Revert cosmetic changes flagged by Quentin: keep `X` variable name, `np.prod(n_matrices, dtype=int)`, and explicit reshape dimensions. * Remove batch_native from _vectorize_nd, drop decorator from naturally vectorized means mean_logchol and mean_poweuclid already handle batch dims natively via ellipsis indexing, so the decorator was redundant. All remaining callers passed batch_native=False, making the parameter dead code. * Mark sklearn-facing torch tests as numpy_only Higher-level classes (classifiers, estimators, spatial filters, embedding, clustering, etc.) have not been converted to array-API yet, so torch tests fail with mixed-namespace or numpy-API errors. Skip these 48 test functions for torch until the sklearn layer is ported. * Remove unused is_numpy_namespace import from base.py * Address review feedback: simplify check_weights/check_init, add backend tests - check_weights: use dtype=float for default weights, let xp.asarray infer dtype for user-provided weights (avoids cross-backend dtype issues) - check_init: preserve init.dtype when available - nanmean_riemann: convert torch init to numpy before check_init - Fix test_ajdpham: use real dtype for weights with HPD matrices - Add tests for joint_eigvalsh and pairwise_euclidean in test_utils_backend * add check_like * improve code and remove useless cast * Address review feedback from qbarthelemy - Rename joint_eigvalsh to eigvalsh and drop explicit xp argument from all custom helpers in _backend.py (weighted_average, pairwise_euclidean, diag/tril/triu_indices) — xp is now recovered via get_namespace inside. - Dedupe weighted_average validation by reusing check_weights. - Document like parameter in check_weights and check_init docstrings. - Replace xp.einsum("a,abc->bc", w, M) with xp.tensordot in mean.py and median.py: einsum is not in the Array API and raw torch.einsum is strict about real/complex dtype mixing; xp.tensordot is wrapped by array-api-compat and promotes properly. This removes the redundant sample_weight = xp.asarray(..., dtype=X.dtype, ...) casts entirely. - Drop float() cast around distance_riemann and matrix_norm in mean_ale: 0-d tensor comparisons work across backends. - Move _apply_xp / _numpy_to_xp_kwargs helpers from covariance.py to _backend.py (private, not in __all__). - Mark test_stats.py as numpy_only at module level: pyriemann.stats is not array-API compatible and triggered 30+ warnings (sklearn scoring cascade + NumPy 2.0 __array_wrap__ deprecation) under torch runs. - Add whatsnew entry and README paragraph describing the array API / PyTorch backend support. * Optimize torch path of _fixes covariance helpers Inspired by scikit-learn/scikit-learn#33573: - _empirical_covariance: post-hoc centering via cov = (X.T @ X - n * outer(mean, mean)) / n instead of allocating a centered copy of X. Avoids one (n_samples, n_features) allocation. OAS and ledoit_wolf benefit too since they call this helper. Cancellation error is negligible for bandpass-filtered signals where mean is near zero. - _ledoit_wolf_shrinkage: replace sum(X2.T @ X2) with sum(sum(X2, axis=1) ** 2). The identity sum_{i,j} (X2.T @ X2)_{i,j} = sum_k (sum_i X2[k,i]) ** 2 is strictly less work (O(n*f) vs O(n*f^2)) and avoids the (n_features, n_features) intermediate matrix — no tradeoff. Verified against sklearn reference outputs to 1e-8 tolerance across four shapes and non-zero means. Full test suite still 3697 passed. * Move torch into the `tests` extra so CI actually exercises it The CI workflow installs `.[tests]` which previously excluded torch, leaving all [torch] parameterizations silently skipped on every CI run (2433 passed / 2184 skipped on all 9 matrix jobs). The torch backend we added in this PR was not being validated at all. Merging torch into the `tests` extra is the cleanest fix: no workflow change needed, every contributor and CI job installs torch the same way. * Drop einops dependency in favor of xp.reshape + xp.permute_dims The three rearrange calls in ajd.py all use the same pattern 'n1 (matrices n2) -> matrices n1 n2' on an array of shape (n, n_matrices * n) to produce (n_matrices, n, n). This is expressible with two Array API primitives that are in the standard: D = xp.permute_dims(xp.reshape(A, (n, n_matrices, n)), (1, 0, 2)) Both reshape and permute_dims are Array API functions wrapped by array-api-compat for numpy and torch, verified equivalent numerically for real, complex, and torch inputs. This lets us drop einops from requirements.txt — one fewer dependency we were only using for three one-line reshapes. * Use array_api_extra.expand_dims to minimize diff from master in normalize The Array API's `xp.expand_dims` only accepts a single int `axis` because torch's `unsqueeze` backs it. That's why the original PR replaced master's one-liner np.expand_dims(denom, axis=tuple(range(denom.ndim, X.ndim))) with a Python for-loop, and replaced another tuple-axis call with a `[..., xp.newaxis, xp.newaxis]` index hack. array_api_extra.expand_dims accepts a tuple axis on both numpy and torch (verified on array-api-compat 1.13.0). Re-exporting it from _backend.py alongside create_diagonal lets us use the master one-liner almost verbatim — only `np.` → bare `expand_dims` changes. This also restores the master structure of the `gamma` line in covariance_sch, which was unnecessarily rewritten in the PR. * Use array_api_extra.atleast_nd and drop redundant correction=0 Two small follow-ups from an audit of array_api_extra vs our helpers: 1. _backend._apply_xp: replace the manual if C.ndim < 2: C = xp.reshape(C, (1, 1)) with xpx.atleast_nd(C, ndim=2). Same intent, cleaner, and also safer for the ndim==1 edge case where reshape((1,1)) would raise ("cannot reshape N elements into (1,1)") instead of lifting to (1,N). 2. covariance_sch: xp.std / xp.var default to correction=0, matching numpy's default for X.std() / X.var(). The explicit correction=0 was redundant and made the diff from master larger than necessary; drop it. * Make new test_transport_properties geodesic composition work on torch Commit 4212d49 on master added a test block using numpy-only primitives that fail on torch when the backend fixture picks the torch namespace: - X.copy() — not a method on torch.Tensor (torch uses .clone()) - np.array(list) — creates an object array from a list of torch tensors Replace with the Array API equivalents that work on both numpy and torch: xp = get_namespace(X) G = xp.stack(G_AB + G_BC) Xt_AC = xp.asarray(X, copy=True) `xp.stack` accepts a list and builds a proper batched tensor; `xp.asarray(X, copy=True)` is the Array API standard for a dtype-preserving deep copy. `get_namespace` is already imported in this file. * simplify * Restore rjd copy pattern using xp.asarray(..., copy=True) Match master's structure in the Givens update block of rjd and use the Array-API-standard portable copy (works on both numpy and torch) instead of .copy(). * add private functions for ajd * correct import * improve base * improve code * correct * Address review round 3: inline weighted_average, clean up kernel and tests Addressing qbarthelemy's comments from 2026-04-14/15: - Remove `weighted_average` from `_backend.py` and inline `xp.tensordot` at its 3 call sites in `mean.py` (mean_euclid, mean_logchol). Only used 3 times, not worth the abstraction. - kernel.py: remove 3 bare `get_namespace(X, Y, Cref)` calls whose return value was unused (validation already done by _check_xy/_check_cref). Use `_add_to_diagonal` for kernel regularization instead of manual `diag_indices` + indexed assignment. - test_utils_ajd.py: `.T` → `.mT` (Array API matrix transpose). - test_utils_base.py: replace manual trace via `sum(X * eye)` with `xp.linalg.trace(X)`. * Remove custom eigvalsh from _backend.py, inline Cholesky reduction qbarthelemy asked (at _backend.py:113 and geodesic.py:6): "Why not use xp.linalg.eigvalsh? I am trying to minimize the number of functions that we will have to support ourselves." The standard xp.linalg.eigvalsh(A) takes a single matrix. Our custom eigvalsh(A, B) solved the generalized eigenvalue problem A v = λ B v. But the Cholesky reduction L = chol(B) Y = solve(L, A) Z = ctranspose(solve(L, ctranspose(Y))) eigvals = xp.linalg.eigvalsh(Z) uses ONLY standard Array API functions (cholesky, solve, eigvalsh) and works natively batched on both numpy and torch. No custom function needed. Inline at the 3 call sites: distance_kullback_sym, distance_thompson, geodesic_thompson. Also removes the scipy.linalg.eigvalsh import and the _recursive wrapper that were only used by the old eigvalsh function. * remove un-used code * Address review round 3: inline weighted_average, clean up kernel and tests - Move eigvalsh to base.py as a shared function using Cholesky reduction with only standard xp.linalg functions (cholesky, solve, eigvalsh). Per qbarthelemy's request to centralize while minimizing custom helpers. - Inline pairwise_euclidean into distance.py (only call site), remove from _backend.py. Handles squared arg at call site. - Simplify distance() to always assume callables handle broadcasting, per qbarthelemy: "we do not handle non-vectorized callable". Update test callables (call_dist, callable_diageuclid) to broadcast. - Fix normalize() clip for complex correlation matrices: clip real part while preserving imaginary part. * Address review: simplify normalize clip, clean up imports in distance - Remove is_real_type branching in normalize(), use Xn.real assignment for both real and complex matrices - Move sklearn euclidean_distances import to top-level in distance.py - Export torch from _backend.py and use it in distance.py instead of local import * Add array-api-compat/extra to setup.py, bump version to 0.12.dev After merging master (which hardcoded install_requires in setup.py), add back array-api-compat and array-api-extra dependencies that are required by the array API backend. * Remove _add_to_diagonal, replace with diag_indices Use the existing diag_indices utility from _backend instead of a separate helper function in _fixes. * delete _recursive * Restore _add_to_diagonal in _fixes.py Keep the helper private to _fixes, as it will be removed when sklearn adds native array API support for covariance estimators. * Apply review suggestions: docstring and einsum cleanups - Bump versionadded 0.11 -> 0.12 for `like` param in check_weights / check_init - Add versionchanged:: 0.12 (NumPy + PyTorch) to ajd_pham and uwedge; bump rjd - Tighten einsum subscripts in _apply_matrix_kernel - Apply reviewer form for is_real inner zeros_like * Reuse _add_to_diagonal and _backend.expand_dims; annotate _cov/_corr - kernel._apply_matrix_kernel now regularizes via _add_to_diagonal instead of manual diag_indices + in-place add - ajd.uwedge uses _add_to_diagonal(A0, 1, xp) instead of A0 += eye_n - covariance.normalize uses _backend.expand_dims for consistency with the existing call further down in the same function - Add comment above _cov/_corr explaining the NumPy-vs-Torch kwarg mismatch that motivates the wrapper * Rename eigvalsh to _eigvalsh; expand docstring and add tests The helper is internal to distance/geodesic — make it private to match its role. Rewrite the docstring with :math: formatted pencil equation, a reference to Golub & Van Loan, and bump the versionadded to 0.12. Add tests asserting agreement with scipy.linalg.eigh for 2D SPD/HPD matrices and correct broadcasting over leading batch dimensions. * Import array-api primitives directly; stop re-exporting from _backend The reviewer asked why _backend re-exports things like get_namespace, xpd, is_numpy_namespace, create_diagonal, expand_dims when consumers could import them from array_api_compat / array_api_extra directly. Drop the re-exports and point every call site at the upstream package. Also move _numpy_to_xp_kwargs and _apply_xp into covariance.py, which is their only user — they were specific to dispatching np.cov / np.corrcoef vs. torch.cov / torch.corrcoef across backends. _backend now exposes only the pyRiemann-specific helpers (check_matrix_pair, check_like, to_numpy, from_numpy, diag_indices, tril_indices, triu_indices, and the torch fallback). * Fix flake8: drop unused import, reflow long lines in _eigvalsh docstring * Add _allclose helper; restore _vectorize_nd guard for sch/scm * improve tests for eigvalsh * add versionchanged * add versionchanged bis * Consolidate numpy_only markers; port Array-API blurb to introduction.rst Apply module-level `pytestmark = pytest.mark.numpy_only` across the nine non-`test_utils_*` test modules (47 per-test decorators removed), matching the pattern already used in `tests/test_stats.py`. Copy the Array-API support paragraph from README.md into doc/introduction.rst so it renders on the docs site. * add versionchanged ter * improve tests * Address review comments - Rename _backend.to_numpy → as_numpy to avoid clash with conftest.to_numpy; use array_api_compat.{is_numpy_array,is_torch_array} for explicit dispatch and raise TypeError on unsupported inputs. Strengthen with resolve_conj/ resolve_neg for torch lazy bits (no array-API equivalent). - Move check_like and check_matrix_pair from _backend.py to utils.utils. check_like(None) now returns the array_api_compat.numpy namespace for consistency. check_matrix_pair narrows except to (ValueError, RuntimeError). - mean_alm: replace zeros_like + [...] = with xp.asarray(copy=True). - _get_eigenvals: take real part once at source; is_pos_def / is_pos_semi_def simplify accordingly. - Drop unnecessary float() casts in _ledoit_wolf_shrinkage, ledoit_wolf, oas and median (preserves autograd; tests pass without cast). - conftest.to_numpy delegates to library as_numpy; tighter scalar whitelist. - Tests use the backend fixture instead of is_numpy_namespace(get_namespace()). - doc/index.rst: add NumPy and PyTorch backends bullet to Key features. * improve code * remove many @pytest.mark.numpy_only, add some * remove many @pytest.mark.numpy_only bis * Update pyriemann/utils/_backend.py Co-authored-by: Quentin Barthélemy <q.barthelemy@gmail.com> * Update pyriemann/utils/_backend.py Co-authored-by: Quentin Barthélemy <q.barthelemy@gmail.com> * Update pyriemann/utils/_backend.py Co-authored-by: Quentin Barthélemy <q.barthelemy@gmail.com> * Make nanmean_riemann torch-native; backend docstrings - nanmean_riemann no longer round-trips through numpy: _get_mask_from_nan uses array-API boolean indexing instead of np.delete; init uses xp.eye and a backend-agnostic _nanmean helper; nan filtering uses xp.where to preserve autograd. Numpy and torch agree to 1.4e-14; gradients flow through the function for torch tensors. - Drop @numpy_only from test_mean_nan_riemann and test_mean_nan_riemann_errors (now backend-agnostic via xp.mean and xp.asarray(copy=True)). - _apply_masks uses .mT (array-API matrix transpose) instead of .T. - _backend.py: add Notes/versionadded:: 0.12 docstrings to diag_indices, tril_indices, triu_indices (Quentin's suggestion). Drop duplicate Notes block in as_numpy. * Move check_matrix_pair tests; clean nan_to_num and numpy_only markers - Move check_matrix_pair tests from test_utils_backend to test_utils_utils (the function lives in utils.utils; Quentin's request). - nanmean_riemann: use array_api_extra.nan_to_num (the official Array API utility) instead of an inline xp.where. Numerical agreement preserved (1.4e-14 numpy vs torch). - Drop numpy_only markers on test_cross_spectrum_broadcasting, test_cospectrum_broadcasting, test_coherence_broadcasting — they don't use the backend fixture, so the marker was a no-op (Quentin's note). - distance_logdet / distance_wasserstein: replace xp.maximum(zeros_like, d2) with xp.clip(d2, 0, None). * Move _apply_xp/_numpy_to_xp_kwargs to _backend as apply_xp_cov Self-note from earlier review ("Maybe move to back end"): the cov/corrcoef backend-translation helpers belong in _backend with the other primitives. - _backend.apply_xp_cov(func, X, **kwds): public entrypoint that dispatches numpy.cov/corrcoef vs torch.cov/corrcoef and translates bias/ddof to correction. Wraps the result with array_api_extra.atleast_nd(ndim=2). - _backend._cov_kwargs_to_xp: private kwargs translator. - covariance.py: _cov / _corr now call apply_xp_cov directly. Drop the no-longer-needed atleast_nd and is_numpy_namespace imports. * Drop unused pytest import and n_channels constant in test_utils_backend Both became unused after the check_matrix_pair tests moved to test_utils_utils. Fixes the F401 flake8 failure on CI. * Document utils.utils validation helpers in API reference Add a Validation Helpers section listing the previously-undocumented check_* family in pyriemann.utils.utils: - check_function - check_init - check_like (new in 0.12) - check_matrix_pair (new in 0.12) - check_metric - check_param_in_func - check_weights Includes a navigation card matching the existing card layout. * Add hann_window helper to _backend A backend-portable Hann window of length n in the namespace of a reference array. Used by the spectral utilities in covariance.py (cross_spectrum/coherence/cospectrum). Implemented with the explicit symmetric formula because neither array_api_compat nor array_api_extra exposes a unified hann window: numpy has np.hanning(n), torch has torch.hann_window(n) with periodic=True default — different name and different default. The 4-line formula is backend-portable and matches np.hanning element-wise. Includes a parametrized test against np.hanning + the n=1 edge case. * Port cross_spectrum and coherence to array-API (NumPy + PyTorch) Mechanical np.X -> xp.X swap; control flow, naming and integer arithmetic match the pre-port form so the diff is easy to follow. cross_spectrum: - np.hanning -> hann_window(window, like=X) (helper added in previous commit, since neither array-api-compat nor array-api-extra exposes a unified hann window). - np.lib.stride_tricks.sliding_window_view -> fancy indexing X[..., starts[:, None] + offsets[None, :]] (no torch equivalent for stride_tricks; this works in both backends and for any batch dims). - np.fft.rfft / np.einsum / .conj() / np.linalg.norm replaced with xp.fft.rfft / xp.einsum / xp.conj / xp.linalg.vector_norm. - In-place S /= ... and S[..., 1:] *= 2 are kept (work in both backends). coherence: - Same np.* -> xp.* swap. - np.array_equal(a, b) -> bool(xp.all(a == b)). - S.real.copy() -> xp.asarray(xp.real(S), copy=True) (torch tensors have no .copy() method). - The C = zeros / f_inds / branch / fancy-index-assign structure is preserved verbatim (in-place fancy assignment works in both backends). - numpy import dropped — covariance.py is now namespace-agnostic. Tests: - Drop no-op @pytest.mark.numpy_only on tests that don't use the backend fixture (test_cross_spectrum_errors, test_cross_spectrum, test_cross_spectrum_scipy_auto, test_cross_spectrum_scipy_cross, test_cospectrum, test_coherence_error). - Add test_cross_spectrum_backend and test_coherence_backend that run on both NumPy and PyTorch and verify numerical agreement with the numpy reference. Verified: numpy/torch agreement at machine epsilon (~1e-16); scipy.signal csd reference still matches; suite 3398 passed (+12 from new backend-parametrized tests + the hann_window tests), 1231 skipped. * Make scalars/coherence tests backend-aware (drop misleading numpy_only) These tests had @pytest.mark.numpy_only but didn't request the backend fixture, so the marker was a no-op (the test always ran with numpy data regardless). The functions under test (mean_harmonic, mean_logeuclid, median_euclid, coherence) are all array-API; the tests now actually exercise both backends: - test_mean_harmonic_scalars: now numpy + torch (was numpy-only no-op) - test_mean_logeuclid_scalars: now numpy + torch - test_median_euclid_scalars: now numpy + torch - test_coherence: now numpy + torch; scipy comparison still uses the numpy reference X_np at the scipy boundary - test_coherence_properties: now numpy + torch; pytest.approx happens to handle 1-element torch tensors directly, no extra conversion needed Suite: 3415 passed (+17 from new torch parametrizations) / 1231 skipped. * small corrections * Address review: drop utils.utils API section, move nanmean to _backend Three review follow-ups from Quentin: - doc/api.rst: drop the Validation Helpers section (and its navigation card). Per Quentin: "This module should not be added to the API, to allow for more flexibility in modifications. Typically, this module should be renamed utils._check rather than utils.utils. We'll do that in a future PR." - Move the private _nanmean helper from mean.py into _backend as a public nanmean function (Quentin: "This function could be put into _backend. A good contribution for array-api-extra ;-)"). - Add test_nanmean to tests/test_utils_backend.py: verifies equivalence with xp.mean when the input has no NaN (both axis=0 and axis=1), and equivalence with np.nanmean when NaN are present. test_hann_window already verifies the match against np.hanning for n=8 and the n=1 edge case — Quentin's earlier "Can you add a test to verify the match?" is satisfied by the existing test. Suite: 3417 passed / 1231 skipped (+ test_nanmean[numpy|torch] on top of the prior 3398). * merge _backend and _fixes * correct f8 * correct f8 bis * improve tests --------- Co-authored-by: qbarthelemy <q.barthelemy@gmail.com>
Reference Issues/PRs
Came up as part of pondering #33564
Towards #33584
What does this implement/fix? Explain your changes.
This makes
LedoitWolfarray API compatible. After this it might be easier to convert other covariance estimators as it also updates some of the basic infra shared by themTodo
ledoit_wolf(the function) in this PR as wellAI usage disclosure
I used AI assistance for:
Any other comments?