Skip to content

ENH: expose correction and weights parameters in cov#690

Open
bruAristimunha wants to merge 1 commit intodata-apis:mainfrom
bruAristimunha:cov_parameters
Open

ENH: expose correction and weights parameters in cov#690
bruAristimunha wants to merge 1 commit intodata-apis:mainfrom
bruAristimunha:cov_parameters

Conversation

@bruAristimunha
Copy link
Copy Markdown

Resolves #688.

Summary

  • Adds axis, correction, frequency_weights, and weights parameters to xpx.cov, unlocking the degrees-of-freedom and weighted variants that numpy.cov and torch.cov already support.
  • Naming follows array-api conventions (axis, correction) used elsewhere in this library rather than numpy's (rowvar, bias, ddof). The docstring includes a one-to-one mapping for users migrating from numpy.cov.

Design

The delegation moves observations to the last axis via xp.moveaxis, which collapses rowvar out of backend dispatch entirely — only ddof (numpy/cupy/dask/jax) vs correction (torch) differs between branches.

Fallbacks to the generic implementation (_funcs.cov):

  • m.ndim > 2 (batched input, not supported by any native).
  • Non-integer correction (rejected by numpy.cov's ddof).
  • Dask with weights — dask.array.cov forces .compute() on a lazy 0-D scalar via its internal if fact <= 0 check. The generic path stays fully lazy because its weighted branch doesn't compare fact to zero (noted in docstring).

Weighted formula in _funcs.cov matches numpy's (algebraically): c = (m_c · w) @ m_c.T / (v1 - correction · v2 / v1).

Tests

New TestCov cases validate against np.cov as reference:

  • test_correction (integer ddof)
  • test_correction_float (generic-path-only, hand-computed reference)
  • test_axis / test_axis_with_weights / test_axis_out_of_bounds
  • test_frequency_weights / test_weights / test_both_weights
  • test_batch_with_weights

Test plan

  • pytest tests/test_funcs.py::TestCov — 126 passed across numpy, torch, jax, dask, array-api-strict
  • pytest tests/test_funcs.py full — 4263 passed, 0 failed
  • lefthook run pre-commit — ruff, numpydoc, mypy, pyright, typos all green
  • Dask laziness verified — lazy_xp_function(cov) asserts 0 .compute() calls, holds for weighted path via the fallback

Resolves data-apis#688. Adds `axis`, `correction`, `frequency_weights`, and
`weights` to `cov`, giving users control over the degrees-of-freedom
correction and the observation-axis / weighted variants that
`numpy.cov` and `torch.cov` already support.

Naming follows array-api conventions (`axis`, `correction`) rather
than numpy's (`rowvar`, `bias`, `ddof`); the docstring includes a
one-to-one mapping. The delegation moves observations to the last
axis via `xp.moveaxis`, collapsing `rowvar` out of the backend
dispatch — only `ddof` vs `correction` differs between branches.

Dask's native `cov` forces `.compute()` on a lazy scalar when any
weights are given, so weighted dask inputs fall through to the
generic implementation, which is fully lazy.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

EHN: make covariance more flexible

1 participant