Support per-sample weights in progressive validation by MaxHalford · Pull Request #1795 · online-ml/river

MaxHalford · 2026-04-09T22:21:36Z

Summary

Support per-sample weights in progressive_val_score / iter_progressive_val_score via (x, y, {"w": 2.0}) dataset tuples
Fix Pipeline.learn_one to forward **params to the final supervised step so weights reach the underlying model
Simplified approach based on Add w parameter to progressive_val_score and iter_progressive_val_score #1762 by @satishkc7

Details

Weights are extracted from the existing kwargs dict mechanism ("w" key is popped and forwarded to learn_one separately). No new function parameters, no weight queue, no duplicated code paths. Models that don't accept w gracefully ignore it.

Test plan

All existing evaluate + compose tests pass (44 tests)
New test: weighted vs uniform produces different accuracy
New test: models without w param ignore weights gracefully
Pre-commit hooks pass (ruff check, ruff format, mypy)

🤖 Generated with Claude Code

…sive_val_score Allow per-sample weights via two complementary APIs: - weights callable: progressive_val_score(..., weights=lambda x, y: float) - dataset triples: dataset yields (x, y, w) where w is a per-sample float Tuple weights take precedence over the callable so mixed datasets behave predictably. The weight is forwarded to learn_one for models that accept a w parameter (e.g. linear_model.LogisticRegression). Models without a w parameter are called without it to preserve backward compatibility. Implementation: - _needs_weights guard: weight infrastructure (weight_queue, _iter_dataset) is only created when the model accepts w or a weights callable is given, keeping the default path free of any overhead - weight_queue (collections.deque) bounds memory to the delay window, not the full dataset - kwargs w-key collision guard strips w from simulate_qa metadata before learn_one to prevent TypeError when stream metadata includes a w key Closes #1502

river's EstimatorMeta.__instancecheck__ accesses instance._last_step to unwrap pipelines. On a plain MagicMock every attribute access returns another MagicMock, so isinstance(mock, AnomalyFilter) recurses infinitely and hits Python's recursion limit. Setting _last_step=None on the test mock stops the chain: hasattr(None, '_last_step') is False so __instancecheck__ returns False immediately.

AnomalyFilter.learn_one(*args, **learn_kwargs) was being treated as accepting w because of the VAR_KEYWORD check, causing it to receive w=1.0 and silently forward it to OneClassSVM.learn_one which has no w parameter, raising TypeError. Fix _model_accepts_w to check only for an explicit w parameter in the signature, not **kwargs. Models that want per-sample weights should declare w explicitly. Also fix _make_model in the weight tests to use create_autospec instead of manually assigning __signature__ on a MagicMock, which causes RecursionError in Python 3.13. Update the no-w model test to use (x, y) pairs since (x, y, w_float) triples through the non-weight code path expect the third element to be a kwargs dict.

mypy infers preds as dict[int, tuple[Any, Any|bool, float]] from the 3-tuple assignment and then flags the 2-tuple assignment and 2-variable unpack as type errors. Annotate with typing.Any to allow both shapes.

Replace the weights callable, weight_queue, and duplicated fast path with a simpler approach: extract "w" from the existing kwargs dict in dataset tuples and forward it to learn_one. Also fix Pipeline.learn_one to forward **params to the final supervised step so weights reach the underlying model. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

satishkc7 and others added 5 commits April 10, 2026 00:11

fix: annotate preds dict to satisfy mypy type inference

40aa290

mypy infers preds as dict[int, tuple[Any, Any|bool, float]] from the 3-tuple assignment and then flags the 2-tuple assignment and 2-variable unpack as type errors. Annotate with typing.Any to allow both shapes.

MaxHalford requested a review from smastelini as a code owner April 9, 2026 22:21

MaxHalford mentioned this pull request Apr 9, 2026

Add w parameter to progressive_val_score and iter_progressive_val_score #1762

Closed

Add unreleased.md entries for weight support and pipeline fix

4535e0e

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

MaxHalford merged commit 6d77d4a into main Apr 9, 2026
1 check passed

MaxHalford deleted the feat/progressive-val-weights branch April 9, 2026 22:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support per-sample weights in progressive validation#1795

Support per-sample weights in progressive validation#1795
MaxHalford merged 6 commits intomainfrom
feat/progressive-val-weights

MaxHalford commented Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

MaxHalford commented Apr 9, 2026

Summary

Details

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants