[ENH] Sync skpro and sktime probabilistic metrics modules: sample_weight and output consistency #674
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR brings the
skproprobabilistic metrics module into alignment withsktime, as discussed in issue #367.It adds sample_weight support across metrics, ensures consistent output types, and fixes weighted multioutput behavior.
Changes
1. Sample Weight Support
Added a
sample_weightparameter to:BaseProbaMetric.evaluateBaseDistrMetric.evaluateUpdated input validation methods:
_check_consistent_input_check_ysto correctly propagate
sample_weightImplemented proper weighted averaging using
np.averagein:_evaluate(Proba metrics)evaluate(Distribution metrics)2. Output Consistency Improvements
BaseDistrMetric.evaluatenow returns a pd.Series when:multioutput="raw_values"This matches:
BaseProbaMetricbehaviorsktimeconventionsFixes the previous inconsistency where a 1-row DataFrame was returned.
3. Weighted Multioutput Fixes
BaseDistrMetric.evaluateTests Added / Updated
test_sample_weight_pinball(intest_probabilistic_metrics.py)Verifies sample_weight support for quantile and interval metrics.
test_sample_weight_logloss(intest_distr_metrics.py)Verifies sample_weight support for distribution-based metrics.
test_multioutput_weights_logloss(intest_distr_metrics.py)Ensures weighted multioutput aggregation behaves correctly.
Updated
test_distr_evaluateto assert that raw-value outputs are returned as pd.Series.Fixes #367