Skip to content

fix: handle both 1D and 2D input shapes in XGBoost custom objectives#866

Merged
egordm merged 1 commit into
OpenSTEF:release/v4.0.0from
Soutehkeshan:feature-xgboost-3-2-compat
May 5, 2026
Merged

fix: handle both 1D and 2D input shapes in XGBoost custom objectives#866
egordm merged 1 commit into
OpenSTEF:release/v4.0.0from
Soutehkeshan:feature-xgboost-3-2-compat

Conversation

@Soutehkeshan
Copy link
Copy Markdown

Summary

Fixes compatibility with XGBoost 3.2, which changed the shape of arrays passed to custom objectives via the sklearn interface. With multi_strategy="one_output_per_tree", y_true and y_pred are now passed as 2D (n_samples, n_quantiles) arrays instead of 1D flattened arrays of length n_samples * n_quantiles.

Closes #865

Root Cause

The old reshape logic in both objective functions assumed 1D input:

n_items = len(y_true)
n_quantiles = len(quantiles)
n_rows = n_items // n_quantiles  # ← breaks with 2D input
y_pred = np.reshape(y_pred, (n_rows, n_quantiles))
y_true = np.reshape(y_true, (n_rows, n_quantiles))
sample_weight = np.reshape(sample_weight, (n_rows, -1)) if sample_weight is not None else None

With 2D input, len(y_true) returns n_samples (not n_samples * n_quantiles), so n_rows = n_samples // n_quantiles is wrong, causing the reshape to fail or produce incorrect results.

The below image shows running a minimal script using XGBoost 3.1:
Screenshot 2026-05-04 at 21 19 23

The below image shows running the same minimal script using XGBoost 3.2:
Screenshot 2026-05-04 at 21 22 12

Fix

Replaced with shape-agnostic normalisation in arctan_loss_multi_objective and pinball_loss_multi_objective:

n_quantiles = len(quantiles)
y_pred = np.reshape(y_pred, (-1, n_quantiles))
y_true = np.reshape(y_true, (-1, n_quantiles))
sample_weight = sample_weight[:, np.newaxis] if sample_weight is not None else None

np.reshape(arr, (-1, n_quantiles)) is a no-op when the array is already (n_samples, n_quantiles) and correctly reshapes a 1D flat array, maintaining backward compatibility with XGBoost <3.2.

Investigation Notes

  • mean_pinball_loss: already used np.reshape(y_pred, [-1, n_quantiles]) which handles both shapes correctly — no change needed
  • metrics_probabilistic.py: no changes needed, sample_weight_eval_set verified working correctly with XGBoost 3.2
  • xgboost_forecaster.py: no changes needed
  • pyproject.toml: already at >=3,<4 on this branch — no change needed

@Soutehkeshan Soutehkeshan requested a review from a team May 4, 2026 20:24
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 4, 2026

Copy link
Copy Markdown
Collaborator

@egordm egordm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, both affected objective functions are correctly fixed and the pattern is consistent. Approving.

@egordm
Copy link
Copy Markdown
Collaborator

egordm commented May 5, 2026

Hi @Soutehkeshan, thanks so much for picking this up and submitting a fix so quickly! Really appreciated.

The code change looks correct. Both arctan_loss_multi_objective and pinball_loss_multi_objective are now handled, and the existing tests pass. The only thing missing is test coverage for the new 2D input shape so we have a regression guard going forward. To avoid blocking your PR, I've added those tests in a separate follow-up (PR #867) that we'll merge right after this one.

On the DCO check failing: if you haven't already, you can either add a sign-off by amending your commit (git commit --amend --signoff) or we can handle it with a squash-merge on our side. No worries either way, we'll sort it out.

Looking forward to seeing more contributions from you!

@egordm egordm merged commit 460093e into OpenSTEF:release/v4.0.0 May 5, 2026
5 checks passed
egordm added a commit that referenced this pull request May 5, 2026
Adds regression tests verifying both arctan_loss_multi_objective and
pinball_loss_multi_objective accept 2D (n_samples, n_quantiles) inputs
as passed by XGBoost 3.2, producing identical results to the flat 1D
arrays used in older versions.

Tests intentionally FAIL on the unfixed code and PASS after PR #866
is merged into release/v4.0.0.

Closes #865 (once #866 is merged and these pass)

Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
egordm added a commit that referenced this pull request May 5, 2026
Adds regression tests verifying both arctan_loss_multi_objective and
pinball_loss_multi_objective accept 2D (n_samples, n_quantiles) inputs
as passed by XGBoost 3.2, producing identical results to the flat 1D
arrays used in older versions.

Tests intentionally FAIL on the unfixed code and PASS after PR #866
is merged into release/v4.0.0.

Closes #865 (once #866 is merged and these pass)

Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
egordm added a commit that referenced this pull request May 5, 2026
Adds regression tests verifying both arctan_loss_multi_objective and
pinball_loss_multi_objective accept 2D (n_samples, n_quantiles) inputs
as passed by XGBoost 3.2, producing identical results to the flat 1D
arrays used in older versions.

Tests intentionally FAIL on the unfixed code and PASS after PR #866
is merged into release/v4.0.0.

Closes #865 (once #866 is merged and these pass)

Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
egordm added a commit that referenced this pull request May 5, 2026
Adds regression tests verifying both arctan_loss_multi_objective and
pinball_loss_multi_objective accept 2D (n_samples, n_quantiles) inputs
as passed by XGBoost 3.2, producing identical results to the flat 1D
arrays used in older versions.

Tests intentionally FAIL on the unfixed code and PASS after PR #866
is merged into release/v4.0.0.

Closes #865 (once #866 is merged and these pass)

Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
egordm added a commit that referenced this pull request May 5, 2026
## Summary

Adds regression tests verifying that `arctan_loss_multi_objective` and
`pinball_loss_multi_objective` accept **2D `(n_samples, n_quantiles)`
input arrays** as passed by XGBoost 3.2+, and produce results identical
to the flat 1D arrays used in older versions.

## Context

XGBoost 3.2 changed the calling convention for custom objectives with
multi-output trees: arrays are now passed as 2D `(n_samples, n_outputs)`
instead of the flat 1D `(n_samples * n_outputs,)` used previously. This
was tracked in #865 and fixed in #866.

## Relationship to #866

- These tests **intentionally FAIL** on the unfixed code (this branch's
base, `release/v4.0.0` before #866).
- They will **pass** once #866 is merged.
- This PR should be merged **after** #866 to ensure CI stays green.

## Tests added

- `test_loss_fn__2d_input_matches_1d_input[pinball]` — pinball objective
with 2D input
- `test_loss_fn__2d_input_matches_1d_input[arctan]` — arctan objective
with 2D input
- `test_loss_fn__2d_input_with_sample_weights_matches_1d[pinball]` —
pinball with 2D input + sample weights
- `test_loss_fn__2d_input_with_sample_weights_matches_1d[arctan]` —
arctan with 2D input + sample weights

Closes #865 (when merged after #866)

---------

Signed-off-by: Egor Dmitriev <egor.dmitriev@alliander.com>
@Soutehkeshan Soutehkeshan deleted the feature-xgboost-3-2-compat branch May 5, 2026 20:27
@Soutehkeshan
Copy link
Copy Markdown
Author

Hi @egordm,

Thanks for your feedback. I'm not sure if sign-off is still needed considering that the PR has already been merged. If so, could you please handle it with a squash-merge on your side?

Looking forward to contributing more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants