Skip to content

Prevent zero-variance instability in BaseProbaRegressor.predict_proba#956

Open
kindler-king wants to merge 1 commit intosktime:mainfrom
kindler-king:bugfix-zero-variance-predict-proba
Open

Prevent zero-variance instability in BaseProbaRegressor.predict_proba#956
kindler-king wants to merge 1 commit intosktime:mainfrom
kindler-king:bugfix-zero-variance-predict-proba

Conversation

@kindler-king
Copy link
Contributor

Reference Issues/PRs
Fixes #955

What does this implement/fix?

This PR fixes a numerical instability in BaseProbaRegressor.predict_proba.

When predict_var returns 0, the fallback Normal distribution is constructed with sigma=0, which leads to divide-by-zero warnings and NaN values when evaluating pdf or log_pdf.

To prevent this, the predicted variance is clipped to machine epsilon before computing the standard deviation:

pred_var = np.clip(pred_var, np.finfo(float).eps, None)
This ensures the resulting Normal distribution always has a strictly positive scale while leaving normal model outputs effectively unchanged.

Does your contribution introduce a new dependency?
No.

What should a reviewer concentrate their feedback on?

  • Whether clipping variance at machine epsilon is the appropriate safeguard.
  • Consistency with the existing probabilistic regression design.

Did you add any tests for the change?
Yes.

A regression test was added that uses a mock regressor returning zero variance and verifies that:
predict_proba().pdf() and log_pdf() remain finite
no numerical warnings are raised

@fkiraly fkiraly added bug module:probability&simulation probability distributions and simulators module:regression probabilistic regression module and removed module:probability&simulation probability distributions and simulators labels Mar 21, 2026
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say this is a hack. Instead of clipping it, I would instead return a Delta distribution if the variance is below machine epsilon (possibly times a factor).

Also, code formatting tests are failing. Please look at the dev guide, and pre-commit.

@kindler-king kindler-king force-pushed the bugfix-zero-variance-predict-proba branch from 8fc8178 to 8c79ee4 Compare March 24, 2026 13:31
@kindler-king
Copy link
Contributor Author

Hello @fkiraly , thanks for the suggestion, I’ve updated the implementation.

  • All-zero variance now returns a Delta distribution.
  • Non-zero variance continues to return Normal.

For the mixed case, I explored returning per-row heterogeneous distributions, but according to my understanding, skpro doesn’t support that (no concat in BaseDistribution, and Mixture uses global weights).

So for now, I fall back to Normal with eps clamping for zero-variance entries to maintain numerical stability.

Also fixed formatting issues using pre-commit.

Would love your feedback on this approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug module:regression probabilistic regression module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Zero predicted variance causes instability in BaseProbaRegressor.predict_proba

2 participants