Skip to content

[BUG] Handle Hurdle values less than zero properly#652

Merged
fkiraly merged 8 commits into
sktime:mainfrom
tingiskhan:hotfix/distributions/hurdle/bug-less-than-zero
Dec 4, 2025
Merged

[BUG] Handle Hurdle values less than zero properly#652
fkiraly merged 8 commits into
sktime:mainfrom
tingiskhan:hotfix/distributions/hurdle/bug-less-than-zero

Conversation

@tingiskhan
Copy link
Copy Markdown
Contributor

Reference Issues/PRs

Fixes #651

What does this implement/fix? Explain your changes.

Adds a wrapper for the methods s.t. values outside of support are set to 0 and -np.infty.

Does your contribution introduce a new dependency? If yes, which one?

No.

What should a reviewer concentrate their feedback on?

Did you add any tests for the change?

Yes.

Any other comments?

No,

PR checklist

For all contributions
  • I've added myself to the list of contributors with any new badges I've earned :-)
    How to: add yourself to the all-contributors file in the skpro root directory (not the CONTRIBUTORS.md). Common badges: code - fixing a bug, or adding code logic. doc - writing or improving documentation or docstrings. bug - reporting or diagnosing a bug (get this plus code if you also fixed the bug in the PR).maintenance - CI, test framework, release.
    See here for full badge reference
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.
For new estimators
  • I've added the estimator to the API reference - in docs/source/api_reference/taskname.rst, follow the pattern.
  • I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.
  • If the estimator relies on a soft dependency, I've set the python_dependencies tag and ensured
    dependency isolation, see the estimator dependencies guide.

Copy link
Copy Markdown
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Could we avoid decorators to ensure the extension contract has a fixed signature?
That would also mean having a "three casess" distinction in _cdf rather than two cases only.

We should also add a small regression test.

@tingiskhan
Copy link
Copy Markdown
Contributor Author

Thanks!

Could we avoid decorators to ensure the extension contract has a fixed signature? That would also mean having a "three casess" distinction in _cdf rather than two cases only.

We should also add a small regression test.

Absolutely, no worries. Would you be fine with just removing the decorator and re-using the inner-function? I.e.

    def _set_to_constant_where_negative(x: np.ndarray, const: float) -> np.ndarray:
        return np.where(x < 0, const, result)

    def _log_pmf(self, x):
        log_prob_zero = np.log(1.0 - self.p)
        log_prob_hurdle = np.log(self.p)

        is_zero = x == 0
        result = np.where(is_zero, log_prob_zero, log_prob_positive)

        return _set_to_constant_where_negative(result, const=-np.inf)

@fkiraly
Copy link
Copy Markdown
Collaborator

fkiraly commented Nov 27, 2025

Would you be fine with just removing the decorator and re-using the inner-function? I.e.

Yes, sure - as long as it is vectorized (which it is)

@tingiskhan
Copy link
Copy Markdown
Contributor Author

tingiskhan commented Dec 2, 2025

@fkiraly, this is something I must have missed when first working on skpro, but are vectorized operations for univariate zero-dimensional distributions not supported? I can e.g. not run the below snippet of code without getting an exception

import numpy as np
import pandas as pd
from skpro.distributions import Normal

dist = Normal(0, 1)
xs = -1 + np.arange(0, 10)

pdf_vals = dist.pdf(xs)
log_pdf_vals = dist.log_pdf(xs)
cdf_vals = dist.cdf(xs)

And wondering if this is to be expected, or something wrong with my environment.

@fkiraly
Copy link
Copy Markdown
Collaborator

fkiraly commented Dec 3, 2025

this is expected, currently only broadcasting arguments up is supported, not broadcasting the distributions up.

Feels like a great idea to add, would you like to open an issue? Alghough I feel this might leed to too many complications if the broadcasting logic is not reworked cleanly to rely on numpy.

Copy link
Copy Markdown
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now!

@tingiskhan
Copy link
Copy Markdown
Contributor Author

this is expected, currently only broadcasting arguments up is supported, not broadcasting the distributions up.

Feels like a great idea to add, would you like to open an issue? Alghough I feel this might leed to too many complications if the broadcasting logic is not reworked cleanly to rely on numpy.

Sure thing!

Yes, it might becoma a bit nasty tbh. I could open an issue and then we could discuss how a possible solution would look. This question sort of goes hand-in-hand with the one relating to distribution support if we go the same route as pytorch/pyro with event and batch shapes.

@fkiraly
Copy link
Copy Markdown
Collaborator

fkiraly commented Dec 4, 2025

yes, issue would be much appreciated, @tingiskhan! I will merge this now then.

@fkiraly fkiraly merged commit 5abb3ac into sktime:main Dec 4, 2025
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug module:probability&simulation probability distributions and simulators

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Hurdle CDF is wrong for negative values

2 participants