Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] adapter from scipy rv_discrete to skpro Empirical #155

Merged
merged 3 commits into from
Dec 28, 2023

Conversation

fkiraly
Copy link
Collaborator

@fkiraly fkiraly commented Dec 27, 2023

This PR adds an adapter from lists of scipy rv_discrete to skpro Empirical.

This adapter will be useful in interfacing statsmodels models that return lists of discrete rv_discrete distributions to represent tabular distribution objects.

Also makes minor corrections to the Empirical docstring, and removes a stray line.

@fkiraly fkiraly added enhancement module:probability&simulation probability distributions and simulators labels Dec 27, 2023
Copy link

codecov bot commented Dec 27, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (5c2c4c3) 64.09% compared to head (e7bfb7e) 64.62%.
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #155      +/-   ##
==========================================
+ Coverage   64.09%   64.62%   +0.52%     
==========================================
  Files         104      107       +3     
  Lines        5587     5636      +49     
  Branches     1047     1054       +7     
==========================================
+ Hits         3581     3642      +61     
+ Misses       1720     1712       -8     
+ Partials      286      282       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@fkiraly fkiraly merged commit 216a768 into main Dec 28, 2023
36 checks passed
fkiraly added a commit that referenced this pull request Jan 30, 2024
…PH model (#157)

This PR implements framework support for survival (aka time-to-event or
failure time) prediction, adds tests, and an interface to `statsmodels`
cox proportional hazards models as test case.

Depends on #155 and
#159 which should be merged first.

### Design

Survival prediction models use the current `BaseRegressorProba` base
class, which has `fit` extended to take a third argument `C`, a
dataframe-like with a censoring indicator.

Regressors capable of making use of the third argument `C` are
identified via the `capability:survival` tag (being `True`). Regressors
without this tag also take `C` but ignore it, corresponding to the
"ignore censoring" reduction strategy.

This way, all existing regressors can be used for survival prediction
and vice versa.
The interface is also fully downwards compatible for users - `C`
defaults to `None` - and for extenders - estimators without the tag do
not assume a `C` present in fit, as in this case only `X_inner`,
`y_inner` are passed in `fit`.

As the `predict` and `predict_proba` interfaces remain unchanged,
metrics do not need to be adapted, they directly work.

To avoid cluttering the docs for users who are interested primarily in
probabilistic regression without censoring, models with the
`capability:survival` tag have a more detailed `fit` docstring. The
difference is mediated via a base class `BaseSuvReg`, which is the same
as `BaseRegressorProba` with docstring overrides.

### Testing

As time-to-event models inherit from `BaseProbaRegressor`, the existing
`TestAllRegressors` suite tests runs on all survival prediction models.

A scenario with a non-trivial `C` is added.

As regressors and time-to-event models have an interchangeable interface
(see above), both are tested with non-trivial `C`, with `C=None`, and
without a `C` being passed.

### Further contents

* an inteface to `statsmodels` proportional hazards models,
`skpro.survival.coxph.CoxPH`, to showcase and test the interface
* `Pipeline` is updated to accommodate survival models, for this the tag
needs to be carried and `C` passed through in `_fit`
* update to the API reference - new page for survival prediction
* survival prediction extension template
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement module:probability&simulation probability distributions and simulators
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant