Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skpro integration #69

Closed
fkiraly opened this issue Jan 26, 2024 · 1 comment
Closed

skpro integration #69

fkiraly opened this issue Jan 26, 2024 · 1 comment

Comments

@fkiraly
Copy link

fkiraly commented Jan 26, 2024

It would be great to integrate this package - and adjacent ones like LightGBMLSS - with skpro, which in turn directly integrates with sktime for time series forecasting.
(both of course integate seamlessly with sklearn)

Issue opened here: sktime/skpro#184

This is very similar to the suggestion of @joshdunnlime for sklearn interface, skpro provides interface specifications and stringent tests (no need to write new ones!) for probabilistic tabular regressors already.

What would be needed is, as far as I see it:

  • predict_proba interface
  • distributions from XGBoostLSS implemented as skpro tabular distributions

Architecturally, there are two options:

  • small changes in XGBoostLSS, and work done in skpro in interfacing
  • or, import check_estimator from skpro (works on distribution objects as well as on estimators), and use that to create fully skpro conformant interfaces within XGBoostLSS. Then have a light import wrapper in skpro.
    • it is perhaps worthy of note that skpro already has an adapter to tensorflow for distributions.

Personally, I would think option 1 is preferable at least for the distributions, since the different distribution types are of general use, including for statmixedML's other packages, so it would avoid duplication of distribution objects or interfaces.

@joshdunnlime
Copy link

Hello!
I can't profess to be anything like an expert on prob regression or forecasting, or on StatMixedML's LSS-verse (nice umbrella term btw). However, I'm sure you've seen the open PR here to add a sklearn-API to the XGBLSS package, hence my involvement. Once this PR is merged I would then like to turn my attention to LGBMLSS.

My main focus was to follow the XGB SK-API as closely as possible so that XGBLSS can be used as a drop-in replacement for XGB, should users wish to compare familiar XGB models to XGBLSS. My understanding is the XGB SK-API attempts to follow the HistGradientBoostingRegressor as closely as possible. Of course, there are clear gaps in this API w.r.t to prob regression, hence my eagerness for some kind of guidance on an API from the sklearn team. I also wanted to respect the XGBLSS API as much as possible.

At the time I wasn't aware of skpro. It seems to me that adopting this well established API makes sense. In the future, I'm sure sklearn and skpro will converge, or at least remain very well aligned. This does me breaking from the XGBLSS API of .predict(pred_type="quantiles"). Should @StatMixedML agree with this direction, I would be very happy to implement it.

Additionally, there should be very little overhead if we wish to support both APIs.

Repository owner locked and limited conversation to collaborators Jan 27, 2024
@StatMixedML StatMixedML converted this issue into discussion #71 Jan 27, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants