(wish)list of probabilistic regressors to implement or to interface #32

fkiraly · 2019-10-31T22:08:17Z

A wishlist for probabilistic regression methods to implement or interface.
Number of stars at the end is estimated difficulty or time investment.

GLM

generalized linear model(s) with regression link, e.g., Gaussian *
generalized linear model(s) with count link, e.g., Poisson *
heteroscedastic linear regression ***
Bayesian GLM where conjugate priors are available, e.g., GLM with Gaussian link ***

KRR aka Gaussian process regression

vanilla kernel ridge regression with fixed kernel parameters and variance *
kernel ridge regression with MLE for kernel parameters and regularization parameter **
heteroscedastic KRR or Gaussian processes ***

CDE

variants of conditional density estimation (Nadaraya-Watson type) **
reduction to density estimation by binning of input variables, then apply unconditional density estimation **

Tree-based

probabilistic regression trees **

Neural networks

interface tensorflow probability ***

Bayesian toolboxes

generic Stan interface ****
generic JAGS interface ****
generic BUGS interface ****
generic Bayesian interface - prior-valued hyperparameters *****

Pipeline elements for target transformation

distr fixed target transformation **
distr predictive target calibration **

Composite techniques, reduction to deterministic regression

stick mean, sd, from a deterministic regressor which already has these as return types into some location/scale distr family (Gaussian, Laplace) *
use model 1 for the mean, model 2 fit to residuals (squared, absolute, or log), put this in some location/scale distr family (Gaussian, Laplace) **
upper/lower thresholder for a regression prediction, to use as a pipeline element for a forced lower variance bound **
generic parameter prediction by elicitation, output being plugged into parameters of a distr object not necessarily scale/location ****
reduction via bootstrapped sampling of a determinstic regressor **

Ensembling type pipeline elements and compositors

simple bagging, averaging of pdf/cdf **
probabilistic boosting ***
probabilistic stacking ***

baselines

always predict a Gaussian with mean = training mean, var = training var *
IMPORTANT as featureless baseline: reduction to distr/density estimation to produce an unconditional probabilistic regressor **
IMPORTANT as deterministic style baseline: reduction to deterministic regression, mean = prediction by det.regressor, var = training sample var, distr type = Gaussian (or Laplace) **

Other reduction from/to probabilistic regression

reducing deterministic regression to probabilistic regression - take mean, median or mode **
reduction(s) to quantile regression, use predictive quantiles to make a distr ***
reducing deterministic (quantile) regression to probabilistic regression - take quantile(s) **
reducing interval regression to probabilistic regression - take mean/sd, or take quantile(s) **
reduction to survival, as the sub-case of no censoring **
reduction to classification, by binning ***

RaphaelS1 · 2019-10-31T22:42:15Z

To note:

Anyone can implement these however the high-level interface for composition/reduction needs to be discussed as well as interfacing Bayesian toolboxes

fkiraly · 2019-10-31T22:46:53Z

yes, indeed:

priors ought to be hyper-parameters, and we haven't agreed on a representation, especially in the context of the sets6 discussion
composition/reduction should be compatible with mlr3pipelines, though we agreed today with @mllg that the best way to integrate these is to see components as incoming arrows, and have compositors as special pipeline network nodes

RaphaelS1 · 2019-10-31T22:48:13Z

Oh and can I suggest adding some baselines? e.g. Gaussian with mean = sample mean, variance = sample var?

fkiraly · 2019-10-31T22:48:25Z

Regarding Bayes, perhaps it's premature to look at this at all, without thinking carefully about a Bayesian mlr interface - since the issue with priors is potentially also of relevance in Bayesian classifiers, or Bayesian [any method].

fkiraly · 2019-10-31T22:53:40Z

and, obviously, any suggestions for the wishlist are welcome too

fkiraly · 2019-11-02T09:54:24Z

Oh and can I suggest adding some baselines? e.g. Gaussian with mean = sample mean, variance = sample var?

That's a special case of two methods already there:

the model1/model2 residual fitter, where both components are the featureless regressor
the reduction-to-distr-estimation featureless baseline, where the density estimator is just Gaussian MLE

Though I agree it probably should be a "special" baseline with its own name, perhaps "the" baseline.

I made a special "baseline" section.

fkiraly · 2019-12-16T20:00:54Z

In line with "one feature, one issue" principle (which @RaphaelS1 mentioned in communication elsewhere) - should this be split in individual issues, and the list moved to wiki?
Issues can be collected in projects.

RaphaelS1 · 2019-12-16T20:02:00Z

If we split this into "one feature, one issue", now it will bloat the issue tracker. Let's split it once we actually finish the design and start implementing learners

fkiraly · 2019-12-16T20:02:35Z

ok, let me know when. Just trying to comply with local best practice conventions.

RaphaelS1 added Priority: Low Status: Available Type: Enhancement labels Oct 31, 2019

fkiraly added the good first issue label Nov 1, 2019

RaphaelS1 added Status: On Hold and removed Status: Available labels Dec 1, 2019

fkiraly mentioned this issue Apr 12, 2020

[ENH] roadmap of probabilistic regressors to implement or to interface sktime/skpro#7

Open

45 tasks

RaphaelS1 closed this as not planned Won't fix, can't repro, duplicate, stale Dec 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(wish)list of probabilistic regressors to implement or to interface #32

(wish)list of probabilistic regressors to implement or to interface #32

fkiraly commented Oct 31, 2019 •

edited by RaphaelS1

Loading

RaphaelS1 commented Oct 31, 2019

fkiraly commented Oct 31, 2019

RaphaelS1 commented Oct 31, 2019

fkiraly commented Oct 31, 2019

fkiraly commented Oct 31, 2019

fkiraly commented Nov 2, 2019 •

edited

Loading

fkiraly commented Dec 16, 2019 •

edited

Loading

RaphaelS1 commented Dec 16, 2019

fkiraly commented Dec 16, 2019

(wish)list of probabilistic regressors to implement or to interface #32

(wish)list of probabilistic regressors to implement or to interface #32

Comments

fkiraly commented Oct 31, 2019 • edited by RaphaelS1 Loading

RaphaelS1 commented Oct 31, 2019

fkiraly commented Oct 31, 2019

RaphaelS1 commented Oct 31, 2019

fkiraly commented Oct 31, 2019

fkiraly commented Oct 31, 2019

fkiraly commented Nov 2, 2019 • edited Loading

fkiraly commented Dec 16, 2019 • edited Loading

RaphaelS1 commented Dec 16, 2019

fkiraly commented Dec 16, 2019

fkiraly commented Oct 31, 2019 •

edited by RaphaelS1

Loading

fkiraly commented Nov 2, 2019 •

edited

Loading

fkiraly commented Dec 16, 2019 •

edited

Loading