# Introduction to `skbase`

Contents of this tutorial:

1. introduction to the unified `sklearn` / `sktime`-like interface supported by `skbase`
2. `skbase` usage patterns
3. package building with `skbase`

## 1 - Introducing the ``sklearn`` / `sktime` interface

- it is recommended you have worked through either an ``sklearn`` or ``sktime`` tutorial
- for ``sktime``, check out a previous [pydata tutorial](https://www.youtube.com/watch?v=ODspi8-uWgo) of ours, and of course visit [our website](https://www.sktime.net/en/latest/index.html)! 
- ``skbase`` is currently maintained by the ``sktime`` project.
  - We *love* new contributors. Even if you are new to open source software developement!
  - Check out the ``sktime`` [new contributors guide](https://www.sktime.net/en/latest/get_involved/contributing.html).


### 1.1 ``skbase``, ``sklearn``, ``sktime`` in a nutshell

- `skbase` is a workbench package for developers for creation of "`sklearn`-likes"
  - reusable base class factory with `get_params`, config, nested composition interface, etc
  - templated base classes compatible with `sklearn` / `sktime`
  - lookup and search utilities
  - factory templates for test frameworks 

- `sklearn` / `sktime` interface:
  - unified interface for objects/estimators
  - modular design, strategy pattern
  - composable, composites are interface homogenous
  - simple specification language and parameter interface
  - visually informative pretty printing

- `sktime` base class design is an evolution on `sklearn`:
  - separation of `BaseObject` (non-fittable) and `BaseEstimator` (fittable)
  - `get_fitted_params` interface for fittable objects, similar to `get_params`
  - unified tag and config manager, dynamic tags
  - improved state handling - `clone`, `reset`
  - test case generation, e.g., `create_test_instances_and_names`
  - test framework with scenario and conditional fixture handling

### 1.2 sklearn unified interface - the strategy pattern

`sklearn` provides a unified interface to multiple learning tasks including classification, regression.

any (supervised) estimator has the following interface points

1. **Instantiate** your model of choice, with parameter settings
2. **Fit** the instance of your model
3. Use that fitted instance to **predict** new data!

![](./img/estimator-conceptual-model.jpg)

In [1]:
# get data to use the model on
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

X, y = load_iris(return_X_y=True, as_frame=True)
X_train, X_test, y_train, y_test = train_test_split(X, y)

In [2]:
from sklearn.svm import SVC

# 1. Instantiate SVC with parameters gamma, C
clf = SVC(gamma=0.001, C=100.)

# 2. Fit clf to training data
clf.fit(X_train, y_train)

# 3. Predict labels on test data
y_test_pred = clf.predict(X_test)

y_test_pred

array([0, 0, 1, 2, 1, 2, 1, 1, 0, 0, 2, 0, 0, 2, 0, 0, 2, 2, 2, 1, 0, 0,
       2, 1, 1, 2, 2, 2, 2, 0, 2, 0, 1, 0, 0, 2, 0, 1])

IMPORTANT: to use another classifier, only the specification line, part 1 changes!

`SVC` could have been `RandomForest`, steps 2 and 3 remain the same - unified interface:

In [3]:
from sklearn.ensemble import RandomForestClassifier

# 1. Instantiate SVC with parameters gamma, C
clf = RandomForestClassifier(n_estimators=100)

# 2. Fit clf to training data
clf.fit(X_train, y_train)

# 3. Predict labels on test data
y_test_pred = clf.predict(X_test)

y_test_pred

array([0, 0, 1, 2, 1, 2, 1, 1, 0, 0, 2, 0, 0, 2, 0, 0, 2, 2, 2, 1, 0, 0,
       2, 1, 1, 1, 2, 2, 2, 0, 2, 0, 1, 0, 0, 2, 0, 1])

in object oriented design terminology, this is called **"strategy pattern"**

= different estimators can be switched out without change to the interface

= like a power plug adapter, it's plug&play if it conforms with the interface

Pictorial summary:
![](./img/sklearn-unified-interface.jpg)

parameters can be accessed and set via `get_params`, `set_params`:

In [4]:
clf.get_params()

{'bootstrap': True,
 'ccp_alpha': 0.0,
 'class_weight': None,
 'criterion': 'gini',
 'max_depth': None,
 'max_features': 'sqrt',
 'max_leaf_nodes': None,
 'max_samples': None,
 'min_impurity_decrease': 0.0,
 'min_samples_leaf': 1,
 'min_samples_split': 2,
 'min_weight_fraction_leaf': 0.0,
 'n_estimators': 100,
 'n_jobs': None,
 'oob_score': False,
 'random_state': None,
 'verbose': 0,
 'warm_start': False}

In [5]:
clf.set_params(n_estimators=42)
clf

fitted parameters end in an underscore:

In [6]:
from sklearn.linear_model import LogisticRegression
clf = LogisticRegression()
clf.fit(X_train, y_train)
clf.coef_

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


array([[-0.4187126 ,  0.87719211, -2.36276128, -1.00734237],
       [ 0.34128422, -0.32007984, -0.13769943, -0.84751076],
       [ 0.07742838, -0.55711227,  2.50046071,  1.85485313]])

### 1.3 sklearn - composition patterns

`sklearn`'s unified interface extends to composition such as:

* tuning such as grid search
* ensembling such as bagging
* pipelining such as chaining pre-processing with a classifier

in that the pipeline also adheres to the unified interface!

This makes `sklearn` particularly powerful as a specification language,\
as compositors can be combined in any number of ways.

example compositions - tuning or ensembling:

![](./img/sklearn-composition-interface.png)

example - classification pipeline. Has the same interface!

In [7]:
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC

# 1. Instantiate the estimator
pipe = make_pipeline(StandardScaler(), SVC(gamma=0.01))

# 2. Fit clf to training data
pipe.fit(X_train, y_train)

# 3. Predict labels on test data
y_test_pred = pipe.predict(X_test)

y_test_pred


array([0, 0, 1, 2, 1, 2, 2, 1, 0, 0, 2, 0, 0, 2, 0, 0, 2, 1, 1, 1, 0, 0,
       2, 1, 2, 1, 2, 2, 2, 0, 2, 0, 1, 0, 0, 2, 0, 1])

In [8]:
# nice pretty printing
# that allows to read the specificiation easily
pipe

parameters of the composite are addressed by

`[componentname]__[paramname]` (separated by double-underscore)

this can be nested indefinitely, in multiply nested compositions!

In [9]:
pipe.get_params()

{'memory': None,
 'steps': [('standardscaler', StandardScaler()), ('svc', SVC(gamma=0.01))],
 'verbose': False,
 'standardscaler': StandardScaler(),
 'svc': SVC(gamma=0.01),
 'standardscaler__copy': True,
 'standardscaler__with_mean': True,
 'standardscaler__with_std': True,
 'svc__C': 1.0,
 'svc__break_ties': False,
 'svc__cache_size': 200,
 'svc__class_weight': None,
 'svc__coef0': 0.0,
 'svc__decision_function_shape': 'ovr',
 'svc__degree': 3,
 'svc__gamma': 0.01,
 'svc__kernel': 'rbf',
 'svc__max_iter': -1,
 'svc__probability': False,
 'svc__random_state': None,
 'svc__shrinking': True,
 'svc__tol': 0.001,
 'svc__verbose': False}

### 1.4 `skbase` / `sktime` is an evolution upon the `sklearn` base interface

`sktime` - and `skbase` which follows `sktime` - evolve the `sklearn` base interface in a number of ways, including:

- `get_fitted_params` interface for fittable objects, similar to `get_params`
- unified tag and config manager, dynamic tags
- improved state handling - `clone`, `reset`

`sktime` follows general `sklearn` interface patterns:

In [10]:
from sktime.datasets import load_airline
from sktime.forecasting.arima import ARIMA

y = load_airline()

fcst = ARIMA()

fcst.fit(y, fh=[1, 2, 3])

fcst.predict()

1961-01    426.544850
1961-02    421.282983
1961-03    416.207550
Freq: M, Name: Number of airline passengers, dtype: float64

### 1.4.1 unified tag and config system

each `skbase` / `sktime` estimator has tags and configs.

* tags are "properties" of the estimator, for developers and for search
* configs are "instructions" to the estimators, for users to set

#### `skbase` estimator tag system

In [11]:
fcst.get_tags()

{'scitype:y': 'univariate',
 'ignores-exogeneous-X': False,
 'capability:insample': True,
 'capability:pred_int': True,
 'capability:pred_int:insample': True,
 'handles-missing-data': True,
 'y_inner_mtype': 'pd.Series',
 'X_inner_mtype': 'pd.DataFrame',
 'requires-fh-in-fit': False,
 'X-y-must-have-same-index': True,
 'enforce_index_type': None,
 'fit_is_empty': False,
 'python_version': None,
 'python_dependencies': 'pmdarima'}

tags in `sktime` (and `skbase` templated packages) are listed and explained in the tag registry:

In [12]:
from sktime.registry import all_tags

all_tags("forecaster", as_dataframe=True)

Unnamed: 0,name,scitype,type,description
0,X-y-must-have-same-index,"[forecaster, regressor]",bool,do X/y in fit/update and X/fh in predict have ...
1,X_inner_mtype,"[forecaster, transformer, transformer-pairwise...","(list, [pd.Series, pd.DataFrame, np.array, nes...",which machine type(s) is the internal _fit/_pr...
2,capability:insample,forecaster,bool,can the forecaster make in-sample predictions?
3,capability:pred_int,forecaster,bool,does the forecaster implement predict_interval...
4,capability:pred_int:insample,forecaster,bool,can the forecaster make in-sample predictions ...
5,capability:pred_var,forecaster,bool,does the forecaster implement predict_variance?
6,enforce_index_type,"[forecaster, regressor]",type,"passed to input checks, input conversion index..."
7,ignores-exogeneous-X,forecaster,bool,does forecaster ignore exogeneous data (X)?
8,remember_data,"[forecaster, transformer]",bool,whether estimator remembers all data seen as s...
9,requires-fh-in-fit,forecaster,bool,does forecaster require fh passed already in f...


this can be used to search, e.g., for forecasters that can produce prediction intervals

In [13]:
from sktime.registry import all_estimators

all_estimators(
    "forecaster", filter_tags={"capability:pred_int": True}, as_dataframe=True
)

Unnamed: 0,name,estimator
0,ARIMA,<class 'sktime.forecasting.arima.ARIMA'>
1,AutoARIMA,<class 'sktime.forecasting.arima.AutoARIMA'>
2,AutoETS,<class 'sktime.forecasting.ets.AutoETS'>
3,BATS,<class 'sktime.forecasting.bats.BATS'>
4,BaggingForecaster,<class 'sktime.forecasting.compose._bagging.Ba...
5,ColumnEnsembleForecaster,<class 'sktime.forecasting.compose._column_ens...
6,ConformalIntervals,<class 'sktime.forecasting.conformal.Conformal...
7,DynamicFactor,<class 'sktime.forecasting.dynamic_factor.Dyna...
8,ForecastX,<class 'sktime.forecasting.compose._pipeline.F...
9,ForecastingGridSearchCV,<class 'sktime.forecasting.model_selection._tu...


#### `skbase` estimator config system

In [14]:
fcst.get_config()

{'display': 'diagram', 'print_changed_only': True}

using config to change display mode:

In [15]:
fcst

In [16]:
fcst.set_config(display="text")
fcst

ARIMA()

### 1.4.2 `get_fitted_params` - unified access to fitted parameters

every fittable `skbase` / `sktime` estimator has a unified `get_fitted_params` interface point

this retrieves fitted parameters as a `str` keyed `dict`,

in complete analogy to `sklearn`'s `get_params`

the default retrieves attributes `[attrname]_` at the key `"[attrname]"`

In [17]:
fcst.get_fitted_params()

{'intercept': 9.851123166191401,
 'ar.L1': 0.9645688130558445,
 'sigma2': 1118.471558740029,
 'aic': 1428.179380503289,
 'aicc': 1428.3508090747175,
 'bic': 1437.0888204020168,
 'hqic': 1431.7996741468405}

fitted params also works with nested composites, e.g., pipelines

behaviour is like `get_params` from `sklearn`

In [18]:
from sktime.datasets import load_airline
from sktime.forecasting.arima import ARIMA
from sktime.transformations.series.detrend import Deseasonalizer
from sktime.pipeline import make_pipeline

y = load_airline()

fcst = make_pipeline(Deseasonalizer(sp=12), ARIMA())

fcst.fit(y, fh=[1, 2, 3])

fcst.predict()

  warn(msg)


1961-01    434.548331
1961-02    421.796437
1961-03    454.441206
Freq: M, Name: Number of airline passengers, dtype: float64

In [19]:
fcst.get_fitted_params()

{'forecaster': ARIMA(),
 'steps': [('Deseasonalizer', Deseasonalizer(sp=12)), ('ARIMA', ARIMA())],
 'transformers_post': [],
 'transformers_pre': [('Deseasonalizer', Deseasonalizer(sp=12))],
 'Deseasonalizer': Deseasonalizer(sp=12),
 'ARIMA': ARIMA(),
 'Deseasonalizer__seasonal': Period
 1949-01   -24.748737
 1949-02   -36.188131
 1949-03    -2.241162
 1949-04    -8.036616
 1949-05    -4.506313
 1949-06    35.402778
 1949-07    63.830808
 1949-08    62.823232
 1949-09    16.520202
 1949-10   -20.642677
 1949-11   -53.593434
 1949-12   -28.619949
 Freq: M, Name: seasonal, dtype: float64,
 'ARIMA__intercept': 2.291647699095138,
 'ARIMA__ar.L1': 0.992152904734254,
 'ARIMA__sigma2': 299.71685461374733,
 'ARIMA__aic': 1240.0248111025328,
 'ARIMA__aicc': 1240.1962396739614,
 'ARIMA__bic': 1248.9342510012607,
 'ARIMA__hqic': 1243.6451047460844}

### 1.4.4 state handling via `clone`, `reset`

`skbase` / `sktime` estimators have `clone` and `reset` interface points:

* `clone` creates a blank, newly constructed copy of any object
* `reset` resets object to state after construction

both return an object with the same content!

* `clone` is a copy, does not mutate
* `reset` is identical, and mutates

The equality dunder in `skbase` / `sktime` compares the *specification* for equality (not python object identity).

In [20]:
fcst = ARIMA(order=(1, 1, 0))
fcst.fit(y, fh=[1, 2, 3])
fcst.is_fitted

True

#### `clone` - create true copy of the specification

In [21]:
fcst_clone = fcst.clone()
fcst_clone.is_fitted

False

In [22]:
fcst_clone is fcst

False

In [23]:
fcst_clone == fcst

True

why is `clone` useful, as a method?

* allows to handle case specific logic in estimators and intermediate base classes
* no extensive coupling with a "loose method" `clone`
* design consistent with `reset`

#### `reset` - reset an estimator as if freshly constructed

In [24]:
fcst_reset = fcst.reset()
fcst_reset.is_fitted

False

In [25]:
fcst_reset is fcst

True

In [26]:
# this is of course also equal
fcst_reset == fcst

True

why is `reset` useful?

* internally, it actually runs `__init__`!
* it is called in `set_params`
* it can be called at the start of `fit` (and is in `sktime`)

so, preparation and parameter checking logic can happen in `__init__` (unlike in `sklearn`)

### 1.4.5 integrated test case generation via `get_test_params`, `create_test_instances_and_names`

`skbase` / `sktime` estimators have test instance generation points:

* `get_test_params` which returns a list of param dict that can be passed to constructor or `set_params`
* `create_test_instances_and_names` which returns the instances as name-esetimator tuples

This can be used with the `skbase` test framework for systematic testing in type specific scenarios.

### 1.5 `sktime` evolves the `sklearn` extension interface for power users!

`sklearn` has pioneered the easily extensible estimator interface:

it is easy to write your own estimators and maintain them in third party code bases!

`sktime` expands on this by introducing the **template pattern** on the extender interface side:

* outer/inner methods, e.g., `fit`/`_fit`, with opportunity for boilerplate on `fit`, e.g., input checks, estimator `reset`
* tags that control boilerplate and functionality without need to write it, e.g., preferred data type, check logic

`skbase` does not *require* but *facilitates* these design patterns (combined strategy & template), more in notebook 3.

### 1.6 Summary/What is next!

- `sklearn` has a seminal interface design: unified interface (strategy pattern), modular, composition stable, easy specification language
- `sktime` evolves and consolidates the `sklearn` API: parameter, tag, config, state handling, advanced extender interface support (template pattern)
- `skbase` is a convenient developer workbench to construct packages that are conformant and compatible with the above!
- next: core usage patterns of `skbase` & playing with `skbase` `BaseObject`, `BaseEstimator`
- then: recipes for building packages with `skbase`, with examples

---
### Credits: notebook 1 - Sktime intro, toolbox features, Forecasting

notebook creation: fkiraly

some vignettes based on existing `sktime` tutorials, credit: fkiraly, miraep8, mloning and danbartl

slides (png/jpg): from fkiraly's postgraduate course at UCL, Principles and Patterns in Data Scientific Software Engineering

General credit also to `sklearn` and `sktime` contributors