# Automatic Piecewise Linear Regression (APLR)

Links to API References: [APLRRegressor](./python/api/APLRRegressor.ipynb), [APLRClassifier](./python/api/APLRClassifier.ipynb)

*See the backing repository for APLR [here](https://github.com/ottenbreit-data-science/aplr).*

<h2>Summary</h2>

APLR produces inherently interpretable models. The relationship between the response and its explanatory variables are modeled with piecewise linear basis functions. The algorithm automatically handles variable selection, non-linear relationships and interactions. Empirical tests show that APLR is often able to compete with tree-based methods on predictiveness. APLR can be used for regression tasks and classification tasks, including multiclass classification. The implementation is a light wrapper to the `aplr` package, adding the `explain_global` and `explain_local` methods so that APLR models can be interpreted in the same framework as for example EBMs.

<h2>How it Works</h2>

A brief introduction to APLR can be found [here](https://github.com/ottenbreit-data-science/aplr/tree/main/documentation). An article that describes APLR in detail and compares its predictiveness against other algorithms can be found [here](https://rdcu.be/dz7bF). 

For implementation specific details, scikit-learn's user guide [[2](pedregosa2011scikit_lr)] on linear and regression models are solid and can be found [here](https://scikit-learn.org/stable/modules/linear_model.html).

<h2>Code Example</h2>

The following code will train a logistic regression for the breast cancer dataset. The visualizations provided will be for both global and local explanations.

In [1]:
from interpret import set_visualize_provider
from interpret.provider import InlineProvider
set_visualize_provider(InlineProvider())

In [None]:
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score

from interpret.glassbox import LogisticRegression
from interpret import show

seed = 42
np.random.seed(seed)
X, y = load_breast_cancer(return_X_y=True, as_frame=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=seed)

lr = LogisticRegression(max_iter=3000, random_state=seed)
lr.fit(X_train, y_train)

auc = roc_auc_score(y_test, lr.predict_proba(X_test)[:, 1])
print("AUC: {:.3f}".format(auc))

In [None]:
show(lr.explain_global())

In [None]:
show(lr.explain_local(X_test[:5], y_test[:5]), 0)

<h2>Further Resources</h2>

- [Wikipedia on Linear Models](https://scikit-learn.org/stable/modules/linear_model.html)
- [scikit-learn on their Linear Models module](https://scikit-learn.org/stable/modules/linear_model.html)

<h2>Bibliography</h2>

(molnar2020interpretable_lr)=
[1] Christoph Molnar. Interpretable machine learning. Lulu. com, 2020.

(pedregosa2011scikit_lr)=
[2] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, and others. Scikit-learn: machine learning in python. the Journal of machine Learning research, 12:2825–2830, 2011.