# AutoML with LazyPredict

[Official website](https://github.com/shankarpandala/lazypredict)

In [None]:
from lazypredict.Supervised import LazyClassifier, LazyRegressor
from sklearn.model_selection import train_test_split
from sklearn import datasets

## Classification

In [None]:
# load data
data = datasets.load_breast_cancer()
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.2, random_state=42)
# fit all models
clf = LazyClassifier(predictions=True)
models, predictions = clf.fit(X_train, X_test, y_train, y_test)

One important thing that is hidden from the user is that the library automatically applies preprocessing to the dataset. 

First, it imputes missing values using `SimpleImputer` (using the mean for numeric features and a constant 'missing' value for categorical ones). Then, it uses `StandardScaler` for numeric features and either `OneHotEncoder` or `OrdinalEncoder` for the categorical features (depending on the cardinality — number of unique values). 

While this is handy and ensures that the models will actually run, some users might prefer different approaches to preprocessing the dataset.

The `models` DataFrame contains the performance summary of all the classifiers, while `predictions` contains all the predictions for each model.

In [None]:
models

In [None]:
predictions

## Regression

In [None]:
# load data
boston = datasets.load_boston()
X, y = boston.data, boston.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.2, random_state=42)
# fit all models
reg = LazyRegressor(predictions=True)
models, predictions = reg.fit(X_train, X_test, y_train, y_test)

In [None]:
models

In [None]:
predictions