# Lazy Predict

- While building machine learning models we are not sure which algorithm should work well with the given dataset, hence we end up trying many models and keep iterating until we get proper accuracy. Have you ever thought about getting all the basic algorithms at once to predict for model performance ?

- Ans is LazyPredict. It is a module helpful for this purpose. LazyPredict will generate all the basic machine learning algorithms’ performances on your model. Along with the accuracy score, LazyPredict provides certain evaluation metrics and the time taken by each model.

### What is Lazy Predict ?

- LazyPredict is an open-source Python library that automates the model training pipeline and speeds up the workflow. LazyPredict trains around 30 classification models for a classification dataset and trains around 40 regression models for a regression dataset.

- LazyPredict returns with the trained models along with its performance metric without writing much code. One can compare the performance metrics of each model and tune the best model to further improve the performance.

#### Parameters used in LazyRegressor ():

        verbose  – by default 0
        ignore_warning – by default set to True, to avoid warning messages for any kind of discrepancy in generating models
        custom_metric – by default None, can be set to custom metrics if defined
        predictions – by default False, if set to True it’ll return predictions based on each model.
        random_state by default is set to 42.
        Note that all of these parameters are optional, if not defined they will take the default values. 

In [1]:
# ! pip install --user lazypredict

## Classification

In [2]:
# Import libraries
import warnings
warnings.simplefilter('ignore')

from lazypredict.Supervised import LazyClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

In [3]:
# Load dataset
data = load_breast_cancer()
X = data.data
y= data.target

In [4]:
# Data split
X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=.5,random_state =123)

In [5]:
X_train.shape,X_test.shape,y_train.shape,y_test.shape

((284, 30), (285, 30), (284,), (285,))

In [6]:
# Defines and builds the lazyclassifier
clf = LazyClassifier(verbose=0,ignore_warnings=True, custom_metric=None)
models,predictions = clf.fit(X_train, X_test, y_train, y_test)

100%|██████████████████████████████████████████████████████████████████████████████████| 29/29 [00:02<00:00, 11.85it/s]


In [7]:
# Prints the model performance
models

Unnamed: 0_level_0,Accuracy,Balanced Accuracy,ROC AUC,F1 Score,Time Taken
Model,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
LinearSVC,0.99,0.99,0.99,0.99,0.03
Perceptron,0.99,0.98,0.98,0.99,0.02
LogisticRegression,0.99,0.98,0.98,0.99,0.03
SVC,0.98,0.98,0.98,0.98,0.03
XGBClassifier,0.98,0.98,0.98,0.98,0.57
LabelPropagation,0.98,0.97,0.97,0.98,0.03
LabelSpreading,0.98,0.97,0.97,0.98,0.03
BaggingClassifier,0.97,0.97,0.97,0.97,0.11
PassiveAggressiveClassifier,0.98,0.97,0.97,0.98,0.03
SGDClassifier,0.98,0.97,0.97,0.98,0.03


## Regression

In [8]:
from lazypredict.Supervised import LazyRegressor
from sklearn import datasets
from sklearn.utils import shuffle
import numpy as np

In [9]:
boston = datasets.load_boston()
X, y = shuffle(boston.data, boston.target, random_state=13)
X = X.astype(np.float32)

In [10]:
offset = int(X.shape[0] * 0.9)

X_train, y_train = X[:offset], y[:offset]
X_test, y_test = X[offset:], y[offset:]

In [11]:
reg = LazyRegressor(verbose=0, ignore_warnings=True, custom_metric=None)
models, predictions = reg.fit(X_train, X_test, y_train, y_test)

print(models)

100%|██████████████████████████████████████████████████████████████████████████████████| 42/42 [00:04<00:00,  9.56it/s]

                               Adjusted R-Squared  R-Squared  RMSE  Time Taken
Model                                                                         
SVR                                          0.83       0.88  2.62        0.03
BaggingRegressor                             0.83       0.88  2.63        0.07
NuSVR                                        0.82       0.86  2.76        0.02
RandomForestRegressor                        0.81       0.86  2.78        0.55
XGBRegressor                                 0.81       0.86  2.79        0.12
GradientBoostingRegressor                    0.81       0.86  2.84        0.24
ExtraTreesRegressor                          0.79       0.84  2.98        0.34
AdaBoostRegressor                            0.78       0.83  3.04        0.18
HistGradientBoostingRegressor                0.77       0.83  3.06        0.64
PoissonRegressor                             0.77       0.83  3.11        0.02
LGBMRegressor                                0.77   




#### Conclusion

LazyPredict would be very handy for selecting the more accurate model for the dataset being used from a variety of different models along with evaluation metrics within some seconds. Thereafter the best model could be tested against hyperparameters. Easy to implement and use as it performs all the preprocessing.