# Lazy Predict

Lazy Predict helps build a lot of basic models without much code and helps understand which models works better without any parameter tuning.

Using lazypredict is very easy and intuitive for anyone familiar with scikit-learn. We first create an instance of the estimator, LazyClassifier in this case, and then fit it to the data using the fit method. By specifying predictions=True while creating the instance of LazyClassifier, we will also receive predictions of all the models for each and every observation. Just in case we want to use them for something else later on. Additionally, we can use the custom_metric argument to pass a custom metric we would like to use for evaluating the models’ performance.

LazyPredict is a module helpful for this purpose. LazyPredict will generate all the basic machine learning algorithms’ performances on your model. Along with the accuracy score, LazyPredict provides certain evaluation metrics and the time taken by each model.

Lazypredict is an open-source python package created by Shankar Rao Pandala. Development and contribution to this are still going. 

### Properties of LazyPredict:

As of now, it is only based on Supervised learning algorithms(Regression and Classification)
Compatible with python version 3.6 and above.
Could be run on Command Line Interface(CLI). 
Fast in predicting as all the basic model performances for the dataset is given at once.
Has an inbuilt Pipeline to scaling and transform the data and handle missing values and change categorical data to numeric. 
Provides evaluation metrics on individual models.
Shows the time consumed by each model to build.
In this article, I’ll be discussing how to implement LazyPredict for regression and classification models with just a few lines of code.

### Installing packages

In [2]:
pip install lazypredict

Collecting lazypredict
  Using cached lazypredict-0.2.9-py2.py3-none-any.whl (12 kB)
Collecting pandas==1.0.5
  Using cached pandas-1.0.5-cp38-cp38-win_amd64.whl (8.9 MB)
Collecting joblib==1.0.0
  Using cached joblib-1.0.0-py3-none-any.whl (302 kB)
Collecting tqdm==4.56.0
  Using cached tqdm-4.56.0-py2.py3-none-any.whl (72 kB)
Collecting lightgbm==2.3.1
  Using cached lightgbm-2.3.1-py2.py3-none-win_amd64.whl (544 kB)
Collecting scipy==1.5.4
  Using cached scipy-1.5.4-cp38-cp38-win_amd64.whl (31.4 MB)
Collecting xgboost==1.1.1
  Using cached xgboost-1.1.1-py3-none-win_amd64.whl (54.4 MB)
Collecting scikit-learn==0.23.1
  Using cached scikit_learn-0.23.1-cp38-cp38-win_amd64.whl (6.8 MB)
Collecting pytest==5.4.3
  Using cached pytest-5.4.3-py3-none-any.whl (248 kB)
Collecting numpy==1.19.1
  Using cached numpy-1.19.1-cp38-cp38-win_amd64.whl (13.0 MB)
Installing collected packages: numpy, scipy, joblib, scikit-learn, xgboost, tqdm, pytest, pandas, lightgbm, lazypredict
  Attempting unins

ERROR: Could not install packages due to an OSError: [WinError 5] Access is denied: 'C:\\Users\\piyush.pathak\\Anaconda3\\Lib\\site-packages\\~cipy\\linalg\\cython_blas.cp38-win_amd64.pyd'
Consider using the `--user` option or check the permissions.



## Importing pakages and data for classification and check the performance of different models

In [None]:
import lazypredict
from lazypredict.Supervised import LazyClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

data = load_breast_cancer()
X = data.data
y= data.target

X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=.30,random_state =123)

clf = LazyClassifier(verbose=0,ignore_warnings=True, custom_metric=None)
models,predictions = clf.fit(X_train, X_test, y_train, y_test)

print(models)

In [None]:
models

## Importing pakages and data for regression and check the performance of different models

In [None]:
from lazypredict.Supervised import LazyRegressor
from sklearn import datasets
from sklearn.utils import shuffle
import numpy as np

boston = datasets.load_boston()
X, y = shuffle(boston.data, boston.target, random_state=13)


offset = int(X.shape[0] * 0.9)

X_train, y_train = X[:offset], y[:offset]
X_test, y_test = X[offset:], y[offset:]

reg = LazyRegressor(verbose=0, ignore_warnings=False, custom_metric=None)
models, predictions = reg.fit(X_train, X_test, y_train, y_test)

print(models)

100%|██████████| 42/42 [00:02<00:00, 14.55it/s]

                               Adjusted R-Squared  R-Squared  RMSE  Time Taken
Model                                                                         
SVR                                          0.83       0.88  2.62        0.03
BaggingRegressor                             0.83       0.88  2.63        0.06
NuSVR                                        0.82       0.86  2.76        0.04
RandomForestRegressor                        0.81       0.86  2.79        0.37
XGBRegressor                                 0.81       0.86  2.79        0.09
GradientBoostingRegressor                    0.81       0.86  2.84        0.17
ExtraTreesRegressor                          0.79       0.84  2.98        0.23
HistGradientBoostingRegressor                0.77       0.83  3.06        0.27
AdaBoostRegressor                            0.77       0.83  3.06        0.12
PoissonRegressor                             0.77       0.83  3.11        0.02
LGBMRegressor                                0.77   




In [None]:
models

Unnamed: 0_level_0,Adjusted R-Squared,R-Squared,RMSE,Time Taken
Model,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
SVR,0.83,0.88,2.62,0.03
BaggingRegressor,0.83,0.88,2.63,0.06
NuSVR,0.82,0.86,2.76,0.04
RandomForestRegressor,0.81,0.86,2.79,0.37
XGBRegressor,0.81,0.86,2.79,0.09
GradientBoostingRegressor,0.81,0.86,2.84,0.17
ExtraTreesRegressor,0.79,0.84,2.98,0.23
HistGradientBoostingRegressor,0.77,0.83,3.06,0.27
AdaBoostRegressor,0.77,0.83,3.06,0.12
PoissonRegressor,0.77,0.83,3.11,0.02


## Load new data and apply Lazypredict

In [None]:
import seaborn as sns

In [None]:
df=sns.load_dataset('titanic')

In [None]:
df.head()

Unnamed: 0,survived,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
0,0,3,male,22.0,1,0,7.25,S,Third,man,True,,Southampton,no,False
1,1,1,female,38.0,1,0,71.28,C,First,woman,False,C,Cherbourg,yes,False
2,1,3,female,26.0,0,0,7.92,S,Third,woman,False,,Southampton,yes,True
3,1,1,female,35.0,1,0,53.1,S,First,woman,False,C,Southampton,yes,False
4,0,3,male,35.0,0,0,8.05,S,Third,man,True,,Southampton,no,True


In [None]:
X=df.drop('survived',axis=1)
y=df.iloc[:,0]

In [None]:
X.head()

Unnamed: 0,pclass,sex,age,sibsp,parch,fare,embarked,class,who,adult_male,deck,embark_town,alive,alone
0,3,male,22.0,1,0,7.25,S,Third,man,True,,Southampton,no,False
1,1,female,38.0,1,0,71.28,C,First,woman,False,C,Cherbourg,yes,False
2,3,female,26.0,0,0,7.92,S,Third,woman,False,,Southampton,yes,True
3,1,female,35.0,1,0,53.1,S,First,woman,False,C,Southampton,yes,False
4,3,male,35.0,0,0,8.05,S,Third,man,True,,Southampton,no,True


In [None]:
y

0      0
1      1
2      1
3      1
4      0
      ..
886    0
887    1
888    0
889    1
890    0
Name: survived, Length: 891, dtype: int64

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=.25,random_state =123)


In [None]:
clf = LazyClassifier(verbose=0, ignore_warnings=False, custom_metric=None)
models,predictions = clf.fit(X_train, X_test, y_train, y_test)

 14%|█▍        | 4/29 [00:00<00:01, 20.90it/s]

CategoricalNB model failed to execute
Negative values in data passed to CategoricalNB (input X)


100%|██████████| 29/29 [00:01<00:00, 19.68it/s]

StackingClassifier model failed to execute
__init__() missing 1 required positional argument: 'estimators'





In [None]:
print(models)

                               Accuracy  ...  Time Taken
Model                                    ...            
AdaBoostClassifier                 1.00  ...        0.03
BaggingClassifier                  1.00  ...        0.04
XGBClassifier                      1.00  ...        0.06
SVC                                1.00  ...        0.06
SGDClassifier                      1.00  ...        0.04
RidgeClassifierCV                  1.00  ...        0.04
RidgeClassifier                    1.00  ...        0.04
RandomForestClassifier             1.00  ...        0.20
Perceptron                         1.00  ...        0.02
PassiveAggressiveClassifier        1.00  ...        0.03
LogisticRegression                 1.00  ...        0.05
LinearSVC                          1.00  ...        0.07
GaussianNB                         1.00  ...        0.04
ExtraTreesClassifier               1.00  ...        0.15
ExtraTreeClassifier                1.00  ...        0.02
DecisionTreeClassifier         

In [None]:
models

Unnamed: 0_level_0,Accuracy,Balanced Accuracy,ROC AUC,F1 Score,Time Taken
Model,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
AdaBoostClassifier,1.0,1.0,1.0,1.0,0.03
BaggingClassifier,1.0,1.0,1.0,1.0,0.04
XGBClassifier,1.0,1.0,1.0,1.0,0.06
SVC,1.0,1.0,1.0,1.0,0.06
SGDClassifier,1.0,1.0,1.0,1.0,0.04
RidgeClassifierCV,1.0,1.0,1.0,1.0,0.04
RidgeClassifier,1.0,1.0,1.0,1.0,0.04
RandomForestClassifier,1.0,1.0,1.0,1.0,0.2
Perceptron,1.0,1.0,1.0,1.0,0.02
PassiveAggressiveClassifier,1.0,1.0,1.0,1.0,0.03
