![](https://i1.wp.com/pycaret.org/wp-content/uploads/2020/04/thumbnail.png?fit=1166%2C656&ssl=1)

⚙️ Install PyCaret & Import Libraries
Kaggle notebooks do not provide pycaret by default. So, you can install it with the following command :

> !pip install pycaret

In [None]:
!pip install pycaret

In [None]:
import numpy as np 
import pandas as pd 
import os
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

In [None]:
# Import the data
train = pd.read_csv("../input/tabular-playground-series-jun-2021/train.csv")
test = pd.read_csv("../input/tabular-playground-series-jun-2021/test.csv")
sample_submission = pd.read_csv("../input/tabular-playground-series-jun-2021/sample_submission.csv")

In [None]:
train.shape, test.shape

In [None]:
train.head()

* the dataset looks same as previous TPS.
* The id value is meaningless, so I will leave it out in advance.

In [None]:
import plotly.graph_objects as go
# Use `hole` to create a donut-like pie chart
fig = go.Figure(data=[go.Pie(labels=train.target, hole=.3)])
fig.show()

In [None]:
train.drop('id',axis=1,inplace=True)
test.drop('id',axis=1,inplace=True)

### Setting up Environment in PyCaret
The setup() function initializes the environment in pycaret and creates the transformation pipeline to prepare the data for modeling and deployment. setup() must be called before executing any other function in pycaret. It takes two mandatory parameters: a pandas dataframe and the name of the target column. All other parameters are optional and are used to customize the pre-processing pipeline.

In [None]:
from pycaret.classification import *
exp_mclf = setup(data = train, target = 'target', fold = 9, session_id=2021,silent = True)

#### using add_metric to apply logloss and submit with predict_proba

In [None]:
from sklearn.metrics import log_loss
add_metric('logloss', 'LogLoss', log_loss, greater_is_better=False, target="pred_proba")

### Models in PyCaret
* There are 18 classifiers available in the model library of PyCaret. To see list of all classifiers either check the docstring or use models function to see the library.

In [None]:
models()

* I choose **lightgbm** and **xgboost**  for prediction.

### Creating Models

* create_model is the most granular function in PyCaret and is often the foundation behind most of the PyCaret functionalities. As the name suggests this function trains and evaluates a model using cross validation that can be set with fold parameter. The output prints a score grid that shows Accuracy, AUC, Recall, Precision, F1, Kappa, MCC and logloss by fold.I will work with the below models.
  * LGBMClassifier ('lightgbm')
  * XGBClassifier ('xgboost')

In [None]:
lgb = create_model('lightgbm',learning_rate= 0.0321)

> xgb = create_model('xgboost',max_depth= 8)

### Tune a Model
* When a model is created using the create_model() function it uses the default hyperparameters to train the model. In order to tune hyperparameters, the tune_model() function is used. This function automatically tunes the hyperparameters of a model using Random Grid Search on a pre-defined search space. The output prints a score grid that shows Accuracy, AUC, Recall, Precision, F1, Kappa, and MCC by fold for the best model. To use the custom search grid, you can pass custom_grid parameter in the tune_model function.

> tuned_lgb = tune_model(lgb,
                      optimize='LogLoss')

### Blend Models
* Blending models is a method of ensembling which uses consensus among estimators to generate final predictions. The idea behind blending is to combine different machine learning algorithms and use a majority vote or the average predicted probabilities in case of classification to predict the final outcome. Blending models in PyCaret is as simple as writing blend_models.

> blended = blend_models(estimator_list = [lgb, xgb], optimize = 'logloss')

### Stack Models
* Stacking models is method of ensembling that uses meta learning. The idea behind stacking is to build a meta model that generates the final prediction using the prediction of multiple base estimators. Stacking models in PyCaret is as simple as writing stack_models.

> stacked = stack_models(estimator_list = [lgb, xgb], optimize = 'logloss', method = 'predict_proba')

### Plot a Model
Before model finalization, the plot_model() function can be used to analyze the performance across different aspects such as AUC, confusion_matrix, decision boundary etc. This function takes a trained model object and returns a plot based on the test / hold-out set.

There are 15 different plots available, please see the plot_model() docstring for the list of available plots.

#### auc Plot

In [None]:
plot_model(lgb,plot = 'auc')

> plot_model(xgb,plot = 'auc')

### Predict on test

In [None]:
prep_pipe = get_config("prep_pipe")
prep_pipe.steps.append(['trained_model', lgb])
predictions = prep_pipe.predict_proba(test)
predictions

In [None]:
#pred_lgb = pred(lgb)
#pred_xgb = pred(xgb)
#pred_blend = pred(blended)
#pred_stacked = pred(stacked)

In [None]:
sample_submission[['Class_1','Class_2', 'Class_3', 'Class_4','Class_5','Class_6', 'Class_7', 'Class_8', 'Class_9']] = predictions
sample_submission.to_csv(f'lgb.csv',index=False)

#sample_submission[['Class_1','Class_2', 'Class_3', 'Class_4','Class_5','Class_6', 'Class_7', 'Class_8', 'Class_9']] = pred_xgb
#sample_submission.to_csv(f'xgb.csv',index=False)


#### if you like this notebook plz upvote :)
#### thank you