<a href="https://colab.research.google.com/github/DataWithAaditya/Max-Life-Health-Insurance-Cross-Sell-Prediction/blob/main/Max_Life_Health_Insurance_Cross_Sell_Prediction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Project Name**    - Max Life Health Insurance Cross Sell Prediction



##### **Project Type**    - EDA/Regression/Classification/Unsupervised
##### **Contribution**    - Individual

# **Project Summary -**

Insurance companies aim to maximize customer engagement by identifying potential buyers for additional policies. This project, Max Life Health Insurance Cross-Sell Prediction, uses machine learning to predict whether an existing customer is likely to purchase health insurance. By analyzing customer demographics, past insurance history, and vehicle-related details, we can build a predictive model that improves marketing efficiency and boosts sales.

Traditional marketing strategies often struggle to identify the right customers, leading to wasted efforts and higher costs. Some customers may receive unnecessary offers, while others who might be interested are overlooked. Our goal is to create an effective model that predicts customer interest in health insurance, allowing Max Life to focus on high-potential buyers. This will not only improve conversion rates but also enhance the overall customer experience by offering personalized recommendations.

The dataset consists of 381,109 records with 12 features, including customer demographics, vehicle details, and past insurance history. Key features include age, gender, driving license status, insurance history, vehicle age, vehicle damage history, annual premium, and policy sales channels. The target variable, Response, indicates whether a customer is interested in purchasing a health insurance policy. By studying patterns in this data, we can build a machine learning model that predicts the likelihood of a customer buying insurance.

The project follows a structured machine learning pipeline. First, we conduct Exploratory Data Analysis (EDA) to understand data distributions, detect missing values, and identify key patterns. For example, we analyze how age groups influence insurance purchase behavior, whether vehicle damage history affects interest, and how different sales channels impact customer decisions. Identifying these patterns helps in selecting the right features for the model.

Next, we preprocess the data by handling missing values, encoding categorical variables, and scaling numerical features. We also remove unnecessary columns that do not contribute to predictions. Data preprocessing ensures that the machine learning model receives clean and structured input, which leads to better performance.

Feature engineering is an essential step where we create new features that can improve model accuracy. For example, we derive an Insurance Score based on previous insurance history, vehicle damage, and annual premium. Additionally, a Customer Loyalty Index is computed using the duration of association with the company and regional factors. These new features help in making better predictions and improving model accuracy.

For model training, we experiment with various machine learning algorithms, including Logistic Regression, Random Forest, Gradient Boosting models like XGBoost and LightGBM, and Neural Networks if deep learning is necessary. Each model is evaluated based on accuracy, precision, recall, F1-score, and AUC-ROC metrics to determine the best-performing model. We compare results to find the most effective approach for predicting customer interest in insurance policies.

Once we select the best model, we fine-tune it using hyperparameter optimization techniques such as Grid Search and Random Search. Fine-tuning helps in improving model performance and reducing errors, ensuring that predictions are as accurate as possible. Additionally, we check for overfitting and adjust the model accordingly.

If required, the final model can be deployed using Streamlit for a simple web interface or integrated into company systems via Flask or Django API. This allows Max Life Insurance to use the predictive model in real-time to assist marketing teams in targeting the right customers. Deployment ensures that the model is not just a theoretical solution but a practical tool that can be used to make real business decisions.

By the end of the project, we aim to develop an accurate model that helps Max Life Insurance improve its cross-selling strategy, reduce marketing costs, and enhance customer satisfaction through personalized recommendations. With better targeting, the company can optimize its resources, improve conversion rates, and offer customers more relevant insurance plans.

In conclusion, this project demonstrates the power of machine learning in optimizing business decisions. By predicting customer interest in health insurance, Max Life can improve its marketing strategy, boost sales, and create a more customer-centric approach. The insights gained from this analysis will not only enhance targeting strategies but also refine product offerings to meet customer needs more effectively. This project showcases the importance of data-driven decision-making and highlights how machine learning can transform the way businesses interact with their customers.



# **GitHub Link -**

GitHub Link: [GitHub Link click here!](https://github.com/DataWithAaditya/Max-Life-Health-Insurance-Cross-Sell-Prediction/tree/main)

# **Problem Statement**


Insurance companies face challenges in identifying the right customers for additional policy sales. Many marketing efforts are wasted on customers who are not interested, while potential buyers are sometimes overlooked. Max Life Insurance wants to improve its cross-selling strategy by predicting which existing customers are likely to purchase health insurance. By leveraging machine learning, we can analyze customer data, such as demographics, vehicle history, and past insurance purchases, to build a predictive model that helps the company focus its marketing efforts on the right audience. The goal is to improve sales efficiency, reduce costs, and enhance customer satisfaction by offering relevant products to the right customers at the right time.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 15 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





6. You may add more ml algorithms for model creation. Make sure for each and every algorithm, the following format should be answered.


*   Explain the ML Model used and it's performance using Evaluation metric Score Chart.


*   Cross- Validation & Hyperparameter Tuning

*   Have you seen any improvement? Note down the improvement with updates Evaluation metric Score Chart.

*   Explain each evaluation metric's indication towards business and the business impact pf the ML model used.




















# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

### Dataset Loading

In [None]:
# Mount drive if, working on Google Colab
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Load Dataset
df = pd.read_csv('/content/drive/My Drive/Colab Notebooks/Max Life Health Insurance Cross Sell Prediction/TRAIN-HEALTH INSURANCE CROSS SELL PREDICTION.csv')

### Dataset First View

In [None]:
# Dataset First Look
df.head()

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
print("Dataset Size")
print("Rows: ", df.shape[0])
print("Columns: ", df.shape[1])

### Dataset Information

In [None]:
# Dataset Info
df.info()

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
duplicate_values = df.duplicated().sum()
duplicate_values

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
missing_values = df.isnull().sum()
missing_values

In [None]:
# Visualizing missing values using a bar plot
plt.figure(figsize=(8, 6))
missing_values.plot(kind="bar", color="skyblue")
plt.title("Missing Values in Each Column")
plt.xlabel("Columns")
plt.ylabel("Count of Missing Values")
plt.xticks(rotation=45)
plt.show()

This confirms that there are no missing values in the dataset.

### What did you know about your dataset?

After loading the dataset and performing initial checks, I learned the following:

- The dataset contains 381,109 rows and 12 columns.
- It includes numerical and categorical features related to customers, their insurance history, and vehicle details.
- The dataset has no missing values, which means it's complete and doesn’t require imputation.
- Data types are appropriate for analysis, with numerical and categorical values correctly assigned.
- There are no duplicate records, ensuring data integrity.

The dataset is clean and well-structured, with no missing or duplicate values. It provides valuable information to predict which customers are likely to purchase health insurance. Next, we can explore feature distributions, outliers, and relationships between variables.

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
print("Dataset Columns:\n", df.columns)

In [None]:
# Dataset Describe

df.describe()

### Variables Description

The dataset has 381,109 records with 12 features, including customer details, past insurance history, and vehicle-related information. Key features include:

- Gender: Male or Female
- Age: Customer's age in years
- Driving_License: If the customer has a driving license (1 = Yes, 0 = No)
- Region_Code: Location of the customer
- Previously_Insured: If the customer already has an insurance policy (1 = Yes, 0 = No)
- Vehicle_Age: How old the vehicle is (<1 Year, 1-2 Years, >2 Years)
- Vehicle_Damage: If the vehicle was damaged before (Yes/No)
- Annual_Premium: Insurance cost paid by the customer
- Policy_Sales_Channel: How the policy was sold
- Vintage: How long the customer has been with the company
- Response (Target Variable): If the customer is interested in buying health insurance (1 = Yes, 0 = No)

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
unique_values = df.nunique()
unique_values

### Detect Outliers

In [None]:
# Select only numerical columns
num_cols = df.select_dtypes(include=np.number).columns

# Function to visualize outliers using Boxplots
def plot_boxplots(data, columns):
    plt.figure(figsize=(15, 8))
    for i, col in enumerate(columns, 1):
        plt.subplot((len(columns) // 4) + 1, 4, i)  # Adjust layout dynamically
        sns.boxplot(y=data[col], color='skyblue')
        plt.title(f'Boxplot of {col}')
    plt.tight_layout()
    plt.show()

# Call the function to plot boxplots
plot_boxplots(df, num_cols)

In [None]:
# Detecting outliers using IQR method
def detect_outliers_iqr(data, column):
    Q1 = data[column].quantile(0.25)  # First quartile (25th percentile)
    Q3 = data[column].quantile(0.75)  # Third quartile (75th percentile)
    IQR = Q3 - Q1  # Interquartile range

    lower_bound = Q1 - 1.5 * IQR
    upper_bound = Q3 + 1.5 * IQR

    outliers = data[(data[column] < lower_bound) | (data[column] > upper_bound)]

    print(f"{column}: {len(outliers)} outliers detected")
    return outliers

# Define numerical_features
numerical_features = df.select_dtypes(include=np.number).columns

# Check for outliers in all numerical features
outlier_counts = {}
for col in numerical_features:
    outliers = detect_outliers_iqr(df, col)
    outlier_counts[col] = len(outliers)

# Display outlier summary
print("\nOutlier Summary:")
print(outlier_counts)

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write ready code for data analysis

# Handling Outliers for Annual_Premium using Winsorization (Capping)
def cap_outliers(data, column):
    Q1 = data[column].quantile(0.25)  # 25th percentile
    Q3 = data[column].quantile(0.75)  # 75th percentile
    IQR = Q3 - Q1  # Interquartile range

    lower_bound = Q1 - 1.5 * IQR
    upper_bound = Q3 + 1.5 * IQR

    # Capping outliers
    data[column] = np.where(data[column] < lower_bound, lower_bound, data[column])
    data[column] = np.where(data[column] > upper_bound, upper_bound, data[column])

    print(f"Outliers in {column} capped between {lower_bound:.2f} and {upper_bound:.2f}")

# Apply capping to 'Annual_Premium'
cap_outliers(df, 'Annual_Premium')

# Print duplicate values
print("Duplicate Values: ", duplicate_values)

# Print missing values
print("Missing Values:\n ", missing_values)

# Unique values
print("Unique Values Each Columns:\n ", unique_values)

In [None]:
# After handling outliers

# Detecting outliers using IQR method
def detect_outliers_iqr(data, column):
    Q1 = data[column].quantile(0.25)  # First quartile (25th percentile)
    Q3 = data[column].quantile(0.75)  # Third quartile (75th percentile)
    IQR = Q3 - Q1  # Interquartile range

    lower_bound = Q1 - 1.5 * IQR
    upper_bound = Q3 + 1.5 * IQR

    outliers = data[(data[column] < lower_bound) | (data[column] > upper_bound)]

    print(f"{column}: {len(outliers)} outliers detected")
    return outliers

# Define numerical_features
numerical_features = df.select_dtypes(include=np.number).columns

# Check for outliers in all numerical features
outlier_counts = {}
for col in numerical_features:
    outliers = detect_outliers_iqr(df, col)
    outlier_counts[col] = len(outliers)

# Display outlier summary
print("\nOutlier Summary:")
print(outlier_counts)

***Driving_License & Response:***
These are categorical/binary variables (0 or 1), so outliers are irrelevant.
No need to remove or modify them.

### What all manipulations have you done and insights you found?

We performed several preprocessing steps to clean and prepare the data:

1. Handling Missing Values & Duplicates

- What we did:

  - Checked for missing values in each column.
  - Checked for duplicate values in each column.

- Why we did it:

  - Missing values can cause bias or errors in model predictions.
  - Removing duplicates ensures that each data point is unique.

2. Detecting & Handling Outliers

- What we did:

  - Used the Interquartile Range (IQR) Method to detect extreme values.
  - Found outliers in three columns:
    - Driving_License (812 outliers)
    - Annual_Premium (10,320 outliers)
    - Response (46,710 outliers)
  - Applied capping (Winsorization) on Annual_Premium to limit extreme values.
  - Left Driving_License and Response unchanged, as they are categorical (binary 0/1).

- Why we did it:

  - Outliers in numerical data can distort the model's learning process.
  - Capping ensures extreme values don’t mislead the predictions.


**Insights Gained:**
- Missing Values Analysis:
  - Most columns had no missing values, indicating a well-maintained dataset.

- Duplicate Data:
  - We checked and removed duplicate records to ensure better model performance.

- Outlier Analysis:
  - Annual_Premium had extreme values, meaning some customers were paying very high premiums compared to others.
  - Insight: There might be a difference in premium rates based on customer segments.
- Driving_License & Response were categorical, and outliers weren’t an issue there.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1: Distribution of Gender

In [None]:
# Set figure size
plt.figure(figsize=(6, 4))

# Count plot for gender
sns.countplot(x=df['Gender'], palette='coolwarm')

# Add title and labels
plt.title('Distribution of Gender', fontsize=14)
plt.xlabel('Gender', fontsize=12)
plt.ylabel('Count', fontsize=12)

# Display the chart
plt.show()

##### 1. Why did you pick the specific chart?

**Answer:** A count plot will help us understand the distribution of male and female customers. This is crucial because gender-based preferences may impact insurance purchase behavior.

##### 2. What is/are the insight(s) found from the chart?

**Answer:**
- If the distribution is imbalanced, it may indicate that one gender is more likely to purchase health insurance than the other.
- If there is a significant gender gap, we might need targeted marketing strategies for the underrepresented group.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Answer:**
- Positive Impact: If we find that one gender dominates, we can tailor promotional campaigns to the other gender to improve cross-selling.

- Negative Impact: If gender imbalance exists and isn't addressed, we may miss out on potential customers.

#### Chart - 2

In [None]:
# Chart - 2 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 3

In [None]:
# Chart - 3 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 4

In [None]:
# Chart - 4 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 5

In [None]:
# Chart - 5 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 6

In [None]:
# Chart - 6 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 7

In [None]:
# Chart - 7 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 8

In [None]:
# Chart - 8 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 9

In [None]:
# Chart - 9 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 10

In [None]:
# Chart - 10 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 11

In [None]:
# Chart - 11 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 12

In [None]:
# Chart - 12 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 13

In [None]:
# Chart - 13 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 14 - Correlation Heatmap

In [None]:
# Correlation Heatmap visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

#### Chart - 15 - Pair Plot

In [None]:
# Pair Plot visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

## ***5. Hypothesis Testing***

### Based on your chart experiments, define three hypothetical statements from the dataset. In the next three questions, perform hypothesis testing to obtain final conclusion about the statements through your code and statistical testing.

Answer Here.

### Hypothetical Statement - 1

#### 1. State Your research hypothesis as a null hypothesis and alternate hypothesis.

Answer Here.

#### 2. Perform an appropriate statistical test.

In [None]:
# Perform Statistical Test to obtain P-Value

##### Which statistical test have you done to obtain P-Value?

Answer Here.

##### Why did you choose the specific statistical test?

Answer Here.

### Hypothetical Statement - 2

#### 1. State Your research hypothesis as a null hypothesis and alternate hypothesis.

Answer Here.

#### 2. Perform an appropriate statistical test.

In [None]:
# Perform Statistical Test to obtain P-Value

##### Which statistical test have you done to obtain P-Value?

Answer Here.

##### Why did you choose the specific statistical test?

Answer Here.

### Hypothetical Statement - 3

#### 1. State Your research hypothesis as a null hypothesis and alternate hypothesis.

Answer Here.

#### 2. Perform an appropriate statistical test.

In [None]:
# Perform Statistical Test to obtain P-Value

##### Which statistical test have you done to obtain P-Value?

Answer Here.

##### Why did you choose the specific statistical test?

Answer Here.

## ***6. Feature Engineering & Data Pre-processing***

### 1. Handling Missing Values

In [None]:
# Handling Missing Values & Missing Value Imputation

#### What all missing value imputation techniques have you used and why did you use those techniques?

Answer Here.

### 2. Handling Outliers

In [None]:
# Handling Outliers & Outlier treatments

##### What all outlier treatment techniques have you used and why did you use those techniques?

Answer Here.

### 3. Categorical Encoding

In [None]:
# Encode your categorical columns

#### What all categorical encoding techniques have you used & why did you use those techniques?

Answer Here.

### 4. Textual Data Preprocessing
(It's mandatory for textual dataset i.e., NLP, Sentiment Analysis, Text Clustering etc.)

#### 1. Expand Contraction

In [None]:
# Expand Contraction

#### 2. Lower Casing

In [None]:
# Lower Casing

#### 3. Removing Punctuations

In [None]:
# Remove Punctuations

#### 4. Removing URLs & Removing words and digits contain digits.

In [None]:
# Remove URLs & Remove words and digits contain digits

#### 5. Removing Stopwords & Removing White spaces

In [None]:
# Remove Stopwords

In [None]:
# Remove White spaces

#### 6. Rephrase Text

In [None]:
# Rephrase Text

#### 7. Tokenization

In [None]:
# Tokenization

#### 8. Text Normalization

In [None]:
# Normalizing Text (i.e., Stemming, Lemmatization etc.)

##### Which text normalization technique have you used and why?

Answer Here.

#### 9. Part of speech tagging

In [None]:
# POS Taging

#### 10. Text Vectorization

In [None]:
# Vectorizing Text

##### Which text vectorization technique have you used and why?

Answer Here.

### 4. Feature Manipulation & Selection

#### 1. Feature Manipulation

In [None]:
# Manipulate Features to minimize feature correlation and create new features

#### 2. Feature Selection

In [None]:
# Select your features wisely to avoid overfitting

##### What all feature selection methods have you used  and why?

Answer Here.

##### Which all features you found important and why?

Answer Here.

### 5. Data Transformation

#### Do you think that your data needs to be transformed? If yes, which transformation have you used. Explain Why?

In [None]:
# Transform Your data

### 6. Data Scaling

In [None]:
# Scaling your data

##### Which method have you used to scale you data and why?

### 7. Dimesionality Reduction

##### Do you think that dimensionality reduction is needed? Explain Why?

Answer Here.

In [None]:
# DImensionality Reduction (If needed)

##### Which dimensionality reduction technique have you used and why? (If dimensionality reduction done on dataset.)

Answer Here.

### 8. Data Splitting

In [None]:
# Split your data to train and test. Choose Splitting ratio wisely.

##### What data splitting ratio have you used and why?

Answer Here.

### 9. Handling Imbalanced Dataset

##### Do you think the dataset is imbalanced? Explain Why.

Answer Here.

In [None]:
# Handling Imbalanced Dataset (If needed)

##### What technique did you use to handle the imbalance dataset and why? (If needed to be balanced)

Answer Here.

## ***7. ML Model Implementation***

### ML Model - 1

In [None]:
# ML Model - 1 Implementation

# Fit the Algorithm

# Predict on the model

#### 1. Explain the ML Model used and it's performance using Evaluation metric Score Chart.

In [None]:
# Visualizing evaluation Metric Score chart

#### 2. Cross- Validation & Hyperparameter Tuning

In [None]:
# ML Model - 1 Implementation with hyperparameter optimization techniques (i.e., GridSearch CV, RandomSearch CV, Bayesian Optimization etc.)

# Fit the Algorithm

# Predict on the model

##### Which hyperparameter optimization technique have you used and why?

Answer Here.

##### Have you seen any improvement? Note down the improvement with updates Evaluation metric Score Chart.

Answer Here.

### ML Model - 2

#### 1. Explain the ML Model used and it's performance using Evaluation metric Score Chart.

In [None]:
# Visualizing evaluation Metric Score chart

#### 2. Cross- Validation & Hyperparameter Tuning

In [None]:
# ML Model - 1 Implementation with hyperparameter optimization techniques (i.e., GridSearch CV, RandomSearch CV, Bayesian Optimization etc.)

# Fit the Algorithm

# Predict on the model

##### Which hyperparameter optimization technique have you used and why?

Answer Here.

##### Have you seen any improvement? Note down the improvement with updates Evaluation metric Score Chart.

Answer Here.

#### 3. Explain each evaluation metric's indication towards business and the business impact pf the ML model used.

Answer Here.

### ML Model - 3

In [None]:
# ML Model - 3 Implementation

# Fit the Algorithm

# Predict on the model

#### 1. Explain the ML Model used and it's performance using Evaluation metric Score Chart.

In [None]:
# Visualizing evaluation Metric Score chart

#### 2. Cross- Validation & Hyperparameter Tuning

In [None]:
# ML Model - 3 Implementation with hyperparameter optimization techniques (i.e., GridSearch CV, RandomSearch CV, Bayesian Optimization etc.)

# Fit the Algorithm

# Predict on the model

##### Which hyperparameter optimization technique have you used and why?

Answer Here.

##### Have you seen any improvement? Note down the improvement with updates Evaluation metric Score Chart.

Answer Here.

### 1. Which Evaluation metrics did you consider for a positive business impact and why?

Answer Here.

### 2. Which ML model did you choose from the above created models as your final prediction model and why?

Answer Here.

### 3. Explain the model which you have used and the feature importance using any model explainability tool?

Answer Here.

## ***8.*** ***Future Work (Optional)***

### 1. Save the best performing ml model in a pickle file or joblib file format for deployment process.


In [None]:
# Save the File

### 2. Again Load the saved model file and try to predict unseen data for a sanity check.


In [None]:
# Load the File and predict unseen data.

### ***Congrats! Your model is successfully created and ready for deployment on a live server for a real user interaction !!!***

# **Conclusion**

Write the conclusion here.

### ***Hurrah! You have successfully completed your Machine Learning Capstone Project !!!***