#### Problem Statement 
##### Enabling More Targeted Promotions and Lower Customer Acquisition Costs using Data Engineering and AI

One of Incedo Inc banking and financial services(BFSI) customer wanted to improve their marketing campaigns in order to boost conversion rates while lowering customer acquisition costs. To better target clients and promote the most appropriate products and services, the bank sought to identify channels, offers, and approaches.

The business objectives are:-
- To reduce customer acquisition cost by targeting the ones who are likely to buy
- To improve the response rate, i.e., the fraction of prospects who respond to the campaign

We will follow below steps:-

- Read and understand the data
- Exploratort data analysis
- Prepare the data for modelling
- Modle evaluation
- Create Gain and Lift charts, and finacial benefits for the banks for customer acquisition by using the model

In [1]:
# Importing libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
pd.set_option('display.max_columns',100)

In [2]:
import warnings
warnings.filterwarnings('ignore')

## Read and understand the data

In [3]:
data = pd.read_csv('../input/incedo-bfsi-data/Customer purchase and transaction history.csv')
data.head()

In [4]:
data.shape

In [5]:
data.info()

In [6]:
data.describe()

#### Treating Missing Values in columns

In [7]:
# Cheking percent of missing values in columns
(data.isnull().sum()/len(data.index))*100

We can see there is no missing values in any column.

### Checking response variable

In [9]:
data['cardLast4Digits'].value_counts()

We can see that there is an imbalance in the response rate.

## Exploratory Data Analysis

In [10]:
data = pd.read_csv('../input/incedo-bfsi-data/Customer profile data.csv')
data.head()

First, we will look at the client data.

- Age
- Job : type of job
- Marital : marital status
- Education
- Default: has credit in default?
- Housing: has housing loan?
- Loan: has personal loan?

#### IncomeInThousands

In [14]:
sns.boxplot(x='incomeInThousands', data=data)

We can see that the youngest and eldest have more response rate than others, keeping in mind that number of prospects is very less for the youngest.

#### Job: Employment Status
* self-employed    
* salaried         
* employed         
* unemployed

In [16]:
data['employmentStatus'].value_counts()

In [21]:
plt.figure(figsize=(15,7))
sns.barplot(x='employmentStatus', y='incomeInThousands', data=data)
plt.show()

We can draw similarities from the incomeInThousands analysis where we found that the youngest and eldest were most likely to respond in a positive manner. It is rreiterated by the above analysis, where we notice that student and retired have the highest response rates.

#### MaritalStatus
* single       
* married    
* separated    


In [22]:
data['maritalStatus'].value_counts()

In [25]:
sns.barplot(x='maritalStatus', y='incomeInThousands', data=data)
plt.show()

We can see that singles, separated and married have the same incomeInTousands.

#### City

In [26]:
data['education'].value_counts()

In [27]:
plt.figure(figsize=(15,7))
sns.barplot(x='City', y='incomeInThousands', data=data)
plt.show()