# **INTRODUCTION**

This notebook contains data on the implementation of marketing to customers. We create reports and analyze them to know the actionable insight into decision making purpose.

Now, build some insight questions may helpful for our works in understanding the data through univariate and multivariate analysis.

**Core question to answer in this notebook:**
1. How is the marketing acceptance rate in each country?
2. How do we do marketing in the future?

**Derivative question to get the answer:**
1. How does customer identity looks like in the data?
2. What products are the bestseller?
3. How can we defined customer behavior using available feature in the data?


In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
df = pd.read_csv('/kaggle/input/marketing-data/marketing_data.csv')
df.head()

In [None]:
df.columns

In [None]:
df.info()

Here we do some data cleaning and feature engineering to get important feature for analysis

In [None]:
#cleaning income columns
df.rename(columns = {' Income ': 'Income'}, inplace = True)
df['Income'] = df['Income'].str.strip('$').str.replace(' ', '').str.replace(',', '')
df['Income'] = df['Income'].astype(float)
df['Income'] = df['Income'].fillna(df['Income'].median())

#convert datatypes from object to integer
df['Year_Birth'] = df['Year_Birth'].astype(int)

#convert datatypes from object to date
df['Dt_Customer'] = pd.to_datetime(df['Dt_Customer'])

#deleting absurd categories in marital status
delete_absurd = df[df['Marital_Status'] == 'Absurd'].index
df.drop(delete_absurd, inplace = True)

df['Education'] = df['Education'].str.replace('2n Cycle', 'Master')

df['Marital_Status'] = df['Marital_Status'].str.replace('YOLO', 'Single').str.replace('Alone', 
                                                                                      'Single').str.replace('Widow', 'Divorced')
#feature engineering
df['Age'] = 2021 - df['Year_Birth']
df['First_Cust'] = 2021 - df['Dt_Customer'].dt.year
df['Dependent'] = df['Kidhome'] + df['Teenhome']
spending_col = [col for col in df.columns if 'Mnt' in col]
df['Total_Spending'] = df[spending_col].sum(axis = 1)
platform_col = [col for col in df.columns if 'Purchases' in col]
df['Total_Purchase'] = df[platform_col].sum(axis = 1)
campaigns_cols = [col for col in df.columns if 'Cmp' in col] + ['Response'] 
df['Total_Campaign_Scc'] = df[campaigns_cols].sum(axis = 1)

df['Generation'] = df['Age'].\
apply(lambda x: 'Silent' if x > 76 else 'Boomers' if x > 56 else 'Gen X' if x > 41 else 'Milenial' if x > 27 else 'Gen Z')

#Copying data for vizualization
df2 = df.copy()
df2.drop(['ID', 'Year_Birth', 'Dt_Customer', 'Kidhome', 'Teenhome', 'Recency'], axis = 1, inplace = True)
df2.head()

# **Univariate Analysis**

Here we setting palette for visualization ['#e2d810', '#d9138a', '#12a4d9', '#322e2f', 'gray']

In [None]:
pal_list = ['#e2d810', '#d9138a', '#12a4d9', '#322e2f', 'gray']

import matplotlib.ticker as mtick

fig, ax = plt.subplots(2, 2, figsize = (18, 14))
fig.patch.set_facecolor('white')

ax0 = ax[0, 0]
ax1 = ax[0, 1]
ax2 = ax[1, 0]
ax3 = ax[1, 1]

for s in ['top', 'right', 'left']:
    for i in range(0, 4):
        locals()[f'ax{i}'].spines[s].set_visible(False)
        locals()[f'ax{i}'].set_facecolor('white')
        
#Country
    #data
country_data = pd.DataFrame(df2['Country'].value_counts())[:7]
country_data['percentage'] = country_data['Country'].apply(lambda x: x / sum(country_data['Country']) * 100)
    #viz
x_country = np.arange(len(country_data.index))
ax0.barh(country_data.index, width = country_data['percentage'], zorder = 3, color = '#e2d810', height = 0.6)
ax0.xaxis.set_major_formatter(mtick.PercentFormatter())
ax0.xaxis.set_major_locator(mtick.MultipleLocator(10))
ax0.text(-0.5, -0.5, 'Country Origin', fontsize = 15, fontweight = 'bold', fontfamily = 'serif', color = 'black')
ax0.invert_yaxis()

#Education
    #ascending index
ed_asc = ['Basic', 'Graduation', 'Master', 'PhD']
    #data
ed_data = pd.DataFrame(df2['Education'].value_counts())
ed_data['percentage'] = ed_data['Education'].apply(lambda x: x / sum(ed_data['Education']) * 100).loc[ed_asc]
    #viz
x_ed = np.arange(len(ed_data.index))
ax1.bar(x_ed, height = ed_data['percentage'], zorder = 3, width = 0.4, color = '#e2d810')
ax1.set_xticks([0, 1, 2, 3])
ax1.set_xticklabels(list(ed_data.index))
ax1.yaxis.set_major_formatter(mtick.PercentFormatter())
ax1.text(-0.5, 52.5, 'Educational Level', fontsize = 15, fontweight = 'bold', fontfamily = 'serif', color = 'black')

#Marital Status
    #ascending index
marital_asc = ['Single', 'Divorced', 'Together', 'Married']
    #data
marital_data = pd.DataFrame(df2['Marital_Status'].value_counts())[:4]
marital_data['percentage'] = marital_data['Marital_Status'].\
apply(lambda x: round(x / sum(marital_data['Marital_Status']) * 100),)
    #viz
explode = [0.1, 0, 0, 0]
x_marital = np.arange(len(marital_data.index))
ax2.pie(marital_data['percentage'], labels = marital_data.index, shadow = True, colors = pal_list, autopct = '%1.1f%%',
       explode = explode, counterclock = False, radius = 1.1)
ax2.text(-1, 1.4, 'Relationship Status', fontsize = 15, fontweight = 'bold', fontfamily = 'serif', color = 'black',
        ha = 'center')

#Generation
gen_order = ['Silent', 'Boomers', 'X Gen', 'Milenial', 'Z Gen']
gen_data = df2['Generation'].value_counts()
explode1 = [0.1, 0, 0, 1, 2.5]
ax3.pie(gen_data, labels = gen_data.index, colors = pal_list, explode = explode1, autopct = '%1.1f%%', counterclock = False,
       shadow = True, radius = 1.1)
ax3.text(-1, 1.4, 'Generation', fontsize = 15, fontweight = 'bold', fontfamily = 'serif', color = 'black', 
         ha = 'center')

fig.text(0, 1, 'CUSTOMER PROFILE DISTRIBUTION', fontsize = 18, fontweight = 'bold', fontfamily = 'serif', color = 'black',
        bbox={'facecolor': 'white', 'alpha': 0.5, 'pad': 10}, style = 'italic')

fig.tight_layout(pad = 3.0)

For categorical feature, here we found that most customer came from Spain, Graduate ed., married, and Gen X

In [None]:
col_dist = ['Income', 'Age', 'Total_Spending', 'Total_Purchase']
fig, ax = plt.subplots(2, 2, figsize=(13, 8))
ax = ax.flatten()

for ind, col in enumerate(col_dist):
    sns.boxplot(x = df2[col], ax = ax[ind], color = '#e2d810')
    for s in ['top', 'right', 'left']:
        ax[ind].spines[s].set_visible(False)

bbox_args = dict(boxstyle="round", fc="#e2d810", color = '#e2d810')
ax[0].annotate('Distant Outlier',
            xy=(666666, 0),
            xycoords='data',
            xytext=(-40,50),
            textcoords='offset points',
            arrowprops=dict(color='#363d46', arrowstyle = '->'),
            fontsize=12, bbox=bbox_args)
ax[1].annotate('This seems suspicious',
            xy=(121, 0),
            xycoords='data',
            xytext=(-40,50),
            textcoords='offset points',
            arrowprops=dict(color='#363d46', arrowstyle = '->'),
            fontsize=12, bbox=bbox_args)
        
fig.text(0.05, 1.1, 'NUMERICAL DISTRIBUTION', fontsize = 18, fontweight = 'bold', fontfamily = 'serif', color = 'black',
        bbox={'facecolor': 'white', 'alpha': 0.5, 'pad': 10}, style = 'italic')
fig.text(0.05, 0.88, '''
- Income feature has distant outliers but we ignore it for the visual purpose in containing pattern

- But Age feature seems suspicious since the oldest human in the world is 118 years old (2014), 
  so we drop this observation
  
- Total Spending and Total Purchase feature are acceptable''',
        fontsize = 15, fontweight = 'light', fontfamily = 'serif', color = 'black')

plt.show()

In [None]:
#dropping outlier
outlier_data = df2[(df2['Age'] > 120) | (df2['Income'] > 150000)].index

df2.drop(index = outlier_data, inplace = True)

# **Multivariate Analysis**

In [None]:
df_products = df2[spending_col].sum().sort_values(ascending = False)
df_products.index = ['Wines', 'Meat', 'Gold', 'Fish', 'Sweet', 'Fruits']

high_perform = '#e2d810'
color_map = ['gray' for i in range(6)]
color_map[0] = color_map[1] =  high_perform

fig, ax = plt.subplots(1, 1, figsize = (12, 8))
ax.barh(df_products.index, df_products, color = 'gray', alpha = 0.5)
ax.barh(df_products.index, df_products, color = color_map)

for i in df_products.index[:2]:
    ax.annotate(f"{int(df_products[i] / 1000)}k", xy = (df_products[i] + 1, i), 
                va = 'center', ha='right', fontsize=15, fontweight='bold', fontfamily='serif',color='white')
for i in df_products.index[2:]:
    ax.annotate(f"{int(df_products[i] / 1000)}k", xy = (df_products[i] + 1, i), 
                va = 'center', ha='right', fontsize=15, fontweight='bold', fontfamily='serif',color='white')
    
for s in ['top', 'right', 'left']:
    ax.spines[s].set_visible(False)
    
ax.text(0, -0.9, 'PRODUCT TOTAL SALES', fontsize = 20, fontfamily = 'serif', fontweight = 'bold', 
        color = 'black', bbox={'facecolor': 'white', 'alpha': 0.5, 'pad': 10}, style = 'italic')
ax.text(0, -0.6, '', fontsize = 15, fontfamily = 'serif', 
        fontweight = 'light', color = '#323232')

ax.invert_yaxis()

plt.show()

In [None]:
wine_sales = df2.groupby(['Generation'])[['MntWines']].sum().\
reset_index()
wine_sales['percentage'] = wine_sales['MntWines'].apply(lambda x: round(x / np.sum(wine_sales['MntWines']) * 100, 2))


fig, ax = plt.subplots(figsize = (15, 8))

recipe = wine_sales['Generation']

data = wine_sales['MntWines']

wedges, texts = ax.pie(data, wedgeprops=dict(width=0.5), startangle=-40, colors = pal_list)
bbox_props = dict(boxstyle="square,pad=0.3", fc="w", ec="k", lw=0.72)
kw = dict(arrowprops=dict(arrowstyle="-"),
          bbox=bbox_props, zorder=0,va="center")

for i, p in enumerate(wedges):
    ang = (p.theta2 - p.theta1)/2. + p.theta1
    y = np.sin(np.deg2rad(ang))
    x = np.cos(np.deg2rad(ang))
    horizontalalignment = {-1: "right", 1: "left"}[int(np.sign(x))]
    connectionstyle = "angle,angleA=0,angleB={}".format(ang)
    kw["arrowprops"].update({"connectionstyle": connectionstyle})
    ax.annotate(recipe[i], xy=(x, y), xytext=(1.35*np.sign(x), 1.4*y),
                horizontalalignment=horizontalalignment, **kw)
    
ax.text(-1.6, 1.5, "WINE SALES BY GENERATION", fontsize = 20, fontfamily = 'serif', fontweight = 'bold', 
        color = 'black', bbox={'facecolor': 'white', 'alpha': 0.5, 'pad': 10}, style = 'italic')
ax.text(-1.6, 1.23, '''
It's clear here that boomers have a high occupation of wine because of the proportion 
for generation feature, in addition to high motivated behavior (socialize, pair with food, relaxing, etc.)''',
       fontweight = 'light', fontfamily = 'serif', fontsize = 14)

In [None]:
import matplotlib

df3 = df2.copy()
df3.drop(['MntWines', 'MntFruits', 'MntMeatProducts', 'MntFishProducts', 'MntSweetProducts', 
                     'MntGoldProds','AcceptedCmp3', 'AcceptedCmp4', 'AcceptedCmp5', 'AcceptedCmp1', 'AcceptedCmp2', 
                     'Response'], axis = 1, inplace = True)
df3.rename(columns = {'NumDealsPurchases': 'Deals_Pch', 'NumWebPurchases': 'Web_Pch', 'NumCatalogPurchases': 'Catalog_Pch',
                     'NumStorePurchases': 'Store_Pch', 'NumWebVisitsMonth': 'Web_Visit'}, inplace = True)


fig, ax = plt.subplots(2, 1, figsize = (10, 18))

corr1 = df3.corr()

mask = np.zeros_like(corr1, dtype = np.bool)
mask[np.triu_indices_from(mask)] = True

cmap = matplotlib.colors.LinearSegmentedColormap.from_list('', pal_list)

sns.heatmap(corr1, square = True, cmap = cmap, mask = mask, linewidth = 2.5, vmax = 0.4, vmin = -0.4, cbar = False, ax = ax[0],
           annot = True)
ax[0].set_xticklabels(['Income', 'Deals_Pch', 'Web_Pch', 'Catalog_Pch', 'Store_Pch',
       'Web_Visit', 'Complain', 'Age', 'First_Cust', 'Dependent',
       'Total_Spending', 'Total_Purchase', ''], rotation = 90)
ax[0].set_yticklabels(['', 'Deals_Pch', 'Web_Pch', 'Catalog_Pch', 'Store_Pch',
       'Web_Visit', 'Complain', 'Age', 'First_Cust', 'Dependent', 'Total_Spending', 'Total_Purchase', 'Total_Campaign_Scc'])

corr2 = corr1.copy()
corr2.drop(columns = 'Total_Campaign_Scc', axis = 1, inplace = True)
tot_corr = corr1['Total_Campaign_Scc'].sort_values().drop(index = 'Total_Campaign_Scc')
sns.barplot(x=corr2.columns, y=tot_corr, palette=pal_list, ax = ax[1])
ax[1].set_xticklabels(tot_corr.index, rotation = 90)

ax[0].text(0, 0, "CORRELATION MATRIX", fontsize = 20, fontfamily = 'serif', fontweight = 'bold', 
        color = 'black', bbox={'facecolor': 'white', 'alpha': 0.5, 'pad': 10}, style = 'italic')

for s in ['top', 'right', 'left']:
    ax[1].spines[s].set_visible(False)
    ax[1].set_yticklabels('')
    ax[1].set_yticks([])

fig.tight_layout()

In [None]:
gen_ed_inc = df2.groupby(['Generation', 'Education'])['Income'].mean().reset_index()

fig = plt.figure(figsize = (20, 12))

sns.catplot(x = 'Generation', y = 'Income', col = 'Education', data = gen_ed_inc, saturation=1, kind="bar", aspect=0.7,
           palette = pal_list, order = ['Silent', 'Boomers', 'Gen X', 'Milenial', 'Gen Z'])

Here income has positive realtionship with age/generation: when age increase, income will increase respectively. On the other hand, there is suspicious data that Gen Z has higher average income than the other generation with higher education. This is happening because Gen Z has small proportion of observation. So we assume that Gen Z is at the bottom of the order in terms of income.

In [None]:
fig, ax = plt.subplots(1, 1, figsize = (9, 10))
sns.boxplot(x = 'Dependent', y = 'Total_Spending', hue = 'Marital_Status', data = df2, 
            ax = ax, palette = pal_list)

legend_labels, _= ax.get_legend_handles_labels()
ax.legend(legend_labels, ['Divorced', 'Single', 'Married', 'Together'], ncol = 4, bbox_to_anchor = (1,1))

for s in ['right', 'left']:
   ax.spines[s].set_visible(False)

ax.set_ylabel('')

fig.text(0, 0.9, "Total Spending by Dependent and Marital Status", fontfamily = 'serif', fontweight = 'bold',
        fontsize = 20, bbox={'facecolor': 'white', 'alpha': 0.5, 'pad': 10}, style = 'italic')

plt.show()

Wee see here that of our customer spend more when they doesn't have any dependents.This indicates that our products are more salable to them. However, it is also interesting that there are total spending outliers or certain pattern happening for 1,2,3 dependent. Maybe they have more income to cover that spending or having higher education. On the other hand, we don't see any marital status changing the pattern for the total spending feature
         
Next, we will see how this total spending feature is affiliated with the implementation of the campaign

In [None]:
platform_list = ['NumWebPurchases', 'NumCatalogPurchases', 'NumStorePurchases', 'NumDealsPurchases']
gen_asc = ['Silent', 'Boomers', 'X Gen', 'Milenial', 'Z Gen']

age_platform = df2.groupby('Generation')[platform_list].sum()
age_platform['Sum'] = age_platform.sum(axis = 1)
age_platform_ratio = (age_platform.T / age_platform['Sum']).round(2).T
age_platform_ratio.drop('Sum', axis = 1, inplace = True)
age_platform_ratio.rename(columns = {'NumWebPurchases': 'Web',
                                      'NumCatalogPurchases': 'Catalogue', 'NumStorePurchases': 'Stores',
                                    'NumDealsPurchases': 'Deals'}, inplace = True)

fig, ax = plt.subplots(1, 1, figsize = (12, 6))

ax.barh(age_platform_ratio.index, age_platform_ratio['Web'], color = '#e2d810', alpha = 0.9, label = 'Web')
ax.barh(age_platform_ratio.index, age_platform_ratio['Catalogue'], 
        left = age_platform_ratio['Web'], color = '#d9138a', alpha = 0.9, label = 'Catalogue')
ax.barh(age_platform_ratio.index, age_platform_ratio['Stores'], 
        left = age_platform_ratio['Web'] + age_platform_ratio['Catalogue'], 
        color = '#12a4d9', alpha = 0.9, label = 'Stores')
ax.barh(age_platform_ratio.index, age_platform_ratio['Deals'], 
        left = age_platform_ratio['Web'] + age_platform_ratio['Catalogue'] + age_platform_ratio['Stores'], 
        color = '#322e2f', alpha = 0.9, label = 'Deals')


for i in age_platform_ratio.index:
    ax.annotate(f"{int(age_platform_ratio['Web'][i] * 100)}%", xy = (age_platform_ratio['Web'][i] / 2, i), 
                va = 'center', ha='center', fontsize=10, fontweight='light', fontfamily='serif',color='black')
    ax.annotate(f"{int(age_platform_ratio['Catalogue'][i] * 100)}%", 
                xy = (age_platform_ratio['Web'][i] + age_platform_ratio['Catalogue'][i] / 2, i), 
                va = 'center', ha='center', fontsize=10, fontweight='light', fontfamily='serif',color='black')
    ax.annotate(f"{int(age_platform_ratio['Stores'][i] * 100)}%", 
                xy = (age_platform_ratio['Web'][i] + age_platform_ratio['Catalogue'][i] + age_platform_ratio['Stores'][i] / 2, i), 
                va = 'center', ha='center', fontsize=10, fontweight='light', fontfamily='serif',color='black')
    ax.annotate(f"{int(age_platform_ratio['Deals'][i] * 100)}%", 
                xy = (age_platform_ratio['Web'][i] + age_platform_ratio['Catalogue'][i] + age_platform_ratio['Stores'][i] + age_platform_ratio['Deals'][i] / 2, i), 
                va = 'center', ha='center', fontsize=10, fontweight='light', fontfamily='serif',color='black')
    
    
ax.text(0, 5, 'Platform Distribution by Generation', fontsize = 20, fontfamily = 'serif', fontweight = 'bold', 
        color = '#323232', bbox={'facecolor': 'white', 'alpha': 0.5, 'pad': 10}, style = 'italic')
ax.text(0, 4.5, 'It is interesting how Stores gets more attention among all generation', fontsize = 15, fontfamily = 'serif', 
        fontweight = 'light')
    
legend_labels, _ = ax.get_legend_handles_labels()
ax.legend(legend_labels, ['Web', 'Catalogue', 'Stores', 'Deals'], ncol = 4, bbox_to_anchor = (0.5, -0.1), loc = 'lower center')
ax.set_xticks([])

for s in ['top', 'bottom', 'left', 'right']:
    ax.spines[s].set_visible(False)
    
plt.show()

In [None]:
pal_list = ['#e2d810', '#d9138a', '#12a4d9', '#322e2f', 'gray']
campaign_country = df2.groupby('Country')[['AcceptedCmp3', 'AcceptedCmp4', 'AcceptedCmp5', 'AcceptedCmp1',
       'AcceptedCmp2', 'Response', 'Total_Campaign_Scc']].mean().sort_values(by = 'Total_Campaign_Scc', 
                                                                           ascending = False).reset_index()
fig, ax = plt.subplots(figsize = (15, 8))

ax.bar(campaign_country['Country'], campaign_country['AcceptedCmp1'], color = '#e2d810')
ax.bar(campaign_country['Country'], campaign_country['AcceptedCmp2'], bottom = campaign_country['AcceptedCmp1'], color = '#d9138a')
ax.bar(campaign_country['Country'], campaign_country['AcceptedCmp3'], 
       bottom = campaign_country['AcceptedCmp1'] + campaign_country['AcceptedCmp2'], color = '#12a4d9')
ax.bar(campaign_country['Country'], campaign_country['AcceptedCmp4'], 
       bottom = campaign_country['AcceptedCmp1'] + campaign_country['AcceptedCmp2'] + campaign_country['AcceptedCmp3'],
      color = '#322e2f')
ax.bar(campaign_country['Country'], campaign_country['AcceptedCmp5'], 
       bottom = campaign_country['AcceptedCmp1'] + campaign_country['AcceptedCmp2'] + campaign_country['AcceptedCmp3'] + campaign_country['AcceptedCmp4'],
      color = 'gray')
ax.bar(campaign_country['Country'], campaign_country['Response'], 
       bottom = campaign_country['AcceptedCmp1'] + campaign_country['AcceptedCmp2'] + campaign_country['AcceptedCmp3'] + campaign_country['AcceptedCmp4'] + campaign_country['AcceptedCmp5'])

ax.legend(['Cmp 1', 'Cmp 2', 'Cmp 3', 'Cmp 4', 'Cmp 5', 'Cmp 6'], ncol = 3, 
          bbox_to_anchor = (0.5, 0.7), loc = 'upper center')

for s in ['top', 'right']:
    ax.spines[s].set_visible(False)
    
ax.text(-0.5, 1.1, "CAMPAIGN ACCEPTED BY COUNTRY", fontsize = 20, fontfamily = 'serif', fontweight = 'bold', 
        color = 'black', bbox={'facecolor': 'white', 'alpha': 0.5, 'pad': 10}, style = 'italic')

plt.show()

The same issue here that Mexico has highest acceptace rate by country simply because Mexico has small proportion of the total observations. So again we assume here that Mexico in the last order in terms of acceptance rate

In [None]:
pal_list = ['#e2d810', '#d9138a', '#12a4d9', '#322e2f', 'gray']

fig, ax = plt.subplots(2, 3, figsize = (15, 10))
fig.patch.set_facecolor('white')

ax0 = ax[0, 0]
ax1 = ax[0, 1]
ax2 = ax[0, 2]
ax3 = ax[1, 0]
ax4 = ax[1, 1]
ax5 = ax[1, 2]

for i in range(0, 6):
    locals()[f'ax{i}'].grid(which='major', axis='x', zorder=0, color='black', linestyle=':', dashes=(1,5))

#Campaign 1
acc1 = df2[df2['AcceptedCmp1'] == 1]
no_acc1 = df2[df2['AcceptedCmp1'] == 0]
sns.kdeplot(acc1['Total_Spending'], ax = ax0, color = "#e2d810", alpha = 0.9, shade = True)
sns.kdeplot(no_acc1['Total_Spending'], ax = ax0, color = "#d9138a", alpha = 0.9, shade = True)

#Campaign 2
acc2 = df2[df2['AcceptedCmp2'] == 1]
no_acc2 = df2[df2['AcceptedCmp2'] == 0]
sns.kdeplot(acc2['Total_Spending'], ax = ax1, color = "#e2d810", alpha = 0.9, shade = True)
sns.kdeplot(no_acc2['Total_Spending'], ax = ax1, color = "#d9138a", alpha = 0.9, shade = True)

#Campaign 3
acc3 = df2[df2['AcceptedCmp3'] == 1]
no_acc3 = df2[df2['AcceptedCmp3'] == 0]
sns.kdeplot(acc3['Total_Spending'], ax = ax2, color = "#e2d810", alpha = 0.9, shade = True)
sns.kdeplot(no_acc3['Total_Spending'], ax = ax2, color = "#d9138a", alpha = 0.9, shade = True)

#Campaign 4
acc4 = df2[df2['AcceptedCmp4'] == 1]
no_acc4 = df2[df2['AcceptedCmp4'] == 0]
sns.kdeplot(acc4['Total_Spending'], ax = ax3, color = "#e2d810", alpha = 0.9, shade = True)
sns.kdeplot(no_acc4['Total_Spending'], ax = ax3, color = "#d9138a", alpha = 0.9, shade = True)

#Campagin 5
acc5 = df2[df2['AcceptedCmp5'] == 1]
no_acc5 = df2[df2['AcceptedCmp5'] == 0]
sns.kdeplot(acc5['Total_Spending'], ax = ax4, color = "#e2d810", alpha = 0.9, shade = True)
sns.kdeplot(no_acc5['Total_Spending'], ax = ax4, color = "#d9138a", alpha = 0.9, shade = True)

#Last Campaign / Response
res = df2[df2['Response'] == 1]
no_res = df2[df2['Response'] == 0]
sns.kdeplot(res['Total_Spending'], ax = ax5, color = "#e2d810", alpha = 0.9, shade = True)
sns.kdeplot(no_res['Total_Spending'], ax = ax5, color = "#d9138a", alpha = 0.9, shade = True)

for s in ["top","right","left"]:
    for i in range(0, 6):
        locals()[f'ax{i}'].spines[s].set_visible(False)

for i in range(0, 6):
    locals()[f'ax{i}'].set_ylabel('')
    locals()[f'ax{i}'].set_xlabel('')
    locals()[f'ax{i}'].set(yticks = [])

ax0.text(-0.56, 0.0015, 'Campaign 1', fontsize = 15, fontweight = 'bold', fontfamily = 'serif', color = 'black')
ax1.text(-0.56, 0.0014, 'Campaign 2', fontsize = 15, fontweight = 'bold', fontfamily = 'serif', color = 'black')
ax2.text(-0.56, 0.0014, 'Campaign 3', fontsize = 15, fontweight = 'bold', fontfamily = 'serif', color = 'black')
ax3.text(-0.56, 0.0015, 'Campaign 4', fontsize = 15, fontweight = 'bold', fontfamily = 'serif', color = 'black')
ax4.text(-0.56, 0.0015, 'Campaign 5', fontsize = 15, fontweight = 'bold', fontfamily = 'serif', color = 'black')
ax5.text(-0.56, 0.0015, "Campaign 6 (Response)", fontsize = 15, fontweight = 'bold', fontfamily = 'serif', color = 'black')

fig.text(0.1, 1, "TOTAL SPENDING AND CAMPAIGN ACCEPTED RELATIONSHIP: \nDEEP UNDERSTANDING", fontsize = 18, 
         fontfamily = 'serif', fontweight = 'bold', color = 'black', 
         bbox={'facecolor': 'white', 'alpha': 0.5, 'pad': 12}, style = 'italic', va = 'center')
fig.text(0.1, 0.93, "Its clear to see that there are several campaign (Campaign 1, Campaign 2, Campaign 5) influenced by Total Spending feature.",
        fontsize = 13, fontfamily = 'serif', fontweight = 'light', color = 'black')

ax0.legend(['Accepted', 'Not Accepted'], ncol = 2, bbox_to_anchor = (3.2, 1.35), loc = 'upper center')

plt.show()

# **Conclusion**

Action Suggestions:
1. Spain has the most customer basis that already exist. Therefore campaign marketing should conduct by retension (keep repeating purchase from existing customer) at any level of marketing budget.
2. Since campaign acceptance rate is influenced by Total Spending (add: Total Spending is influence by Generation and Education), campaign 1, 2, 5 deserved to be continued in its conduct.
3. If there are marketing budget left, customer acquisition must be an important strategies to improve new customer basis by Income, Generation, Dependet consideration