# Direct Target Marketing

mass marketing sees all customers as one unit. One-to-one marketing focuses on one customer at a time. Direct target marketing lies between mass marketing and one-to-one marketing. Direct target marketing involves activities to those customers who are most likely to buy the product. 

Direct target marketing implies selection. Some customers are identified as more valuable than others and these more valued customers are given special attention. 

By using direct target marketing properly, companies can improve its profitability, increasing revenues and decreasing costs. 

Models for direct marketing: Bank Casework

- What do we mean by a direct marketing model? Classification into likely buyers vs. non-buyers 

- Shall we have one model or many (one for each segment or one for each mailing)?

- Define a meaningful response and a set of legitimate explanatory variables (no independent/dependent variable confounding)

- Training/test approach to model testing: Do explonatory analysis and data preparation on the training set alone. Develop targeting model (model specification and subset selection) on the training set alone.

- Evaluate the model on the test set using statistical and financial evaluation criteria

- Develop specific recommendations for management - which customers should be sent mailings (likely buyers predicted by the model)?

- What's the profit contribution of the model? Without the model, we mail to everyone. With the model we mail only to predicted likely buyers. Using the test set predict costs and sales with and without using the model for targeting. 

In [1]:
# Python >= 3.7
import sys

# Scikit-Learn >= 0.22
import sklearn

# Common import packages for text processing and machine learning
import numpy as np # arrays and numerical processing
import pandas as pd # DataFrame structure and operations
import os

# Making the notebook's output stable across runs
np.random.seed(31)

# To plot figures
%matplotlib inline
import matplotlib as mp
import matplotlib.pyplot as plt # 2D plotting
mp.rc('axes', labelsize=14)
mp.rc('xtick', labelsize=12)
mp.rc('ytick', labelsize=12)

# Where to save the figures
PROJECT_DIR = "."
SECTION_ID = "targeting_customers"
IMAGES_PATH = os.path.join(PROJECT_DIR, "images", SECTION_ID)
os.makedirs(IMAGES_PATH, exist_ok=True)

def save_img(fig_id, tight_layout=True, fig_extension="png", resolution=300):
    path = os.path.join(IMAGES_PATH, fig_id + "." + fig_extension)
    print("Saving image", fig_id)
    if tight_layout:
        plt.tight_layout()
    plt.savefig(path, format=fig_extension, dpi=resolution)
    
# Ignore useless warnings
import warnings
warnings.filterwarnings(action="ignore", message="^internal gelsd")

### Get the data

In [2]:
bankData = pd.read_csv('datasets/bank.csv', sep=';')

In [3]:
print(bankData.head())

   age          job  marital  education default  balance housing loan  \
0   30   unemployed  married    primary      no     1787      no   no   
1   33     services  married  secondary      no     4789     yes  yes   
2   35   management   single   tertiary      no     1350     yes   no   
3   30   management  married   tertiary      no     1476     yes  yes   
4   59  blue-collar  married  secondary      no        0     yes   no   

    contact  day month  duration  campaign  pdays  previous poutcome response  
0  cellular   19   oct        79         1     -1         0  unknown       no  
1  cellular   11   may       220         1    339         4  failure       no  
2  cellular   16   apr       185         1    330         1  failure       no  
3   unknown    3   jun       199         4     -1         0  unknown       no  
4   unknown    5   may       226         1     -1         0  unknown       no  


### Define variables

In [4]:
jobtitle = {'admin.': 'White Collar', 'entrepreneur': 'White Collar', 'management': 'White Collar',
          'self-employed': 'White Collar', 'blue-collar': 'Blue Collar', 'services': 'Blue Collar',
          'technician': 'Blue Collar'}

In [5]:
marital_status = {'divorced': 'Divorced', 'married': 'Married', 'single': 'Single'}

In [6]:
education_level = {'primary': 'Primary', 'secondary': 'Secondary', 'tertiary': 'Tertiary'}

In [7]:
yesno = {'yes': 1, 'no': 0}

In [8]:
bankData['jobtype'] = bankData['job'].map(jobtitle)
bankData['jobtype'] = bankData['jobtype'].fillna('Other')

bankData['marital'] = bankData['marital'].map(marital_status)
bankData['marital'] = bankData['marital'].fillna('Unknown')

bankData['education'] = bankData['education'].map(education_level)
bankData['education'] = bankData['education'].fillna('Unknown')

bankData['default'] = bankData['default'].map(yesno)
bankData['default'] = bankData['default'].fillna('No')

bankData['housing'] = bankData['housing'].map(yesno)
bankData['housing'] = bankData['housing'].fillna('No')

bankData['loan'] = bankData['loan'].map(yesno)
bankData['loan'] = bankData['loan'].fillna('No')

bankData['response'] = bankData['response'].map(yesno)
bankData['response'] = bankData['response'].fillna('No')

In [9]:
# selecting first time customers
first_time_customers = bankData['pdays'].map(lambda x: x == -1)

first_time_bankData = pd.DataFrame(bankData[first_time_customers], columns = [
    'response', 'age', 'jobtype', 'education', 'marital', 'default', 'balance', 'housing', 'loan'])

In [10]:
print(first_time_bankData.head())

   response  age       jobtype  education  marital  default  balance  housing  \
0         0   30         Other    Primary  Married        0     1787        0   
3         0   30  White Collar   Tertiary  Married        0     1476        1   
4         0   59   Blue Collar  Secondary  Married        0        0        1   
7         0   39   Blue Collar  Secondary  Married        0      147        1   
8         0   41  White Collar   Tertiary  Married        0      221        1   

   loan  
0     0  
3     1  
4     0  
7     0  
8     0  


In [11]:
print(first_time_bankData.info())

<class 'pandas.core.frame.DataFrame'>
Int64Index: 3705 entries, 0 to 4518
Data columns (total 9 columns):
response     3705 non-null int64
age          3705 non-null int64
jobtype      3705 non-null object
education    3705 non-null object
marital      3705 non-null object
default      3705 non-null int64
balance      3705 non-null int64
housing      3705 non-null int64
loan         3705 non-null int64
dtypes: int64(6), object(3)
memory usage: 289.5+ KB
None


In [12]:
print(first_time_bankData.describe())

          response          age      default       balance      housing  \
count  3705.000000  3705.000000  3705.000000   3705.000000  3705.000000   
mean      0.090958    41.083671     0.019163   1374.862078     0.551417   
std       0.287588    10.373818     0.137117   3008.524207     0.497416   
min       0.000000    19.000000     0.000000  -3313.000000     0.000000   
25%       0.000000    33.000000     0.000000     60.000000     0.000000   
50%       0.000000    39.000000     0.000000    415.000000     1.000000   
75%       0.000000    49.000000     0.000000   1412.000000     1.000000   
max       1.000000    87.000000     1.000000  71188.000000     1.000000   

              loan  
count  3705.000000  
mean      0.159784  
std       0.366455  
min       0.000000  
25%       0.000000  
50%       0.000000  
75%       0.000000  
max       1.000000  


In [13]:
# baseline response rate
accepted_offer = first_time_bankData['response'].map(lambda x: x == 1)
baseline = len(first_time_bankData[accepted_offer]) / len(first_time_bankData)
print('\Proportion of bank customers responding to offer: ', round(baseline, 5), '\n')

\Proportion of bank customers responding to offer:  0.09096 



In [14]:
# average continous variables by response
print(first_time_bankData.pivot_table(['age'], index = ['response']))
print(first_time_bankData.pivot_table(['balance'], index = ['response']))

print(first_time_bankData.pivot_table(['response'], index = ['jobtype']))
print(first_time_bankData.pivot_table(['response'], index = ['education']))
print(first_time_bankData.pivot_table(['response'], index = ['marital']))
print(first_time_bankData.pivot_table(['response'], index = ['default']))
print(first_time_bankData.pivot_table(['response'], index = ['housing']))
print(first_time_bankData.pivot_table(['response'], index = ['loan']))

                age
response           
0         40.983076
1         42.089021
              balance
response             
0         1359.654097
1         1526.851632
              response
jobtype               
Blue Collar   0.072072
Other         0.144958
White Collar  0.096352
           response
education          
Primary    0.082759
Secondary  0.082496
Tertiary   0.112546
Unknown    0.073333
          response
marital           
Divorced  0.126411
Married   0.073753
Single    0.115987
         response
default          
0        0.090534
1        0.112676
         response
housing          
0        0.116727
1        0.069995
      response
loan          
0     0.098619
1     0.050676
