# Direct marketing optimization
## 1. Preparation of analytical dataset
The data is separated into a training set (<a href="http://localhost:8888/edit/training_data.xlsx"> training_data.xlsx</a>), that includes the 60% of clients with realized sales information and a target set (<a href="http://localhost:8888/edit/target_data.xlsx"> target_data.xlsx</a>), that has all clients. The preparation of the data is done by <a href="http://localhost:8888/notebooks/PrepareData.ipynb"> PrepareData.ipynb</a>.

## 2. Classification model selection
<ul>
    <li>
To decide whether a client is likely to buy a product a classification algorithm is used. The appropriate algorithm for each product is selected in  <a href="http://localhost:8888/notebooks/CCModelSelector.ipynb"> CCModelSelector.ipynb</a>,  <a href="http://localhost:8888/notebooks/CLModelSelector.ipynb"> CLModelSelector.ipynb</a> and <a href="http://localhost:8888/notebooks/MFModelSelector.ipynb"> MFModelSelector.ipynb</a>.
    </li>
    <li>
The analysed algorithm are: 
        <ul>
            <li>
        DecisionTreeClassifier (sklearn.tree.DecisionTreeClassifier), 
            </li>
            <li>
                RandomForestClassifier (sklearn.ensemble.RandomForestClassifier), 
            </li>
            <li>
                KNeighborsClassifier (sklearn.neighbors.KNeighborsClassifier),
            </li>
            <li>
                MLPClassifier (sklearn.neural_network.MLPClassifier), 
            </li>
            <li>
                Support Vector Classification (sklearn.svm.SVC).
            </li>
        </ul>
    </li>
    <li>
        To decide on the  classification algorithms is used the accuracy metric (sklearn.metrics.accuracy_score).
    </li>
    <li>
        The best performing model for all three products is RandomForestClassifier.
    </li>
      <li>
        The models created from the algorithm are: <a href="http://localhost:8888/edit/clmodel.pkl"> clmodel.pkl</a>, <a href="http://localhost:8888/edit/ccmodel.pkl"> ccmodel.pkl</a>, <a href="http://localhost:8888/edit/mfmodel.pkl"> mfmodel.pkl</a> .
    </li>
    </ul>
    

## 3. Which clients have higher propensity to buy consumer loan?
<br/>
<b>
People who are most likely to buy a consumer loan are:
    <br/>
<ul>
    <li>
       Young people, who already have tenure in the bank,
    </li>
     <li>
        with current and savings accounts
    </li>
    <li>
        and high debit and credit cards turnover.
    </li>
    </ul>
</b>
The following feature analysis was done from the consumer loan model.


![title](download.png)


## 4. Which clients have higher propensity to buy credit card?
<br/>

<b>
People who are most likely to buy a credit card are: 
    <br/>
    <ul>
    <li>
       Mostly middle aged people,
    </li>
     <li>
        with current and savings accounts,
    </li>
    <li>
        high debit and credit cards turnover
    </li>
     <li>
       and an overdraft.
    </li>
    </ul>
    </b>
The following feature analysis was done from the credit card model.


![title](download_cc.png)


## 5. Which clients have higher propensity to buy mutual fund?
<br/>

<b>
People who are most likely to buy a mutual fund are:
    <br/>
<ul>
    <li>
       Mostly men,
    </li>
     <li>
        with current and savings accounts,
    </li>
    <li>
        high debit turnover
    </li>
     <li>
       and previously bought mutual funds.
    </li>
    </ul>
    </b>
    
The following feature analysis was done from the mutual fund model.

![title](download_mf.png)


## 6. Which clients are to be targeted with which offer?
A propensity calchulation was made for each product (<a href="http://localhost:8888/notebooks/CcPropensity.ipynb">CcPropensity.ipynb</a>, <a href="http://localhost:8888/notebooks/ClPropensity.ipynb">ClPropensity.ipynb</a>, <a href="http://localhost:8888/notebooks/MfPropensity.ipynb">MfPropensity.ipynb</a>). <br/>
The clients most likely to buy each product, sorted by the probability to buy the product are stored in <a href="http://localhost:8888/edit/cc_sales_prediction.xlsx"> cc_sales_prediction.xlsx</a>, <a href="http://localhost:8888/edit/cl_sales_prediction.xlsx"> cl_sales_prediction.xlsx</a>,
<a href="http://localhost:8888/edit/mf_sales_prediction.xlsx"> mf_sales_prediction.xlsx</a>.


In [17]:
from pandas import read_excel
file_name = 'cc_sales_prediction.xlsx' 
cc_sales = read_excel(file_name)
file_name = 'cl_sales_prediction.xlsx' 
cl_sales = read_excel(file_name)
file_name = 'mf_sales_prediction.xlsx' 
mf_sales = read_excel(file_name)

In [18]:
cc_sales

Unnamed: 0,Client,Sex,Age,Tenure,Count_CA,Count_SA,Count_MF,Count_OVD,Count_CC,Count_CL,...,VolumeDebCash_Card,VolumeDebCashless_Card,VolumeDeb_PaymentOrder,TransactionsDeb,TransactionsDeb_CA,TransactionsDebCash_Card,TransactionsDebCashless_Card,TransactionsDeb_PaymentOrder,probability,cc_sales_prediction
0,529,1,33,61,1,1,0,0,0,0,...,0.000000,0.000000,329.071429,4,4,0,0,3,0.93,1
1,478,0,41,11,1,0,0,1,0,0,...,0.000000,0.000000,247.500000,5,5,0,0,5,0.93,1
2,83,1,43,111,1,3,7,0,0,0,...,0.000000,0.000000,224.392857,4,4,0,0,3,0.92,1
3,322,1,1,92,1,1,0,0,0,0,...,0.000000,0.000000,0.000000,2,2,0,0,0,0.92,1
4,938,0,57,33,1,1,2,0,0,0,...,85.714286,221.528571,163.285714,17,17,2,9,6,0.92,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
237,1399,0,24,151,1,0,1,0,1,1,...,57.142857,136.563929,931.290357,58,21,3,27,8,0.61,1
238,741,0,67,77,1,0,0,0,0,0,...,0.000000,0.000000,0.000000,0,0,0,0,0,0.61,1
239,395,1,49,150,1,0,0,1,0,0,...,275.000000,113.010714,0.000000,16,13,4,5,0,0.60,1
240,412,1,51,116,1,1,0,0,0,0,...,0.000000,28.617857,909.596429,11,11,0,4,6,0.57,1


In [19]:
cl_sales

Unnamed: 0,Client,Sex,Age,Tenure,Count_CA,Count_SA,Count_MF,Count_OVD,Count_CC,Count_CL,...,VolumeDebCash_Card,VolumeDebCashless_Card,VolumeDeb_PaymentOrder,TransactionsDeb,TransactionsDeb_CA,TransactionsDebCash_Card,TransactionsDebCashless_Card,TransactionsDeb_PaymentOrder,probability,cl_sales_prediction
0,1056,1,10,159,1,0,0,0,0,0,...,0.000000,0.000000,0.000000,1,1,0,0,0,0.92,1
1,1431,1,19,231,1,0,0,0,0,0,...,0.000000,0.000000,535.714286,5,5,0,0,1,0.91,1
2,1169,1,6,172,1,0,0,0,0,0,...,500.000000,189.889286,0.000000,9,9,1,7,0,0.90,1
3,1231,1,6,232,1,0,0,0,0,0,...,2250.000000,0.000000,0.000000,9,9,6,0,0,0.90,1
4,632,0,26,173,2,0,0,1,0,0,...,142.857143,0.000000,957.142857,26,15,2,0,3,0.89,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
285,83,1,43,111,1,3,7,0,0,0,...,0.000000,0.000000,224.392857,4,4,0,0,3,0.61,1
286,45,0,63,52,1,1,0,0,0,0,...,0.000000,0.000000,0.000000,1,1,0,0,0,0.61,1
287,582,0,57,65,1,0,0,0,0,0,...,0.000000,0.000000,0.005000,1,1,0,0,1,0.60,1
288,269,1,39,61,1,1,0,1,0,0,...,107.142857,0.000000,268.714286,10,10,1,0,7,0.59,1


In [20]:
mf_sales

Unnamed: 0,Client,Sex,Age,Tenure,Count_CA,Count_SA,Count_MF,Count_OVD,Count_CC,Count_CL,...,VolumeDebCash_Card,VolumeDebCashless_Card,VolumeDeb_PaymentOrder,TransactionsDeb,TransactionsDeb_CA,TransactionsDebCash_Card,TransactionsDebCashless_Card,TransactionsDeb_PaymentOrder,probability,mf_sales_prediction
0,1564,1,29,175,1,0,0,1,0,0,...,139.285714,122.081786,83.928571,41,27,6,14,1,0.92,1
1,556,0,39,160,1,0,0,0,0,0,...,35.714286,584.410714,732.428571,38,38,1,29,7,0.91,1
2,544,0,5,113,1,0,64,0,0,0,...,278.571429,451.818571,72.732857,24,24,1,15,6,0.91,1
3,390,0,50,67,1,1,79,0,0,0,...,89.285714,213.951786,397.142857,38,38,2,29,6,0.90,1
4,1087,0,29,34,1,1,18,0,0,0,...,0.000000,80.214286,0.000000,2,1,0,1,0,0.90,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
187,203,1,57,181,1,0,0,0,0,0,...,553.571429,0.000000,0.000000,11,11,6,0,0,0.60,1
188,770,0,37,80,1,0,0,0,0,0,...,196.428571,0.000000,209.714286,3,3,1,0,1,0.57,1
189,977,0,46,59,1,0,0,0,0,0,...,75.000000,1.067857,0.035714,4,4,2,1,1,0.57,1
190,132,0,50,53,1,1,0,0,0,0,...,0.000000,0.000000,0.000000,0,0,0,0,0,0.56,1


## 7. What would be the expected revenue based on your strategy?

<ul>
    <li>
        To calculate the revenue a regression method was selected in <a href="http://localhost:8888/notebooks/MFRegressionModelSelector.ipynb"> MFRegressionModelSelector.ipynb </a>, <a href="http://localhost:8888/notebooks/CCRegressionModelSelector.ipynb"> CCRegressionModelSelector.ipynb </a>, <a href="http://localhost:8888/notebooks/CLRegressionModelSelector.ipynb"> CLRegressionModelSelector.ipynb </a>.
    </li>
    <br/>
    <li>
        The analysed methods are Linear regression, Random Forest, Support Vectors. The analysis calculated the smallest mean_absolute_error on the training set using sklearn.metrics.mean_absolute_error.
    </li>
     <br/>
    <li>
      The best performing method for all three products is Support Vectors.
    </li>
     <br/>
     <li>
      In the <a href="http://localhost:8888/edit/final_offer.xlsx"> final_offer.xlsx </a> the first 100 people most likely to by a product (based on probability to buy) and their predicted revenues are listed.
    </li>
     <br/>
     <li>
    Offers with the clients most likely to buy each product are in 
         <a href="http://localhost:8888/edit/mf_offer.xlsx"> mf_offer</a>, 
         <a href="http://localhost:8888/edit/cc_offer.xlsx"> cc_offer</a>, 
         <a href="http://localhost:8888/edit/cl_offer.xlsx"> cl_offer</a>.
    </li>
     <br/>
    <li>
        The expected revenue with 100% realised sales is 669,93 EUR.
    </li>
        </ul>

In [21]:
from pandas import read_excel
file_name = 'mf_offer.xlsx' 
mf_offer = read_excel(file_name)
file_name = 'cc_offer.xlsx' 
cc_offer = read_excel(file_name)
file_name = 'cl_offer.xlsx' 
cl_offer = read_excel(file_name)

In [22]:
mf_offer

Unnamed: 0,Client,probability_x,mf_revenue_prediction
0,1564,0.92,3.036735
1,544,0.91,2.784634
2,556,0.91,3.053878
3,1087,0.9,2.975829
4,390,0.9,2.951082
5,187,0.89,3.066447
6,1182,0.88,3.095037
7,1149,0.87,3.043596
8,1484,0.87,3.047654
9,725,0.86,3.04745


In [23]:
cc_offer

Unnamed: 0,Client,probability_y,cc_revenue_prediction
0,529,0.93,4.482576
1,478,0.93,4.400068
2,938,0.92,5.000517
3,83,0.92,5.15
4,322,0.92,4.094715
5,1170,0.91,4.741387
6,1362,0.91,4.148265
7,1406,0.91,3.822676
8,1413,0.91,5.087236
9,963,0.9,4.801341


In [24]:
cl_offer

Unnamed: 0,Client,probability,cl_revenue_prediction
0,1056,0.92,10.678685
1,1431,0.91,10.697129
2,1169,0.9,10.728321
3,1231,0.9,10.692662
4,632,0.89,10.690319
5,1175,0.89,11.774022
6,102,0.89,10.68365
7,1513,0.88,10.239042
8,1438,0.88,10.685839
9,491,0.88,10.864255
