### Intro 
This notebook will go over variable importance metrics for uplift models. It will go over:

- Data Generating Process
- Model Building 
- Variable Importance Metric 

For an introduction on uplift models please see [example on single responses](https://github.com/Ibotta/ibotta_uplift/blob/master/examples/ibotta_uplift_multiple_response_example.ipynb)


### Data Generating Process

Below is the data generating process of the data we have.

\begin{equation}
x_1  \sim runif(0,1)
\end{equation}

\begin{equation}
x_2 \sim runif(0,1)
\end{equation}

\begin{equation}
x_3 \sim runif(0,1)
\end{equation}


\begin{equation}
e_1 \sim rnorm(0,1)
\end{equation}

\begin{equation}
e_2 \sim rnorm(0,1)
\end{equation}

\begin{equation}
t \sim rbinom(.5)
\end{equation}

\begin{equation}
noise \sim rnorm(0,1)
\end{equation}

\begin{equation}
revenue = x_1*t + e_1
\end{equation}

\begin{equation}
costs = x_2*t + e_2
\end{equation}

\begin{equation}
profit = revenue - costs
\end{equation}




In [1]:
import numpy as np
import pandas as pd

from ibotta_uplift.dataset.data_simulation import get_simple_uplift_data
from ibotta_uplift.ibotta_uplift import IbottaUplift
from ggplot import *

num_obs = 10000
y, x, t = get_simple_uplift_data(num_obs)

y = pd.DataFrame(y)
y.columns = ['revenue','cost', 'noise']
y['profit'] = y['revenue'] - y['cost']


#include noise explanatory variable
x = pd.DataFrame(x)
x.columns = ['x_1', 'x_2']
x['x_3'] = np.random.normal(0, 1, num_obs)




  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
You can access Timestamp as pandas.Timestamp
  pd.tslib.Timestamp,
  from pandas.lib import Timestamp
  from pandas.core import datetools


In [2]:
#build model
uplift_model = IbottaUplift()
param_grid = dict(num_nodes=[8], dropout=[.1,.5], activation=[
                          'relu'], num_layers=[1,2], epochs=[25], batch_size=[30])
uplift_model.fit(x, y, t.reshape(-1,1), param_grid = param_grid, n_jobs = 1)





### Variable Importance Uplift

The variable importance metrice described here is a variation on permutation importance; shuffle a column and measure how much the output disagrees with output of the original data. 

Continuing the notation from the [multiple response example](https://github.com/Ibotta/ibotta_uplift/blob/master/examples/ibotta_uplift_multiple_response_example.ipynb) we have policy assignment $\pi(x_i, W)$ as a function of weights $W$, explanatory variables $X$, and estimated model $E[]$. 


\begin{equation}
    \pi(x_i, W) =argmax \:_{t \in T} \sum_j w_j *E[y_{j,i} | X=x_i, T=t]
\end{equation}


To obtain a variable importance for a particular explanatory variable $p$ a permutation is performed on that column and a new dataset for each user is generated $x_{i,permuted_p}$. The variable importance is a disagreement between original decision for all $n$ observations :


\begin{equation}
   variableimportance_p = 1 -  1/n \sum_{i=1}^{n} I(\pi(x_i, W) = \pi(x_{i,permuted_p}, W))
\end{equation}

Intuitively, if the decisions of the permuted data is the same as the unpermuted data then we can conclude it is not an important variable. Alternatively, if the decisions are very different then we can conclude that variable is very important. 

Below is the variable importance from the fitted model. Note that the noise variable $x_3$ has very low importance relative to the other two. This makes sense since $x_3$ does not effect the response variables. 



In [3]:
uplift_model.permutation_varimp(weights = np.array([.6,-.4,0,0]).reshape(1,-1))



Unnamed: 0,permuation_varimp_metric,var_names
0,0.362571,x_1
1,0.234,x_2
2,0.006571,x_3
