### Causal Inference

- Aim: to identify which are causes and effects, i.e. Did event A cause Event B?
    - Examples: does sending emails increase purchase conversion? does changing page design improve **click through rate** (percentage of impressions that resulted in a click)? 
- Challenges:
    - Confounders. Variables are not controllable. (Need randomization)
    - Selection Bias. Not good representation of the population.
    - Counterfactuals. (Matching)
- Assumptions:
    - Causal Markov Condition (Markov Assumption)
        - Casual Graph
        - DAG
    - Stable Unit Treatment Value Assumption
        - Treatment and Control groups do not interact each other.
    - Ignorability
        - No other unknown confounders.
- Metrics:
    - Individual Treatment Effect
    - Average Treatment Effect 
    - Conditional Average Treatment Effect
- Types of Leads (Potential Customer)
    - Only target Persuadables.

|||Without Marketing|Action|
|---|---|---|---|
|||**Don't Convert**|**Convert**|
|**With Marketing**|**Convert**|Persuadables|Sure Things|
|**Action**|**Don't Convert**|Lost Causes|Sleeping Dogs|
    
    

- Techniques:
    1. **Randomized Controlled Tests (RCTs)** / AB Testings
        - Select Participants
        - Split into Treatment and Control groups.
        - Treat them differently.
        - Monitor Purchase conversion over time.
        - Make decision whether effects are expected.

### Uplift Modeling 
Which individuals should we target?
1. Meta-Learning Techniques
    - Two-model Approach
    - Class Transformation Approach
    
$$\begin{align*}ITE&=P(\text{Outcome}|\text{Treated})-P(\text{Outcome}|\text{Not Treated})\\
&=P(Y_i=1|X_i,W_i=1)-P(Y_i=1|X_i,W_i=0)
\end{align*}$$

where $ITE\in [0,1]$ Individual Treatment Effect, $Y_i\in \{0,1\}$ the outcome of purchasing, $X_i\in \mathbb{R}^D$ lead feature vector, and $W_i\in \{0,1\}$ treatment or control group.

2. Direct Uplift Estimation Techniques

#### Two-Model Approach

There are two models to train and predict, one is to predict the probability of purchase from treatment group, one is to predict the probability of purchase from control group. 

When training, only select the data with the right condition, 

$$\begin{align*}
P(Y_i=1|X_i,W_i=1) &= f_1(X_i[W_i=1]) \\
P(Y_i=1|X_i,W_i=0) &= f_2(X_i[W_i=0]) \\
\end{align*}$$

During inference, simply pass the target vector to both models and calibrate the final values for ITE calculation,
$$\begin{align*}
P(Y_i=1,W_i=1|X_i) &= f_1(X_i) \\
P(Y_i=1,W_i=0|X_i) &= f_2(X_i) \\
\end{align*}$$


#### Class Transformation Approach
Only 1 model is used, and label classes are transformed from Y to Z to determine the persuadables $Y_i=0,W_i=0$ and $Y_i=1,W_i=1$.

$$\begin{align*}&Z_i = Y_iW_i+(1-Y_i)(1-W_i)\\
&P(Z_i=1|X_i) = f(X_i)
\end{align*}$$

Now the individual treatment effect becomes, 
$$ITE=2P(Z_i=1|X_i)-1$$