# Modelling Customer Relationships as Markov Chains
> Overview of the work from P. E. Pfeifer and R. L. Carraway

- toc: true 
- badges: true
- comments: true
- categories: [jupyter]
- image: images/markov.png

# Introduction

On this note I want to explore the work presented in the article ["Modelling customer relationships as Markov Chains" from P. E. Pfeifer and R. L. Carraway (2000)](http://web.stanford.edu/class/msande121/Links/crmmodelingmarkov.pdf). 

Lets consider a business with non-contractual relations with their customers, it can be for instance a shop selling products.   The business spends money on direct marketing campaings that aim to make customers who have purchased previously purchase again.

As the famous quote says "Half the money I spend on advertising is wasted; the trouble is I don't know which half." (John Wanamake). The purpose of the work analized here is to model the customer purchasing behavior in the scenario depicted above, with the objective of optimizing the direct marketing spend of the business.  

The authors start by presenting a simple toy scenario to illustrate their ideas.    I will present in the following this simple scenario, though with some modifications that I think will help to make things more concrete.          

Consider a shop that sells one product at a price $NC$ (in the notation of the authors, $NC$ is the net contribution to the company profits).    Assume for simplicity that customers cannot buy more than one unit per month from this shop.     Every month the company will spend $M$ in marketing for each customer unless its last purchase was 5 months ago or longer, when this happens the shop considers the relation with the customer finished.       

We will be characterizing customers in terms of states.  Two customers on the same state would behave the same under our model.   In this case, we will be doing this by characterizing customers in terms of their recency, the number of months since their last purchase.     Lets consider a concrete case:  We have a new customer that purchases in january.   In february we will say that this customer has a recency $r=1$, as one month as passed.  On march, he will continue having recency $r=1$ if he bought again in february, otherwise he will have recency $r=2$.  If the customer has not bought anything by june, he will have recency $r=5$ by then and we will declare him inactive at that point.    

In the toy scenario we analyze, we will assume that the probability that a customer will purchase again in a given month only depends on its recency.       Direct marketers have observed that recency plays a very important role in determining the probability that a customer will purchase again, so though we analyze a very simplified scenario it is not pointless.   


The system we are describing is very special, in the sense that the next state of the customer is determined only by its current state indepently of how he arrived to that state.  For instance, a customer with recency $r=3$ can either purchase again or not.  The probability he purchases again in that month only depends on its recency.   If he purchases again his next state is $r=1$, else his next state is $r=4$. This is what people calls the Markov property.  

If we draw this system as a diagram it would look like this


<img src="../images/markov.png" width="500">

The arrows represent the possible paths and $p_r$ represents the probability that a customer with recency $r$ will purchase again.   In matrix form, the previous diagram can be written as


\begin{equation*}
\mathbf{P} = 
\begin{pmatrix}
p_1 & 1 - p_{1} & 0 & 0 & 0 \\
p_2 & 0 & 1- p_2 & 0 & 0\\
p_3  & 0  & 0  & 1-p_3 & 0 \\
p_4  & 0 & 0 & 0 & 1-p_4 \\
0  & 0 & 0 & 0& 1 
\end{pmatrix}
\end{equation*}

Note that the $[\mathbf{P}]_{5,5}$ entry is $1$ as the customer will never purchase back again in this state.  Ultimately, we would like to estimate the lifetime value of our customers.   For this, we need to consider the cash-flows associated to each customer state.   This information is contained in the vector $\mathbf{R}$ 

\begin{equation*}
\mathbf{R} = 
\begin{pmatrix}
NC - M \\
-M \\
-M \\
-M \\
0 
\end{pmatrix}
\end{equation*}

Which can be interpreted as follows, when the customer goes to state $r=1$, he has just purchased one item (bringing $+ NC$ in profits) and will cost $M$ in marketing spenditures in that month.   For recency $r=2,3,4$ the customer costs $M$ in marketing while for $r=5$ the shop already declared the customer inactive.  

With all the information we have so far, we can estimate the Lifetime Value (LV) of a customer $T$ months after its first purchase. Assume that we have a customer making its first purchase in a given month. The LV at $T$ months is then given by 

\begin{equation*}
\mathrm{LV} = [\sum_{t=0}^{T} \mathbf{P}^{\,t}]_{1i} \, R_{i}
\end{equation*}

We can examine the first two terms.  The first term is simply $NC - M$ corresponding to the first purchase and the associated first marketing spenditure on that customer.  The second term is $(p_1) (NC - M )  - (1-p_1) M  $, which accounts for the two possible scenarios at this stage (he buys again or not).  This hopefully makes the previous formula more clear.   Note that $[\mathbf{P}^{\,t}]_{ij}$ is the probability that the customer will be at recency $j$ at the end of month $t$
given that he started at recency $i$. Thats one of the nice things of the Markov property, we can know what happens after several steps by simply taking powers of the one-step evolution matrix.    

Notice that in all the discussion I am not including the effects of the time-value of money as the authors do, as I dont think they are crucial for the dicussion and then I can deal with simpler formulas.  

# Markov Chains

absortive markov chain

# General Formulation

# Example