Challenge: Write a Churn Simulator
==================================

- Start with a single customer in a random initial state (A, B, C)
 * What data type to use for the customer state?
- Change the state randomly using a transition probability matrix P
 * What values to use for the probability matrix?
 * What data type to use for the probability matrix
- Simulate N time steps in the same way
- write the sequence of states to an output file
 * What is the overall structure of the program?

--------------------------------------

- Do the same but for M customers
- Add new customers to the system at every time step
- Structure the code, clean it up, put it on GitHub etc.
 * What functions do I need?

--------------------------------------

OPTIONAL:

- Price the products, estimate customer lifetime value (CLV)
- Plot the number of customers in each state
- Try to find a steady state

In [1]:
import pandas as pd
import numpy as np

In [2]:
# create a list that contains all the possible states
STATES = ['Apple', 'Banana', 'Coffee', 'Doughnut', 'churned']

# create a list that saves the history of the states
HISTORY = []

# create a dictionary for the state vector creation
D = {'Apple': 0, 'Banana': 1, 'Coffee': 2, 'Doughnut': 3, 'churned': 4}

# chose the number of steps for the simmulation
N = 10

In [3]:
# create a random initial state
def initial_state(states):
    
    '''
    This function takes a list of possible states as an input and
    returns the initial state
    '''
    
    initial_state = np.random.choice(states, size=1)
    initial_state = np.asscalar(initial_state)
    return initial_state

In [4]:
# create an array for the next state
def next_state(initial, transition, states, dict_):
    
    '''
    This function takes in the inital state and the transition probability matrix and passes on the next state
    '''
    state = np.array([0] * 5)
    state[dict_[initial]] = 1
    
    between_state = np.matmul(state.T, transition)
    
    new_state = np.random.choice(states, p=between_state, size=1)
    return np.asscalar(new_state)

1) Load the customer matrix and the transition probability matrix

In [5]:
cust = pd.read_csv('./data/customer_matrix.csv', index_col=0)
trans = pd.read_csv('./data/transition_probabilities.csv', index_col=0)

2) Start with a single customer in a random initial state (=random flavor)
 * What data type to use for the customer state?
 -> start with a numpy array

In [6]:
# Add the probabilites from the state 'churned'
churn_row = pd.DataFrame([[0, 0, 0, 0, 1]], columns=trans.columns)
trans = trans.append(churn_row)
trans = trans.rename({0: 'churned'}, axis='index')
trans

Unnamed: 0,Apple,Banana,Coffee,Doughnut,churned
Apple,0.092487,0.124075,0.103017,0.056915,0.623506
Banana,0.100091,0.144928,0.131793,0.072464,0.550725
Coffee,0.101566,0.146353,0.124572,0.099853,0.527655
Doughnut,0.063365,0.100584,0.113196,0.098431,0.624423
churned,0.0,0.0,0.0,0.0,1.0


In [7]:
# transform the transition probability matrix to a numpy array for matrix multiplication
prob = np.array(trans)

In [8]:
init = initial_state(STATES[:-1])
HISTORY.append(init)
print(init)
new_state = next_state(init, prob, STATES, D)
HISTORY.append(new_state)
print(new_state)

for i in range(N):
    new_state = next_state(new_state, prob, STATES, D)
    HISTORY.append(new_state)
    print(new_state)

Doughnut
churned
churned
churned
churned
churned
churned
churned
churned
churned
churned
churned


In [9]:
HISTORY

['Doughnut',
 'churned',
 'churned',
 'churned',
 'churned',
 'churned',
 'churned',
 'churned',
 'churned',
 'churned',
 'churned',
 'churned']