<a href="https://colab.research.google.com/github/seanreed1111/BDA_py_demos/blob/master/btyd_test_brian_callander.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

see also: 
- https://www.briancallander.com/posts/customer_lifetime_value/pareto-nbd.html



In [1]:
# installlation required
!pip install pyro-ppl


Collecting pyro-ppl
  Downloading pyro_ppl-1.8.0-py3-none-any.whl (713 kB)
[K     |████████████████████████████████| 713 kB 3.1 MB/s 
Collecting pyro-api>=0.1.1
  Downloading pyro_api-0.1.2-py3-none-any.whl (11 kB)
Installing collected packages: pyro-api, pyro-ppl
Successfully installed pyro-api-0.1.2 pyro-ppl-1.8.0


<a id = "7"></a><br>
# LIBRARIES

In [2]:
import os
import datetime as dt
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

import torch
from torch.distributions import constraints
from torch import tensor

import pyro
import pyro.distributions as dist
from pyro.infer import SVI,Trace_ELBO
from pyro.infer.autoguide  import AutoMultivariateNormal, AutoNormal, init_to_mean
from pyro.optim import ClippedAdam

assert pyro.__version__.startswith('1.8.0')
pyro.set_rng_seed(1)
torch.manual_seed(1)

# Set matplotlib settings
%matplotlib inline
plt.style.use('default')
plt.rcParams['figure.figsize'] = [12, 8]
import warnings 
warnings.filterwarnings("ignore")



Let’s describe the model first by simulation. 

Suppose we have a company that is 2 years old and a total of 2000 customers, C, that have made at least one purchase from us. 

We’ll assume a linear rate of customer acquisition, so that the first purchase date is simply a uniform random variable over the 2 years of the company existance. These assumptions are just to keep the example concrete, and are not so important for understanding the model.

Each customer c∈C is assumed to have a certain lifetime, τc, starting on their join-date. 

During their lifetime, they will purchase at a constant rate, λc, so that they will make k∼Poisson(tλc) purchases over a time-interval t. 

Once their lifetime is over, they will stop purchasing. We only observe the customer for Tc units of time, and this observation time can be either larger or smaller than the lifetime, τc. 

Since we don’t observe τc itself, we will assume it follows an exponential distribution, i.e. τc∼Exp(μc).

The mean expected lifetime in our simulated example will be ~2 months, with a standard deviation of 30.

The mean purchase rate will be once each 14 days, with a standard deviation of 0.05.

In [None]:
#The following function generates possible observations given μ and λ.

# sample_conditional <- function(mu, lambda, T) {
  
#   # lifetime
#   tau <- rexp(1, mu)
  
#   # start with 0 purchases
#   t <- 0
#   k <- 0
  
#   # simulate time till next purchase
#   wait <- rexp(1, lambda)
  
#   # keep purchasing till end of life/observation time
#   while(t + wait <= pmin(T, tau)) {
#     t <- t + wait
#     k <- k + 1
#     wait <- rexp(1, lambda)
#   }
  
#   # return tabular data
#   tibble(
#     mu = mu,
#     lambda = lambda,
#     T = T,
#     tau = tau,
#     k = k,
#     t = t
#   )
# }

# s <- sample_conditional(0.01, 1, 30) 

In [None]:
def sample_conditional(mu, lam, T):
  tau = dist.Exponential(mu).sample()
  t,k = tensor(0),tensor(0)
  wait = dist.Exponential(lam).sample()
  while (torch.add(t, wait) <= torch.minimum(T, tau)):
    t = torch.add(t, wait)
    k = k + 1
    wait = dist.Exponential(lam).sample()
  return mu, lam, T, tau, k


Priors
Now the priors. Typically, μ and λ are given gamma priors, which we’ll use too. 
However, the expected mean lifetime 𝔼(τ)=1/μ is easier to reason about than μ, so we’ll put an inverse gamma distribution on 1/μ. 
The reciprocal of an inverse gamma distribution has a gamma distribution, so μ will still end up with a gamma distribution.



In [None]:
etau_mean = 60
etau_variance = 30**2
Lambda_mean = 1 / 14
Lambda_variance = 0.05 **2
etau_beta  = etau_mean**3 / etau_variance + etau_mean
etau_alpha  = etau_mean**2 / etau_variance + 2

Lambda_beta = Lambda_mean / Lambda_variance
Lambda_alpha = Lambda_mean * Lambda_beta

In [None]:
def model(t_n, T_n, k_n):
  '''
  t_n, T_n, k_n are nx1 dimensional tensors
  t_n  = time to most recent purchase
  T_n  = total observation time
  k_n  = number of purchases observed (k must be >= 2)

  n, etau_alpha, etau_beta, Lambda_alpha, Lambda_beta are scalars
  n = number of customers
  etau_alpha, etau_beta are priors for etau
  Lambda_alpha, Lambda_beta are priors for Lambda
  '''


  with pyro.plate("data", t_n.size(0)):
    etau  = pyro.sample('etau', dist.InverseGamma(etau_alpha, etau_beta))
    mu = 1./etau
    Lambda = pyro.sample('Lambda', dist.Gamma(Lambda_alpha, Lambda_beta))

  pyro.factor('loglik', loglik(Lambda, mu, t_n, T_n, k_n))

In [3]:
def loglik(Lambda, mu, t, T, k):
  target = k * torch.log(Lambda) - torch.log(Lambda + mu)
  n = Lambda.size(0)
  for i in range(n):
    target  = target + torch.logaddexp(torch.log(Lambda[i]) - (Lambda[i] + mu[i]) * T[i],
                                       torch.log(mu[i]) - (Lambda[i] + mu[i]) * t[i]
                                       )
  return target

In [None]:
def create_customers(n, days_in_first_purchase_period):
  '''
  output = day of first purchase, T = days_in_first_purchase_period - day of first purchase )
  '''
  c = dist.Uniform(0, days_in_first_purchase_period).expand([n]).to_event(1).sample().floor()
  return torch.stack([c,days_in_first_purchase_period - c], dim=1)

In [None]:
 #create_customers(25,200) makes 25 customers who have their first purchase on given day. 
 # T = 200 - day of first purchase
create_customers(25,200)

tensor([[168.,  32.],
        [170.,  30.],
        [ 21., 179.],
        [196.,   4.],
        [  1., 199.],
        [157.,  43.],
        [107.,  93.],
        [147.,  53.],
        [ 45., 155.],
        [160.,  40.],
        [ 50., 150.],
        [ 11., 189.],
        [133.,  67.],
        [154.,  46.],
        [199.,   1.],
        [ 89., 111.],
        [196.,   4.],
        [164.,  36.],
        [ 50., 150.],
        [ 22., 178.],
        [156.,  44.],
        [ 45., 155.],
        [145.,  55.],
        [ 14., 186.],
        [172.,  28.]])

In [None]:
# Given μ and λ, the CLV is calculated as follows. 
# The remaining lifetime is the lifetime minus the age of the customer. 
# So if the customer is estimated to have a lifetime of 1 year and has been a customer for 3 months already, 
# then the remaining lifetime will be 9 months.

# lifetime <- function(n, mu, age=0) {
#   rexp(n, mu) %>% 
#     `-`(age) %>% 
#     pmax(0) # remaining lifetime always >= 0
# }



In [None]:
# The number of purchases in a given timeframe (within the customer’s lifetime) is simply a poisson random variable.

# purchases <- function(n, lambda, time) {
#   rpois(n, lambda * time)
# }

In [None]:
# To simulate the CLV, we just simulate a possible lifetime remaining, 
# then simulate the number of puchases in that timeframe. 
# Repeating many times gives us the distribution of the total number of purchases the customer is expected to make.

# clv <- function(n, mu, lambda, age=0) {
#   lifetime(n, mu, age) %>% 
#     purchases(n, lambda, .)
# } 