# Gamma-gamma Model

In this notebook we show how to fit a Gamma-Gamma model in PyMC-Marketing. We compare the results with the [`lifetimes`](https://github.com/CamDavidsonPilon/lifetimes) package (no longer maintained and last meaningful update was July 2020). The model is presented in the paper: Fader, P. S., & Hardie, B. G. (2013). [The Gamma-Gamma model of monetary value](http://www.brucehardie.com/notes/025/gamma_gamma.pdf). February, 2, 1-9.

## Prepare Notebook

In [None]:
import arviz as az
import matplotlib.pyplot as plt
import pandas as pd
from lifetimes import GammaGammaFitter

from pymc_marketing import clv

# Plotting configuration
az.style.use("arviz-darkgrid")
plt.rcParams["figure.figsize"] = [10, 6]
plt.rcParams["figure.dpi"] = 100
plt.rcParams["figure.facecolor"] = "white"

%load_ext autoreload
%autoreload 2
%config InlineBackend.figure_format = "retina"

## Load Data

We start by loading the `CDNOW` dataset.

In [None]:
data_path = "https://raw.githubusercontent.com/pymc-labs/pymc-marketing/main/data/clv_quickstart.csv"

summary_with_money_value = pd.read_csv(data_path)
summary_with_money_value["customer_id"] = summary_with_money_value.index
summary_with_money_value.head()

For the Gamma-Gamma model, we need to filter out customers who have made only one purchase.

In [None]:
returning_customers_summary = summary_with_money_value.query("frequency > 0")

returning_customers_summary.head()

## Model Specification

Here we briefly describe the assumptions and the parametrization of the Gamma-Gamma model from the paper above.

The model of spend per transaction is based on the following three general assumptions:

- The monetary value of a customer’s given transaction varies randomly around their average transaction value.
- Average transaction values vary across customers but do not vary over time for any given individual.
- The distribution of average transaction values across customers is independent of the transaction process.
  
For a customer with x transactions, let $z_1, z_2, \ldots, z_x$ denote the value of each transaction. The customer’s observed average transaction value by

$$
\bar{z} = \frac{1}{x} \sum_{i=1}^{x} z_i
$$

Now let's describe the parametrization: 

1. We assume that $z_i \sim \text{Gamma}(p, ν)$, with $E(Z_i| p, ν) = \xi = p/ν$.

    – Given the convolution properties of the gamma, it follows that total spend across x transactions is distributed $\text{Gamma}(px, ν)$.

    – Given the scaling property of the gamma distribution, it follows that $\bar{z} \sim \text{Gamma}(px, νx)$.

2.  We assume $ν \sim \text{Gamma}(q, \gamma)$.

We are interested in estimating the parameters $p$, $q$ and $ν$.

```{note}
The Gamma-Gamma model assumes that there is no relationship between the monetary value and the purchase frequency. We can check this assumption by calculating the correlation between the average spend and the frequency of purchases.
```

In [None]:
returning_customers_summary[["monetary_value", "frequency"]].corr()

The value of this correlation is close to $0.11$, which in practice is considered low enough to proceed with the model.

## Lifetimes Implementation

First, we fit the model using the `lifetimes` package.

In [None]:
ggf = GammaGammaFitter()
ggf.fit(
    returning_customers_summary["frequency"],
    returning_customers_summary["monetary_value"],
)

In [None]:
ggf.summary

Once the model is fitted we can use the following method to compute the conditional expectation of the average profit per transaction for a group of one or more customers.

In [None]:
avg_profit = ggf.conditional_expected_average_profit(
    summary_with_money_value["frequency"], summary_with_money_value["monetary_value"]
)
avg_profit.head(10)

In [None]:
avg_profit.mean()

## PyMC Marketing Implementation

We can use the pre-built PyMC Marketing implementation of the Gamma-Gamma model, which also provides nice plotting and prediction methods:

We can *build* the model so that we can see the model specification:

In [None]:
model = clv.GammaGammaModel(data=returning_customers_summary)
model.build_model()
model

```{note}
It is not necessary to build the model before fitting it. We can fit the model directly.
```

### Using MAP

To begin with, lets use a numerical optimizer (`L-BFGS-B`) from `scipy.optimize` to find the maximum a posteriori (MAP) estimate of the parameters.

In [None]:
idata_map = model.fit(fit_method="map").posterior.to_dataframe()

In [None]:
idata_map

These values are very close to the ones obtained by the `lifetimes` package.

### MCMC

We can also use MCMC to sample from the posterior distribution of the parameters. MCMC is a more robust method than MAP and provides uncertainty estimates for the parameters.

In [None]:
sampler_kwargs = {
    "draws": 2_000,
    "target_accept": 0.9,
    "chains": 4,
    "random_seed": 42,
}

idata_mcmc = model.fit(**sampler_kwargs)

In [None]:
idata_mcmc

We can see some statistics of the posterior distribution of the parameters.

In [None]:
model.fit_summary()

Let's visualize the posterior distributions and the rank plot:

In [None]:
axes = az.plot_trace(
    data=model.idata,
    compact=True,
    kind="rank_bars",
    backend_kwargs={"figsize": (12, 9), "layout": "constrained"},
)
plt.gcf().suptitle("Gamma-Gamma Model Trace", fontsize=18, fontweight="bold");

We can compare the results with the ones obtained by the `lifetimes` package and the MAP estimation.

In [None]:
fig, axes = plt.subplots(
    nrows=3, ncols=1, figsize=(12, 10), sharex=False, sharey=False, layout="constrained"
)

for i, var_name in enumerate(["p", "q", "v"]):
    ax = axes[i]
    az.plot_posterior(
        idata_mcmc.posterior[var_name].values.flatten(),
        color="C0",
        point_estimate="mean",
        hdi_prob=0.95,
        ref_val=ggf.summary["coef"][var_name],
        ax=ax,
        label="MCMC",
    )
    ax.axvline(
        x=ggf.summary["lower 95% bound"][var_name],
        color="C1",
        linestyle="--",
        label="lifetimes 95% CI",
    )
    ax.axvline(
        x=ggf.summary["upper 95% bound"][var_name],
        color="C1",
        linestyle="--",
    )
    ax.axvline(x=idata_map[var_name].item(), color="C2", linestyle="-.", label="MAP")
    ax.legend(loc="upper right")

plt.gcf().suptitle("Gamma-Gamma Model Parameters", fontsize=18, fontweight="bold");

We see that the `lifetimes` and MAP estimates are essentially the same. Both of them are close to the mean of the posterior distribution obtained by MCMC.

## Expected Customer Spend

Once we have the posterior distribution of the parameters, we can use the `expected_average_profit` method to compute the conditional expectation of the average profit per transaction for a group of one or more customers.

In [None]:
expected_spend = model.expected_customer_spend(data=summary_with_money_value)

Let's see how it looks for a subset of customers.

In [None]:
az.summary(expected_spend.isel(customer_id=range(10)), kind="stats")

In [None]:
ax, *_ = az.plot_forest(
    data=expected_spend.isel(customer_id=(range(10))), combined=True, figsize=(8, 7)
)
ax.set(xlabel="Expected Spend (10 Customers)", ylabel="Customer ID")
ax.set_title("Expected Spend", fontsize=18, fontweight="bold");

Finally, lets look at some statistics and the distribution for the whole dataset.

In [None]:
az.summary(expected_spend.mean("customer_id"), kind="stats")

In [None]:
fig, ax = plt.subplots()
az.plot_dist(expected_spend.mean("customer_id"), label="Mean over Customer ID", ax=ax)
ax.axvline(x=expected_spend.mean(), color="black", ls="--", label="Overall Mean")
ax.legend(loc="upper right")
ax.set(xlabel="Expected Spend", ylabel="Density")
ax.set_title("Expected Spend", fontsize=18, fontweight="bold");

In [None]:
%load_ext watermark
%watermark -n -u -v -iv -w -p pymc,pytensor