# Modeling DAU, WAU, MAU with Markov chain

Doubtlessly, DAU, WAU, and MAU are critical business metrics. An article ["How Duolingo reignited user growth"](https://www.lennysnewsletter.com/p/how-duolingo-reignited-user-growth) by [Jorge Mazal](https://www.linkedin.com/in/jorgemazal/), former CPO of Duolingo is #1 the Growth section of Lenny's Newsletter blog. In this article Jorge payed special attention to the methodology Duolingo used to model DAU metrics (see another article ["Meaningful metrics: how data sharpened the focus of product teams"](https://blog.duolingo.com/growth-model-duolingo/) by [Erin Gustafson](https://blog.duolingo.com/author/erin/)). This methodology has multiple strenghts but I focus here on how one can use it for DAU forecasting.

New year is coming soon, so many companies are planning their budgets for the next year these days. Cost estimations are often require DAU forecast. In this article I'll show you how you can get this prediction using the Duolingo's growth model and share a DAU & MAU "calculator" designed as a Google Spreadsheet calculator.

## Methodology

A quick recap on how the [Duolingo's growth model](https://blog.duolingo.com/growth-model-duolingo/) works. At day $d$ ($d=1,2,\ldots,$) of a user's lifetime the user can be in one of the following 7 (mutually-exclusive) states:

<table>
<thead><tr><th>state</th><th>d = 1</th><th>is active today</th><th>was active in [d-6, d-1]</th><th>was active in [d-29, d-7]</th><th>was active before d-30</th></tr></thead>
<tr><td>new</td><td>✅</td><td>❓</td><td>❌</td><td>❌</td><td>❌</td></tr>
<tr><td>current</td><td>❌</td><td>✅</td><td>✅</td><td>❓</td><td>❓</td></tr>
<tr><td>reactivated</td><td>❌</td><td>✅</td><td>❌</td><td>✅</td><td>❓</td></tr>
<tr><td>resurrected</td><td>❌</td><td>✅</td><td>❌</td><td>❌</td><td>✅</td></tr>
<tr><td>at_risk_wau</td><td>❌</td><td>❌</td><td>✅</td><td>❓</td><td>❓</td></tr>
<tr><td>at_risk_mau</td><td>❌</td><td>❌</td><td>❌</td><td>✅</td><td>❓</td></tr>
<tr><td>dormant</td><td>❌</td><td>❌</td><td>❌</td><td>❌</td><td>✅</td></tr>
</table>

Having these states defined (as set $S$), we can consider a user's lifetime trajectory as a Markov chain. Let $M$ be a transition matrix associated with this Markov chain: $m_{i, j} = P(s_j | s_i)$ are the probabilities that a user moves to state $s_j$ right after being at state $s_i$, $s_i, s_j \in S$. The matrix values are easily fetched from the historical data.

The beauty and simplicity of this approach is that matrix $M$ fully describes states of the all users in the future. Suppose that vector $u_0$ of length 7 contains the counts of users being in certain states at some calendar day denoted as 0. Thus, according to the Markov model, in the next day $u_1$ we expect to have the following amount of users:

$$
\underbrace{
\begin{pmatrix}  \#New_1 \\ \#Current_1 \\ \#Reactivated_1 \\ \#Resurrected_1 \\ \#AtRiskWau_1 \\ \#AtRiskMau_1 \\ \#Dormant_1 \end{pmatrix}
}_{u_1} = M^T \cdot 
\underbrace{
\begin{pmatrix}  \#New_0 \\ \#Current_0 \\ \#Reactivated_0 \\ \#Resurrected_0 \\ \#AtRiskWau_0 \\ \#AtRiskMau_0 \\ \#Dormant_0 \end{pmatrix}
}_{u_0}
$$

Applying this formula recursevely, we derive the amount of the users at any arbitrary day $t > 0$ in the future. The only thing we need to provide despite of the initial distribution $u_0$ is to the amount of new users that would appear in the product each day in the future. We'll get it by using historical data on new users appeared in the past and appyling the [prophet](http://facebook.github.io/prophet/) library.

Now, having $u_t$ calculated, we can calculate DAU values at day t:
$$\begin{equation} DAU_t = \#New_t + \#Current_t + \#Reactivated_t +\#Resurrected_t \end{equation}.$$

Additionally, we can easily calculate WAU and MAU metrics:
$$WAU_t = DAU_t +\#AtRiskWau_t,$$
$$MAU_t = DAU_t +\#AtRiskWau_t + \#AtRiskMau_t.$$

Finally, the algorithm looks like this:

1. Calculate initial counts $u_0$ corresponding to the day right before prediction.
3. For each prediction day $t=1, ..., T$ calculate the expected amount of new users $\#New_1, \ldots, \#New_T$.
4. Calculate recursively $u_{t+1} = M^T u_t$.
5. Calculate DAU, WAU, MAU for each prediction day $t=1, ..., T$.

## Getting the states



## Predicting new users amount

## Predicting DAU

## Prediction testing

## Discussion

## Conclusions

