Merge 613940f into 9af2990

better · Dec 4, 2018 · ef8ac13 · ef8ac13
2 parents 9af2990 + 613940f
commit ef8ac13
Show file tree

Hide file tree

Showing 5 changed files with 43 additions and 2 deletions.
diff --git a/docs/images/conversion.gif b/docs/images/conversion.gif
diff --git a/docs/images/convoys-markov-chain.png b/docs/images/convoys-markov-chain.png
diff --git a/docs/index.rst b/docs/index.rst
@@ -2,6 +2,7 @@
 
 .. include:: examples.rst
 
+.. include:: motivation.rst
 
 Full API documentation
 ======================

diff --git a/docs/introduction.rst b/docs/introduction.rst
@@ -14,7 +14,7 @@ The easiest way right now is to install it straight from Github using Pip:
     pip install -e git://github.com/better/convoys#egg=convoys
 
 
-Motivation
+Background
 ----------
 
 Predicting conversions is a really important problem for ecommerce, online advertising, and many other applications.
@@ -30,7 +30,7 @@ First of all, you can not learn from users that are younger than X.
 You also can not learn from users that convert *after* X.
 For an excellent introduction to this problem (and distributions like the `Weibull distribution <https://en.wikipedia.org/wiki/Weibull_distribution>`_), here's a blog post about `implementing a recurrent neural network to predict churn <https://ragulpr.github.io/2016/12/22/WTTE-RNN-Hackless-churn-modeling/>`_.
 
-Survival analysis to the rescue
+Survival analysis saves the day
 -------------------------------
 
 Luckily, there is a somewhat similar field called `survival analysis <https://en.wikipedia.org/wiki/Survival_analysis>`_.

diff --git a/docs/motivation.rst b/docs/motivation.rst
@@ -0,0 +1,40 @@
+Motivation
+==========
+
+A perfectly valid question is: Convoys really only implements one model: a generalized Gamma distribution multiplied by basically logistic regression. That seems like a very specialized distribution. In order to justify this choice, let's first look at a handful of conversion charts from real data at `Better <https://better.com>`_:
+
+.. image:: images/conversion.gif
+
+The legend, labels of the axes, and title are all removed so that no business secrets are revealed. The solid lines with shaded area are all the generalized Gamma fits, whereas the dashed lines are the Kaplan-Meier fits. Note that the fit is very good! In fact, we have observed that almost any conversion metric can be modeled reasonably well with the generalized Gamma model (multiplied by logistic regression).
+
+Empirically, this model seems to hold up pretty well.
+
+Some more mathematical justification
+------------------------------------
+
+A simple toy problem also demonstrates why we would expect to get a time-dependent distribution (like the Exponential distribution) multiplied by a logistic function. Consider the `continuous-time Markov chain <https://en.wikipedia.org/wiki/Markov_chain#Continuous-time_Markov_chain>`_ with three states: undecided, converted, or died.
+
+.. image:: images/convoys-markov-chain.png
+   :height: 300px
+   :width: 300px
+
+Everyone starts out "undecided" but either converts or dies. However, we *only observe the conversions,* not the deaths.
+
+We can solve for the distribution by thinking of this as a partial differential equation. The solution turns out to be quite simple:
+
+.. math::
+   P_{\mbox{converted}}(t) = \frac{\lambda_1}{\lambda_1 + \lambda_2}(1 - \exp(-(\lambda_1 + \lambda_2)t))
+
+As you can see, the solution is an exponential distribution (the :math:`1 - \exp(-(\lambda_1 + \lambda_2)t)` part) multiplied by a constant factor (the :math:`\lambda_1/(\lambda_1 + \lambda_2)` part).
+
+Turning it into a regression problem
+------------------------------------
+
+Note that :math:`\lambda_1` and :math:`\lambda_2` are positive numbers. For each observation :math:`z`, let's write it as a linear combination :math:`\lambda_1 = \exp(a^Tz)` and :math:`\lambda_2 = \exp(b^Tz)` where :math:`a, b` are two unknown vectors.
+
+With this transformation, the probability of conversion becomes
+
+.. math::
+   P_{\mbox{converted}}(t) = \frac{1}{1 + \exp(-(a-b)^Tz)}
+
+This is the `sigmoid function <https://en.wikipedia.org/wiki/Sigmoid_function>`_ which means that we are basically doing logistic regression in the limit where :math:`t \rightarrow \infty`.