Skip to content

Commit

Permalink
Merge 586eb4c into 9af2990
Browse files Browse the repository at this point in the history
  • Loading branch information
erikbern committed Dec 4, 2018
2 parents 9af2990 + 586eb4c commit 9bc305b
Show file tree
Hide file tree
Showing 7 changed files with 65 additions and 2 deletions.
Binary file added docs/images/conversion.gif
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/exponential-markov-chain.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/gamma-markov-chain.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/probability-distributions.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

.. include:: examples.rst

.. include:: motivation.rst

Full API documentation
======================
Expand Down
4 changes: 2 additions & 2 deletions docs/introduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ The easiest way right now is to install it straight from Github using Pip:
pip install -e git://github.com/better/convoys#egg=convoys


Motivation
Background
----------

Predicting conversions is a really important problem for ecommerce, online advertising, and many other applications.
Expand All @@ -30,7 +30,7 @@ First of all, you can not learn from users that are younger than X.
You also can not learn from users that convert *after* X.
For an excellent introduction to this problem (and distributions like the `Weibull distribution <https://en.wikipedia.org/wiki/Weibull_distribution>`_), here's a blog post about `implementing a recurrent neural network to predict churn <https://ragulpr.github.io/2016/12/22/WTTE-RNN-Hackless-churn-modeling/>`_.

Survival analysis to the rescue
Survival analysis saves the day
-------------------------------

Luckily, there is a somewhat similar field called `survival analysis <https://en.wikipedia.org/wiki/Survival_analysis>`_.
Expand Down
62 changes: 62 additions & 0 deletions docs/motivation.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
Motivation
==========

A perfectly valid question is: Convoys really only implements one model: a generalized Gamma distribution multiplied by basically logistic regression. That seems like a very specialized distribution. In order to justify this choice, let's first look at a handful of conversion charts from real data at `Better <https://better.com>`_:

.. image:: images/conversion.gif
:align: center

The legend, labels of the axes, and title are all removed so that no business secrets are revealed. The solid lines with shaded area are all the generalized Gamma fits, whereas the dashed lines are the Kaplan-Meier fits. Note that the fit is very good! In fact, we have observed that almost any conversion metric can be modeled reasonably well with the generalized Gamma model (multiplied by logistic regression).

Empirically, this model seems to hold up pretty well.

Some more mathematical justification
------------------------------------

A simple toy problem also demonstrates why we would expect to get a time-dependent distribution (like the Exponential distribution) multiplied by a logistic function. Consider the `continuous-time Markov chain <https://en.wikipedia.org/wiki/Markov_chain#Continuous-time_Markov_chain>`_ with three states: undecided, converted, or died.

.. image:: images/exponential-markov-chain.png
:align: center
:height: 200px

Everyone starts out "undecided" but either converts or dies with rates :math:`\lambda_1` and :math:`\lambda_2. However, we *only observe the conversions,* not the deaths. We can solve for the distribution by thinking of this as a partial differential equation:

.. math::
\frac{\partial P_{\text{converted}}(t)}{\partial t} = \lambda_1 P_{\text{undecided}}(t) \\
\frac{\partial P_{\text{dead}}(t)}{\partial t} = \lambda_2 P_{\text{undecided}}(t)

The solution turns out to be quite simple:

.. math::
P_{\text{converted}}(t) = \frac{\lambda_1}{\lambda_1 + \lambda_2}\left(1 - \exp(-(\lambda_1 + \lambda_2)t)\right)

As you can see, the solution is an exponential distribution (the :math:`1 - \exp(-(\lambda_1 + \lambda_2)t)` part) multiplied by a constant factor (the :math:`\lambda_1/(\lambda_1 + \lambda_2)` part).

Turning it into a regression problem
------------------------------------

Note that :math:`\lambda_1` and :math:`\lambda_2` are positive numbers. For each observation :math:`z`, let's set :math:`\lambda_1 = \exp(a^Tz)` and :math:`\lambda_2 = \exp(b^Tz)` where :math:`a, b` are two unknown vectors.

With this transformation, the probability of conversion becomes

.. math::
P_{\text{converted} \rightarrow \infty}(t) = \frac{1}{1 + \exp(-(a-b)^Tz)}
This is the `sigmoid function <https://en.wikipedia.org/wiki/Sigmoid_function>`_. If you set :math:`\beta = a - b` then it turns into ordinary logistic regression where :math:`\beta` is the unknown feature weights that we are trying to learn. This shows that the regression method in convoys turns into logistic regression in the limit where :math:`t \rightarrow \infty`.

Weibull, gamma, and generalized gamma distributions
---------------------------------------------------

Moving on from exponential distributions, there are some good reasons we would want a bit more flexibility with the conversion rates. The `Weibull distribution <https://en.wikipedia.org/wiki/Weibull_distribution>`_ adds one single parameter and is widely used in time-to-event analysis. Empirically, the Weibull model seems to fit a large range of applications, where the common pattern is that conversions start immediately at :math:`t=0`.

Another class of processes model the behavior where there might be some internal states between "undecided" and "converted" that causes conversions not to start immediately. The sum of multiple exponential distributions is a `gamma distribution <https://en.wikipedia.org/wiki/Gamma_distribution>`_. It also requires one more parameter than the exponential distribution.

.. image:: images/gamma-markov-chain.png
:align: center
:height: 300px

Finally, the generalized gamma distribution unifies the Weibull and the gamma distribution, and requires two more parameters than the exponential distribution. The relationship between all four distributions can be summarized in this chart:

.. image:: images/probability-distributions.png
:align: center
:height: 150px

0 comments on commit 9bc305b

Please sign in to comment.