# The Kelly Criteria without the Log Transformation

Of the three main bet sizing methods in finance, the Kelly Criteria is arguably the most fun, maybe due to its use in sports betting.  (The other two are Markowitz optimization and the Gordon Growth model.)   People have been trying to extend it for decades and it is a popular discussion topic among investors. Warren Buffett once said:

>I have 2 views on diversification. If you are a professional and have confidence, then I would advocate lots of concentration. For everyone else, if it's not your game, participate in total diversification. So this means that professionals use Kelly and amateurs better off with index funds following the capital asset pricing model.

Kelly made an interesting modeling decision in his original 1956 publication to wrap potential gains inside of a logarithm.  Without fail, every other paper has used the same transformation.  The main arguments in the earliest papers was that the transform implied working with growth rates instead of resulting wealth.  I'm going to show that this decision led to some awkward behavior and that removing it solves those issues. 

There was one interesting discussion between the economist Paul Samuelson and Professor William Ziemba which ran from 2008 to 2012 about how the logarithic utility used by Kelly was not sound economics.  You can find the conversation <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2161068">here</a>.  This was also published in the Journal of Portfolio Management in 2015.

___
### Standard derivation of the Kelly Criteria

First, we derive the standard Kelly betting fraction.  

We assume that a proportion $f$ of your investment cash is put on each bet.  The probability of winning is $\rho$ and is known with certainty.  If we win a bet, we We want to find the $f$ that maximizes our future growth.  

After N bets:

$V_T = V_o (1 + f\ b)^W (1 - f)^L$

If we divide by $V_o$ and take the $N^{th}$ root, we find the approximate return per bet:

$ \frac{V_T}{V_o}^{1/N} \approx (1 + f\ b)^\rho (1 - f)^{(1 - \rho)}$

Taking the natural logarithm:

$\frac{1}{N}(Ln(V_T) - Ln(V_o)) = \rho\ Ln(1 + f\ b) + (1 - \rho)\ Ln(1 - f)$


The optimal betting fraction is derived by taking the first deritive and setting it equal to zero.

$f = \rho - \frac{q}{b}$

where $q \equiv 1 - \rho$.

The following cells repeat this derivation more formally using Sympy but extending the result to situations where losses can be less than 100%.


In [39]:
import sympy as sp

f, b1, b2, 𝜌, q = sp.symbols('f b1 b2 𝜌 q')
kelly = 𝜌 * sp.ln(1 + b1 * f) + q * sp.ln(1 + b2 * f)

kelly

q*log(b2*f + 1) + 𝜌*log(b1*f + 1)

In [40]:
# taking the first derivative
first = sp.diff(kelly, f)
solution = sp.simplify(sp.solve(first, f)[0])

# a little light formatting to make it look more familiar
solution = solution.subs(𝜌+q, 1)
solution = sp.simplify(solution)

solution

-𝜌/b2 - q/b1

If we set $b_2$ equal to $-1$ and $b_1$ equal to $b$, we achieve the solution as seen in the introduction.
___
Of more interest is the second condition for a maximum value, which is that the second derivative needs to be negative.  The next cell shows that second derivative is in all places negative so all solutions are maximums.

In [42]:
second = sp.diff(first, f)
second

-b1**2*𝜌/(b1*f + 1)**2 - b2**2*q/(b2*f + 1)**2

This is a non-sensical result.  We can see this by providing non-standard values for $b_1$ and $b_2$.  For example, if both outcomes lead to a loss, such as $b_1 = -0.5$ and $b_2 = -1.0$, the Kelly criteria will always have a positive betting fraction.  If both outcomes lead to a positive gain, the Kelly criteria leads to a short position.

This is a direct result of the logarithmic transform.  In next cells, we will remove the transform and repeat the derivations.  We will find that:

- we will achieve the same first derivative result
- the second derivative result will actually make sense
- further, we will find that the second derivative is negative if and only if $b_1$ and $b_2$ have opposite signs.

### The alternative derivation

In [44]:
alt_kelly = (1 + b1 * f)**𝜌 * (1 + b2 * f)**q

alt_kelly

(b1*f + 1)**𝜌*(b2*f + 1)**q

In [45]:
alt_first = sp.diff(alt_kelly, f)
alt_solution = sp.simplify(sp.solve(alt_first, f)[0])

# a little light formatting to make it look more familiar
alt_solution = alt_solution.subs(𝜌+q, 1)
alt_solution = sp.simplify(alt_solution)

solution

-𝜌/b2 - q/b1

This was the exact same solution achieved for the logarithm-transformed Kelly Criteria.
___

In [47]:
# a little Sympy manipulation to make things look nice and then we show the actual first derivative

a = alt_first.as_ordered_terms()
alt_first = sp.simplify(a[0]) + sp.simplify(a[1])
alt_first

b1*𝜌*(b1*f + 1)**(𝜌 - 1)*(b2*f + 1)**q + b2*q*(b1*f + 1)**𝜌*(b2*f + 1)**(q - 1)

Before we move to the second derivative, a comment on the next cell.  Sympy does not always make things easier to read and it often leaves simple adjustments undone.  A lot of what is being done here is for clarity of the end result.  

Also, we will divide the second derivative by positive factors a few times to remove clutter.  We are only interested in the final result of knowing when the second derivative is negative.

Finally, we will assume that $(1 + b_i * f)$ is positive.  The negative and zero solutions are not valid in this context.

In [64]:
alt_second = sp.diff(alt_first, f)

a = alt_second.as_ordered_terms()
a = [x / 𝜌 / q for x in a]

bet1 = (1 + b1 * f)
bet2 = (1 + b2 * f)

# this is the most important division, removing most of the additional factors in the equation 
a = [x / bet1**(𝜌 - 1) / bet2**(-𝜌) for x in a]
a = [x.subs(𝜌 - 1, -q) for x in a]
a = [sp.simplify(x) for x in a]
a = [x.subs(𝜌 + q, 1) for x in a]

# in this step, we substitute in the solution for the extremum from the first derivative
a = [x.subs(f, alt_solution) for x in a]
a = [sp.simplify(x) for x in a]

# the rest is simply formatting for a more pleasant end result
a = [x.subs(b1 - 𝜌*b1, q*b1) for x in a]
a = [x.subs(q*b2 - b2, -𝜌*b2) for x in a]
a = [sp.simplify(x) for x in a]

a = [x.subs(1 - q, 𝜌) for x in a]
a = [sp.simplify(x) for x in a]


alt_second = sum(a)
alt_second = sp.simplify(alt_second)
alt_second = alt_second.subs(𝜌**2 + 2*𝜌*q + q**2, 1)
alt_second

b1*b2/(q*𝜌)

We see that the second derivative is proportional to $ b_1 * b_2 $, implying that all solutions are maximums if and only if $b_1$ and $b_2$ are of different signs.  This is a very satisfying result.  Even though log returns are more useful for statistical analysis, I have yet to find a mathematical derivation where removing the logarithm does not work.

In [66]:
# for completeness, we'll revert to the full second derivative and will display it
alt_second = alt_second * 𝜌 * q
alt_second = alt_second * bet1**(-q) * bet2**(-𝜌) 
alt_second

b1*b2/((b1*f + 1)**q*(b2*f + 1)**𝜌)