# Extending Kelly Betting Strategies using Bayesian Methods

This notebook extends the original betting strategies to include a Bayesian approach in the running estimate of the probability of flipping a head. Bayesian methods are natural here as they can work well with low amounts of data. 

In this notebook, I explore using a conjugate prior on the current Binomial model from the `KellyWithAnEstimate` class and show some comparisons in the wealth growth rates from the strategies proposed in the YouTube video and original notebook.

In [1]:
import pandas as pd 
import numpy as np

from dataclasses import dataclass

from typing import Union

import altair as alt

import betting_strategies as bs

alt.data_transformers.disable_max_rows()

DataTransformerRegistry.enable('default')

In [2]:
%load_ext nb_black

<IPython.core.display.Javascript object>

In [3]:
# Args from previous notebook
prob_heads = 0.7
payout_ratio = 4 / 6
N_flips = 100
N_games = 1000
initial_wealth = 50


args = dict(
    prob_heads=prob_heads,
    payout_ratio=payout_ratio,
    N_flips=N_flips,
    N_games=N_games,
    initial_wealth=initial_wealth,
)

<IPython.core.display.Javascript object>

# Kelly Betting with a Bayesian Estimate

Defining the conjugate prior and class to implement the strategy

In [4]:
@dataclass 
class Beta: 
    """Class to represent a Beta distribution, the conjugate prior for binomial model. 
    
    More info on conjugate priors here: https://en.wikipedia.org/wiki/Conjugate_prior
    
    """
    alpha: float 
    beta: float 

    def __post_init__(self) -> None: 
        assert 0 < self.alpha, "Alpha must be positive"
        assert 0 < self.beta, "Beta must be positive"

    @property 
    def mean(self) -> float: 
        return self.alpha / (self.alpha + self.beta)

    def conjugate_posterior(self, n: int, x: int) -> "Beta": 
        alpha_post = self.alpha + x 
        beta_post = self.beta + n - x 

        return Beta(alpha=alpha_post, beta=beta_post)

    def plot_distribution(self, n_samples: int = 1_000) -> alt.Chart: 
        """Helper function to display the distribution as a histogram."""
        df_samples = pd.DataFrame({"samples": np.random.beta(self.alpha, self.beta, size=n_samples)})

        return alt.Chart(df_samples).mark_bar().encode(
            alt.X("samples:Q", bin=alt.Bin(extent=[0, 1], step=0.05)), 
            y="count()"
        )


def create_bayes_strategy(prior: Beta, wait_time: Union[int, None] = 10): 
    """Return class that extends the _strategy method of the BaseGame to use a Bayesian estimate. 

    Also allows for optional non-waiting parameter before the first bet.
    
    Args: 
        prior: Prior distribution for estimating the probability of heads
        wait_time: Optional wait time before calculating the estimate

    Returns: 
        Child class of betting_strategies.BaseGame that allows for a Bayesian estimate for the Kelly strategy

    """
    class KellyWithBayesEstimate(bs.BaseGame): 
        def _strategy(self, flips_so_far: np.ndarray, W: float) -> float:
            x = flips_so_far.sum()
            n = len(flips_so_far)

            if wait_time is not None and n < wait_time: 
                return 0

            p = prior.conjugate_posterior(n, x).mean   

            return W * self.kelly_bet(p, self.payout_ratio)

    return KellyWithBayesEstimate

<IPython.core.display.Javascript object>

### Uniform / Uninformed Prior

The probability of heads is unknown before the games. This can be represented with a beta prior that is equivalent to a uniform distribution on the (0, 1) interval.

In [5]:
prior = Beta(1, 1)

<IPython.core.display.Javascript object>

Even after only one flip and no heads, the posterior mean estimate is not as extreme as the frequentist estimate.

In [6]:
print(f"Mean estimate of posterior after one flip and no heads: {100 * prior.conjugate_posterior(n=1, x=0).mean:.2f}%")

Mean estimate of posterior after one flip and no heads: 33.33%


<IPython.core.display.Javascript object>

In [7]:
prior.plot_distribution(n_samples=10_000)

<IPython.core.display.Javascript object>

In [8]:
kelly_w_bayes_estimate = create_bayes_strategy(prior, wait_time=10)(**args)

kelly_w_bayes_estimate.plot_games(n_games=25, log=True, opacity=0.5)

<IPython.core.display.Javascript object>

In [9]:
kelly_w_bayes_estimate.plot_growth_rate_distribution(
    n_games=N_games, min_max_growth_rate=[-0.1, 0.1], step_size=0.002
)

<IPython.core.display.Javascript object>

Since bayesian methods help with low amounts of data, let's not wait to bet money and make use of our prior information

In [10]:
kelly_w_bayes_estimate_no_wait = create_bayes_strategy(prior, wait_time=None)(**args)

kelly_w_bayes_estimate_no_wait.plot_games(n_games=25, log=True, opacity=0.5)

<IPython.core.display.Javascript object>

In [11]:
kelly_w_bayes_estimate_no_wait.plot_growth_rate_distribution(
    n_games=N_games, min_max_growth_rate=[-0.1, 0.1], step_size=0.002
)

<IPython.core.display.Javascript object>

### Informed Prior on the Coin

Might be likely that the coin is still fair but with a higher chance of being biased toward heads

In [12]:
prior = Beta(3, 2)
prior.plot_distribution()

<IPython.core.display.Javascript object>

In [13]:
kelly_w_bayes_estimate_informed = create_bayes_strategy(prior, wait_time=10)(**args)

kelly_w_bayes_estimate_informed.plot_games(n_games=25, log=True, opacity=0.5)

<IPython.core.display.Javascript object>

In [14]:
kelly_w_bayes_estimate_informed.plot_growth_rate_distribution(
    n_games=N_games, min_max_growth_rate=[-0.1, 0.1], step_size=0.002
)

<IPython.core.display.Javascript object>

Let's no longer wait to bet and make use of our prior information again

In [15]:
kelly_w_bayes_estimate_informed_no_wait = create_bayes_strategy(prior, wait_time=None)(**args)

kelly_w_bayes_estimate_informed_no_wait.plot_games(n_games=25, log=True, opacity=0.5)

<IPython.core.display.Javascript object>

In [16]:
kelly_w_bayes_estimate_informed_no_wait.plot_growth_rate_distribution(
    n_games=N_games, min_max_growth_rate=[-0.1, 0.1], step_size=0.002
)

<IPython.core.display.Javascript object>

## Comparison

This is a comparison of the wealth growth from the previous notebook, the four new Bayesian strategies, and the theoretical ideal strategy.

In [17]:
strategies = {
    # Previous notebook strategies
    "dummy": bs.BetLikeADummy(**args), 
    "constant": bs.ConstantDollar(**args), 
    "frequentist": bs.KellyWithAnEstimate(**args), 
    # Bayesian strategies
    "bayes": kelly_w_bayes_estimate, 
    "bayes_no_wait": kelly_w_bayes_estimate_no_wait, 
    "bayes_informed": kelly_w_bayes_estimate_informed, 
    "bayes_informed_no_wait": kelly_w_bayes_estimate_informed_no_wait, 
    # theoretical for comparison
    "theoretical": bs.Kelly(**args),
}

data = pd.DataFrame({
    name: strategy.simulate_growth_rates(N_games) for name, strategy in strategies.items()
}).melt()
data.columns = ["Strategy", "Growth Rate"]

<IPython.core.display.Javascript object>

In [18]:
bounds = {
    "lower": -0.1, 
    "upper": 0.15
}

df_plot = data.assign(value=lambda row: row["Growth Rate"].clip(**bounds))

boxplot = (
    alt.Chart(df_plot)
    .mark_boxplot()
    .encode(
        y=alt.Y("Strategy:O", sort=list(strategies.keys())), 
        x=alt.X("value:Q", title="Growth Rate (Clipped)")
    )
)
line = alt.Chart(pd.DataFrame({"x": [0]})).mark_rule(strokeDash=[10, 10]).encode(x="x")

boxplot + line

<IPython.core.display.Javascript object>

The Bayesian versions of kelly betting appear to hedge against extreme losses as compared to the frequentist approach. That is, the boxplot for `frequentist` has simulation outcomes with growth rate being less than -6% whereas each all the Bayesian methods do not. 

Interestingly enough, it appears to hedge even more by not waiting to collect data and just bet from the start. That can be seen with the whiskers ending around -2% as opposed to -4% with the `bayes` strategy to `bayes_no_wait`. There is a similar ~1% change from `bayes_informed` to `bayes_informed_no_wait`

Compared to the theoretical kelly strategy, the Bayesians ones do not have as many positive growth rates. But the percent from the Bayesians are all slightly better than the frequentist. The extreme positive values on from the theoretical are not as large as some of the Bayesian methods but that is likely just random; The right end of the whisker shown in the boxplot is likely a more stable statistic as is still larger in the theoretical strategy. 

Overall, the median outcomes don't appear much different from each other from frequentist to Bayesian strategies. However, they appear to slightly increase the probability of have a postive growth and be beneficial for hedging extreme losses. 