# **Interactive Campaign Optimization Simulator**
This notebook provides a game that lets you interactively optimize a campaign and compares your results to state of the art Multi-Armed Bandit algorithms: $\epsilon$-greedy, upper confidence bounds (UCB), and Thompson Sampling.

The intricacy in the optimization of the campaign lies in the fact that the teaser $ctr$'s are not known with full certainty. If the $ctr$'s would be exactly known, the optimization problem would be trivial (play the best teaser $100\%$ of the time).

Instead of a pure optimization problem, the task is on the one hand to maximize the number of clicks, but on the other to simultaneously try out different configurations in a clever way to acquire new data (clicks & views) that are most informative about the true optimum.

This is often referred to as the *Exploration-Exploitation Dilemma*: we need to *explore* and play different teasers to see which one of them works best. But at the same time, we need to *exploit* our current knowlege and play the configuration we assume to be best as often as possible.

In [1]:
from simulator import HumanAgent, EpsilonGreedyAgent, ThompsonSamplingAgent, UCBAgent, NoOptimizationAgent, OptimizationGame
import scipy.stats as spst
import numpy as np
from numpy.random import default_rng
rng = default_rng()
from bokeh.plotting import figure, show
from bokeh.io import output_notebook
output_notebook()

### **Setting up a campaign environment**
You first need to specify how many teasers your campaign should have by specifying `n_teasers = X`. You can also manually specify the true teaser ctr's by setting a list like `true_ctrs = [.0005, .001, .0012, .003, .002, .004]`. If you do not specify anything, some random ctr's are sampled with a mean around ~0.2%.

**Important note:** The true ctr's are only necessary to generate mock click data. Neither you nor any of the optimization algorithms is allowed to use this information for optimization!

In [2]:
n_teasers = 6
dist_alpha = 2.0
dist_beta = 1000.0
true_ctrs = rng.beta(a=dist_alpha, b=dist_beta, size=n_teasers)
# true_ctrs = [.0005, .001, .0012, .003, .002, .004]

## **Setting up the agents**
<img src="https://upload.wikimedia.org/wikipedia/commons/1/1b/Reinforcement_learning_diagram.svg?export=view">

In reinforcement learning, an agent is a human or machine that observes the environment (i.e., that tracks the teaser clicks and views) and based on the data observed, performes actions (turns teasers on/off) to increase what is called *reward* (= the number of clicks).

For this notebook, there are different kinds of agents implemented based on the strategy they play to increase the number of clicks. Parameters that all agents have in common are

*   `n_teasers=X` : The number of teasers an agent should optimize.

*   `n_views_update_data=X` : This specifies the number of views that happen until the agent gets aware of fresh data. This corresponds to the daily import of tracking data from the ad server. The default is 10 000, i.e., every 10 000 views a daily import happens and the agent updates its knowledge about the teasers.
*   `n_views_update_configuration=X` : This is the amout of views after which the agent can modify the configuration. If we assume 10 000 views per day, for an automated agent it would be reasonable to specify `n_views_update_configuration=500` to modify the configuration $\approx 20 \times$ a day. For a human agent, however, it would be more reasonable to set `n_views_update_configuration=70000`, which would correspond to an update of the configuration only once a week.

### **The NoOptimizationAgent**
The `NoOptimizationAgent` does nothing. It keeps all teasers active until the end of the game and only tracks the clicks and views of every teaser. Let's set one up for comparison:

In [3]:
no_optimization_agent = NoOptimizationAgent(n_teasers=n_teasers)

### **The HumanAgent**
The `HumanAgent` pauses the game after every `n_views_update_configuration` views, prints the currently known teaser clicks, views and the empirical click rate on the screen (from all the daily imports that have happened so far) and asks the user to activate or deactivate the specific teasers. Let's set it up here:

In [4]:
human_agent = HumanAgent(n_views_update_data=10000, n_views_update_configuration=70000, n_teasers=n_teasers)

### **The EpsilonGreedyAgent**
The strategy of the `EpsilonGreedyAgent` is to play the teaser with highest empirical $ctr$ (i.e., recorded $\frac{N_{clicks}}{N_{views}}$) with a probability $p = 1 - \epsilon$, whereas with a probability $p = \epsilon$, one of the other teasers is played. This simple algorithm works very well in practice, but the $\epsilon$ needs to be fine-tuned manually for every problem. We set up a $5\%$-greedy algorithm for our game:

In [5]:
e_greedy_agent = EpsilonGreedyAgent(epsilon=.05, n_teasers=n_teasers, n_views_update_data=10000, 
                                    n_views_update_configuration=500)

### **The ThompsonSamplingAgent**
The `ThompsonSamplingAgent` tracks the teaser clicks and views and uses Bayes' law to compute probability distributions over the teaser ctr's that reflect the current state of knowledge ([for details, check the Insights documentation here](https://content-garden.gitlab.io/analytics-server/theory.html#thompson-sampling)). It then draws random realizations of the $ctr$'s from these distributions and performs the optimal action given these realizations, which is to only activate the teaser with highest sampled $ctr$. Additional parameters that can be set are

*   `prior_alpha=X, prior_beta=Y` : These parameters tell the agent what is known about the teaser $ctr$'s *before* any clicks and views have been observed, e.g., we may already know that teaser $ctr$'s are around $\approx 0.2\%$, ranging from $0.05\%$ to $0.5\%$, which would correspond to specific values for `prior_alpha, prior_beta`. It is reasonable to use the same values as were used to draw the `true_ctrs`, i.e., `prior_alpha=dist_alpha, prior_beta=dist_beta`.

Let's initialize a `ThompsonSamplingAgent`:

In [6]:
ts_agent = ThompsonSamplingAgent(n_views_update_data=10000, n_views_update_configuration=500, n_teasers=n_teasers,
                                 prior_alpha=dist_alpha, prior_beta=dist_beta)

We also set up a `ThompsonSamplingAgent` with real-time tracking and immediate configuration updates:

In [7]:
perfect_ts_agent = ThompsonSamplingAgent(n_views_update_data=1, n_views_update_configuration=1, n_teasers=n_teasers,
                                 prior_alpha=dist_alpha, prior_beta=dist_beta)

Moreover, a-priori information on the teaser performance can be passed to the Thompson Sampling Agent. Assume we have a machine learning model that provides predictions on the teaser $ctr$. To mock this, we assume that the predictive mean is equal to the true $ctr$ whereas the predictive variance roughly stays the same:

In [8]:
prior_alpha = dist_beta * true_ctrs/(1.0 - true_ctrs)
a_priori_ts_agent = ThompsonSamplingAgent(n_views_update_data=1, n_views_update_configuration=1, n_teasers=n_teasers,
                                 prior_alpha=prior_alpha, prior_beta=dist_beta)

### **The UCBAgent**
The `UCBAgent` uses the same probability distributions over the ctr's as are used by the `ThompsonSamplingAgent` but instead of sampling from these distributions, it uses them to compute **U**pper **C**onfidence **B**ounds, typically the $95\%$ quantile. It then plays the teaser with highest upper confidence bound, therefore automatically balancing exploration and exploitation. It has the same parameters as the `ThompsonSamplingAgent` plus `ucb_quantile` which specifies the quantile used as an upper confidence bound.

In [9]:
ucb_agent = UCBAgent(n_views_update_data=10000, n_views_update_configuration=500, n_teasers=n_teasers,
                                 prior_alpha=dist_alpha, prior_beta=dist_beta, ucb_quantile=0.95)

## **Setting up the OptimizationGame**
Finally, we need to set up an `OptimizationGame` which has the following parameters:

*   `agents=[...]` : A list of all the agents that have been set up and should participate in the game.
*   `true_ctrs=X` : The true ctr's that have been specified above. **These are not known to the agents!** They are only necessary to generate mock clicks and views.
*   `n_views_total=X` : The total amount of teaser views after which the game should end.

In [10]:
game = OptimizationGame([human_agent, no_optimization_agent, e_greedy_agent, ucb_agent, ts_agent, perfect_ts_agent, a_priori_ts_agent],
                        true_ctrs=true_ctrs, n_views_total=250000)

## **Before you start**
It is important to note that randomness plays a key role in this game. As all agents may activate/deactivate different teasers in every step, they necessarily need to run on different random sequences (i.e., different clicks and views).

**It is possible that you will win a game from time to time!**

It would however be surprising if you win 6 out of 10 or even 51 out of 100 games.

## **Start the game**
Execute the next cell to run the game. Before the first view, you will be asked to activate the teasers you want to play. Is it reasonable to activate all the teasers in the beginning?

Then, after every `n_views_update_configuration` views you have set up for the `human_agent`, the currently known stats are shown and you can modify the configuration. Also, a short summary of the accumulated clicks for all agents is printed.

In [11]:
game.start()
print('true ctr`s: ', true_ctrs)
ts_agent.update_data()
perfect_ts_agent.update_data()
no_optimization_agent.update_data()
human_agent.update_data()
ts_tot_clicks = ts_agent.n_clicks.sum()
perfect_ts_tot_clicks = perfect_ts_agent.n_clicks.sum()
no_opt_tot_clicks = no_optimization_agent.n_clicks.sum()
human_tot_clicks = human_agent.n_clicks.sum()
print('Improvement  no  optimization   -->     TS: ' + '%2.2f' % (100 * (ts_tot_clicks/no_opt_tot_clicks - 1.0)) + '%')
print('Improvement      human          -->     TS: ' + '%2.2f' % (100 * (ts_tot_clicks/human_tot_clicks - 1.0)) + '%')
print('Improvement no optimization --> perfect TS: ' + '%2.2f' % (100 * (perfect_ts_tot_clicks/no_opt_tot_clicks - 1.0)) + '%')
print('Improvement    human    -->     perfect TS: ' + '%2.2f' % (100 * (perfect_ts_tot_clicks/human_tot_clicks - 1.0)) + '%')

The teaser clicks and views you know from daily import are:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,True,0,0,
teaser 2,True,0,0,
teaser 3,True,0,0,
teaser 4,True,0,0,
teaser 5,True,0,0,
teaser 6,True,0,0,


Do you want to turn on teaser 1 (y/n)?
y
Do you want to turn on teaser 2 (y/n)?
y
Do you want to turn on teaser 3 (y/n)?
y
Do you want to turn on teaser 4 (y/n)?
y
Do you want to turn on teaser 5 (y/n)?
y
Do you want to turn on teaser 6 (y/n)?
y
Configuration updated, continue running...

The cumulative stats are:

human_agent_update_data=10000_update_config=70000:
clicks:        176
views:         70000
empirical ctr: 0.25142857142857145

no_optimization_agent:
clicks:        139
views:         70000
empirical ctr: 0.19857142857142857

0.05_greedy_agent_update_data=10000_update_config=500:
clicks:        325
views:         70000
empirical ctr: 0.4642857142857143

ucb_agent_update_data=10000_update_config=500:
clicks:        223
views:         70000
empirical ctr: 0.31857142857142856

thompson_agent_update_data=10000_update_config=500:
clicks:        296
views:         70000
empirical ctr: 0.4228571428571429

thompson_agent_update_data=1_update_config=1:
clicks:        306
views:      

Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,True,14,9966,0.14%
teaser 2,True,21,10016,0.21%
teaser 3,True,49,10010,0.49%
teaser 4,True,15,9924,0.15%
teaser 5,True,30,9958,0.30%
teaser 6,True,12,10126,0.12%


Do you want to turn on teaser 1 (y/n)?
n
Do you want to turn on teaser 2 (y/n)?
y
Do you want to turn on teaser 3 (y/n)?
y
Do you want to turn on teaser 4 (y/n)?
n
Do you want to turn on teaser 5 (y/n)?
y
Do you want to turn on teaser 6 (y/n)?
n
Configuration updated, continue running...

The cumulative stats are:

human_agent_update_data=10000_update_config=70000:
clicks:        392
views:         140000
empirical ctr: 0.28

no_optimization_agent:
clicks:        307
views:         140000
empirical ctr: 0.21928571428571428

0.05_greedy_agent_update_data=10000_update_config=500:
clicks:        660
views:         140000
empirical ctr: 0.4714285714285714

ucb_agent_update_data=10000_update_config=500:
clicks:        583
views:         140000
empirical ctr: 0.4164285714285714

thompson_agent_update_data=10000_update_config=500:
clicks:        637
views:         140000
empirical ctr: 0.455

thompson_agent_update_data=1_update_config=1:
clicks:        634
views:         140000
empirical ctr:

Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,20,11662,0.17%
teaser 2,True,73,31772,0.23%
teaser 3,True,151,31639,0.48%
teaser 4,False,18,11592,0.16%
teaser 5,True,80,31521,0.25%
teaser 6,False,12,11814,0.10%


Do you want to turn on teaser 1 (y/n)?


KeyboardInterrupt: Interrupted by user

# **Explanation of Algorithms**
## **$\epsilon$-greedy**
The idea of the $\epsilon$-greedy algorithm is simple: With probability $1 - \epsilon$, play the teaser which has currently the highest empirical $ctr$ (exploitation step). With probability $\epsilon$, randomly choose one of the other teasers (exploration). The value of $\epsilon$ needs to be set by the user. High $\epsilon$ values drive exploration, low $\epsilon$ values drive exploitation. Reasonable values for $\epsilon$ are $1\% \lesssim \epsilon \lesssim 10\%$. Let's play an optimization game with an $\epsilon$-greedy agent only and look more closely what it's doing by setting the option `verbose=True`. The empirical $ctr$'s are visualized in chart of blue bars. The teaser that will be activated until the next configuration update is highlighted in orange color. We use $\epsilon = 20\%$ to observe an exploration step more often:

In [12]:
e_greedy_agent = EpsilonGreedyAgent(epsilon=.2, n_teasers=n_teasers, n_views_update_data=10000,
                                    n_views_update_configuration=3000, verbose=True)
game = OptimizationGame([e_greedy_agent], true_ctrs=true_ctrs, n_views_total=1000000)
game.start()

New data was imported!
New data was imported!
Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,1,1617,0.06%
teaser 2,False,2,1646,0.12%
teaser 3,False,4,1634,0.24%
teaser 4,False,0,1730,0.00%
teaser 5,True,5,1668,0.30%
teaser 6,False,3,1705,0.18%


Chosen teaser =  teaser 6
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,1,1617,0.06%
teaser 2,False,2,1646,0.12%
teaser 3,False,4,1634,0.24%
teaser 4,False,0,1730,0.00%
teaser 5,True,5,1668,0.30%
teaser 6,False,3,1705,0.18%


Chosen teaser =  teaser 5
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,1,1617,0.06%
teaser 2,False,2,1646,0.12%
teaser 3,False,4,1634,0.24%
teaser 4,False,0,1730,0.00%
teaser 5,True,5,1668,0.30%
teaser 6,False,3,1705,0.18%


Chosen teaser =  teaser 4
Hit enter to continue

New data was imported!
Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1976,0.10%
teaser 2,False,2,1955,0.10%
teaser 3,True,9,1991,0.45%
teaser 4,False,3,4053,0.07%
teaser 5,False,11,5004,0.22%
teaser 6,False,5,5021,0.10%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1976,0.10%
teaser 2,False,2,1955,0.10%
teaser 3,True,9,1991,0.45%
teaser 4,False,3,4053,0.07%
teaser 5,False,11,5004,0.22%
teaser 6,False,5,5021,0.10%


Chosen teaser =  teaser 1
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1976,0.10%
teaser 2,False,2,1955,0.10%
teaser 3,True,9,1991,0.45%
teaser 4,False,3,4053,0.07%
teaser 5,False,11,5004,0.22%
teaser 6,False,5,5021,0.10%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1976,0.10%
teaser 2,False,2,1955,0.10%
teaser 3,True,9,1991,0.45%
teaser 4,False,3,4053,0.07%
teaser 5,False,11,5004,0.22%
teaser 6,False,5,5021,0.10%


Chosen teaser =  teaser 3
Hit enter to continue

New data was imported!
Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,6,4976,0.12%
teaser 2,False,2,1955,0.10%
teaser 3,True,43,7991,0.54%
teaser 4,False,4,5053,0.08%
teaser 5,False,11,5004,0.22%
teaser 6,False,5,5021,0.10%


Chosen teaser =  teaser 2


KeyboardInterrupt: Interrupted by user

$\epsilon$-greedy is a very simple algorithm that shows surprisingly good results. However, in the limit of infinite views, the best teaser is not played with a probability of $\epsilon$ even though it is known with certainty. A schedule to decrease $\epsilon$ over time can be applied, but a badly chosen schedule can completely mess up the optimization.

## **Thompson Sampling**
Thompson Sampling uses Bayes' law to compute distributions over the $ctr$'s given a certain amount of clicks and views ([for details, check the Insights documentation here](https://content-garden.gitlab.io/analytics-server/theory.html#thompson-sampling)). It then plays a teaser with the same probability as the probability of this teaser to be the best one ("probability matching decision strategy"). It achieves this by drawing random numbers from the inferred probability distributions over the $ctr$'s and then playing the teaser corresponding to the highest random number that was drawn.

Assume that, after recording some clicks and views for, say, 3 different teasers, we know that the first one has $10\%$, the second one $20\%$ and the third one $70\%$ probability to be the best teaser. Then Thompson Sampling will play the first one $10\%$, the second one $20\%$ and the third one $70\%$ of the time.

After some more views, our knowledge of the true teaser $ctr$'s increases and it may be that according to our knowledge, the first teaser is with $2\%$, the second with $5\%$ and the third with $93\%$ the best teaser. Then the first one is played $2\%$, the second $5\%$ and the third one $93\%$ of the time.

It may as well be that actually the second teaser is better than the third and there was only bad luck for the second one in the first couple of views. Then our knowlege about the best teaser may refine to, e.g., $4\%$ for the first, $60\%$ for the second and $36\%$ for the third and the Thompson Sampling agent will play the teasers accordingly.

In the limit of infinite views, Thompson Sampling will always find the best teaser and play it almost $100\%$ of the time in future steps.

A nice feature is that this algorithm does not require any fine tuning of parameters as is necessary in, e.g., the $\epsilon$-greedy method. 

Let's set up a Thompson Sampling agent and set `verbose=True` to see some visualizations. Observe that whenever data is imported ("daily import"), the probability distributions over the teaser $ctr$'s change. Whenever the configuration is updated, random numbers from these distributions are drawn and visualized as dots on the $x$-axis of the same color as the probability distribution of the corresponding teaser $ctr$. Only the teaser with the largest drawn random number is activated, indicated by an encircled dot and a bigger line width. 

In [13]:
ts_agent = ThompsonSamplingAgent(n_views_update_data=10000, n_views_update_configuration=3000, n_teasers=n_teasers,
                                 prior_alpha=dist_alpha, prior_beta=dist_beta, verbose=True)
game = OptimizationGame([ts_agent], true_ctrs=true_ctrs, n_views_total=1000000)
game.start()

New data was imported!
New data was imported!
Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1696,0.12%
teaser 2,False,2,1687,0.12%
teaser 3,True,5,1695,0.29%
teaser 4,False,1,1669,0.06%
teaser 5,False,5,1624,0.31%
teaser 6,False,1,1629,0.06%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1696,0.12%
teaser 2,False,2,1687,0.12%
teaser 3,False,5,1695,0.29%
teaser 4,False,1,1669,0.06%
teaser 5,True,5,1624,0.31%
teaser 6,False,1,1629,0.06%


Chosen teaser =  teaser 5
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1696,0.12%
teaser 2,False,2,1687,0.12%
teaser 3,True,5,1695,0.29%
teaser 4,False,1,1669,0.06%
teaser 5,False,5,1624,0.31%
teaser 6,False,1,1629,0.06%


Chosen teaser =  teaser 3
Hit enter to continue

New data was imported!
Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,2032,0.10%
teaser 2,False,2,2024,0.10%
teaser 3,True,25,7022,0.36%
teaser 4,False,1,1999,0.05%
teaser 5,False,12,4950,0.24%
teaser 6,False,1,1973,0.05%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,2032,0.10%
teaser 2,False,2,2024,0.10%
teaser 3,True,25,7022,0.36%
teaser 4,False,1,1999,0.05%
teaser 5,False,12,4950,0.24%
teaser 6,False,1,1973,0.05%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,2032,0.10%
teaser 2,False,2,2024,0.10%
teaser 3,True,25,7022,0.36%
teaser 4,False,1,1999,0.05%
teaser 5,False,12,4950,0.24%
teaser 6,False,1,1973,0.05%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,2032,0.10%
teaser 2,False,2,2024,0.10%
teaser 3,True,25,7022,0.36%
teaser 4,False,1,1999,0.05%
teaser 5,False,12,4950,0.24%
teaser 6,False,1,1973,0.05%


Chosen teaser =  teaser 3
Hit enter to continue

New data was imported!
Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,2032,0.10%
teaser 2,False,2,2024,0.10%
teaser 3,True,72,17022,0.42%
teaser 4,False,1,1999,0.05%
teaser 5,False,12,4950,0.24%
teaser 6,False,1,1973,0.05%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,2032,0.10%
teaser 2,False,2,2024,0.10%
teaser 3,True,72,17022,0.42%
teaser 4,False,1,1999,0.05%
teaser 5,False,12,4950,0.24%
teaser 6,False,1,1973,0.05%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,2032,0.10%
teaser 2,False,2,2024,0.10%
teaser 3,True,72,17022,0.42%
teaser 4,False,1,1999,0.05%
teaser 5,False,12,4950,0.24%
teaser 6,False,1,1973,0.05%


Chosen teaser =  teaser 3
Hit enter to continue

New data was imported!
Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,2032,0.10%
teaser 2,False,2,2024,0.10%
teaser 3,True,121,27022,0.45%
teaser 4,False,1,1999,0.05%
teaser 5,False,12,4950,0.24%
teaser 6,False,1,1973,0.05%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,2032,0.10%
teaser 2,False,2,2024,0.10%
teaser 3,True,121,27022,0.45%
teaser 4,False,1,1999,0.05%
teaser 5,False,12,4950,0.24%
teaser 6,False,1,1973,0.05%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,2032,0.10%
teaser 2,False,2,2024,0.10%
teaser 3,True,121,27022,0.45%
teaser 4,False,1,1999,0.05%
teaser 5,False,12,4950,0.24%
teaser 6,False,1,1973,0.05%


Chosen teaser =  teaser 3
Hit enter to continue

New data was imported!
Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,2032,0.10%
teaser 2,False,2,2024,0.10%
teaser 3,True,175,37022,0.47%
teaser 4,False,1,1999,0.05%
teaser 5,False,12,4950,0.24%
teaser 6,False,1,1973,0.05%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,2032,0.10%
teaser 2,False,2,2024,0.10%
teaser 3,True,175,37022,0.47%
teaser 4,False,1,1999,0.05%
teaser 5,False,12,4950,0.24%
teaser 6,False,1,1973,0.05%


Chosen teaser =  teaser 3


KeyboardInterrupt: Interrupted by user

## **Upper Confidence Bounds**
The UCB algorithm is pretty similar to Thompson Sampling, but instead of drawing random numbers from the inferred probability distributions over the teaser $ctr$'s, upper confidence bounds are computed and the teaser with largest value is selected. A common upper confidence bound is the $95\%$ quantile (i.e., the $ctr$ value where to $95\%$ probability, the true $ctr$ is smaller, and to $5\%$ it's bigger).

We specify a UCB agent, draw the probability distributions over the $ctr$'s at every configuration update, and indicate the upper confidence bounds ($95\%$ quantiles) with dots on the $x$-axis.

In [14]:
ucb_agent = UCBAgent(n_views_update_data=10000, n_views_update_configuration=3000, n_teasers=n_teasers,
                                 prior_alpha=dist_alpha, prior_beta=dist_beta, ucb_quantile=0.95, verbose=True)
game = OptimizationGame([ucb_agent], true_ctrs=true_ctrs, n_views_total=1000000)
game.start()

New data was imported!
New data was imported!
Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,1,1670,0.06%
teaser 2,False,5,1708,0.29%
teaser 3,True,14,1700,0.82%
teaser 4,False,3,1623,0.18%
teaser 5,False,1,1643,0.06%
teaser 6,False,3,1656,0.18%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,1,1670,0.06%
teaser 2,False,5,1708,0.29%
teaser 3,True,14,1700,0.82%
teaser 4,False,3,1623,0.18%
teaser 5,False,1,1643,0.06%
teaser 6,False,3,1656,0.18%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,1,1670,0.06%
teaser 2,False,5,1708,0.29%
teaser 3,True,14,1700,0.82%
teaser 4,False,3,1623,0.18%
teaser 5,False,1,1643,0.06%
teaser 6,False,3,1656,0.18%


Chosen teaser =  teaser 3
Hit enter to continue

New data was imported!
Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,53,10054,0.53%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,53,10054,0.53%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,53,10054,0.53%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,53,10054,0.53%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

New data was imported!
Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,96,20054,0.48%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,96,20054,0.48%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,96,20054,0.48%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

New data was imported!
Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,149,30054,0.50%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,149,30054,0.50%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,149,30054,0.50%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

New data was imported!
Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,201,40054,0.50%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,201,40054,0.50%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,201,40054,0.50%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,201,40054,0.50%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

New data was imported!
Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,252,50054,0.50%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,252,50054,0.50%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,252,50054,0.50%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

New data was imported!
Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,295,60054,0.49%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,295,60054,0.49%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,295,60054,0.49%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

New data was imported!
Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,347,70054,0.50%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,347,70054,0.50%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,347,70054,0.50%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,347,70054,0.50%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

New data was imported!
Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,390,80054,0.49%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,390,80054,0.49%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,390,80054,0.49%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3
Hit enter to continue

New data was imported!
Known data after daily import:


Unnamed: 0,active,n_clicks,n_views,empirical ctr
teaser 1,False,2,1999,0.10%
teaser 2,False,6,2044,0.29%
teaser 3,True,428,90054,0.48%
teaser 4,False,3,1936,0.15%
teaser 5,False,3,1980,0.15%
teaser 6,False,3,1987,0.15%


Chosen teaser =  teaser 3


KeyboardInterrupt: Interrupted by user

The UCB algorithm shows similar or even better performance than Thompson Sampling. The issue in our context is that it will always play the same teaser until fresh data is imported, i.e., only a single active teaser for a whole day.

# **Summary**
The three most common Multi-Armed Bandit algorithms ($\epsilon$-greedy, UCB, and Thompson Sampling) were presented. The Bayesian methods UCB and Thompson Sampling generally show the best performance. However, applying UCB would in our case mean that only a single teaser is active on a campaign/placement pair for a whole day since configurations do not change until there is fresh data.

Therefore, in our context, Thompson Sampling seems to be most promising.

To better align Thompson Sampling with the restriction of only limited configuration updates, strategies with more than a single active teaser can be developed.

# **Appendix: Probability Distribution Plotter**
The next cell provides a plotting tool to visualize the probability distribution over the teaser $ctr$ given a certain amount of clicks and views. Specify `n_clicks` and `n_views` and run the cell to see the distribution!

In [None]:
n_clicks = 1
n_views = 1000

prior_alpha = 1.1
prior_beta = 300.0
posterior_alpha = prior_alpha + n_clicks
posterior_beta = prior_beta + n_views - n_clicks
ctr = np.linspace(.0, .015, 501)
p_ctr = spst.beta.pdf(ctr, posterior_alpha, posterior_beta)

fig = figure(background_fill_color=(255, 255, 255), plot_width=1000, plot_height=300, 
             tools="hover,pan,wheel_zoom,box_zoom,reset,crosshair,save", toolbar_location='right', 
             title='Probability Distribution over Teaser CTR')
fig.line(ctr, p_ctr, line_width=3)
fig.y_range.start = 0.0
fig.xaxis.major_label_text_font_size = "16px"
fig.yaxis.major_label_text_font_size = "16px"
fig.xaxis.axis_label_text_font_size = "16px"
fig.yaxis.axis_label_text_font_size = "16px"
fig.yaxis.axis_label = 'p(ctr)'
fig.xaxis.axis_label = 'ctr'
show(fig)