# Running bandit optimization algorithms with synthetic arms

This section introduces optimization with bandit algorithms, and how our framework works

## testing ε-greedy with Bernoulli arms

ε-greedy is one of the most common bandit algorithms with $\epsilon \in (0,1)$. It explores $\epsilon$ of the times, and exploits $(1-\epsilon)$ of the times.
We will explore some different ε's for the algorithm, and compare their performances in optimizing with 5 Bernoulli arms.

In [None]:
from arms import BernoulliArm
from algos_regret import EpsilonGreedy
from algos_testfunc import test_algorithm_regret

# build arms here
# each Bernoulli arm has an underlying Bernoulli distribution, with a different mean
# The Bernoulli arm with a mean of 0.9 is the optimal arm, which will return a reward 90% of the times on average.
means = [0.1, 0.2, 0.3, 0.4, 0.9]
n_arms = len(means)
arms = list(map(lambda x: BernoulliArm(x), means))

# test for epsilon greedy
for eps in [0.1, 0.3, 0.5]:
    algo = EpsilonGreedy(n_arms, eps)
    algo.reset(n_arms)
    results = test_algorithm_regret(algo, arms, n_sims, n_horizon)
    filename = 'epsilon_' + str(eps) + '.csv'
    results.to_csv(output_dir / filename)

# test for epsilon greedy with annealing
algo = AnnealingEpsilonGreedy(n_arms)
algo.reset(n_arms)
results = test_algorithm_regret(algo, arms, n_sims, n_horizon)
results.to_csv(output_dir / 'annealing.csv')


# Running bandit optimization in real time
This section describes how to run optimization in real time, where experiments are conducted as they are proposed

## set up the reaction scope