<h1>
    <center>
       O modelo Thompson Sampling

# Determinação da máquina com maior chance de vitória

O modelo de Thompson Sampling será utilizado para determinar quais das máquinas oferece a maior chance de vitória. Este algoritmo utiliza a função de distribuição apresentada a seguir:

\begin{equation}
    x = \beta(a,b)
\end{equation}
onde:
* $x$ é uma escolha aleatória para distribuição Beta;
* $\beta$ é nossa função Beta;
* $a$ é o primeiro argumento;
* $b$ é o segundo argumento.

## Importing the libraries 

In [1]:
import numpy as np

## Setting conversion rates and the number of samples

Now we have to understand something very important. You are creating a simulation whose aim is to simulate real-life situations. In reality, every slot machine gives us some chance of winning, and some machines have it higher than others. Therefore, when simulating this environment, you have to do the same thing. It is important to remember, however, that our AI will not know these predefined winning rates. It cannot just read them and judge, based on these rates, which machine is the best

In [2]:
conversionRates = [0.15, 0.04, 0.13, 0.11, 0.05]

In [3]:
n = 10000
d = len(conversionRates)

## Criação do dataset para treinamento

In [4]:
x = np.zeros((n, d))

In [5]:
for i in range(n):
    for j in range(d):
        if np.random.rand() < conversionRates[j]:
            x[i][j] = 1

## Counter of victory number

In [6]:
n_pos_result = np.zeros(d)
n_neg_result = np.zeros(d)

## Taking our best slot machine through beta distribution and updating its losses and wins

In [7]:
for i in range(n):
    selected = 0
    maximum = 0
    
    for j in range(d):
        beta = np.random.beta(n_pos_result[j]+1, n_neg_result[j]+1)
        
        if beta > maximum:
            selected, maximum = j, beta
    
    if x[i][selected] == 1:
        n_pos_result[selected]+= 1
    else:
        n_neg_result[selected]+= 1

## Showing which slot machine is considered the best

In [14]:
n_selected = n_pos_result + n_neg_result

for i in range(d):
    print(f'Machine number {i+1} was selected {int(n_selected[i])} times!')

Machine number 1 was selected 8368 times!
Machine number 2 was selected 67 times!
Machine number 3 was selected 1172 times!
Machine number 4 was selected 333 times!
Machine number 5 was selected 60 times!


In [16]:
maquina = np.argmax(n_selected)

print(f'Machine number {maquina+1} has the best odds!')

Machine number 1 has the best odds!
