## Point to  Match Prediction
This notebook explores using point level predictions to predict the outcome of a match. We will first investigate monte carlo methods of simulating a match.

### Monte Carlo

In [1]:
import numpy as np
import scipy
from atp_forecaster.scripts.point_to_match import point_to_match_mc, point_to_match_dp

In [2]:
n_trials = 500
spw = 0.6
rpw = 0.4
p, winners = point_to_match_mc(spw, rpw, trials=n_trials, best_of=3)
print("match win probability: ", p)

match win probability:  0.514


p is a realisation of the sample mean estimator for the true match win probability theta. The estimator $\hat{p}$ follows an approximately normal distribution with mean theta and variance $\frac{\theta(1 - \theta)}{n}$. We can estimate variance by replacing theta with the sample proportion.

In [3]:
variance = p * (1 - p) / n_trials

# 95% confidence interval
t = scipy.stats.t.ppf(0.975, n_trials - 1)
se = np.sqrt(variance)
ci = (p - t * se, p + t * se)
print("p:", p, "95% CI:", ci)

p: 0.514 95% CI: (np.float64(0.47008454023278257), np.float64(0.5579154597672175))


### Monte Carlo Results

With 50000 iterations, the 95% confidence iterval is still roughly 0.002 wide, and takes roughly 5 seconds to run on a macbook air M2. This speed and accuracy is not sufficient for proper match estimation, and model calibration + backtesting will take too long. The monte carlo approach is not viable. 

We now try a dynamic programming approach to calculate the exact probability.

### Dynamic Programming

In [4]:
true_win_prob = point_to_match_dp(spw, rpw, best_of=3)
print("match win probability: ", true_win_prob)

match win probability:  0.499999571090448


In [5]:
n_iters = 1000
inside = 0;
for i in range(n_iters):
    p, winners = point_to_match_mc(spw, rpw, trials=n_trials, best_of=3)
    t = scipy.stats.t.ppf(0.975, n_trials - 1)
    se = np.sqrt(variance)
    ci = (p - t * se, p + t * se)
    if true_win_prob > ci[0] and true_win_prob < ci[1]:
        inside += 1
print("proportion of confidence intervals containing true win probability: ", inside / n_iters)

proportion of confidence intervals containing true win probability:  0.946


### DP Results

From above, the dp method is a lot faster, running in less than 0.1 seconds. The code snippet above shows that approximately 95% of the time, the 95% confidence interval from the MC method includes the true win probability calculated by the dp approach. This is to be expected, and shows that the DP result aligns with the Monte Carlo estimation.

We will proceed with the DP method, as it is both more precise and faster than the Monte Carlo approach.