## 1. Introduction and Dataset

My brother Ben and I play a lot of ping pong, and recently, we have decided to track the scores of our ping-pong games! 

We actually went a little overboard and designed a system where we can track the exact progression of points in each game. 

An example of the dataset we've collected is below.

In [4]:
import processing
data = processing.process_data()
data.tail()

Unnamed: 0_level_0,point,time,gametime,ash_points,ben_points,server,game_no
point_num,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
45,0,1592016000.0,1592015311,23,23,1,52
46,0,1592016000.0,1592015311,23,24,0,52
47,1,1592016000.0,1592015311,24,24,1,52
48,0,1592016000.0,1592015311,24,25,0,52
49,0,1592016000.0,1592015311,24,26,1,52


Now that we have the data, we want to model it, with two goals. (1) We want to know who is better at ping pong, and (2) we want to gain a better understanding of stochastic progression of points within games. 

### 1.1 Dataset Details

In the above sample from the data, we have the following columns:
1. ``point_num`` refers to which point in the game we have played
2. ``point`` denotes who won the point, where 0 means Ben won the point, and 1 means Asher won the point.
3. ``time`` and ``gametime`` denote the times that the point was logged / the game was started, respectively
4.  ``ash_points`` and ``ben_points`` refer to the cumulative number of points scored by each player
5. ``server`` denotes who was serving (1 = Asher, 0 = Ben)
6. ``game_no``is an ID which denotes which game is being played

### 1.2 Game Rules

Some ground rules right off the bat:

1. Ben and I play to $21$ with a win-by-two rule.
2. We alternate serves in pairs (e.g. one person serves twice, then the other person serves twice, repeat).
3. If the game reaches the score $20-20$, we begin to alternate every other serve (e.g. one person serves once, then the other person serves once, repeat).

## 2. Model 

### 2.1 Substantive Framework

Currently, we use a hidden markov model with a mean-reverting autoregressive hidden state to model the data.

In particular, consider point $i$ of game $j$. We observe $Y_{ij}$, the player who wins this point. However, we introduce a *latent skill variable* $X_{ij}$ which measures the difference in skill between my brother and I at a particular point in time. When $X_{ij} > 0$, this indicates that I have higher skill than my brother, and vice versa.

We have chosen to model this skill difference as a *stochastic process*, rather than a fixed variable, because in all honesty, the skill difference between us does change with time! Some days I outplay my brother, and some days he outplays me. 

That said, there are underlying parameters which govern the distribution of $X_{ij}$. We can perform inference on these parameters later to test whether the $X_{ij}$ has bias in one direction or the other (e.g. one player tends to have higher skill than the other).

### 2.2 Idealized Model

As before, consider point $i$ of game $j$. Denote:
- $Y_{i}$ as the indicator that I (Asher) win point $i$
- $X_{i}$ as the skill gap at that time

In general, we have that
$$Y_{i} \sim \text{Bern}(\sigma(X_{i})) $$

Then for $i \ge 1$, we model
$$X_{i} = \rho_i X_{i-1} + \sqrt{1-\rho_i^2} \cdot Z_{i} \text{ for } Z_{i} \sim N(\mu, \sigma^2) $$
where 
$$\rho_i = \begin{cases} \rho_{game} & i \text{ is the first point of a game } \\ \rho_{point} & i \text{ else } \\ \end{cases}  $$

When $i = 0$, the first point in the dataset, we let
$$X_i \sim \mathcal{N}(\mu, \sigma^2) $$
We are interested in inferring the parameter $\mu$, which is the mean of the skill gap, as well as understanding the distribution of the latent states $p(X | Y)$. This is a hidden Markov model.

### 2.3 Discretized Model

 However, there are an additional $3$ parameters we need to infer:

1. $\sigma^2$, the variance level
2. $\rho_{point}$, the in-game correlation constant
3. $\rho_{game}$, the between-game correlation constant

To do this, we take a Bayesian approach. In particular, we consider the priors

$$ \mu \sim \mathcal{N}(0, 1) $$
$$\sigma^2 \sim \text{invGamma}(2, 1) $$
$$\rho_{point}, \rho_{game} \sim \text{Unif}(0,1) $$


Then we can evaluate a density for the whole dataset and a specific choice of parameters:

$$L(X, Y, \mu, \sigma, \rho, \gamma) =  $$