# Hypothesis

From the [rules](https://www.kaggle.com/c/lux-ai-2021/overview/evaluation)

> After an Episode finishes, we'll update the Rating estimate for all Submissions in that Episode.
> If one Submission won, we'll increase its $\mu$ and decrease its opponent's $\mu$ - if the result was a draw, then we'll move the two $\mu$ values closer towards their mean.
> The updates will have magnitude relative to the deviation from the expected result based on the previous $\mu$ values, and also relative to each Submission's uncertainty $\sigma$.
> We also reduce the $\sigma$ terms relative to the amount of information gained by the result. The score by which your bot wins or loses an Episode does not affect the skill rating updates.

We hypothesize that the rating update of the winner is only affected by two variables - the initial rating difference and initial confidence of your agent.

We try to derive the rating update of the winner given only the two variables.

We will also calculate the expected rating difference between two bots given a win probability.

# Data Analysis

We use matches from an agent submitted by Toad Brigade. The matches were crawled by Rogba. See data section for details.

In [None]:
import glob
import json

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

In [None]:
diffs = []
confs_1 = []
confs_2 = []
target_1 = []
target_2 = []
match_times = []

pattern = "/kaggle/input/simulations-episode-scraper-match-downloader/*_info.json"
for filename in sorted(glob.glob(pattern)):
    with open(filename) as fd:
        json_data_full = json.load(fd)
        json_data = json_data_full["agents"]
        
        if json_data[0]["submissionId"] == 22777661:
            a0, a1 = 0, 1
        else:
            a0, a1 = 1, 0
        
        if json_data[a0]["initialConfidence"] == 0:
            continue

#         if json_data[a0]["initialConfidence"] != 35:
#             continue
#         if json_data[a1]["initialConfidence"] != 35:
#             continue

        diffs.append(json_data[a0]["initialScore"] - json_data[a1]["initialScore"])
        confs_1.append(json_data[a0]["initialConfidence"])
        confs_2.append(json_data[a1]["initialConfidence"])
        target_1.append(json_data[a0]["updatedScore"] - json_data[a0]["initialScore"])
        target_2.append(json_data[a1]["updatedScore"] - json_data[a1]["initialScore"])
        match_times.append(json_data_full["createTime"]["seconds"])
        

match_times = np.array(match_times)
diffs = np.array(diffs)
confs_1 = np.array(confs_1)
confs_2 = np.array(confs_2)
target_1 = np.array(target_1)
target_2 = np.array(target_2)

In [None]:
json_data_full

This is the contents of a sample of `info.json`.

In [None]:
plt.plot(match_times)
plt.show()

We see that first 40 matches are played quickly initially, then the following matches at played at a slower rate.

This also checks that the matches that we are enumerating are in chronological order.

In [None]:
plt.plot(confs_1)
plt.show()

We see that `initialConfidence` the submitted agent generally decreases over time, and converges to 35 at the 70th match.

However, apparently, the `initialConfidence` can increase at times in the inital matches.

In [None]:
target_1_filtered = []
target_2_filtered = []
target_difference = []

for c1, c2, t1, t2 in zip(confs_1, confs_2, target_1, target_2):
    if c1 == c2 == 35:
        target_1_filtered.append(t1)
        target_2_filtered.append(t2)
        target_difference.append(t1+t2)
        
plt.plot(target_1_filtered)
plt.plot(target_2_filtered)
plt.plot(target_difference)
plt.show()

If the `initialConfidence` of both agents is 35, the rating update value cancels out each other.

# Fixed `initialConfidence`

In [None]:
diffs = []
confs_1 = []
confs_2 = []
target_1 = []
target_2 = []
match_times = []

pattern = "/kaggle/input/simulations-episode-scraper-match-downloader/*_info.json"
for filename in sorted(glob.glob(pattern)):
    with open(filename) as fd:
        json_data_full = json.load(fd)
        json_data = json_data_full["agents"]
        
        if json_data[0]["reward"] > json_data[1]["reward"]:
            a0, a1 = 0, 1
        elif json_data[0]["reward"] < json_data[1]["reward"]:
            a0, a1 = 1, 0
        else:
            continue
        
        if json_data[a0]["initialConfidence"] == 0:
            continue

        if json_data[a0]["initialConfidence"] != 35:
            continue
#         if json_data[a1]["initialConfidence"] != 35:
#             continue

        diffs.append(json_data[a0]["initialScore"] - json_data[a1]["initialScore"])
        confs_1.append(json_data[a0]["initialConfidence"])
        confs_2.append(json_data[a1]["initialConfidence"])
        target_1.append(json_data[a0]["updatedScore"] - json_data[a0]["initialScore"])
        target_2.append(json_data[a1]["updatedScore"] - json_data[a1]["initialScore"])
        match_times.append(json_data_full["createTime"]["seconds"])

match_times = np.array(match_times)
diffs = np.array(diffs)
confs_1 = np.array(confs_1)
confs_2 = np.array(confs_2)
target_1 = np.array(target_1)
target_2 = np.array(target_2)

In [None]:
x = diffs
y = target_1 ** 0.5 
x = x
y = y

plt.scatter(x, y, c=confs_2)

X, Y = np.array(x).reshape(-1,1), np.array(y).reshape(-1,1)
lr = LinearRegression().fit(X, Y)
plt.plot(X, lr.predict(X))
plt.show()

Even despite different `initialConfidence` of the other agent (see color), the initial rating difference and the root of rating update follows a strong linear relation.

Therefore we can conclude that `initialConfidence` of the other agent does not affect your rating update.

In [None]:
predicted = (diffs * -0.0038636399198901045 + 2.114669551580679) ** 2
actual = target_1

plt.scatter(predicted, actual, c=confs_2)
plt.show()

We have the exact formula for the rating update, given that `initialConfidence` is 35.

# Variable `initialConfidence`

However, the `initialConfidence` of your agent might not be 35. We attempt to derive the rating update value of the winner in the general case.

In [None]:
diffs = []
confs_1 = []
confs_2 = []
target_1 = []
target_2 = []
match_times = []

pattern = "/kaggle/input/simulations-episode-scraper-match-downloader/*_info.json"
for filename in sorted(glob.glob(pattern)):
    with open(filename) as fd:
        json_data_full = json.load(fd)
        json_data = json_data_full["agents"]
        
        if json_data[0]["reward"] > json_data[1]["reward"]:
            a0, a1 = 0, 1
        elif json_data[0]["reward"] < json_data[1]["reward"]:
            a0, a1 = 1, 0
        else:
            continue
        
        if json_data[a0]["initialConfidence"] == 0:
            continue

#         if json_data[a0]["initialConfidence"] != 35:
#             continue
#         if json_data[a1]["initialConfidence"] != 35:
#             continue

        diffs.append(json_data[a0]["initialScore"] - json_data[a1]["initialScore"])
        confs_1.append(json_data[a0]["initialConfidence"])
        confs_2.append(json_data[a1]["initialConfidence"])
        target_1.append(json_data[a0]["updatedScore"] - json_data[a0]["initialScore"])
        target_2.append(json_data[a1]["updatedScore"] - json_data[a1]["initialScore"])
        match_times.append(json_data_full["createTime"]["seconds"])

match_times = np.array(match_times)
diffs = np.array(diffs)
confs_1 = np.array(confs_1)
confs_2 = np.array(confs_2)
target_1 = np.array(target_1)
target_2 = np.array(target_2)

In [None]:
x = confs_1
y = ((diffs * -0.0038636399198901045 + 2.114669551580679) - target_1 ** 0.5)

plt.scatter(x, y, c=confs_1)

X, Y = np.array(x).reshape(-1,1), np.array(y).reshape(-1,1)
lr = LinearRegression().fit(X, Y)
plt.plot(X, lr.predict(X))
plt.show()

The error term annd the confidence rating follows a somewhat strong linear relationship.

In [None]:
predicted = ((diffs * -0.0038636399198901045 + 2.114669551580679) - (confs_1*-0.04830843616997959 + 1.6686794100048548)) ** 2
actual = target_1

plt.scatter(predicted, actual, c=confs_1)
plt.show()

Not really a good estimate, but probably works for now.

How the `initialConfidence` converges is also unexplored.

# Expected Rating Convergence

From the result with `initialConfidence` of your agent fixed, the rating update if you win will be

`(mx + c)**2`

where
* m = -0.0038636399198901045
* x = how much more initial rating you have compared to your opponent
* c = 2.114669551580679

For a win probability of `p`, the expected rating advantage you are going to converge to is `(m+c-qc)/(m+mq)`

where `q = (p/(1-p))**0.5`

In [None]:
eps = 10**(-9)
p = np.linspace(0+eps,1-eps,100+1)
q = (p/(1-p))**0.5
m = -0.0038636399198901045
c = 2.114669551580679
converged_difference = (m+c-q*c)/(m+m*q)

In [None]:
plt.plot(p,converged_difference)
plt.show()

For example
- if your bot win rate against another bot is 100%, your rating will converge to 546.326 more than your opponent
- if your bot win rate against another bot is 50%, your rating will converge to 0 more than your opponent
- if your bot win rate against another bot is 75%, your rating will converge to 147.022 more than your opponent

In [None]:
converged_difference[75]