<h2><center> Rock, Paper, Scissors competition - leaderboard analysis </center></h2>

<h2><center> <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/7/78/Cup-vang-chat-luong-AnThaiCafe.png/800px-Cup-vang-chat-luong-AnThaiCafe.png" alt="Rock, Paper, Scissors img"></center></h2>

In [None]:
import json
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from kaggle_environments import (evaluate, make, utils, get_episode_replay, list_episodes, list_episodes_for_team, list_episodes_for_submission)
import datetime
import glob
import warnings
from IPython.display import display, Markdown
pd.set_option("display.max_rows", 200)
pd.options.display.float_format = '{:,.2f}'.format
warnings.filterwarnings('ignore')

In [None]:
!wget "https://www.kaggle.com/c/rock-paper-scissors/leaderboard.json?includeBeforeUser=true&includeAfterUser=false" -O leaderboard.json

In [None]:
with open("leaderboard.json") as f:
    jsn = json.load(f)
leaderboard = pd.DataFrame(columns = ["team_name", "team_id", "score", "n_agents", "team_rank"])
for user in jsn["beforeUser"]+jsn["afterUser"]:
    leaderboard = leaderboard.append({"team_name": user["teamName"], 
                                      "team_id": user["teamId"], 
                                      "score": user["score"], 
                                      "n_agents": user["entries"],
                                     "team_rank": user["rank"]}, 
                                     ignore_index=True)
leaderboard[["score", "n_agents", "team_rank"]] = leaderboard[["score", "n_agents", "team_rank"]].apply(pd.to_numeric)

# 1. Team score distribution

The median is well above 600 (which is a starting point for every bot). That is, most teams have at least one OK bot.

In [None]:
gold_min_score = leaderboard.sort_values("score", ascending=False)["score"][12]
silver_min_score = leaderboard.sort_values("score", ascending=False)["score"][77]
bronze_min_score = leaderboard.sort_values("score", ascending=False)["score"][156]
plt.figure(figsize=(25,8))
plt.hist(leaderboard["score"], color="lightsteelblue", bins=200)
plt.axvline(x=gold_min_score, color="gold")
plt.axvline(x=silver_min_score, color="silver")
plt.axvline(x=bronze_min_score, color="peru")
plt.xlabel("Team score")
plt.ylabel("Number of teams")
plt.legend(title="Team score distribution (vertical lines are medal thresholds)", loc="upper center", title_fontsize=25)
plt.show()

In [None]:
leaderboard["score"].describe()

In [None]:
plt.figure(figsize=(25,8))
plt.hist(leaderboard["score"][leaderboard["score"] > 900], color="papayawhip", bins=100)
plt.axvline(x=gold_min_score, color="gold")
plt.axvline(x=silver_min_score, color="silver")
plt.axvline(x=bronze_min_score, color="peru")
plt.xlabel("Team score")
plt.ylabel("Number of teams")
plt.legend(title="Team score distribution (only teams with score > 900, vertical lines are medal thresholds)", loc="upper center", title_fontsize=25)
plt.show()

# 2. Number of submissions

Teams with a score below 600 are not participating actively. They typically have only one or two submissions.

In [None]:
plt.figure(figsize=(25,8))
plt.hist(leaderboard["n_agents"][leaderboard["score"] < 600], color="teal", bins=50)
plt.xlabel("Number of submissions (agents)")
plt.ylabel("Number of teams")
plt.legend(title="Number of agents (teams with score < 600)", loc="upper center", title_fontsize=25)
plt.show()

In general, there is some correlation between the number of agents and team score. Hardly surprising, since you need more submissions to test new models and tune parameters. Or, as [stressed](https://www.kaggle.com/c/rock-paper-scissors/discussion/211336) it [@karham](https://www.kaggle.com/karham):
> Correlation is not necessarily causation. ... Do remember though that those who have 300 submissions are very likely not simply doing it for the sake of having submissions. They are probably obsessed, as I am, with this contest, constantly iterating and improving. So it's possible that high submission count is merely an indicator of effort and dedication, which generally translates to score too.

In [None]:
print("Correlation coef.:", np.corrcoef(leaderboard["n_agents"], leaderboard["score"])[0, 1])

In [None]:
sns.set(style="white")
sns.lmplot("n_agents", "score", data=leaderboard, scatter_kws={"alpha": 0.5}, line_kws={"color": "red"}, height=10)
plt.legend(title="Number of agents vs team score", loc="lower center", title_fontsize = 25)
plt.show()

The same for medal-winning teams:

In [None]:
sns.set(style="white")
sns.lmplot("n_agents", "score", data=leaderboard[leaderboard["score"] > bronze_min_score], 
           scatter_kws={"alpha": 0.5}, line_kws={"color": "coral"}, height=8)
plt.legend(title="Number of agents vs team score (medal-winning teams)", loc="lower center", title_fontsize = 25)
plt.show()

# 3. Agent score distribution

Below I use [the data](https://www.kaggle.com/demche/rock-paper-scissors-agents-data) collected from kaggle API and meta kaggle. Due to API limitations for some bots team is unknown.

In [None]:
episodes = pd.read_csv("../input/meta-kaggle/Episodes.csv")
gaps = sorted(set(range(episodes[episodes["CompetitionId"] == 22838]["Id"].min(), episodes["Id"].max() + 1)) - set(episodes["Id"].values), reverse=True)
episodes = episodes.loc[episodes["CompetitionId"] == 22838]
episodes["CreateTime"] = pd.to_datetime(episodes["CreateTime"], format="%m/%d/%Y %H:%M:%S")
episodes = episodes[["Id", "CreateTime"]]

episode_agents = pd.read_csv("../input/meta-kaggle/EpisodeAgents.csv")
episode_agents = pd.merge(episode_agents, episodes, left_on="EpisodeId", right_on="Id")
episode_agents = episode_agents[["EpisodeId", "CreateTime", "SubmissionId", "UpdatedScore"]]
episode_agents = episode_agents.drop_duplicates()
agents_mapping = pd.DataFrame(columns = ["team_id", "submission_id", "submission_dt"])

episodes_to_consider = episode_agents[episode_agents["EpisodeId"].isin(episodes["Id"])].groupby(["SubmissionId"])["EpisodeId"].max().to_list()
for i in range(0, len(episodes_to_consider), 1000):
    batch = episodes_to_consider[i:i + 1000]
    try:
        resp = list_episodes(batch)  
        for episode in resp["result"]["submissions"]:
            agents_mapping = agents_mapping.append({"team_id": episode["teamId"],
                                "submission_id":  episode["id"] ,
                                "submission_dt": datetime.datetime.strptime(episode["dateSubmitted"][:19], "%Y-%m-%dT%H:%M:%S")
                               }, ignore_index=True)
        del episode, batch
    except Exception as ex:
        print("Error:", ex)
        continue

for i in range(0, len(gaps), 1000):
    batch = gaps[i:i + 1000]
    try:
        resp = list_episodes(batch)      
        if len(resp["result"]["episodes"]) != 0:
            for episode in resp["result"]["episodes"]:
                if episode["competitionId"] == 22838:
                    EpisodeId = episode["id"]
                    for agent in episode["agents"]:
                        submissionId = agent["submissionId"]
                        updatedScore = agent["updatedScore"]
                        CreateTime = datetime.strptime(episode["createTime"][:19], "%Y-%m-%dT%H:%M:%S")
                        episode_agents = episode_agents.append({"EpisodeId": EpisodeId,
                                                    "CreateTime": CreateTime,
                                                    "SubmissionId": submissionId,
                                                    "UpdatedScore": updatedScore
                                                    }, ignore_index=True)           
            for episode in episodes["result"]["submissions"]:
                agents_mapping = agents_mapping.append({"team_id": episode["teamId"],
                                    "submission_id":  episode["id"] ,
                                    "submission_dt": datetime.datetime.strptime(episode["dateSubmitted"][:19], "%Y-%m-%dT%H:%M:%S")
                                   }, ignore_index=True)
            del episode, batch
    except Exception as ex:
        print("Error:", ex)
        continue
        
agents_mapping = agents_mapping.drop_duplicates(subset=["submission_id"])
episode_agents = episode_agents[episode_agents["SubmissionId"].isin(agents_mapping["submission_id"])]
episode_agents = episode_agents.drop_duplicates()
episode_agents["date"] = episode_agents["CreateTime"].dt.date
agents = episode_agents.loc[episode_agents.groupby("SubmissionId").CreateTime.idxmax()].dropna(subset=["UpdatedScore"]).\
    loc[:, ["SubmissionId", "UpdatedScore"]].reset_index(drop=True)
agents.columns = ["submission_id", "score"]
agents = pd.merge(agents, agents_mapping, on="submission_id", how="left")
agents = agents.drop_duplicates(subset=["submission_id"])
agents = pd.merge(agents, leaderboard.loc[:, ["team_name", "team_id"]], on="team_id", how="left")
agents["medal"] = ["gold" if x >= gold_min_score else "silver" if x >= silver_min_score else "peru" if x >= bronze_min_score else "lightblue" \
     for x in agents["score"]]


In [None]:
plt.figure(figsize=(25,8))
plt.hist(agents["score"], color="thistle", bins=250)
plt.axvline(x=gold_min_score, color="gold")
plt.axvline(x=silver_min_score, color="silver")
plt.axvline(x=bronze_min_score, color="peru")
plt.xlabel("Agent score")
plt.ylabel("Number of individual agents")
plt.legend(title="Score distribution for individual agents (vertical lines are medal thresholds)", loc="upper center", title_fontsize=25)
plt.show()

In [None]:
agents["score"].describe()

In [None]:
stat_rps = episode_agents.groupby(["date"]).agg({"UpdatedScore": [np.max, np.min, np.mean, np.median]}, axis="columns")
stat_rps.columns = stat_rps.columns.droplevel(0)
stat_rps.plot(figsize=(25,10), title="Score summary statistics for individual agents", colormap="PuOr")
plt.show()

Agents in the bronze medal or above:

In [None]:
plt.figure(figsize=(25,8))
plt.hist(agents["score"][agents["score"] >= bronze_min_score], color="pink", bins=200)
plt.axvline(x=gold_min_score, color="gold")
plt.axvline(x=silver_min_score, color="silver")
plt.axvline(x=bronze_min_score, color="peru")
plt.xlabel("Agent score")
plt.ylabel("Number of individual agents")
plt.legend(title="Score distribution for individual agents (medal zone only)", loc="upper center", title_fontsize=25)
plt.show()

In [None]:
agents["score"][agents["score"] >= bronze_min_score].describe()

# 4. Submission time for agents

There is [an intuition ](https://www.kaggle.com/c/rock-paper-scissors/discussion/205162)that we need some time to evaluate the true value of an agent. However, it's not clear if score [depends on](https://www.kaggle.com/c/rock-paper-scissors/discussion/213306) submission timing.

In [None]:
plt.figure(figsize=(25,15))
plt.scatter(agents["submission_dt"], agents["score"], c=agents["medal"], alpha=0.7)
plt.xlabel("Submission date")
plt.ylabel("Score")
plt.legend(title="Agent scores by submission time", loc="lower center", title_fontsize=25)
plt.show()

Then, let's investigate time trends for top 100 teams.

In [None]:
top_teams_agents = agents[agents["team_id"].isin(leaderboard.sort_values(by=["score"], ascending=False).head(100)["team_id"])].reset_index()
top_teams_agents["medal"] = ["gold" if x >= gold_min_score else "silver" if x >= silver_min_score else "peru" if x >= bronze_min_score else "blue" \
     for x in top_teams_agents["score"]]
top_teams_agents = pd.merge(top_teams_agents, leaderboard[["team_id", "team_rank"]], on=["team_id"], how="left")

In [None]:
for team_rank, group in top_teams_agents.groupby("team_rank"):
    display(Markdown("Team #" + str(team_rank) + ": " + group["team_name"].unique()[0]))
    plt.figure(figsize=(20, 5))
    plt.scatter(group["submission_dt"], group["score"], c=group["medal"])
    plt.ylim(400, 1300)
    plt.title("Team #" + str(team_rank) + ": " + group["team_name"].unique()[0])
    plt.show()

# 5. Top 100 agents

The leaderboard reveals only the top score for each team. So, let's look at the top 100 agents (NB: [@robga](https://www.kaggle.com/robga) was the first to [do that](https://www.kaggle.com/c/rock-paper-scissors/discussion/202556)).

In [None]:
top100_agents = agents.sort_values(by=["score"], ascending=False).head(100).reset_index()
top100_agents.loc[:, ["team_name", "submission_id", "score"]]

In [None]:
plt.figure(figsize=(20, 30))
sns.countplot(y=top100_agents['team_name'], color = 'wheat', order=top100_agents['team_name'].value_counts().index)
plt.legend(title="TOP-100 agents by team", loc="upper right", title_fontsize = 15)
plt.show()

# 6. Agents with high scores

In [None]:
agents_900 = agents[agents["score"] > 900].sort_values(by=["score"], ascending=False).reset_index()["team_name"]\
.value_counts().reset_index().rename(columns={"index": "team", "team_name": "score > 900"})
agents_950 = agents[agents["score"] > 950].sort_values(by=["score"], ascending=False).reset_index()["team_name"]\
.value_counts().reset_index().rename(columns={"index": "team", "team_name": "score > 950"})
agents_1000 = agents[agents["score"] > 1000].sort_values(by=["score"], ascending=False).reset_index()["team_name"]\
.value_counts().reset_index().rename(columns={"index": "team", "team_name": "score > 1000"})
agents_1050 = agents[agents["score"] > 1050].sort_values(by=["score"], ascending=False).reset_index()["team_name"]\
.value_counts().reset_index().rename(columns={"index": "team", "team_name": "score > 1050"})
top_scored_agents = pd.merge(pd.merge(pd.merge(agents_900, agents_950, on="team", how="outer"), \
                  agents_1000, on="team", how="outer"), agents_1050, on="team", how="outer").fillna(0)
top_scored_agents["score > 900"] = top_scored_agents["score > 900"].astype(int)
top_scored_agents["score > 950"] = top_scored_agents["score > 950"].astype(int)
top_scored_agents["score > 1000"] = top_scored_agents["score > 1000"].astype(int)
top_scored_agents["score > 1050"] = top_scored_agents["score > 1050"].astype(int)
top_scored_agents.style.background_gradient(cmap="Wistia")

# 7. Agents in "medal zone"

In [None]:
agents_bronze = agents[agents["medal"]  == "peru"].sort_values(by=["score"], ascending=False).reset_index()["team_name"]\
.value_counts().reset_index().rename(columns={"index": "team", "team_name": "bronze"})
agents_silver = agents[agents["medal"] == "silver"].sort_values(by=["score"], ascending=False).reset_index()["team_name"]\
.value_counts().reset_index().rename(columns={"index": "team", "team_name": "silver"})
agents_gold = agents[agents["medal"] == "gold"].sort_values(by=["score"], ascending=False).reset_index()["team_name"]\
.value_counts().reset_index().rename(columns={"index": "team", "team_name": "gold"})
medal_zone_agents = pd.merge(pd.merge(agents_gold, agents_silver, on="team", how="outer"), agents_bronze, on="team", how="outer").fillna(0)
medal_zone_agents["bronze"] = medal_zone_agents["bronze"].astype(int)
medal_zone_agents["silver"] = medal_zone_agents["silver"].astype(int)
medal_zone_agents["gold"] = medal_zone_agents["gold"].astype(int)
medal_zone_agents.style.background_gradient(cmap="Wistia")

# 8. Best agents for top-100 teams

As an additional point, we can look at the top-100 team's best agents. Maybe, some positions at the leaderboard are merely a bit of luck (i.e. come from a single lucky bot). Top-30 agents for each top team might be sufficient:

In [None]:
top_teams_agents_best30 = top_teams_agents.sort_values("score",ascending = False).groupby("team_id").head(30).reset_index(drop=True)
view = top_teams_agents_best30.groupby("team_name").agg({"score": [np.min, np.mean,  np.median, np.max]})
s = top_teams_agents_best30.groupby(["team_rank", "team_name"])["submission_id"].count().rename("agents")
view.merge(s.to_frame(), left_index=True, right_index=True)

In [None]:
group_order = top_teams_agents_best30.groupby("team_name")["score"].max().sort_values(ascending=False).index.to_list()
plt.figure(figsize=(20, 100))
sns.boxplot(x="score", y="team_name", palette="vlag", data=top_teams_agents_best30, order=group_order)
sns.stripplot(x="score", y="team_name", data=top_teams_agents, size=6, color=".4", linewidth=0, order=group_order)
plt.legend(title="Top team scores (best 30 agents for each team)", loc="upper right", title_fontsize = 15)
plt.show()

Agent score from 1st to 10th for each top-100 team:

In [None]:
top_teams_agents_best10 = top_teams_agents.sort_values("score",ascending = False).groupby("team_id").head(10).reset_index(drop=True)
top_teams_agents_best10 = top_teams_agents_best10.loc[:, ["team_name", "score"]]
top_teams_agents_best10["rank"] = top_teams_agents_best10.groupby("team_name")["score"].rank("dense", ascending=False).astype(int)
top_teams_agents_best10.pivot(index="team_name", columns="rank", values="score").sort_values(1,ascending = False).style.background_gradient(cmap="YlGn")

# 8. Best agent vs top agents

Best agent is clearly outlier if there is significant difference between its score and top agents' score.

In [None]:
best1_agents = agents.sort_values("score",ascending = False).groupby("team_name").head(1).reset_index(drop=True).\
    groupby("team_name").agg({"score": np.mean}).rename(columns={"score": "best agent"})
best10_agents = agents.sort_values("score",ascending = False).groupby("team_name").head(10).reset_index(drop=True).\
    groupby("team_name").agg({"score": np.mean}).rename(columns={"score": "top 10 agents"})
best30_agents = agents.sort_values("score",ascending = False).groupby("team_name").head(30).reset_index(drop=True).\
    groupby("team_name").agg({"score": np.mean}).rename(columns={"score": "top 30 agents"})
best_agents = pd.merge(pd.merge(pd.merge(best1_agents, best10_agents, on=["team_name"]), best30_agents, on=["team_name"]), 
                leaderboard.loc[:, ["team_name", "n_agents"]], on=["team_name"])
best_agents["difference best - top 10"] = best_agents.apply(lambda x: x["best agent"] - x["top 10 agents"], axis=1)
best_agents["difference best - top 30"] = best_agents.apply(lambda x: x["best agent"] - x["top 30 agents"], axis=1)
best_agents["medal"] = ["gold" if x >= gold_min_score else "silver" if x >= silver_min_score else "peru" if x >= bronze_min_score else "lightblue" \
     for x in best_agents["best agent"]]

In [None]:
plt.figure(figsize=(25,15))
plt.scatter(best_agents["best agent"], best_agents["top 10 agents"], c=best_agents["medal"], alpha=0.7)
plt.xlabel("Best agent score")
plt.ylabel("Top-10 agents score (mean)")
plt.legend(title="Top-10 agents score (mean) vs best agent score", loc="lower center", title_fontsize=25)
plt.show()

In [None]:
plt.figure(figsize=(25,8))
plt.hist(best_agents["difference best - top 10"][best_agents["difference best - top 10"] != 0], color="plum", bins=200)
plt.xlabel("Difference between best agent and top 10 agents")
plt.ylabel("Number of teams")
plt.legend(title="Difference between top-10 agents score (mean) and best agent score", loc="upper center", title_fontsize=25)
plt.show()

In [None]:
plt.figure(figsize=(25,15))
plt.scatter(best_agents["best agent"], best_agents["top 30 agents"], c=best_agents["medal"], alpha=0.7)
plt.xlabel("Best agent score")
plt.ylabel("Top-30 agents score (mean)")
plt.legend(title="Top-30 agents score (mean) vs best agent score", loc="lower center", title_fontsize=25)
plt.show()

In [None]:
plt.figure(figsize=(25,8))
plt.hist(best_agents["difference best - top 30"][best_agents["difference best - top 30"] != 0], color="slategrey", bins=200)
plt.xlabel("Difference between best agent and top 30 agents")
plt.ylabel("Number of teams")
plt.legend(title="Difference between top-30 agents score (mean) and best agent score", loc="upper center", title_fontsize=25)
plt.show()

Difference for medal-winning teams:

In [None]:
best_agents[best_agents["team_name"].isin(leaderboard.sort_values("score", ascending=False)["team_name"][0:156])].\
    loc[:, ["team_name", "best agent", "top 10 agents", "top 30 agents", "difference best - top 10", 
             "difference best - top 30", "n_agents"]].sort_values("best agent", ascending=False).reset_index(drop=True)

P.S. I'm going to regularly update this notebook till the end of the competition. Any feedback is greatly appreciated.