# Who is the Best Among Us Player? 🔎🔎🔎

Among Us, arguably the most popular online multiplayer social deduction game ever, has been one of my favorite games from September until it's gradual downfall in 2021. As a token for my love of Among Us, I've decided to analyze some data related to the game! In this notebook, I utilize one of the csv files in <a href="https://www.kaggle.com/mrisdal/among-us-gameplay"> this Among Us dataset </a> to ultimately answer the question: **Who is the best Among Us player**?

# Essential Imports

This is my general template of imports I like to use. There may be some imports that are not used.

In [None]:
#-----General------#
import numpy as np
import pandas as pd
import os
import sys

#-----Plotting-----#
import matplotlib.pyplot as plt
import plotly.graph_objects as go
import plotly.express as px
import plotly.graph_objects as go
import plotly.offline as py
py.init_notebook_mode(connected=True)
import seaborn as sns
from pandas_profiling import ProfileReport

#-----Utility-----#
import math
import itertools
import warnings
warnings.filterwarnings("ignore")
import re
import gc
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from scipy.stats import pearsonr

In [None]:
LOOK_AT = 10 # My favorite "LOOK_AT" variable, which controls the number of players you see in each visualization!

In [None]:
players = pd.read_csv("../input/among-us-gameplay/player_analysis.csv").drop("Index", axis=1)
players

Note, there are other csv files in this dataset titled ```crewmate_voting.csv```, ```game_feed.csv```, and ```player_record.csv```. I only focus on what I believe to be the most illustrating csv in this dataset, ```player_analysis.csv```.

<h2> What Columns Can We Manipulate? </h2>

In [None]:
players.columns

# What Players are in the Dataset?

Among Us is defined by its players, and thus, so is this dataset. See if you recognize anyone!

In [None]:
players['Players'].sort_values().unique()

Honestly, the only people I know in this dataset are 5up, Bloody, and Skadj, Whyin, jorbs, and hafu. Let me know if you know more of them!

In [None]:
players.index = players['Players']
players.sort_values("Total_Games", ascending=False, inplace=True)
fig = px.bar(players[:LOOK_AT], y=["Crew_Games", "Impostor_Games"])
fig.update_layout(title={'text': f"Top {LOOK_AT} Among Us Players With the Most Games Played", 'x': 0.5,
                         'xanchor': 'center', 'font': {'size': 20}}, legend=dict(title="Legend"), xaxis_title="", yaxis_title="Count")
fig.show()

# What Defines a Winner?

Does a player who plays more games, on average, win or lose more of their games?

In [None]:
fig = px.scatter(players, x="Combined_Win_Percentage", y="Total_Games", hover_data=["Players"])
fig.update_layout(title={'text': f"Number of Games vs Win Percentage", 'x': 0.5,
                         'xanchor': 'center', 'font': {'size': 20}})
fig.show()

From this graph, it's pretty clear that there's not a correlation between win percentage and the total number of games played. What is obvious is that there are many players in the dataset with less than 10 total games, which is just not enough data to make perform any conclusive analyses. Thus, I remove all players with less than 10 total games in the dataset.

In [None]:
players = players.loc[players['Total_Games'] >= 10]
players

# Who is the Best Crewmate?

We calculate this by first converting our original DataFrame into z-scores (normalized units) and then using the formula ```Crew Win Percentage + Crew Bodies Found Per Life + Crew Voting Accuracy Per Life```. Of course, this formula is not 100% robust, but it provides a solid introductory foundation for future work.

In [None]:
abridged_df = players.loc[:, players.dtypes == "float64"]
standard_players = pd.DataFrame(StandardScaler().fit_transform(abridged_df), columns=abridged_df.columns, index=abridged_df.index).round(3)
standard_players

In [None]:
standard_players['Crewmate Score'] = (standard_players['Crew_Win_Percentage'] + standard_players['Crew_Bondies_Found_Per_Life'] + standard_players['Crew_Voting_Accuracy_Per_Life'])
standard_players.sort_values("Crewmate Score", ascending=False, inplace=True)
fig = px.bar(standard_players[:LOOK_AT], y="Crewmate Score", 
             hover_data=["Crew_Win_Percentage", "Crew_Bondies_Found_Per_Life", "Crew_Voting_Accuracy_Per_Life"], color="Crewmate Score")
fig.update_layout(title={'text': f"Top {LOOK_AT} Crewmates by Z-Score Algorithm", 'x': 0.5,
                         'xanchor': 'center', 'font': {'size': 20}})
fig.show()

So, according to my algorithm, the best crewmates are, in order: Stranjak, Colossus, julie, ScaldingHotSoup, Issac, Stunlock, 5up, synthe, Trotske, and Chosaucer. Let's see if any of these players also make it to the top of the best impostors list!

# Who is the Best Impostor?

Similar to the algorithm used above to calculate who the best crewmate is, we will normalize the data into Z-scores and then use the formula: ```2*Impostor Win Percentage + Impostor Kills Per Life - Voted Off As Impostor```.

In [None]:
standard_players['Impostor Score'] = (2*standard_players['Impostor_Win_Percentage'] + standard_players['Impostor_Kills_Per_Life'] - standard_players['Voted_Off_As_Impostor']).sort_values(ascending=False)
standard_players.sort_values("Impostor Score", ascending=False, inplace=True)
fig = px.bar(standard_players[:LOOK_AT], y="Impostor Score", color="Impostor Score", hover_data=["Impostor_Win_Percentage", "Impostor_Kills_Per_Life", "Voted_Off_As_Impostor"])
fig.update_layout(title={'text': f"Top {LOOK_AT} Impostors by Z-Score Algorithm", 'x': 0.5,
                         'xanchor': 'center', 'font': {'size': 20}})
fig.show()

According to this algorithm, the top 10 impostors are: Corey, 5up, FoxBox, Kibler, flutter, LSV, Zyla, wrapter, BK, and Keaton. Now let's put these two pieces together to answer our ultimate questions: **Who is the Best Among Us Player**?

# Who is the Best Overall Player?

In [None]:
standard_players['Crewmate Rank'] = standard_players['Crewmate Score'].rank(ascending=False)
standard_players['Impostor Rank'] = standard_players['Impostor Score'].rank(ascending=False)
standard_players['Average Rank'] = (standard_players['Crewmate Rank'] + standard_players['Impostor Rank'])/2 
standard_players.sort_values("Average Rank", ascending=True, inplace=True)
fig = px.bar(standard_players[:LOOK_AT], y="Average Rank", color="Average Rank")
fig.update_layout(title={'text': f"Top {LOOK_AT} Overall Players by Average Rank", 'x': 0.5,
                         'xanchor': 'center', 'font': {'size': 20}})
fig.show()

In [None]:
score_df = pd.DataFrame(StandardScaler().fit_transform(standard_players[['Crewmate Score', 'Impostor Score']]), 
                        index=standard_players.index, columns=["Crewmate Score", "Impostor Score"]).round(3)
score_df['Overall Score'] = score_df['Crewmate Score'] + score_df['Impostor Score']
score_df.sort_values("Overall Score", ascending=False, inplace=True)
fig = px.bar(score_df[:LOOK_AT], y="Overall Score", color="Overall Score", hover_data=["Crewmate Score", "Impostor Score"])
fig.update_layout(title={'text': f"Top {LOOK_AT} Overall Players by Z-Score Sum", 'x': 0.5,
                         'xanchor': 'center', 'font': {'size': 20}})
fig.show()

As we can see by these two different rankings of overall players, ```5up``` is clearly the top Among Us player in this data, with an absolutely insane Average Rank of 2.5 and by far having the largest Z-score sum. ```ScaldingHotSoup``` is also a fantastic Among Us player, with the 3rd highest Average Rank and the 2nd highest Z-score sum.

Players which are bolded appear on both top 10 lists.

Top 10 average players by average rank: **5up**, **flutter**, **ScaldingHotSoup**, **Keaton**, **Chosaucer**, **Mani**, **TomM**, Pheylop, BK, jorbs.

Top 10 overall players by Z-score sum: **5up**, **ScaldingHotSoup**, FoxBox, **flutter**, Corey, **Chosaucer**, **TomM**, **Keaton**, Naribly, **Mani**.

# What Makes 5up so Good?

Let's take a look at the specific stats pink man ```5up``` has to try and determine why his ranking is so much higher than everyone else's.

In [None]:
players.loc['5up']

His statistics are crazy - 75% crewmate win rate, **100% immpostor win rate**, on top of an absurd 88.2% voting accuracy and average of 2.8 kills per impostor round. No wonder he's at the top of the top. Of course, we do have to take into consideration that he only plays 17 total games in this dataset, but still, averaging those types of numbers over 17 games is crazy. 

Let's take a look at the data to see if players with less games have higher scores.

In [None]:
fig = px.scatter(pd.concat((players['Total_Games'], score_df['Overall Score']), axis=1).reset_index(), x="Overall Score", y="Total_Games", hover_data=["Players"])
fig.update_layout(title={'text': f"Scatter Plot of Overall Scores vs Total Games Played", 'x': 0.5,
                         'xanchor': 'center', 'font': {'size': 20}})
fig.show()

Indeed, the top 5 players in the dataset all have ```< 40``` total games played, which may not be enough data to absolutely conclude anything meaningful. However, what **is** for certain is that, in all of the games recorded in this dataset, ```5up``` is by far the highest overall performing player. So, that answers our question, and we are now done!

# Bonus: Who is the Most Aggressively Voting Player?

In [None]:
players = players.round(3)
fig = px.bar(players.sort_values("Voting_Aggression", ascending=False)[:LOOK_AT], y="Voting_Aggression", color="Voting_Aggression")
fig.update_layout(title={'text': f"Top {LOOK_AT} Most Aggressively Voting Players", 'x': 0.5,
                         'xanchor': 'center', 'font': {'size': 20}})
fig.show()

Ok, so all of these players love voting. But are they accurate with their votes? Is this confidence feigned or is it pure detective work?

In [None]:
fig = px.scatter(players, x='Voting_Aggression', y='Crew_Voting_Accuracy_Per_Life')
fig.update_layout(title={'text': f"Does Voting More Aggressively Make You More Accurate?", 'x': 0.5,
                         'xanchor': 'center', 'font': {'size': 20}})
fig.show()

In [None]:
print("Correlation: %.3f" % pearsonr(players['Voting_Aggression'], players['Crew_Voting_Accuracy_Per_Life'])[0])

There is only a slight positive correlation of voting aggression to voting accuracy, but it is nothing significant.

# Conclusion

Thank you for reading through this notebook! If you found it particularly interesting or helpful, I would really appreciate it if you would give the notebook an <span style="color: green"> upvote </span> or a <span style="color: blue"> comment</span>!