<a href="https://colab.research.google.com/github/vmokashi01/SSBU-Tiers/blob/master/SSBU_Tiers_Analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Super Smash Bros Ultimate Tier List Analysis

Super Smash Bros. Ultimate is a versatile fighting game that caters to both its casual and competitive communities. The game itself is composed of playable fighters, each with their distinct attacks and in-game stats, such as weight, gravity, and walking speed, which are then pitted against each other. The competitive community generates tier lists based on their sentiment regarding the ability for a fighter to win matches. Tier lists often implicitly include how well a fighter matches up against other characters, the skill cap of a fighter, as well as the strategies and attack combinations that can be used by the character. 

The aim of this exploration is to investigate the attributes of specific tiers within the most widely accepted SSBU tier list and to develop a model that can be used to predict the tier of a new character. 
 


1. Data Processing
2. Data Exploration
3. Modelling
4. Conclusions



In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

  import pandas.util.testing as tm


In [2]:
from google.colab import drive
drive.mount('/content/drive/')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive/


## Data Processing

At this stage there are a few tasks that need to be completed in order to ensure that the data set can be worked with. I will be using the 'Character' column as a key between the tables - to do so, each table must be checked to ensure that every character is listed. As well, the names of each character must be written the same way. For example, 'Pokemon Trainer - Ivysaur' vs. 'PT [Ivysaur]' may cause issues when joining the tables. Next, the data types of each column should be changed to correctly reflect the data within the column. Lastly, empty columns should be dropped and the individual tables will be joined in order to view the relationships between every data point for each character.

Notes:
- Some characters are not listed in some tables. The data for these characters will either be filled through an additional data source or by taking the mean of characters within a similar weight class 
- Ice Climbers consist of two partenered characters (out of which one of them is player-controlled while the other one is a CPU); they have slightly different stats, however, they will be considered as a single unit by taking the mean of their individual data points and will always be viewed in their partnered form
- Rosalina & Luma will always be considered in their partnered form
- Some characters have alternate forms (such as Joker and Shulk), which shift their character data points; the exploration will consider their base data points


In [3]:
df = pd.read_csv('/content/drive/My Drive/SSBU-Tiers/tier_data.csv', delimiter=';')
df = df.drop(columns=['Unnamed: 4']) #drop empty column
df['Difficulty'] = df.apply(lambda x: x.Difficulty.replace('Difficulty: ', ''), axis=1) #clean up columns
df['Archetype'] = df.apply(lambda x: x.Archetype.replace(' Archetype', ''), axis=1)
df

Unnamed: 0,Character,Difficulty,Archetype,Tier
0,Banjo & Kazooie,Intermediate,Projectile,B
1,Bayonetta,Hard,Rushdown,D
2,Bowser,Easy,Heavy,B
3,Bowser Jr.,Easy,Rushdown,D
4,Byleth,Intermediate,Sword,B
...,...,...,...,...
77,Wolf,Easy,Projectile,A
78,Yoshi,Intermediate,Rushdown,B
79,Young Link,Intermediate,Projectile,A
80,Zelda,Easy,Projectile,B


I have written a 'clean' function to deal with many of the syntactical issues within the data (eg. naming conventions) - I have used this function as well to manually input missing characters and values for different tables. 

**Note:**
Currently, many of the methods are specific to the tables I have collected - as a next step, I am going to abstract  the processes within this function in order to account for continuous data collection.

Fixes:
- Rosa & Luma naming convention
- PT naming convention
- Meta Knight naming convention
- IC naming convention
- missing values for some characters

In [0]:
def clean(table, name):
  table = table.iloc[:, :-1] #drop "Unnamed" columns

  #drop Rank columns
  if 'Rank' in table.columns:
    table = table.drop(columns=['Rank'])

  #uniform names for PT fighters
  table.loc[table['Character'].str.contains('Ivysaur'), 'Character'] = 'Ivysaur'
  table.loc[table['Character'].str.contains('Charizard'), 'Character'] = 'Charizard'
  table.loc[table['Character'].str.contains('Squirtle'), 'Character'] = 'Squirtle'

  # data from https://www.ssbwiki.com/Air_acceleration#Super_Smash_Bros._Ultimate

  if name == 'acceleration':
    table = table.append(pd.Series(['Joker', '0.01', '0.07', '0.08']))
    table = table.append(pd.Series(['Terry', '0.01', '0.05', '0.06']))

  
  if name == 'air_speed':
    table = table.append(pd.Series(['Byleth', '0.89']))
  
  if name == 'dash_speed':
  #average IC fighters
    initial = str((float(table.loc[table['Character'] == 'Ice Climbers (partner)', 'Initial Dash']) + float(table.loc[table['Character'] == 'Ice Climbers (leader)', 'Initial Dash']))/2)
    run = str((float(table.loc[table['Character'] == 'Ice Climbers (partner)', 'Run Speed']) + float(table.loc[table['Character'] == 'Ice Climbers (leader)', 'Run Speed']))/2)
    dash = str((float(table.loc[table['Character'] == 'Ice Climbers (partner)', 'Dash Frames']) + float(table.loc[table['Character'] == 'Ice Climbers (leader)', 'Dash Frames']))/2)
    pivot = str((float(table.loc[table['Character'] == 'Ice Climbers (partner)', 'Pivot Dash Frames']) + float(table.loc[table['Character'] == 'Ice Climbers (leader)', 'Pivot Dash Frames']))/2)
    table = table.append(pd.Series(['Ice Climbers', initial, run, dash, pivot], index = table.columns), ignore_index = True)
    table = table[table.Character != 'Ice Climbers (partner)']
    table = table[table.Character != 'Ice Climbers (leader)']

  if name == 'fall_speed':
    table = table.append(pd.Series(['Byleth', '1.6', '2.56', '60%']))
  
  if name == 'gravity':
    table = table.append(pd.Series(['Byleth', '0.089']))
    table = table.append(pd.Series(['Terry', '0.09']))

  if name == 'jump_height':
    table = table.append(pd.Series(['Byleth', '26.5', '14', '28.5']))
    table = table.append(pd.Series(['Joker', '32.5', '14.2', '34']))
    table = table.append(pd.Series(['Terry', '27', '15.2', '29']))

  if name == 'landing':
    # split mirrors into separate lines
  
  if name == 'ledge':
    # Add Terry
  
  if name == 'shield':
    # PAC MAN capitalization
    # drop Rosa no luma

  table = table.sort_values(by = 'Character')
          
  return table

Based on the amount of missing data, it seems as though SSBWiki would be a more reliable source for frame data - for now, I have used this unscalable manual data entry method, however, on the next iteration, I will conduct the data scraping process using data sourced from SSBWiki in order to maintain access to the most up-to-date character information. 

In [0]:
tables = ['acceleration', 'air_speed', 'dash_speed', 'fall_speed', 'gravity', 'jump_duration', 'jump_height', 'landing', 'ledge', 'shield', 'walk_speed', 'weight']

In [53]:
for name in tables:
  table = pd.read_csv('/content/drive/My Drive/SSBU-Tiers/' + name + '.csv', delimiter=';')
  print(name)
  table = clean(table, name)
  df = df.set_index('Character').join(table.set_index('Character'))


acceleration
air_speed


KeyError: ignored

In [52]:
df

Unnamed: 0,Character,Difficulty,Archetype,Tier
0,Banjo & Kazooie,Intermediate,Projectile,B
1,Bayonetta,Hard,Rushdown,D
2,Bowser,Easy,Heavy,B
3,Bowser Jr.,Easy,Rushdown,D
4,Byleth,Intermediate,Sword,B
...,...,...,...,...
77,Wolf,Easy,Projectile,A
78,Yoshi,Intermediate,Rushdown,B
79,Young Link,Intermediate,Projectile,A
80,Zelda,Easy,Projectile,B


In [22]:
weight = pd.read_csv('/content/drive/My Drive/SSBU-Tiers/weight.csv', delimiter=';')
weight = clean(weight, 'weight')
weight
#df.set_index('Character').join(weight.set_index('Character'))

Unnamed: 0,Character,Weight
0,Bowser,135
1,King K. Rool,133
2,Donkey Kong,127
3,King Dedede,127
4,Ganondorf,118
...,...,...
77,Fox,77
78,Mr. Game & Watch,75
79,Squirtle,75
80,Jigglypuff,68


In [54]:
shield = pd.read_csv('/content/drive/My Drive/SSBU-Tiers/shield.csv', delimiter=';')
shield = clean(shield, 'shield')
shield

Unnamed: 0,Character,#1,Fastest Move(s),#2,2nd Fastest Move(s),#3,3rd Fastest Move(s),Grab,"Grab, Post-Shieldstun",Item Throw(Forward),Item Throw(Back),Jump+Z-Drop (Front),Jump+Z-Drop(Behind)
0,Banjo & Kazooie,9,Usmash,10,Uair,11,Bair,7,11,**,**,**,**
1,Bayonetta,6,UpB,10,"Fair, SideB (Air)",12,"Nair, Uair",7,11,10,10,4,5
2,Bowser,6,UpB,9,SideB (Air),11,Nair,8,12,10,13,4,5
3,Bowser Jr.,7,Usmash,9,Uair,10,Nair,11,15,8,11,4,5
4,Byleth,9,NAir,11,Up B,13,"UAir, USmash",6,10,**,**,**,**
...,...,...,...,...,...,...,...,...,...,...,...,...,...
78,Wolf,10,"Nair, Fair, Uair, DownB (Air)",13,Usmash,15,Jab,6,9,8,11,4,5
79,Yoshi,6,Nair,8,Uair,11,Usmash,14,18,8,7,4,5
80,Young Link,7,Nair,8,Uair,9,UpB,12,16,8,10,4,5
81,Zelda,6,UpB,9,"Nair, Fair, Bair, Usmash",14,NeutralB (Air),10,14,10,10,4,5


In [47]:
acc = pd.read_csv('/content/drive/My Drive/SSBU-Tiers/acceleration.csv', delimiter=';')
acc
acc = clean(acc, 'a')
acc.columns

Index(['Character', 'Base', 'Additional', 'Max'], dtype='object')