# Converting Slippi to a DataFrame: The Frames
## Table of Contents
1. [Imports](#import)
2. [Reading in Metadata](#metadata)
3. [Mapping Character Names to Values](#character)
4. [Mapping Stage Names to Values](#stage)
5. [Filtering Games](#filter)
6. [Parsing Frame Data](#parse)<br>
    a. [Getting Fox and Falco Ports](#ports)<br>
    b. [Getting the Filepaths](#filepaths)<br>
    c. [The Function](#function)
7. [Exploring the Frames](#eda)
<a id = 'import'></a>
## Imports

In [1]:
import pandas as pd
import numpy as np
import slippi as slp
import os

<a id = 'metadata'></a>
## Reading in MetaData

In [2]:
df_fp9 = pd.read_csv('../data/fp9.csv')
df_fp9.head()

Unnamed: 0.1,Unnamed: 0,game_id,date,duration,platform,p1_char,p2_char,p3_char,p4_char,stage,is_teams,is_pal
0,0,USB1-20190406T180838,2019-04-06 18:08:38+00:00,18517,Platform.NINTENDONT,15.0,,,14.0,2,False,False
1,1,USB1-20190406T172110,2019-04-06 17:21:10+00:00,11637,Platform.NINTENDONT,,9.0,,16.0,3,False,False
2,2,USB1-20190406T171424,2019-04-06 17:14:24+00:00,5305,Platform.NINTENDONT,,9.0,,16.0,31,False,False
3,3,USB1-20190406T174216,2019-04-06 17:42:16+00:00,15823,Platform.NINTENDONT,14.0,,15.0,,31,False,False
4,4,USB1-20190406T175743,2019-04-06 17:57:43+00:00,12348,Platform.NINTENDONT,14.0,,14.0,,31,False,False


In [3]:
# Dropping Unnamed: 0 column
df_fp9.drop(columns = ['Unnamed: 0'], inplace = True)
df_fp9.head()

Unnamed: 0,game_id,date,duration,platform,p1_char,p2_char,p3_char,p4_char,stage,is_teams,is_pal
0,USB1-20190406T180838,2019-04-06 18:08:38+00:00,18517,Platform.NINTENDONT,15.0,,,14.0,2,False,False
1,USB1-20190406T172110,2019-04-06 17:21:10+00:00,11637,Platform.NINTENDONT,,9.0,,16.0,3,False,False
2,USB1-20190406T171424,2019-04-06 17:14:24+00:00,5305,Platform.NINTENDONT,,9.0,,16.0,31,False,False
3,USB1-20190406T174216,2019-04-06 17:42:16+00:00,15823,Platform.NINTENDONT,14.0,,15.0,,31,False,False
4,USB1-20190406T175743,2019-04-06 17:57:43+00:00,12348,Platform.NINTENDONT,14.0,,14.0,,31,False,False


Are there any duplicate games? If there are no duplicate `game_id`s, then set `game_id` as the index. It should set the column as the index because ` copy` is in the `game_id` for those that are potentially duplicates.

In [4]:
if df_fp9.shape == df_fp9.drop_duplicates().shape:
    df_fp9.set_index('game_id', inplace = True)
    print(True)
else:
    print(False)
df_fp9.head()

True


Unnamed: 0_level_0,date,duration,platform,p1_char,p2_char,p3_char,p4_char,stage,is_teams,is_pal
game_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
USB1-20190406T180838,2019-04-06 18:08:38+00:00,18517,Platform.NINTENDONT,15.0,,,14.0,2,False,False
USB1-20190406T172110,2019-04-06 17:21:10+00:00,11637,Platform.NINTENDONT,,9.0,,16.0,3,False,False
USB1-20190406T171424,2019-04-06 17:14:24+00:00,5305,Platform.NINTENDONT,,9.0,,16.0,31,False,False
USB1-20190406T174216,2019-04-06 17:42:16+00:00,15823,Platform.NINTENDONT,14.0,,15.0,,31,False,False
USB1-20190406T175743,2019-04-06 17:57:43+00:00,12348,Platform.NINTENDONT,14.0,,14.0,,31,False,False


<a id = 'character'></a>
## Mapping Character Names to Values

In [5]:
# Reminder: Someone is Player 1 if their port index is 0 in the Game.start.players tuple
df_fp9['p1_char'].value_counts()

9.0     161
20.0    146
2.0     137
0.0      62
15.0     56
19.0     42
16.0     30
14.0     22
22.0     19
1.0      19
12.0     14
25.0     12
7.0       6
13.0      3
18.0      3
8.0       3
3.0       2
6.0       2
5.0       1
4.0       1
21.0      1
Name: p1_char, dtype: int64

For an easier time determining which character each player used in the game, I will be using the [documentation](https://py-slippi.readthedocs.io/en/latest/source/slippi.html) to make sure they are appropriately mapped. Upon looking into the docs, I noticed that there are two enumeration objects regarding characters, `CSSCharacter` and `InGameCharacter`. These objects label all tournament legal chracters, but in different orders. For example, Mario has a value of 0 in the `InGameCharacter` object, but has a value of 8 in `CSSCharacter`. Since I know which characters are more frequently played in tournaments, I'll take the value counts of one character column and determine if the `CSSCharacter` was interpretted or `InGameCharacter`.

| Character Value | Count |  CSSCharacter  | InGameCharacter |
|:---------------:|:-----:|:--------------:|:---------------:|
|        9        |  161  |      Marth     |      Peach      |
|        20       |  146  |      Falco     |    Young Link   |
|        2        |  137  |       Fox      |  Captain Falcon |
|        0        |  62   | Captain Falcon |      Mario      |

The most frequently used character among those I established to be Player 1 is the character whose value is 9. Since both characters are popular relative to the rest of the characters, I am not completely certain that a value of 9 represents Marth or Peach.

The second most frequent character is value 20. I am very confident that this is Falco because his performance is much better than Young Link's. So I'm inclined to say that the values represent the `CSSCharacter` object rather than the `InGameCharacgter` object. Just to be more confident than very confident, I will continue down the list.

Next is the character value of 2. This value can either represent Fox or Captain Falcon. There is not much for me to interpret because both charaters are very popular in tournament play.

The charcter value of 0 represents either Captain Falcon or Mario. This is a similar comparison to Falco and Young Link. Captain Falcon and Falco both get much more tournament play than Young Link and Mario. I can comfortably continue mapping the characters using the `CSSCharacter` values.

In [6]:
# A dictionary to map CSSCharacter values to their respective character names
csscharacter = {
    0: 'Captain Falcon',
    1: 'Donkey Kong',
    2: 'Fox',
    3: 'Game and Watch',
    4: 'Kirby',
    5: 'Bowser',
    6: 'Link',
    7: 'Luigi',
    8: 'Mario',
    9: 'Marth',
    10: 'Mewtwo',
    11: 'Ness',
    12: 'Peach',
    13: 'Pikachu',
    14: 'Ice Climbers',
    15: 'Jigglypuff',
    16: 'Samus',
    17: 'Yoshi',
    18: 'Zelda',
    19: 'Sheik',
    20: 'Falco',
    21: 'Young Link',
    22: 'Dr. Mario',
    23: 'Roy',
    24: 'Pichu',
    25: 'Ganondorf'
}

In [7]:
# Applying the map to each appropriate column
df_fp9['p1_char_name'] = df_fp9['p1_char'].map(csscharacter)
df_fp9['p2_char_name'] = df_fp9['p2_char'].map(csscharacter)
df_fp9['p3_char_name'] = df_fp9['p3_char'].map(csscharacter)
df_fp9['p4_char_name'] = df_fp9['p4_char'].map(csscharacter)
df_fp9.head()

Unnamed: 0_level_0,date,duration,platform,p1_char,p2_char,p3_char,p4_char,stage,is_teams,is_pal,p1_char_name,p2_char_name,p3_char_name,p4_char_name
game_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
USB1-20190406T180838,2019-04-06 18:08:38+00:00,18517,Platform.NINTENDONT,15.0,,,14.0,2,False,False,Jigglypuff,,,Ice Climbers
USB1-20190406T172110,2019-04-06 17:21:10+00:00,11637,Platform.NINTENDONT,,9.0,,16.0,3,False,False,,Marth,,Samus
USB1-20190406T171424,2019-04-06 17:14:24+00:00,5305,Platform.NINTENDONT,,9.0,,16.0,31,False,False,,Marth,,Samus
USB1-20190406T174216,2019-04-06 17:42:16+00:00,15823,Platform.NINTENDONT,14.0,,15.0,,31,False,False,Ice Climbers,,Jigglypuff,
USB1-20190406T175743,2019-04-06 17:57:43+00:00,12348,Platform.NINTENDONT,14.0,,14.0,,31,False,False,Ice Climbers,,Ice Climbers,


<a id = "stage"></a>
## Mapping Stage Names to Values

In [8]:
df_fp9['stage'].value_counts()

31    268
32    200
28    182
8     171
3     143
2     143
20      1
14      1
Name: stage, dtype: int64

I know that only six stages are tournament legal, so I anticipate that stage values 20 and 14 are not tournament legal stages. The other six should be Final Destiantion, Battlefield, Yoshi's Story, Fountain of Dreams, Dream Land 64, and Pokemon Stadium. Pokemon Stadium should be either value 2 or 3 because player's cannot start a set of games on that stage unless they both consent to it.

In [9]:
stages = {
    2: 'Fountain of Dreams',
    3: 'Pokemon Stadium',
    4: "Princess Peach's Castle",
    5: 'Kongo Jungle',
    6: 'Brinstar',
    7: 'Corneria',
    8: "Yoshi's Story",
    9: 'Onett',
    10: 'Mute City',
    11: 'Rainbow Cruise',
    12: 'Jungle Japes',
    13: 'Great Bay',
    14: 'Hyrule Temple',
    15: 'Brinstar Depths',
    16: "Yoshi's Island",
    17: 'Green Greens',
    18: 'Fourside',
    19: 'Mushroom Kingdom I',
    20: 'Mushroom Kingdom II',
    22: 'Venom',
    23: 'Poke Floats',
    24: 'Big Blue',
    25: 'Icicle Mountain',
    26: 'Icetop',
    27: 'Flat Zone',
    28: 'Dream Land 64',
    29: "Yoshi's Island 64",
    30: 'Kongo Jungle 64',
    31: 'Battlefield',
    32: 'Final Destination'
}

In [10]:
df_fp9['stage_name'] = df_fp9['stage'].map(stages)
df_fp9.head()

Unnamed: 0_level_0,date,duration,platform,p1_char,p2_char,p3_char,p4_char,stage,is_teams,is_pal,p1_char_name,p2_char_name,p3_char_name,p4_char_name,stage_name
game_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
USB1-20190406T180838,2019-04-06 18:08:38+00:00,18517,Platform.NINTENDONT,15.0,,,14.0,2,False,False,Jigglypuff,,,Ice Climbers,Fountain of Dreams
USB1-20190406T172110,2019-04-06 17:21:10+00:00,11637,Platform.NINTENDONT,,9.0,,16.0,3,False,False,,Marth,,Samus,Pokemon Stadium
USB1-20190406T171424,2019-04-06 17:14:24+00:00,5305,Platform.NINTENDONT,,9.0,,16.0,31,False,False,,Marth,,Samus,Battlefield
USB1-20190406T174216,2019-04-06 17:42:16+00:00,15823,Platform.NINTENDONT,14.0,,15.0,,31,False,False,Ice Climbers,,Jigglypuff,,Battlefield
USB1-20190406T175743,2019-04-06 17:57:43+00:00,12348,Platform.NINTENDONT,14.0,,14.0,,31,False,False,Ice Climbers,,Ice Climbers,,Battlefield


In [11]:
df_fp9['stage'].value_counts()

31    268
32    200
28    182
8     171
3     143
2     143
20      1
14      1
Name: stage, dtype: int64

In [12]:
df_fp9['stage_name'].value_counts()

Battlefield            268
Final Destination      200
Dream Land 64          182
Yoshi's Story          171
Fountain of Dreams     143
Pokemon Stadium        143
Hyrule Temple            1
Mushroom Kingdom II      1
Name: stage_name, dtype: int64

My hypotheses about what each stage value represents was held true. 20 and 14 are tournament illegal stages and Pokemon Stadium was value 3. I'm just too good.

<a id = 'filter'></a>
## Filtering Games
Now that we are able to determine which characters are in the game, we can begin filtering the games to those with...
- Team Battle set to off
- 1:1 matches only (no 1:1:1 matches or others)
- Fox vs. Falco
- On Final Destination

Final Destination was selected to be the stage of choice because it is a simple stage with no floating platforms. In next models, I will see how well the neural network can learn a player's behavior on other stages and in other matchups.

<img src="../images/final-destination.jpg" alt="Drawing" style="width: 600px;"/><center>Final Destination</center>
<img src="../images/battlefield.png" alt="Drawing" style="width: 600px;"/><center>Battlefield</center>

In [13]:
# Masks to help filter through games
mask_fd = (df_fp9['stage_name'] == 'Final Destination')
mask_fox = (df_fp9['p1_char_name'] == 'Fox') | (df_fp9['p2_char_name'] == 'Fox') | (df_fp9['p3_char_name'] == 'Fox') | (df_fp9['p4_char_name'] == 'Fox')
mask_falco = (df_fp9['p1_char_name'] == 'Falco') | (df_fp9['p2_char_name'] == 'Falco') | (df_fp9['p3_char_name'] == 'Falco') | (df_fp9['p4_char_name'] == 'Falco') 
mask_teams = (df_fp9['is_teams'] == False)

# Of all the columns that represent which character each player is using,
# provide the rows that have two characters in the game by counting non-missing values
mask_1v1 = (df_fp9[['p1_char', 'p2_char', 'p3_char', 'p4_char']].apply(lambda value: value.count(), axis = 1) == 2)

In [14]:
def check_shape_frames(df0, df1):
    '''
    Prints the shape of two dataframes
    '''
    print(f'Shape of first dataframe: {df0.shape}')
    print(f'Shape of second dataframe: {df1.shape}')
    print()
    print(f'Number of frames in first dataframe: {sum(df0["duration"])}')
    print(f'Number of frames in second dataframe: {sum(df1["duration"])}')
    print(f'Difference of frames between dataframes: {sum(df0["duration"]) - sum(df1["duration"])}')

While I could run the following python code to apply the masks all in one go, I was curious to see how many frames and rows would be dropped after applying each mask individually.

```python
# Applying the masks to get only the games we care about
df_games = df_fp9.loc[mask_fox & mask_falco & mask_fd & mask_teams & mask_1v1]

check_shape_frames(df_fp9, df_games)
```

In [15]:
# Get the rows in which Fox is either Player 1 or Player 2
df_games = df_fp9.loc[mask_fox]

check_shape_frames(df_fp9, df_games)

Shape of first dataframe: (1109, 15)
Shape of second dataframe: (428, 15)

Number of frames in first dataframe: 10848849
Number of frames in second dataframe: 3895543
Difference of frames between dataframes: 6953306


In [16]:
# Get the remaining rows in which Falco is either Player 1 or Player 2
df_games = df_games.loc[mask_falco]

check_shape_frames(df_fp9, df_games)

Shape of first dataframe: (1109, 15)
Shape of second dataframe: (103, 15)

Number of frames in first dataframe: 10848849
Number of frames in second dataframe: 901708
Difference of frames between dataframes: 9947141


In [17]:
# Get the remaining rows in which Team Battle is set to off
df_games = df_games.loc[mask_teams]
check_shape_frames(df_fp9, df_games)

Shape of first dataframe: (1109, 15)
Shape of second dataframe: (85, 15)

Number of frames in first dataframe: 10848849
Number of frames in second dataframe: 721110
Difference of frames between dataframes: 10127739


In [18]:
# Get the remaining rows in which the players are on Final Destination
df_games = df_games.loc[mask_fd]
check_shape_frames(df_fp9, df_games)

Shape of first dataframe: (1109, 15)
Shape of second dataframe: (12, 15)

Number of frames in first dataframe: 10848849
Number of frames in second dataframe: 96711
Difference of frames between dataframes: 10752138


In [19]:
df_games = df_games.loc[mask_1v1]
check_shape_frames(df_fp9, df_games)

Shape of first dataframe: (1109, 15)
Shape of second dataframe: (12, 15)

Number of frames in first dataframe: 10848849
Number of frames in second dataframe: 96711
Difference of frames between dataframes: 10752138


In [20]:
# Check that the only characters remaining are Fox and Falco and missing values
print(set(df_games['p1_char_name']))
print(set(df_games['p2_char_name']))
print(set(df_games['p3_char_name']))
print(set(df_games['p4_char_name']))

{nan, 'Fox', 'Falco'}
{nan, 'Fox', 'Falco'}
{nan, 'Fox', 'Falco'}
{nan, 'Fox', 'Falco'}


In [21]:
print(df_games.shape)
df_games.head()

(12, 15)


Unnamed: 0_level_0,date,duration,platform,p1_char,p2_char,p3_char,p4_char,stage,is_teams,is_pal,p1_char_name,p2_char_name,p3_char_name,p4_char_name,stage_name
game_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
USB10-20190406T144505,2019-04-06 14:45:05+00:00,8449,Platform.NINTENDONT,,,20.0,2.0,32,False,False,,,Falco,Fox,Final Destination
USB10-20190406T114015,2019-04-06 11:40:15+00:00,9572,Platform.NINTENDONT,20.0,,2.0,,32,False,False,Falco,,Fox,,Final Destination
USB10-20190406T143503,2019-04-06 14:35:03+00:00,7148,Platform.NINTENDONT,,,20.0,2.0,32,False,False,,,Falco,Fox,Final Destination
USB12-20190406T102328,2019-04-06 10:23:28+00:00,9281,Platform.NINTENDONT,20.0,,2.0,,32,False,False,Falco,,Fox,,Final Destination
USB13-20190406T190420,2019-04-06 19:04:20+00:00,7437,Platform.NINTENDONT,20.0,,,2.0,32,False,False,Falco,,,Fox,Final Destination


After applying the appropriate masks to filter out the games we do not want, we have 12 games for a total of 96,711 frames.

<a id = "parse"></a>
## Parsing Frame Data
<a id = "ports"></a>
### Getting Fox and Falco's Ports

Since I want to predict all of Fox's actions in this tournament when fighting against Falco, I will need to have a column that will specify which port is controlling Fox and which port is controlling Falco.

In [22]:
list(df_games['p1_char_name'])

[nan,
 'Falco',
 nan,
 'Falco',
 'Falco',
 'Falco',
 'Falco',
 'Falco',
 'Fox',
 nan,
 'Fox',
 nan]

In [23]:
list(df_games['p2_char_name'])

[nan, nan, nan, nan, nan, nan, 'Fox', nan, 'Falco', 'Falco', nan, 'Falco']

In [24]:
list(df_games['p3_char_name'])

['Falco', 'Fox', 'Falco', 'Fox', nan, nan, nan, nan, nan, nan, nan, 'Fox']

In [25]:
list(df_games['p4_char_name'])

['Fox', nan, 'Fox', nan, 'Fox', 'Fox', nan, 'Fox', nan, 'Fox', 'Falco', nan]

In [26]:
# Create a list of tuples of the following structure:
# ((row, character), port number)
p1_fox = [(character, 1) for character in list(enumerate(list(df_games['p1_char_name']))) if character[1] == 'Fox']
p1_fox

[((8, 'Fox'), 1), ((10, 'Fox'), 1)]

In [27]:
p2_fox = [(character, 2) for character in list(enumerate(list(df_games['p2_char_name']))) if character[1] == 'Fox']
p2_fox

[((6, 'Fox'), 2)]

In [28]:
p3_fox = [(character, 3) for character in list(enumerate(list(df_games['p3_char_name']))) if character[1] == 'Fox']
p3_fox

[((1, 'Fox'), 3), ((3, 'Fox'), 3), ((11, 'Fox'), 3)]

In [29]:
p4_fox = [(character, 4) for character in list(enumerate(list(df_games['p4_char_name']))) if character[1] == 'Fox']
p4_fox

[((0, 'Fox'), 4),
 ((2, 'Fox'), 4),
 ((4, 'Fox'), 4),
 ((5, 'Fox'), 4),
 ((7, 'Fox'), 4),
 ((9, 'Fox'), 4)]

In [30]:
# Put all the tuples into a single list
p1_fox.extend(p2_fox)
p1_fox.extend(p3_fox)
p1_fox.extend(p4_fox)
p1_fox

[((8, 'Fox'), 1),
 ((10, 'Fox'), 1),
 ((6, 'Fox'), 2),
 ((1, 'Fox'), 3),
 ((3, 'Fox'), 3),
 ((11, 'Fox'), 3),
 ((0, 'Fox'), 4),
 ((2, 'Fox'), 4),
 ((4, 'Fox'), 4),
 ((5, 'Fox'), 4),
 ((7, 'Fox'), 4),
 ((9, 'Fox'), 4)]

In [31]:
# Sort the list according to row order rather than sorted by port
fox_ports_enum = sorted(p1_fox, key = lambda x: x[0][0])
fox_ports_enum

[((0, 'Fox'), 4),
 ((1, 'Fox'), 3),
 ((2, 'Fox'), 4),
 ((3, 'Fox'), 3),
 ((4, 'Fox'), 4),
 ((5, 'Fox'), 4),
 ((6, 'Fox'), 2),
 ((7, 'Fox'), 4),
 ((8, 'Fox'), 1),
 ((9, 'Fox'), 4),
 ((10, 'Fox'), 1),
 ((11, 'Fox'), 3)]

In [32]:
# Using the sorted list, maintain the order
# and create a new list of the ports used for the given
# character, Fox
fox_ports = [port[1] for port in fox_ports_enum]
fox_ports

[4, 3, 4, 3, 4, 4, 2, 4, 1, 4, 1, 3]

In [33]:
# Create a column where that shows which port the Fox player used
df_games['fox_port'] = fox_ports
df_games.head()

Unnamed: 0_level_0,date,duration,platform,p1_char,p2_char,p3_char,p4_char,stage,is_teams,is_pal,p1_char_name,p2_char_name,p3_char_name,p4_char_name,stage_name,fox_port
game_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
USB10-20190406T144505,2019-04-06 14:45:05+00:00,8449,Platform.NINTENDONT,,,20.0,2.0,32,False,False,,,Falco,Fox,Final Destination,4
USB10-20190406T114015,2019-04-06 11:40:15+00:00,9572,Platform.NINTENDONT,20.0,,2.0,,32,False,False,Falco,,Fox,,Final Destination,3
USB10-20190406T143503,2019-04-06 14:35:03+00:00,7148,Platform.NINTENDONT,,,20.0,2.0,32,False,False,,,Falco,Fox,Final Destination,4
USB12-20190406T102328,2019-04-06 10:23:28+00:00,9281,Platform.NINTENDONT,20.0,,2.0,,32,False,False,Falco,,Fox,,Final Destination,3
USB13-20190406T190420,2019-04-06 19:04:20+00:00,7437,Platform.NINTENDONT,20.0,,,2.0,32,False,False,Falco,,,Fox,Final Destination,4


In [34]:
def get_char_ports(char:str, dataframe:pd.DataFrame = df_games):
    '''
    Returns (p1_char, p2_char, p3_char, p4_char)
    --------------------------------------------
    char (string): Character to get ports for
    dataframe (pd.DataFrame): Dataframe to search for `char` in.
        Must have columns `p1_char_name`, `p2_char_name`, `p3_char_name`, `p4_char_name`
        These columns represent the character that was used in the respective column
    '''
    p1_char = [(character, 0) for character in list(enumerate(list(dataframe['p1_char_name']))) if character[1] == char]
    p1_char.extend([(character, 1) for character in list(enumerate(list(dataframe['p2_char_name']))) if character[1] == char])
    p1_char.extend([(character, 2) for character in list(enumerate(list(dataframe['p3_char_name']))) if character[1] == char])
    p1_char.extend([(character, 3) for character in list(enumerate(list(dataframe['p4_char_name']))) if character[1] == char])
    char_ports = sorted(p1_char, key = lambda x: x[0][0])
    return [port[1] for port in char_ports]
    

In [35]:
get_char_ports('Fox')

[3, 2, 3, 2, 3, 3, 1, 3, 0, 3, 0, 2]

In [36]:
def create_char_col(char:str, dataframe:pd.DataFrame = df_games):
    '''
    Returns given dataframe with new column that details the port number
    that the given character played in
    --------------------------------------------------------------------
    char (string): Character to get ports for
    dataframe (pd.DataFrame): Dataframe to search for `char` in.
    '''
    df = dataframe.copy()
    char_ports = get_char_ports(char, dataframe)
    df[f'{char.lower()}_port_index'] = char_ports
    return df
    

In [37]:
df_games = create_char_col('Fox')
df_games.head()

Unnamed: 0_level_0,date,duration,platform,p1_char,p2_char,p3_char,p4_char,stage,is_teams,is_pal,p1_char_name,p2_char_name,p3_char_name,p4_char_name,stage_name,fox_port,fox_port_index
game_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
USB10-20190406T144505,2019-04-06 14:45:05+00:00,8449,Platform.NINTENDONT,,,20.0,2.0,32,False,False,,,Falco,Fox,Final Destination,4,3
USB10-20190406T114015,2019-04-06 11:40:15+00:00,9572,Platform.NINTENDONT,20.0,,2.0,,32,False,False,Falco,,Fox,,Final Destination,3,2
USB10-20190406T143503,2019-04-06 14:35:03+00:00,7148,Platform.NINTENDONT,,,20.0,2.0,32,False,False,,,Falco,Fox,Final Destination,4,3
USB12-20190406T102328,2019-04-06 10:23:28+00:00,9281,Platform.NINTENDONT,20.0,,2.0,,32,False,False,Falco,,Fox,,Final Destination,3,2
USB13-20190406T190420,2019-04-06 19:04:20+00:00,7437,Platform.NINTENDONT,20.0,,,2.0,32,False,False,Falco,,,Fox,Final Destination,4,3


In [38]:
df_games = create_char_col('Falco')
df_games.head()

Unnamed: 0_level_0,date,duration,platform,p1_char,p2_char,p3_char,p4_char,stage,is_teams,is_pal,p1_char_name,p2_char_name,p3_char_name,p4_char_name,stage_name,fox_port,falco_port_index
game_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
USB10-20190406T144505,2019-04-06 14:45:05+00:00,8449,Platform.NINTENDONT,,,20.0,2.0,32,False,False,,,Falco,Fox,Final Destination,4,2
USB10-20190406T114015,2019-04-06 11:40:15+00:00,9572,Platform.NINTENDONT,20.0,,2.0,,32,False,False,Falco,,Fox,,Final Destination,3,0
USB10-20190406T143503,2019-04-06 14:35:03+00:00,7148,Platform.NINTENDONT,,,20.0,2.0,32,False,False,,,Falco,Fox,Final Destination,4,2
USB12-20190406T102328,2019-04-06 10:23:28+00:00,9281,Platform.NINTENDONT,20.0,,2.0,,32,False,False,Falco,,Fox,,Final Destination,3,0
USB13-20190406T190420,2019-04-06 19:04:20+00:00,7437,Platform.NINTENDONT,20.0,,,2.0,32,False,False,Falco,,,Fox,Final Destination,4,0


<a id = "filepaths"></a>
### Getting the Filepaths
In the previous notebook, we used the filepaths to create the `game_id`s. Now we will work backwords and create the filepaths out of the `game_id`.

In [39]:
df_games

Unnamed: 0_level_0,date,duration,platform,p1_char,p2_char,p3_char,p4_char,stage,is_teams,is_pal,p1_char_name,p2_char_name,p3_char_name,p4_char_name,stage_name,fox_port,falco_port_index
game_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
USB10-20190406T144505,2019-04-06 14:45:05+00:00,8449,Platform.NINTENDONT,,,20.0,2.0,32,False,False,,,Falco,Fox,Final Destination,4,2
USB10-20190406T114015,2019-04-06 11:40:15+00:00,9572,Platform.NINTENDONT,20.0,,2.0,,32,False,False,Falco,,Fox,,Final Destination,3,0
USB10-20190406T143503,2019-04-06 14:35:03+00:00,7148,Platform.NINTENDONT,,,20.0,2.0,32,False,False,,,Falco,Fox,Final Destination,4,2
USB12-20190406T102328,2019-04-06 10:23:28+00:00,9281,Platform.NINTENDONT,20.0,,2.0,,32,False,False,Falco,,Fox,,Final Destination,3,0
USB13-20190406T190420,2019-04-06 19:04:20+00:00,7437,Platform.NINTENDONT,20.0,,,2.0,32,False,False,Falco,,,Fox,Final Destination,4,0
USB13-20190406T183745,2019-04-06 18:37:45+00:00,6712,Platform.NINTENDONT,20.0,,,2.0,32,False,False,Falco,,,Fox,Final Destination,4,0
USB13-20190406T105900,2019-04-06 10:59:00+00:00,12435,Platform.NINTENDONT,20.0,2.0,,,32,False,False,Falco,Fox,,,Final Destination,2,0
USB15-20190406T214523,2019-04-06 21:45:23+00:00,6220,Platform.NINTENDONT,20.0,,,2.0,32,False,False,Falco,,,Fox,Final Destination,4,0
USB2-20190406T175827,2019-04-06 17:58:27+00:00,8705,Platform.NINTENDONT,2.0,20.0,,,32,False,False,Fox,Falco,,,Final Destination,1,1
USB5-20190406T190322,2019-04-06 19:03:22+00:00,8206,Platform.NINTENDONT,,20.0,,2.0,32,False,False,,Falco,,Fox,Final Destination,4,1


In [40]:
game_id = df_games.index
game_id

Index(['USB10-20190406T144505', 'USB10-20190406T114015',
       'USB10-20190406T143503', 'USB12-20190406T102328',
       'USB13-20190406T190420', 'USB13-20190406T183745',
       'USB13-20190406T105900', 'USB15-20190406T214523',
       'USB2-20190406T175827', 'USB5-20190406T190322', 'USB6-20190309T012347',
       'USB7-20190406T104203'],
      dtype='object', name='game_id')

In [41]:
game_id[0].strip('-')

'USB10-20190406T144505'

In [42]:
# Use string concatenation to add the relative filepath
# and the file extention to each value in game_id
games = list(map(lambda val: f'../../fp9/{val.split("-")[0]}/Game_{val.split("-")[-1]}.slp', game_id))
games

['../../fp9/USB10/Game_20190406T144505.slp',
 '../../fp9/USB10/Game_20190406T114015.slp',
 '../../fp9/USB10/Game_20190406T143503.slp',
 '../../fp9/USB12/Game_20190406T102328.slp',
 '../../fp9/USB13/Game_20190406T190420.slp',
 '../../fp9/USB13/Game_20190406T183745.slp',
 '../../fp9/USB13/Game_20190406T105900.slp',
 '../../fp9/USB15/Game_20190406T214523.slp',
 '../../fp9/USB2/Game_20190406T175827.slp',
 '../../fp9/USB5/Game_20190406T190322.slp',
 '../../fp9/USB6/Game_20190309T012347.slp',
 '../../fp9/USB7/Game_20190406T104203.slp']

<a id = "function"></a>
### The Function

In [44]:
def frames_to_df_fox(slp_paths):
    '''
    Returns a dataframe of frame data for each game
    slp_paths (list): A list of filepaths that lead to Slippi files
    '''
    length = len(slp_paths)
    count = 0

    # Dictionaries to keep track of which buttons on the controller we pressed for each frame
    # Fox
    fox_button_dict = {'Trigger Analog':[],'Start': [],'Y': [],'X': [],'B': [],'A': [],'L': [],'R': [],
                      'Z': [],'Dpad-Up': [],'Dpad-Down': [],'Dpad-Right': [],'Dpad-Left': []}

    # falco = Not Fox -> Falco
    falco_button_dict = {'Trigger Analog':[],'Start': [],'Y': [],'X': [],'B': [],'A': [],'L': [],'R': [],
                      'Z': [],'Dpad-Up': [],'Dpad-Down': [],'Dpad-Right': [],'Dpad-Left': []}
    
    # foreign key to metadata dataframe
    game_id = list()
    
    # frame index
    index = list()
    
    # feature per frame for fox
    fox_combo_count, fox_dmg, fox_direction, \
    fox_last_attack_landed, fox_last_hit_by, fox_position_x, fox_position_y, \
    fox_shield, fox_state, fox_stage_age, fox_stocks, fox_cstick_x, fox_cstick_y, fox_dmg, fox_direction, \
    fox_joystick_x, fox_joystick_y,  fox_position, fox_raw_analog_x, fox_state, fox_state_age = list(), list(), list(), \
    list(), list(), list(), list(), list(), list(), list(), list(), list(), list(), list(), list(), \
    list(), list(), list(), list(), list(), list()
    
    # feature per frame for not fox
    falco_combo_count, falco_dmg, falco_direction, \
    falco_last_attack_landed, falco_last_hit_by, falco_position_x, falco_position_y, \
    falco_shield, falco_state, p2_stage_age, falco_stocks, falco_cstick_x, falco_cstick_y, falco_dmg, falco_direction, \
    falco_joystick_x, falco_joystick_y, falco_position, falco_raw_analog_x, falco_state, falco_state_age = list(), list(), list(), \
    list(), list(), list(), list(), list(), list(), list(), list(), list(), list(), list(), list(), \
    list(), list(), list(), list(), list(), list()
    
    # For each filepath in the provided list of filepaths
    for path in slp_paths:

        # Create game_id
        curr_gameid = slp_paths[count].split('/')[-1].strip('Game_').strip('.slp')
        print(f'Parsing file {count + 1} of {length}')
        
        # Try to instantiate the Game object. If cannot, skip it
        try:
            game = slp.Game(path)
        except:
            print(f'Skip game {count + 1} of {length}')
            continue

        # get fox ports and non-fox ports
        fox_ports = get_char_ports('Fox')
        falco_ports = get_char_ports('Falco')
        
        # for each Frame object of all frames in a specific game
        frame_length = len(game.frames)
        frame_count = 0
        for frame in game.frames:
            frame_count += 1
            print(f'Parsing frame {frame_count} of {frame_length}: {round(frame_count / frame_length * 100, 2)}%', end = '\r')
            
            # Tell me what game this frame came from
            game_id.append(curr_gameid)
            
            # Tell me the frame index
            index.append(frame.index)
            
            # Tell me the Positional X value of the cstick of each player/character
            fox_cstick_x.append(frame.ports[fox_ports[count]].leader.pre.cstick.x)
            falco_cstick_x.append(frame.ports[falco_ports[count]].leader.pre.cstick.x)

            # Positional Y value of the cstick of each player/character
            fox_cstick_y.append(frame.ports[fox_ports[count]].leader.pre.cstick.y)
            falco_cstick_y.append(frame.ports[falco_ports[count]].leader.pre.cstick.y)
            
            # Positional X value of the joystick of each player/character
            fox_joystick_x.append(frame.ports[fox_ports[count]].leader.pre.joystick.x)
            falco_joystick_x.append(frame.ports[falco_ports[count]].leader.pre.joystick.x)
            
            # Positional Y value of the joystick
            fox_joystick_y.append(frame.ports[fox_ports[count]].leader.pre.joystick.y)
            falco_joystick_y.append(frame.ports[falco_ports[count]].leader.pre.joystick.y)
            
            # Combo Count
            fox_combo_count.append(frame.ports[fox_ports[count]].leader.post.combo_count)
            falco_combo_count.append(frame.ports[falco_ports[count]].leader.post.combo_count)
            
            # Damage
            fox_dmg.append(frame.ports[fox_ports[count]].leader.post.damage)
            falco_dmg.append(frame.ports[falco_ports[count]].leader.post.damage)
            
            # Direction
            fox_direction.append(frame.ports[fox_ports[count]].leader.post.direction)
            falco_direction.append(frame.ports[falco_ports[count]].leader.post.direction)
            
            # Last move hit by
            fox_last_hit_by.append(frame.ports[fox_ports[count]].leader.post.last_hit_by)
            falco_last_hit_by.append(frame.ports[falco_ports[count]].leader.post.last_hit_by)
            
            # Character's X coordinate position
            fox_position_x.append(frame.ports[fox_ports[count]].leader.post.position.x)
            falco_position_x.append(frame.ports[falco_ports[count]].leader.post.position.x)
            
            # Character's Y coordinate position
            fox_position_y.append(frame.ports[fox_ports[count]].leader.post.position.y)
            falco_position_y.append(frame.ports[falco_ports[count]].leader.post.position.y)
            
            # Character's shield size
            fox_shield.append(frame.ports[fox_ports[count]].leader.post.shield)
            falco_shield.append(frame.ports[falco_ports[count]].leader.post.shield)
            
            # Characer's Action state
            fox_state.append(frame.ports[fox_ports[count]].leader.post.state)
            falco_state.append(frame.ports[falco_ports[count]].leader.post.state)
            
            # How long has the characer been in their current action state
            fox_state_age.append(frame.ports[fox_ports[count]].leader.post.state_age)
            falco_state_age.append(frame.ports[falco_ports[count]].leader.post.state_age)
            
            # Number of stocks remaining
            fox_stocks.append(frame.ports[fox_ports[count]].leader.post.stocks)
            falco_stocks.append(frame.ports[falco_ports[count]].leader.post.stocks)

            # Get the inputs of the Fox's controller
            fox_ins = str(frame.ports[fox_ports[count]].leader.pre.buttons.logical).split('.')[1].split('|')
            
            # For each key/button or stick on the controller
            for button in fox_button_dict:

                # if that key is in the buttons that were truly inputted this frame
                if button in fox_ins:
                    # Append True to the dictionary
                    fox_button_dict[button].append(1)
                # else, append False to the dictionary
                else:
                    fox_button_dict[button].append(0)
            
            falco_ins = str(frame.ports[falco_ports[count]].leader.pre.buttons.logical).split('.')[1].split('|')
            for button in falco_button_dict:
                if button in falco_ins:
                    falco_button_dict[button].append(1)
                else:
                    falco_button_dict[button].append(0)
            
        count += 1

    return pd.DataFrame({
        'game_id': game_id,
        'frame_index': index,
        
        # Fox
        'fox_cstick_x': fox_cstick_x,
        'fox_cstick_y': fox_cstick_y,
        'fox_joystick_x': fox_joystick_x,
        'fox_joystick_y': fox_joystick_y,
        'fox_trigger_analog': fox_button_dict['Trigger Analog'],
        'fox_Y': fox_button_dict['Y'],
        'fox_X': fox_button_dict['X'],
        'fox_B': fox_button_dict['B'],
        'fox_A': fox_button_dict['A'],
        'fox_L': fox_button_dict['L'],
        'fox_R': fox_button_dict['R'],
        'fox_Z': fox_button_dict['Z'],
        'fox_Dpad_Up': fox_button_dict['Dpad-Up'],
        'fox_Dpad_Down': fox_button_dict['Dpad-Down'],
        'fox_Dpad_Right': fox_button_dict['Dpad-Right'],
        'fox_Dpad_Left': fox_button_dict['Dpad-Left'],
        'fox_combo_count': fox_combo_count,
        'fox_dmg': fox_dmg,
        'fox_direction': fox_direction,
        'fox_last_hit_by': fox_last_hit_by,
        'fox_position_x': fox_position_x,
        'fox_position_y': fox_position_y,
        'fox_shield': fox_shield,
        'fox_state': fox_state,
        'fox_state_age': fox_state_age,
        'fox_stocks': fox_stocks,
        
        # Not Fox
        'falco_cstick_x': falco_cstick_x,
        'falco_cstick_y': falco_cstick_y,
        'falco_joystick_x': falco_joystick_x,
        'falco_joystick_y': falco_joystick_y,
        'falco_trigger_analog': falco_button_dict['Trigger Analog'],
        'falco_Y': falco_button_dict['Y'],
        'falco_X': falco_button_dict['X'],
        'falco_B': falco_button_dict['B'],
        'falco_A': falco_button_dict['A'],
        'falco_L': falco_button_dict['L'],
        'falco_R': falco_button_dict['R'],
        'falco_Z': falco_button_dict['Z'],
        'falco_Dpad_Up': falco_button_dict['Dpad-Up'],
        'falco_Dpad_Down': falco_button_dict['Dpad-Down'],
        'falco_Dpad_Right': falco_button_dict['Dpad-Right'],
        'falco_Dpad_Left': falco_button_dict['Dpad-Left'],
        'falco_combo_count': falco_combo_count,
        'falco_dmg': falco_dmg,
        'falco_direction': falco_direction,
        'falco_last_hit_by': falco_last_hit_by,
        'falco_position_x': falco_position_x,
        'falco_position_y': falco_position_y,
        'falco_shield': falco_shield,
        'falco_state': falco_state,
        'falco_state_age': falco_state_age,
        'falco_stocks': falco_stocks
    })

In [45]:
get_char_ports('Falco')

[2, 0, 2, 0, 0, 0, 0, 0, 1, 1, 3, 1]

In [45]:
fox_ports

[4, 3, 4, 3, 4, 4, 2, 4, 1, 4, 1, 3]

In [46]:
frames_to_df_fox(games[:5])

Parsing file 1 of 5
Parsing file 2 of 5of 8449: 100.0%
Parsing file 3 of 5of 9572: 100.0%
Parsing file 4 of 5of 7148: 100.0%
Parsing file 5 of 5of 9281: 100.0%
Parsing frame 7437 of 7437: 100.0%

Unnamed: 0,game_id,frame_index,fox_cstick_x,fox_cstick_y,fox_joystick_x,fox_joystick_y,fox_trigger_analog,fox_Y,fox_X,fox_B,...,falco_combo_count,falco_dmg,falco_direction,falco_last_hit_by,falco_position_x,falco_position_y,falco_shield,falco_state,falco_state_age,falco_stocks
0,20190406T144505,-123,0.0,0.0,0.0,0.0,0,0,0,0,...,0,0.000000,1,,-60.000000,10.000000,60.0,322,-1.0,4
1,20190406T144505,-122,0.0,0.0,0.0,0.0,0,0,0,0,...,0,0.000000,1,,-60.000000,10.000000,60.0,322,-1.0,4
2,20190406T144505,-121,0.0,0.0,0.0,0.0,0,0,0,0,...,0,0.000000,1,,-60.000000,10.000000,60.0,322,-1.0,4
3,20190406T144505,-120,0.0,0.0,0.0,0.0,0,0,0,0,...,0,0.000000,1,,-60.000000,10.000000,60.0,322,-1.0,4
4,20190406T144505,-119,0.0,0.0,0.0,0.0,0,0,0,0,...,0,0.000000,1,,-60.000000,10.000000,60.0,322,-1.0,4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
41882,20190406T190420,7309,0.0,0.0,0.0,0.0,0,0,0,0,...,1,13.699999,1,3.0,-51.966187,-122.926659,60.0,247,33.0,1
41883,20190406T190420,7310,0.0,0.0,0.0,0.0,0,0,0,0,...,1,13.699999,1,3.0,-51.452328,-126.125114,60.0,247,34.0,1
41884,20190406T190420,7311,0.0,0.0,0.0,0.0,0,0,0,0,...,1,13.699999,1,3.0,-50.987770,-129.281830,60.0,247,35.0,1
41885,20190406T190420,7312,0.0,0.0,0.0,0.0,0,0,0,0,...,1,13.699999,1,3.0,-50.572514,-132.396805,60.0,247,36.0,1


In [46]:
# Parse each Slippi game
df_frames = frames_to_df_fox(games)
df_frames.head()

Parsing file 1 of 12
Parsing file 2 of 12f 8449: 100.0%
Parsing file 3 of 12f 9572: 100.0%
Parsing file 4 of 12f 7148: 100.0%
Parsing file 5 of 12f 9281: 100.0%
Parsing file 6 of 12f 7437: 100.0%
Parsing file 7 of 12f 6712: 100.0%
Parsing file 8 of 12of 12435: 100.0%
Parsing file 9 of 12f 6220: 100.0%
Parsing file 10 of 12 8705: 100.0%
Parsing file 11 of 12 8206: 100.0%
Parsing file 12 of 12 5429: 100.0%
Parsing frame 7117 of 7117: 100.0%

Unnamed: 0,game_id,frame_index,fox_cstick_x,fox_cstick_y,fox_joystick_x,fox_joystick_y,fox_trigger_analog,fox_Y,fox_X,fox_B,...,falco_combo_count,falco_dmg,falco_direction,falco_last_hit_by,falco_position_x,falco_position_y,falco_shield,falco_state,falco_state_age,falco_stocks
0,20190406T144505,-123,0.0,0.0,0.0,0.0,0,0,0,0,...,0,0.0,1,,-60.0,10.0,60.0,322,-1.0,4
1,20190406T144505,-122,0.0,0.0,0.0,0.0,0,0,0,0,...,0,0.0,1,,-60.0,10.0,60.0,322,-1.0,4
2,20190406T144505,-121,0.0,0.0,0.0,0.0,0,0,0,0,...,0,0.0,1,,-60.0,10.0,60.0,322,-1.0,4
3,20190406T144505,-120,0.0,0.0,0.0,0.0,0,0,0,0,...,0,0.0,1,,-60.0,10.0,60.0,322,-1.0,4
4,20190406T144505,-119,0.0,0.0,0.0,0.0,0,0,0,0,...,0,0.0,1,,-60.0,10.0,60.0,322,-1.0,4


<a id = 'eda'></a>
## Exploring the Frames

In [47]:
# Checking for null values
df_frames.isnull().mean()

game_id                 0.000000
frame_index             0.000000
fox_cstick_x            0.000000
fox_cstick_y            0.000000
fox_joystick_x          0.000000
fox_joystick_y          0.000000
fox_trigger_analog      0.000000
fox_Y                   0.000000
fox_X                   0.000000
fox_B                   0.000000
fox_A                   0.000000
fox_L                   0.000000
fox_R                   0.000000
fox_Z                   0.000000
fox_Dpad_Up             0.000000
fox_Dpad_Down           0.000000
fox_Dpad_Right          0.000000
fox_Dpad_Left           0.000000
fox_combo_count         0.000000
fox_dmg                 0.000000
fox_direction           0.000000
fox_last_hit_by         0.540518
fox_position_x          0.000000
fox_position_y          0.000000
fox_shield              0.000000
fox_state               0.000000
fox_state_age           0.000000
fox_stocks              0.000000
falco_cstick_x          0.000000
falco_cstick_y          0.000000
falco_joys

It appears that the `last_hit_by` features has a lot of missing data. Since I want the final product of this tool to perform with a user's raw game files, I cannot expect a user to take the time to clean their game files and impute their missing values. As such, I will leave these and see how well my model does.

In [48]:
df_frames.shape[0] == sum(df_games['duration'])

True

According to user __Summate__ from the Slippi Discord channel, each character has unique action states. This is obvious, but what was crucial to know is that character's action state values can be identical, but represent different moves. For example, Fox and Falco's action state value of 341 each represent that they will use their blaster to shoot a laser. While they are the same values and have very similar, if not the same animations, the effect of each character's laser is very different. Fox is able to shoot his blaster at a faster rate than Falco, but it does put the opponent in hitstun. It is crucial to map this values to be unique. This way, our model is able to interpret them as different moves rather than the same.

In [49]:
# If the value is less than 341 or greater than 375, then it is not a unique action state to Fox
# Otherwise, it is a unique action state to Fox, so add 42 to each state to make it unique
df_frames['fox_state'] = df_frames['fox_state'].map(lambda val: val if (val < 341) or (val > 375) else val + 42)

In [50]:
# Same idea as the above cell, but add 77 to each unique action state value
df_frames['falco_state'] = df_frames['falco_state'].map(lambda val: val if (val < 341) or (val > 375) else val + 77)

Since we need to tell the model that how much time elapses from one row to another, we need to make the index of the dataframe to be reset. I will use video editing as an analogy to help explain why I reset the index to achieve this.

If the index is reset, then the 0th row has an index of 0 and the final row has an index of 106,177. This is similar to taking the video of all the 13 games and appending each video to each other to make one large video that is 106,178 frames long.

In [51]:
df_frames.reset_index(inplace = True, drop = True)

In [52]:
df_frames.columns

Index(['game_id', 'frame_index', 'fox_cstick_x', 'fox_cstick_y',
       'fox_joystick_x', 'fox_joystick_y', 'fox_trigger_analog', 'fox_Y',
       'fox_X', 'fox_B', 'fox_A', 'fox_L', 'fox_R', 'fox_Z', 'fox_Dpad_Up',
       'fox_Dpad_Down', 'fox_Dpad_Right', 'fox_Dpad_Left', 'fox_combo_count',
       'fox_dmg', 'fox_direction', 'fox_last_hit_by', 'fox_position_x',
       'fox_position_y', 'fox_shield', 'fox_state', 'fox_state_age',
       'fox_stocks', 'falco_cstick_x', 'falco_cstick_y', 'falco_joystick_x',
       'falco_joystick_y', 'falco_trigger_analog', 'falco_Y', 'falco_X',
       'falco_B', 'falco_A', 'falco_L', 'falco_R', 'falco_Z', 'falco_Dpad_Up',
       'falco_Dpad_Down', 'falco_Dpad_Right', 'falco_Dpad_Left',
       'falco_combo_count', 'falco_dmg', 'falco_direction',
       'falco_last_hit_by', 'falco_position_x', 'falco_position_y',
       'falco_shield', 'falco_state', 'falco_state_age', 'falco_stocks'],
      dtype='object')

After checking the set of values for each button on the controller, I have found that the below buttons are all never pressed. This makes sense for Start and the Dpad buttons. If a player pauses mid-match, then they must forfeit one stock as stated in tournament rulings. The Dpad buttons cause a character to taunt. Players don't usually taunt in a match because it usually shows bad sportsmanship and leaves their character vulnerable for a long time.

In [53]:
set(df_frames['falco_trigger_analog'].values)

{0}

In [54]:
set(df_frames['fox_trigger_analog'].values)

{0}

In [55]:
set(df_frames['falco_Dpad_Up'].values)

{0}

In [56]:
set(df_frames['falco_Dpad_Down'].values)

{0}

In [57]:
set(df_frames['falco_Dpad_Left'].values)

{0}

In [58]:
set(df_frames['falco_Dpad_Right'].values)

{0}

In [59]:
set(df_frames['fox_Dpad_Up'].values)

{0}

In [60]:
set(df_frames['fox_Dpad_Down'].values)

{0}

In [61]:
set(df_frames['fox_Dpad_Left'].values)

{0}

In [62]:
set(df_frames['fox_Dpad_Right'].values)

{0}

I will choose to keep the Dpad buttons in because taunting is an action that a character can perform. In niche situations, these taunts can affect the outcome of a skirmish. In addition to that, Samus is able to perform an extended grab using the Dpad buttons. Something that is exclusive to her.

In [63]:
df_frames.to_csv('../data/fox-falco-fd-frames.csv')

In [1]:
df_frames.shape

NameError: name 'df_frames' is not defined