# Position Predictor (In Progress)

`Given the position of 1,3,5 players + ball have the model try and predict the position of the other player.`

The intent of this notebook is to document model selection for this position prediction task.

In [1]:
import carball
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import importlib

## Data Exploration

Before doing anything, I need to just get a grip on what the data is and what can be done with it.

In [2]:
import data
importlib.reload(data)
data_manager = data.Calculated()

In [3]:
replays = data_manager.get_replay_list(num=10)

In [4]:
replays

['D4DE3A894229D5B8A5D0AE9091D3CA6C',
 '4DC852DA4D28F5D9047C509CA03412C8',
 'EF578C404F791CE044EFDEB0036DA5EA',
 'B828B7FD472EE7A19E9A9C8ECB7CA14B',
 'A21C39DD402315E9831EC58C7A331F64',
 '1C1AA99611E7A6E035C32F9551CD5D38',
 '7901299C11E7D3B2E06E8CB0DF9F6172',
 '997C8F1E11E7F7D384BBD0BC1A6F09FC',
 'F15A0B1211E814E30254D78BDDA89CDC',
 'B61A9AA211E829665CEB87A590E3C53A']

In [5]:
df0 = data_manager.get_pandas(replays[1])

In [6]:
df0.head()

Unnamed: 0_level_0,Hunter746,Hunter746,Hunter746,Hunter746,Hunter746,Hunter746,Hunter746,Hunter746,Hunter746,Hunter746,...,xcuttington21,xcuttington21,xcuttington21,xcuttington21,game,game,game,game,game,game
Unnamed: 0_level_1,pos_x,pos_y,pos_z,rot_x,rot_y,rot_z,vel_x,vel_y,vel_z,ang_vel_x,...,double_jump_active,dodge_active,ping,boost_collect,time,delta,seconds_remaining,replicated_seconds_remaining,ball_has_been_hit,goal_number
index,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
0,-2048.0,-2560.0,18.0,-6.272256,-2.356183,-0.00278,0.0,0.0,81.0,0.0,...,False,False,,,19.28497,0.004,300.0,,,
1,-2048.0,-2560.0,18.0,-6.272256,-2.356183,-0.00278,0.0,0.0,81.0,0.0,...,False,False,,,19.321173,0.036207,300.0,,,
2,-2048.0,-2560.0,18.0,-6.272256,-2.356183,-0.00278,0.0,0.0,81.0,0.0,...,False,False,,,19.35717,0.036002,300.0,,,
3,-2048.0,-2560.0,18.0,-6.272256,-2.356183,-0.00278,0.0,0.0,81.0,0.0,...,False,False,,,19.393167,0.036002,300.0,,,
4,-2048.0,-2560.0,18.0,-6.272256,-2.356183,-0.00278,0.0,0.0,81.0,0.0,...,False,False,,,19.429302,0.036138,300.0,,,


In [7]:
df0.columns

MultiIndex(levels=[['Hunter746', 'Hunters Coach', 'ball', 'game', 'jjgamer345', 'madkillerDC', 'oeskrew187', 'xcuttington21'], ['ang_vel_x', 'ang_vel_y', 'ang_vel_z', 'ball_cam', 'ball_has_been_hit', 'boost', 'boost_active', 'boost_collect', 'delta', 'dodge_active', 'double_jump_active', 'goal_number', 'handbrake', 'hit_team_no', 'jump_active', 'ping', 'pos_x', 'pos_y', 'pos_z', 'replicated_seconds_remaining', 'rot_x', 'rot_y', 'rot_z', 'seconds_remaining', 'steer', 'throttle', 'time', 'vel_x', 'vel_y', 'vel_z']],
           codes=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 3, 3, 3, 3, 3, 3]

In [8]:
df0['ball'].tail()

Unnamed: 0_level_0,pos_x,pos_y,pos_z,rot_x,rot_y,rot_z,vel_x,vel_y,vel_z,ang_vel_x,ang_vel_y,ang_vel_z,hit_team_no
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
11145,3304,1088,157,-5.481956,-1.391006,-1.539565,-4687.0,-13167.0,-6975.0,-4736.0,-1810.0,2918.0,0.0
11146,3281,1024,123,-5.28637,-1.30213,-1.596035,-2668.0,-16516.0,-2480.0,1694.0,-53.0,1439.0,0.0
11147,3272,969,115,-5.340156,-1.279407,-1.626716,-2665.0,-16500.0,-2694.0,1694.0,-53.0,1439.0,0.0
11148,3263,914,105,-5.393559,-1.25563,-1.657204,-2662.0,-16483.0,-2908.0,1694.0,-53.0,1439.0,0.0
11149,3254,859,95,-5.446386,-1.230799,-1.6875,-2660.0,-16466.0,-3122.0,1694.0,-53.0,1439.0,0.0


In [9]:
df0['game'].head()

Unnamed: 0_level_0,time,delta,seconds_remaining,replicated_seconds_remaining,ball_has_been_hit,goal_number
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0,19.28497,0.004,300.0,,,
1,19.321173,0.036207,300.0,,,
2,19.35717,0.036002,300.0,,,
3,19.393167,0.036002,300.0,,,
4,19.429302,0.036138,300.0,,,


In [10]:
df0['Hunter746'].tail()

Unnamed: 0_level_0,pos_x,pos_y,pos_z,rot_x,rot_y,rot_z,vel_x,vel_y,vel_z,ang_vel_x,...,steer,handbrake,ball_cam,jump_active,double_jump_active,boost,boost_active,dodge_active,ping,boost_collect
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
11145,3559.0,904.0,18.0,-6.263627,0.570314,-0.006328,-17230.0,-10152.0,108.0,-87.0,...,156.0,,True,4,False,170.129738,True,4,6,
11146,3559.0,904.0,18.0,-6.263627,0.570314,-0.006328,-17230.0,-10152.0,108.0,-87.0,...,156.0,,True,4,False,167.022022,True,4,6,
11147,3430.0,822.0,18.0,-6.2689,0.630524,-0.00441,-17207.0,-11465.0,93.0,-36.0,...,0.0,,True,4,False,163.925097,True,4,6,
11148,3430.0,822.0,18.0,-6.2689,0.630524,-0.00441,-17207.0,-11465.0,93.0,-36.0,...,0.0,,True,4,False,160.821183,True,4,6,
11149,3282.0,726.0,18.0,-6.271105,0.540497,-0.003452,-18253.0,-11285.0,87.0,-13.0,...,0.0,,True,4,False,157.723972,True,4,6,


In [11]:
df0['Hunter746'].columns

Index(['pos_x', 'pos_y', 'pos_z', 'rot_x', 'rot_y', 'rot_z', 'vel_x', 'vel_y',
       'vel_z', 'ang_vel_x', 'ang_vel_y', 'ang_vel_z', 'throttle', 'steer',
       'handbrake', 'ball_cam', 'jump_active', 'double_jump_active', 'boost',
       'boost_active', 'dodge_active', 'ping', 'boost_collect'],
      dtype='object')

## Preprocessing

This stage takes replay dataframes and protobufs and prepares them as a numpy matrix suitable for model ingestion and analysis. There are 2 possible schemas I want to explore for state representation. 


### Schema: raw positions

Records shall be represented as position vectors as shown below

```
<ball> <t0_p0> <t0_p1> <t0_p2> <t1_p0> <t1_p1> <t1_p2>
foreach: <pos_x> <pos_y> <pos_z>
```

### Schema: rasterized positions

Records shall be represented as 3-channel rasterized images, with a channel each for ball, team_0, and team_1.   This can be generated from the condensed positions.

### Reordering

The data could then be expanded to capture the symmetries of team and player reordering.  Player ordering in the condensed schema is arbitrary, and each record could generate `3! * 3! = 36` permutations. Team ordering is not arbitrary, but teams could be swapped by negating `pos_y` values.  This would increase the total available permutations to `36 * 2 = 72`.  There is also x-axis symmetry, so by negating X values we get another mirror, bringing it to `72 * 2 = 144` permutations.


### Dropout

The original plan was to omit one player, and make the predictor guess it. If we are already have generated all reorderings, it should be sufficient to simply omit the final player.  However, I would also like to explore the possibility of a GAN or Autoencoder where one position within a valid record is fabricated, and the predictor must reconstruct the valid position from the invalid one.