# Teamfight Tactics - Set 2: Top Player Analysis

An analysis of the top 200 players for each region at the end of Set 2.

This notebook will bring all of the data sources together and format them so they are ready for further analysis.

## Data Sources

### tft.db
 - **players**: Data for the top 200 players for each region acquired from [lolchess.gg](https://lolchess.gg/)
 - **matches**: Data for 10 matches for each of the top players acquired from [lolchess.gg](https://lolchess.gg/)

## Imports

In [2]:
import sqlite3
import csv
import pickle
from datetime import datetime
from pathlib import Path

## Input/Output file paths and directories

In [3]:
today = datetime.today()
db_path = Path.cwd() / ".." / "data" / "raw" / "tft.db"
players_file_path = Path.cwd() / ".." / "data" / "interim" / "players_data.csv"
matches_file_path = Path.cwd() / ".." / "data" / "interim" / "matches_data.csv"
matches_pkl_traits_path = Path.cwd() / ".." / "data" / "interim" / "matches_traits.pkl"
matches_pkl_units_path = Path.cwd() / ".." / "data" / "interim" / "matches_units.pkl"

## Retrieve the Data from the database

In [4]:
# Initialise a connection to the database
conn = sqlite3.connect(db_path)
cur = conn.cursor()

In [5]:
# Get all of the data from the players table
cur.execute('SELECT * FROM players')
players = cur.fetchall()

# Get the column names from the players table
players_cols = [tuple[0] for tuple in cur.description]

In [6]:
# Get all of the data from the matches table
cur.execute('SELECT * FROM matches')
matches = cur.fetchall()

# Get the column names from the matches table
matches_cols = [tuple[0] for tuple in cur.description]

## Format the Data

The data from both tables in the database needs to be properly formatted before it can be used for the analysis and then output to a new file. It's important that the raw data remains unchanged during the formatting of the data. The following tasks will be performed:
- **Column Heading Preparation**
    - Separate multiple words with "_"
    - Remove leading and trailing whitespace
    - Convert to lowercase
- **Data Preparation**
    - Make sure all data is in the type expected/required
    - Manipulate data into necessary format (output into a single pickle file)

### Column Heading Preparation

#### Separate multiple words with "_"

In [7]:
players_cols[0] = 'player_id'
players_cols[5] = 'win_rate'

matches_cols[0] = 'match_id'
matches_cols[1] = 'player_id'

#### Remove leading and trailing whitespace

In [8]:
players_cols = [column.strip() for column in players_cols]
matches_cols = [column.strip() for column in matches_cols]

#### Convert to lowercase

In [9]:
players_cols = [column.lower() for column in players_cols]
matches_cols = [column.lower() for column in matches_cols]

### Data Preparation

#### Make sure all data is in the type expected/required

1 - Define a function to make the checking of the data and outputting any discrepancies much easier:

In [10]:
def check_type(var, expected):
    '''
    Check that the variable is of the expected type
    and if it isn't then print out the type mismatch
    '''
    if not isinstance(var, expected):
        print('TYPE MISMATCH: {0} is not type {1}'.format(var, expected))

2 - Check the types for each field in the players table:

In [11]:
# player_id
check_type(players[0][0], int)

# rank
check_type(players[0][1], int)

# name
check_type(players[0][2], str)

# tier
check_type(players[0][3], str)

# lp
check_type(players[0][4], int)

# win_rate
check_type(players[0][5], float)

# played
check_type(players[0][6], int)

# wins
check_type(players[0][7], int)

# losses
check_type(players[0][8], int)

# region
check_type(players[0][9], str)

3 - Check the types for each field in the matches table:

1 - Output unit and traits to pkl files (DON'T PICKLE THEM TWICE, JUST WRITE TO THE FILE)
2 - Unpickle these files
3 - Check the types

FINAL OUTPUT FOR ALL OF THIS SHOULD BE A CSV FILE

In [23]:
with open(matches_pkl_traits_path, 'wb') as traits_pkl_file:
    traits_pkl_file.write(matches[0][5])
    
with open(matches_pkl_traits_path, 'rb') as traits_pkl_file:
    traits = pickle.load(traits_pkl_file)
    
with open(matches_pkl_units_path, 'wb') as units_pkl_file:
    units_pkl_file.write(matches[0][6])
    
with open(matches_pkl_units_path, 'rb') as units_pkl_file:
    units = pickle.load(units_pkl_file)

# match_id
check_type(matches[0][0], int)

# player_id
check_type(matches[0][1], int)

# placement
check_type(matches[0][2], int)

# mode
check_type(matches[0][3], str)

# length
check_type(matches[0][4], datetime)
print(type(matches[0][4]))

# traits
check_type(traits, list)

# units
check_type(units, list)

TYPE MISMATCH: 37:45 is not type <class 'datetime.datetime'>
<class 'str'>
