# 1. <a id='toc1_'></a>[Preparation to the NBA Dex app](#toc0_)

Complete features glossary can be seen here: https://www.nba.com/stats/help/glossary

**Table of contents**<a id='toc0_'></a>    
- 1. [Preparation to the NBA Dex app](#toc1_)    
- 2. [Importings](#toc2_)    
  - 2.1. [Libraries](#toc2_1_)    
  - 2.2. [Data](#toc2_2_)    
- 3. [Data preparation](#toc3_)    
  - 3.1. [Merging the dataframes](#toc3_1_)    
  - 3.2. [Converting and creating new features](#toc3_2_)    
    - 3.2.1. [Converting players heights from inches to cm](#toc3_2_1_)    
    - 3.2.2. [Converting the players weights from pounds to kg](#toc3_2_2_)    
  - 3.3. [Selecting features to be shown](#toc3_3_)    
  - 3.4. [Exporting the filtered DataFrame as CSV file](#toc3_4_)    
- 4. [Making some charts](#toc4_)    
- 5. [Transforming the numerical attributes to be plotted as radas charts](#toc5_)    
    - 5.1.1. [Exporting the transformed DataFrame as CSV file](#toc5_1_1_)    
- 6. [Plotting the radar charts](#toc6_)    
    - 6.1.1. [Importing the transformed data](#toc6_1_1_)    

<!-- vscode-jupyter-toc-config
	numbering=true
	anchor=true
	flat=false
	minLevel=1
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

# 2. <a id='toc2_'></a>[Importings](#toc0_)

## 2.1. <a id='toc2_1_'></a>[Libraries](#toc0_)

In [85]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go

pd.set_option('display.max_columns', None)

## 2.2. <a id='toc2_2_'></a>[Data](#toc0_)

In [3]:
players_trad = pd.read_csv('/home/bruno/repos/NBA_2022-2023/data/scraped_2022-23/players_stats_2022-23.csv', low_memory=False)
players_bios = pd.read_csv('/home/bruno/repos/NBA_2022-2023/data/scraped_2022-23/players_bios_2022-23.csv', low_memory=False)
players_hustle = pd.read_csv('/home/bruno/repos/NBA_2022-2023/data/scraped_2022-23/players_hustle_2022-23.csv', low_memory=False)
players_index = pd.read_csv('/home/bruno/repos/NBA_2022-2023/data/scraped_2022-23/players_index_2022-23.csv', low_memory=False)

In [4]:
players_trad.columns

Index(['PLAYER_ID', 'PLAYER_NAME', 'NICKNAME', 'TEAM_ID', 'TEAM_ABBREVIATION',
       'AGE', 'GP', 'W', 'L', 'W_PCT', 'MIN', 'FGM', 'FGA', 'FG_PCT', 'FG3M',
       'FG3A', 'FG3_PCT', 'FTM', 'FTA', 'FT_PCT', 'OREB', 'DREB', 'REB', 'AST',
       'TOV', 'STL', 'BLK', 'BLKA', 'PF', 'PFD', 'PTS', 'PLUS_MINUS',
       'NBA_FANTASY_PTS', 'DD2', 'TD3', 'WNBA_FANTASY_PTS', 'GP_RANK',
       'W_RANK', 'L_RANK', 'W_PCT_RANK', 'MIN_RANK', 'FGM_RANK', 'FGA_RANK',
       'FG_PCT_RANK', 'FG3M_RANK', 'FG3A_RANK', 'FG3_PCT_RANK', 'FTM_RANK',
       'FTA_RANK', 'FT_PCT_RANK', 'OREB_RANK', 'DREB_RANK', 'REB_RANK',
       'AST_RANK', 'TOV_RANK', 'STL_RANK', 'BLK_RANK', 'BLKA_RANK', 'PF_RANK',
       'PFD_RANK', 'PTS_RANK', 'PLUS_MINUS_RANK', 'NBA_FANTASY_PTS_RANK',
       'DD2_RANK', 'TD3_RANK', 'WNBA_FANTASY_PTS_RANK'],
      dtype='object')

In [5]:
players_bios.columns

Index(['PLAYER_ID', 'PLAYER_NAME', 'TEAM_ID', 'TEAM_ABBREVIATION', 'AGE',
       'PLAYER_HEIGHT', 'PLAYER_HEIGHT_INCHES', 'PLAYER_WEIGHT', 'COLLEGE',
       'COUNTRY', 'DRAFT_YEAR', 'DRAFT_ROUND', 'DRAFT_NUMBER', 'GP', 'PTS',
       'REB', 'AST', 'NET_RATING', 'OREB_PCT', 'DREB_PCT', 'USG_PCT', 'TS_PCT',
       'AST_PCT'],
      dtype='object')

In [6]:
players_hustle.columns

Index(['PLAYER_ID', 'PLAYER_NAME', 'TEAM_ID', 'TEAM_ABBREVIATION', 'AGE', 'G',
       'MIN', 'CONTESTED_SHOTS', 'CONTESTED_SHOTS_2PT', 'CONTESTED_SHOTS_3PT',
       'DEFLECTIONS', 'CHARGES_DRAWN', 'SCREEN_ASSISTS', 'SCREEN_AST_PTS',
       'OFF_LOOSE_BALLS_RECOVERED', 'DEF_LOOSE_BALLS_RECOVERED',
       'LOOSE_BALLS_RECOVERED', 'PCT_LOOSE_BALLS_RECOVERED_OFF',
       'PCT_LOOSE_BALLS_RECOVERED_DEF', 'OFF_BOXOUTS', 'DEF_BOXOUTS',
       'BOX_OUTS', 'BOX_OUT_PLAYER_TEAM_REBS', 'BOX_OUT_PLAYER_REBS',
       'PCT_BOX_OUTS_OFF', 'PCT_BOX_OUTS_DEF', 'PCT_BOX_OUTS_TEAM_REB',
       'PCT_BOX_OUTS_REB'],
      dtype='object')

In [7]:
players_index.columns

Index(['PERSON_ID', 'PLAYER_LAST_NAME', 'PLAYER_FIRST_NAME', 'PLAYER_SLUG',
       'TEAM_ID', 'TEAM_SLUG', 'IS_DEFUNCT', 'TEAM_CITY', 'TEAM_NAME',
       'TEAM_ABBREVIATION', 'JERSEY_NUMBER', 'POSITION', 'HEIGHT', 'WEIGHT',
       'COLLEGE', 'COUNTRY', 'DRAFT_YEAR', 'DRAFT_ROUND', 'DRAFT_NUMBER',
       'ROSTER_STATUS', 'PTS', 'REB', 'AST', 'STATS_TIMEFRAME', 'FROM_YEAR',
       'TO_YEAR'],
      dtype='object')

In [8]:
players_index.columns = ['PLAYER_ID', 'PLAYER_LAST_NAME', 'PLAYER_FIRST_NAME', 'PLAYER_SLUG',
       'TEAM_ID', 'TEAM_SLUG', 'IS_DEFUNCT', 'TEAM_CITY', 'TEAM_NAME',
       'TEAM_ABBREVIATION', 'JERSEY_NUMBER', 'POSITION', 'PLAYER_HEIGHT', 'PLAYER_WEIGHT',
       'COLLEGE', 'COUNTRY', 'DRAFT_YEAR', 'DRAFT_ROUND', 'DRAFT_NUMBER',
       'ROSTER_STATUS', 'PTS', 'REB', 'AST', 'STATS_TIMEFRAME', 'FROM_YEAR',
       'TO_YEAR']

# 3. <a id='toc3_'></a>[Data preparation](#toc0_)

## 3.1. <a id='toc3_1_'></a>[Merging the dataframes](#toc0_)

In [9]:
df_aux = pd.merge(players_bios, players_hustle, how='left', on = ['PLAYER_ID', 'PLAYER_NAME', 'TEAM_ID', 'TEAM_ABBREVIATION', 'AGE'])

In [10]:
df_aux2 = pd.merge(df_aux, players_trad, how='left', on = ['PLAYER_ID', 'PLAYER_NAME', 'TEAM_ID', 'TEAM_ABBREVIATION', 'AGE', 'GP', 'PTS', 'REB', 'AST', 'MIN'])

In [11]:
df = pd.merge(df_aux2, players_index, how='left', on = ['PLAYER_ID', 'TEAM_ID', 'TEAM_ABBREVIATION', 'PTS', 
                                                        'REB', 'AST', 'PLAYER_HEIGHT', 'PLAYER_WEIGHT'])

In [12]:
df = df.drop(columns=['COLLEGE_y', 'COUNTRY_y', 'DRAFT_YEAR_y', 'DRAFT_ROUND_y', 'DRAFT_NUMBER_y'], axis = 1)

In [13]:
# print(*df.columns, sep= '\n')

In [14]:
df.columns = ['PLAYER_ID',
'PLAYER_NAME',
'TEAM_ID',
'TEAM_ABBREVIATION',
'AGE',
'PLAYER_HEIGHT_FT',
'PLAYER_HEIGHT_INCHES',
'PLAYER_WEIGHT_LBS',
'COLLEGE',
'COUNTRY',
'DRAFT_YEAR',
'DRAFT_ROUND',
'DRAFT_NUMBER',
'GP',
'PTS',
'REB',
'AST',
'NET_RATING',
'OREB_PCT',
'DREB_PCT',
'USG_PCT',
'TS_PCT',
'AST_PCT',
'G',
'MIN',
'CONTESTED_SHOTS',
'CONTESTED_SHOTS_2PT',
'CONTESTED_SHOTS_3PT',
'DEFLECTIONS',
'CHARGES_DRAWN',
'SCREEN_ASSISTS',
'SCREEN_AST_PTS',
'OFF_LOOSE_BALLS_RECOVERED',
'DEF_LOOSE_BALLS_RECOVERED',
'LOOSE_BALLS_RECOVERED',
'PCT_LOOSE_BALLS_RECOVERED_OFF',
'PCT_LOOSE_BALLS_RECOVERED_DEF',
'OFF_BOXOUTS',
'DEF_BOXOUTS',
'BOX_OUTS',
'BOX_OUT_PLAYER_TEAM_REBS',
'BOX_OUT_PLAYER_REBS',
'PCT_BOX_OUTS_OFF',
'PCT_BOX_OUTS_DEF',
'PCT_BOX_OUTS_TEAM_REB',
'PCT_BOX_OUTS_REB',
'NICKNAME',
'W',
'L',
'W_PCT',
'FGM',
'FGA',
'FG_PCT',
'FG3M',
'FG3A',
'FG3_PCT',
'FTM',
'FTA',
'FT_PCT',
'OREB',
'DREB',
'TOV',
'STL',
'BLK',
'BLKA',
'PF',
'PFD',
'PLUS_MINUS',
'NBA_FANTASY_PTS',
'DD2',
'TD3',
'WNBA_FANTASY_PTS',
'GP_RANK',
'W_RANK',
'L_RANK',
'W_PCT_RANK',
'MIN_RANK',
'FGM_RANK',
'FGA_RANK',
'FG_PCT_RANK',
'FG3M_RANK',
'FG3A_RANK',
'FG3_PCT_RANK',
'FTM_RANK',
'FTA_RANK',
'FT_PCT_RANK',
'OREB_RANK',
'DREB_RANK',
'REB_RANK',
'AST_RANK',
'TOV_RANK',
'STL_RANK',
'BLK_RANK',
'BLKA_RANK',
'PF_RANK',
'PFD_RANK',
'PTS_RANK',
'PLUS_MINUS_RANK',
'NBA_FANTASY_PTS_RANK',
'DD2_RANK',
'TD3_RANK',
'WNBA_FANTASY_PTS_RANK',
'PLAYER_LAST_NAME',
'PLAYER_FIRST_NAME',
'PLAYER_SLUG',
'TEAM_SLUG',
'IS_DEFUNCT',
'TEAM_CITY',
'TEAM_NAME',
'JERSEY_NUMBER',
'POSITION',
'ROSTER_STATUS',
'STATS_TIMEFRAME',
'FROM_YEAR',
'TO_YEAR']

## 3.2. <a id='toc3_2_'></a>[Converting and creating new features](#toc0_)

### 3.2.1. <a id='toc3_2_1_'></a>[Converting players heights from inches to cm](#toc0_)

In [15]:
df['PLAYER_HEIGHT_CM'] = round(df['PLAYER_HEIGHT_INCHES']*2.54, 0)

### 3.2.2. <a id='toc3_2_2_'></a>[Converting the players weights from pounds to kg](#toc0_)

In [16]:
df['PLAYER_WEIGHT_KG'] = round(df['PLAYER_WEIGHT_LBS']*0.453592, 1)

In [17]:
df.sample(2)

Unnamed: 0,PLAYER_ID,PLAYER_NAME,TEAM_ID,TEAM_ABBREVIATION,AGE,PLAYER_HEIGHT_FT,PLAYER_HEIGHT_INCHES,PLAYER_WEIGHT_LBS,COLLEGE,COUNTRY,DRAFT_YEAR,DRAFT_ROUND,DRAFT_NUMBER,GP,PTS,REB,AST,NET_RATING,OREB_PCT,DREB_PCT,USG_PCT,TS_PCT,AST_PCT,G,MIN,CONTESTED_SHOTS,CONTESTED_SHOTS_2PT,CONTESTED_SHOTS_3PT,DEFLECTIONS,CHARGES_DRAWN,SCREEN_ASSISTS,SCREEN_AST_PTS,OFF_LOOSE_BALLS_RECOVERED,DEF_LOOSE_BALLS_RECOVERED,LOOSE_BALLS_RECOVERED,PCT_LOOSE_BALLS_RECOVERED_OFF,PCT_LOOSE_BALLS_RECOVERED_DEF,OFF_BOXOUTS,DEF_BOXOUTS,BOX_OUTS,BOX_OUT_PLAYER_TEAM_REBS,BOX_OUT_PLAYER_REBS,PCT_BOX_OUTS_OFF,PCT_BOX_OUTS_DEF,PCT_BOX_OUTS_TEAM_REB,PCT_BOX_OUTS_REB,NICKNAME,W,L,W_PCT,FGM,FGA,FG_PCT,FG3M,FG3A,FG3_PCT,FTM,FTA,FT_PCT,OREB,DREB,TOV,STL,BLK,BLKA,PF,PFD,PLUS_MINUS,NBA_FANTASY_PTS,DD2,TD3,WNBA_FANTASY_PTS,GP_RANK,W_RANK,L_RANK,W_PCT_RANK,MIN_RANK,FGM_RANK,FGA_RANK,FG_PCT_RANK,FG3M_RANK,FG3A_RANK,FG3_PCT_RANK,FTM_RANK,FTA_RANK,FT_PCT_RANK,OREB_RANK,DREB_RANK,REB_RANK,AST_RANK,TOV_RANK,STL_RANK,BLK_RANK,BLKA_RANK,PF_RANK,PFD_RANK,PTS_RANK,PLUS_MINUS_RANK,NBA_FANTASY_PTS_RANK,DD2_RANK,TD3_RANK,WNBA_FANTASY_PTS_RANK,PLAYER_LAST_NAME,PLAYER_FIRST_NAME,PLAYER_SLUG,TEAM_SLUG,IS_DEFUNCT,TEAM_CITY,TEAM_NAME,JERSEY_NUMBER,POSITION,ROSTER_STATUS,STATS_TIMEFRAME,FROM_YEAR,TO_YEAR,PLAYER_HEIGHT_CM,PLAYER_WEIGHT_KG
465,1627885,Shaquille Harrison,1610612747,LAL,29.0,6-4,76,195,Tulsa,USA,Undrafted,Undrafted,Undrafted,5,8.8,4.4,6.0,-0.4,0.008,0.184,0.162,0.516,0.306,5,24.0,2.8,1.8,1.0,3.2,0.0,0.0,0.0,0.0,0.8,0.8,0.0,1.0,0.0,0.4,0.4,0.4,0.4,0.0,1.0,1.0,1.0,Shaquille,1.0,4.0,0.2,3.0,7.2,0.417,0.6,2.0,0.3,2.2,3.0,0.733,0.2,4.2,1.2,2.2,0.4,0.4,2.4,2.4,-0.2,29.7,1.0,0.0,25.0,499.0,507.0,480.0,505.0,189.0,240.0,206.0,401.0,317.0,305.0,382.0,104.0,98.0,329.0,475.0,85.0,144.0,33.0,183.0,2.0,182.0,199.0,94.0,108.0,220.0,264.0,98.0,192.0,39.0,126.0,Harrison,Shaquille,shaquille-harrison,lakers,0.0,Los Angeles,Lakers,0,G,1.0,Season,2017.0,2022.0,193.0,88.5
200,1630702,Jaden Hardy,1610612742,DAL,20.0,6-3,75,198,,USA,2022,2,37,48,8.8,1.9,1.4,-3.1,0.015,0.106,0.248,0.571,0.147,48,14.8,2.02,0.85,1.17,0.65,0.06,0.1,0.25,0.13,0.15,0.27,0.462,0.538,0.0,0.25,0.25,0.25,0.15,0.0,1.0,1.0,0.583,Jaden,20.0,28.0,0.417,3.0,6.9,0.438,1.3,3.3,0.404,1.4,1.6,0.823,0.2,1.6,1.0,0.4,0.1,0.6,1.3,1.3,-0.7,13.5,0.0,0.0,14.3,302.0,310.0,214.0,391.0,348.0,236.0,218.0,329.0,161.0,190.0,78.0,192.0,215.0,167.0,462.0,369.0,401.0,274.0,244.0,381.0,398.0,102.0,362.0,255.0,222.0,301.0,310.0,253.0,39.0,285.0,Hardy,Jaden,jaden-hardy,mavericks,0.0,Dallas,Mavericks,3,G,1.0,Season,2022.0,2022.0,190.0,89.8


## 3.3. <a id='toc3_3_'></a>[Selecting features to be shown](#toc0_)

In [66]:
selected_columns = ['PLAYER_ID',
                    'PLAYER_NAME',
                    'PLAYER_LAST_NAME',
                    'PLAYER_FIRST_NAME',
                    'TEAM_ID',
                    'TEAM_ABBREVIATION',
                    'AGE',
                    'JERSEY_NUMBER',
                    'POSITION',
                    'PLAYER_HEIGHT_INCHES',
                    'PLAYER_HEIGHT_CM',
                    'PLAYER_WEIGHT_LBS',
                    'PLAYER_WEIGHT_KG',
                    'COLLEGE',
                    'COUNTRY',
                    'DRAFT_YEAR',
                    'PLUS_MINUS',
                    'GP',
                    'PTS',
                    'REB',
                    'AST',
                    'G',
                    'MIN',
                    'PFD',
                    'FGM',
                    'FGA',
                    'FG_PCT',
                    'FG3M',
                    'FG3A',
                    'FG3_PCT',
                    'FTM',
                    'FTA',
                    'FT_PCT',
                    'OREB',
                    'DREB',
                    'STL',
                    'BLK',
                    'BLKA',
                    'TOV',
                    'PF',
                    'CONTESTED_SHOTS',
                    'CONTESTED_SHOTS_2PT',
                    'CONTESTED_SHOTS_3PT',
                    'DEFLECTIONS',
                    'CHARGES_DRAWN',
                    'SCREEN_ASSISTS',
                    'OFF_BOXOUTS',
                    'DEF_BOXOUTS',
                    'BOX_OUTS']

In [67]:
filtered_df = df[selected_columns]

In [68]:
filtered_df

Unnamed: 0,PLAYER_ID,PLAYER_NAME,PLAYER_LAST_NAME,PLAYER_FIRST_NAME,TEAM_ID,TEAM_ABBREVIATION,AGE,JERSEY_NUMBER,POSITION,PLAYER_HEIGHT_INCHES,PLAYER_HEIGHT_CM,PLAYER_WEIGHT_LBS,PLAYER_WEIGHT_KG,COLLEGE,COUNTRY,DRAFT_YEAR,PLUS_MINUS,GP,PTS,REB,AST,G,MIN,PFD,FGM,FGA,FG_PCT,FG3M,FG3A,FG3_PCT,FTM,FTA,FT_PCT,OREB,DREB,STL,BLK,BLKA,TOV,PF,CONTESTED_SHOTS,CONTESTED_SHOTS_2PT,CONTESTED_SHOTS_3PT,DEFLECTIONS,CHARGES_DRAWN,SCREEN_ASSISTS,OFF_BOXOUTS,DEF_BOXOUTS,BOX_OUTS
0,1630639,A.J. Lawson,Lawson,A.J.,1610612742,DAL,22.0,9,G,78,198.0,179,81.2,South Carolina,Canada,Undrafted,-3.1,15,3.7,1.4,0.1,15,7.2,0.4,1.5,2.9,0.500,0.7,1.7,0.400,0.1,0.5,0.250,0.4,1.0,0.1,0.0,0.2,0.2,0.7,0.93,0.47,0.47,0.33,0.00,0.07,0.00,0.00,0.00
1,1631260,AJ Green,Green,AJ,1610612749,MIL,23.0,20,G,77,196.0,190,86.2,Northern Iowa,USA,Undrafted,-0.7,35,4.4,1.3,0.6,35,9.9,0.1,1.5,3.6,0.424,1.3,3.0,0.419,0.1,0.1,1.000,0.2,1.1,0.2,0.0,0.0,0.3,0.9,1.29,0.57,0.71,0.23,0.03,0.14,0.00,0.11,0.11
2,1631100,AJ Griffin,Griffin,AJ,1610612737,ATL,19.0,14,F,78,198.0,220,99.8,Duke,USA,2022,0.9,72,8.9,2.1,1.0,72,19.5,0.6,3.4,7.4,0.465,1.4,3.6,0.390,0.6,0.7,0.894,0.5,1.6,0.6,0.2,0.3,0.6,1.2,2.88,1.42,1.46,0.88,0.00,0.17,0.00,0.08,0.08
3,203932,Aaron Gordon,Gordon,Aaron,1610612743,DEN,27.0,50,F,80,203.0,235,106.6,Arizona,USA,2014,7.6,68,16.3,6.6,3.0,68,30.2,3.6,6.3,11.2,0.564,0.9,2.5,0.347,2.8,4.6,0.608,2.4,4.1,0.8,0.8,1.0,1.4,1.9,5.50,3.93,1.57,1.16,0.01,0.87,0.18,0.29,0.47
4,1628988,Aaron Holiday,Holiday,Aaron,1610612737,ATL,26.0,3,G,72,183.0,185,83.9,UCLA,USA,2018,0.3,63,3.9,1.2,1.4,63,13.4,0.8,1.5,3.5,0.418,0.6,1.4,0.409,0.4,0.5,0.844,0.4,0.8,0.6,0.2,0.3,0.6,1.3,1.84,0.94,0.90,1.11,0.00,0.03,0.00,0.14,0.14
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
534,1628380,Zach Collins,Collins,Zach,1610612759,SAS,25.0,23,F-C,83,211.0,250,113.4,Gonzaga,USA,2017,-3.8,63,11.6,6.4,2.9,63,22.9,2.3,4.5,8.7,0.518,0.9,2.3,0.374,1.7,2.3,0.761,1.8,4.5,0.6,0.8,0.6,2.0,3.2,10.14,8.67,1.48,0.94,0.06,3.29,0.27,1.29,1.56
535,203897,Zach LaVine,LaVine,Zach,1610612741,CHI,28.0,8,G,77,196.0,200,90.7,UCLA,USA,2014,0.2,77,24.8,4.5,4.2,77,35.9,3.8,8.7,18.0,0.485,2.6,7.1,0.375,4.7,5.6,0.848,0.5,3.9,0.9,0.2,1.0,2.5,2.1,5.68,2.44,3.23,1.25,0.00,0.32,0.00,0.19,0.19
536,1630192,Zeke Nnaji,Nnaji,Zeke,1610612743,DEN,22.0,22,F-C,81,206.0,240,108.9,Arizona,USA,2020,-1.8,53,5.2,2.6,0.3,53,13.7,1.2,2.1,3.7,0.561,0.3,1.2,0.262,0.8,1.2,0.645,1.2,1.4,0.3,0.4,0.2,0.6,2.0,3.79,2.70,1.09,0.57,0.00,1.08,0.25,0.32,0.57
537,1630533,Ziaire Williams,Williams,Ziaire,1610612763,MEM,21.0,8,F,81,206.0,185,83.9,Stanford,USA,2021,-2.1,37,5.7,2.1,0.9,37,15.2,0.9,2.3,5.3,0.429,0.7,2.6,0.258,0.5,0.6,0.773,0.4,1.7,0.4,0.2,0.1,1.0,1.6,3.35,1.51,1.84,0.81,0.00,0.19,0.00,0.16,0.16


## 3.4. <a id='toc3_4_'></a>[Exporting the filtered DataFrame as CSV file](#toc0_)

In [69]:
filtered_df.to_csv('/home/bruno/repos/NBA_2022-2023/data/filtered_df.csv', index=False)

# 4. <a id='toc4_'></a>[Making some charts](#toc0_)

In [70]:
px.box(filtered_df,
       x = 'POSITION',
       y = 'PTS',
       color = 'POSITION',
       hover_name='PLAYER_NAME',
       title = 'Points per Game by Position',
       labels = {'PTS':'Points', 'POSITION': 'Position'},
       category_orders = {'Pos':('G', 'G-F', 'F-G', 'F', 'F-C', 'C-F', 'C')},
       template='plotly_dark')

In [71]:
px.box(filtered_df,
       x = 'POSITION',
       y = 'CONTESTED_SHOTS',
       color = 'POSITION',
       hover_name='PLAYER_NAME',
       title = 'Contested shots per Game by Position',
       labels = {'CONTESTED_SHOTS':'Contested shots', 'POSITION': 'Position'},
       category_orders = {'Pos':('G', 'G-F', 'F-G', 'F', 'F-C', 'C-F', 'C')},
       template='plotly_dark')

In [72]:
px.box(filtered_df,
       x = 'POSITION',
       y = 'MIN',
       color = 'POSITION',
       hover_name='PLAYER_NAME',
       title = 'Minutes per Game by Position',
       labels = {'MIN':'Minutes', 'POSITION': 'Position'},
       category_orders = {'Pos':('G', 'G-F', 'F-G', 'F', 'F-C', 'C-F', 'C')},
       template='plotly_dark')

# 5. <a id='toc5_'></a>[Transforming the numerical attributes to be plotted as radas charts](#toc0_)

In [73]:
df_selected = pd.read_csv('/home/bruno/repos/NBA_2022-2023/data/filtered_df.csv', low_memory=False)

In [74]:
num_attributes = df_selected.select_dtypes( include=['int64', 'float64'] )
num_attributes_not_transform = ['AGE', 'PLAYER_ID', 'TEAM_ID', 
                                'JERSEY_NUMBER', 'PLAYER_HEIGHT_INCHES', 
                                'PLAYER_HEIGHT_CM', 'PLAYER_WEIGHT_LBS', 
                                'PLAYER_WEIGHT_KG', 'PLUS_MINUS']
num_attributes = num_attributes.drop(num_attributes_not_transform, axis = 1)
num_attributes.head()

Unnamed: 0,GP,PTS,REB,AST,G,MIN,PFD,FGM,FGA,FG_PCT,FG3M,FG3A,FG3_PCT,FTM,FTA,FT_PCT,OREB,DREB,STL,BLK,BLKA,TOV,PF,CONTESTED_SHOTS,CONTESTED_SHOTS_2PT,CONTESTED_SHOTS_3PT,DEFLECTIONS,CHARGES_DRAWN,SCREEN_ASSISTS,OFF_BOXOUTS,DEF_BOXOUTS,BOX_OUTS
0,15,3.7,1.4,0.1,15,7.2,0.4,1.5,2.9,0.5,0.7,1.7,0.4,0.1,0.5,0.25,0.4,1.0,0.1,0.0,0.2,0.2,0.7,0.93,0.47,0.47,0.33,0.0,0.07,0.0,0.0,0.0
1,35,4.4,1.3,0.6,35,9.9,0.1,1.5,3.6,0.424,1.3,3.0,0.419,0.1,0.1,1.0,0.2,1.1,0.2,0.0,0.0,0.3,0.9,1.29,0.57,0.71,0.23,0.03,0.14,0.0,0.11,0.11
2,72,8.9,2.1,1.0,72,19.5,0.6,3.4,7.4,0.465,1.4,3.6,0.39,0.6,0.7,0.894,0.5,1.6,0.6,0.2,0.3,0.6,1.2,2.88,1.42,1.46,0.88,0.0,0.17,0.0,0.08,0.08
3,68,16.3,6.6,3.0,68,30.2,3.6,6.3,11.2,0.564,0.9,2.5,0.347,2.8,4.6,0.608,2.4,4.1,0.8,0.8,1.0,1.4,1.9,5.5,3.93,1.57,1.16,0.01,0.87,0.18,0.29,0.47
4,63,3.9,1.2,1.4,63,13.4,0.8,1.5,3.5,0.418,0.6,1.4,0.409,0.4,0.5,0.844,0.4,0.8,0.6,0.2,0.3,0.6,1.3,1.84,0.94,0.9,1.11,0.0,0.03,0.0,0.14,0.14


In [75]:
num_attributes = num_attributes.apply(lambda x: x/x.max(), axis = 0)
df_transformed = df_selected.copy()
df_transformed[num_attributes.columns] = num_attributes

In [76]:
df_transformed.head()

Unnamed: 0,PLAYER_ID,PLAYER_NAME,PLAYER_LAST_NAME,PLAYER_FIRST_NAME,TEAM_ID,TEAM_ABBREVIATION,AGE,JERSEY_NUMBER,POSITION,PLAYER_HEIGHT_INCHES,PLAYER_HEIGHT_CM,PLAYER_WEIGHT_LBS,PLAYER_WEIGHT_KG,COLLEGE,COUNTRY,DRAFT_YEAR,PLUS_MINUS,GP,PTS,REB,AST,G,MIN,PFD,FGM,FGA,FG_PCT,FG3M,FG3A,FG3_PCT,FTM,FTA,FT_PCT,OREB,DREB,STL,BLK,BLKA,TOV,PF,CONTESTED_SHOTS,CONTESTED_SHOTS_2PT,CONTESTED_SHOTS_3PT,DEFLECTIONS,CHARGES_DRAWN,SCREEN_ASSISTS,OFF_BOXOUTS,DEF_BOXOUTS,BOX_OUTS
0,1630639,A.J. Lawson,Lawson,A.J.,1610612742,DAL,22.0,9.0,G,78,198.0,179,81.2,South Carolina,Canada,Undrafted,-3.1,0.180723,0.111782,0.112,0.009346,0.180723,0.176471,0.044444,0.133929,0.130631,0.5,0.142857,0.149123,0.4,0.01,0.04065,0.25,0.078431,0.104167,0.033333,0.0,0.105263,0.04878,0.14,0.053295,0.029898,0.129477,0.0825,0.0,0.012411,0.0,0.0,0.0
1,1631260,AJ Green,Green,AJ,1610612749,MIL,23.0,20.0,G,77,196.0,190,86.2,Northern Iowa,USA,Undrafted,-0.7,0.421687,0.132931,0.104,0.056075,0.421687,0.242647,0.011111,0.133929,0.162162,0.424,0.265306,0.263158,0.419,0.01,0.00813,1.0,0.039216,0.114583,0.066667,0.0,0.0,0.073171,0.18,0.073926,0.03626,0.195592,0.0575,0.034091,0.024823,0.0,0.057592,0.035256
2,1631100,AJ Griffin,Griffin,AJ,1610612737,ATL,19.0,14.0,F,78,198.0,220,99.8,Duke,USA,2022,0.9,0.86747,0.268882,0.168,0.093458,0.86747,0.477941,0.066667,0.303571,0.333333,0.465,0.285714,0.315789,0.39,0.06,0.056911,0.894,0.098039,0.166667,0.2,0.066667,0.157895,0.146341,0.24,0.165043,0.090331,0.402204,0.22,0.0,0.030142,0.0,0.041885,0.025641
3,203932,Aaron Gordon,Gordon,Aaron,1610612743,DEN,27.0,50.0,F,80,203.0,235,106.6,Arizona,USA,2014,7.6,0.819277,0.492447,0.528,0.280374,0.819277,0.740196,0.4,0.5625,0.504505,0.564,0.183673,0.219298,0.347,0.28,0.373984,0.608,0.470588,0.427083,0.266667,0.266667,0.526316,0.341463,0.38,0.315186,0.25,0.432507,0.29,0.011364,0.154255,0.139535,0.151832,0.150641
4,1628988,Aaron Holiday,Holiday,Aaron,1610612737,ATL,26.0,3.0,G,72,183.0,185,83.9,UCLA,USA,2018,0.3,0.759036,0.117825,0.096,0.130841,0.759036,0.328431,0.088889,0.133929,0.157658,0.418,0.122449,0.122807,0.409,0.04,0.04065,0.844,0.078431,0.083333,0.2,0.066667,0.157895,0.146341,0.26,0.105444,0.059796,0.247934,0.2775,0.0,0.005319,0.0,0.073298,0.044872


In [65]:
# # Transforming the PLUS_MINUS feature, but it doesn't make sense

# df_transformed['PLUS_MINUS'].apply(
#     lambda x: x/abs(df_transformed['PLUS_MINUS'].min()) if x<0 else
#               x/abs(df_transformed['PLUS_MINUS'].max()) if x>0 else
#               x).sample(10)

### 5.1.1. <a id='toc5_1_1_'></a>[Exporting the transformed DataFrame as CSV file](#toc0_)

In [77]:
df_transformed.to_csv('/home/bruno/repos/NBA_2022-2023/data/transformed_df.csv', index=False)

# 6. <a id='toc6_'></a>[Plotting the radar charts](#toc0_)

## 6.1.1. <a id='toc6_1_1_'></a>[Importing the transformed data](#toc0_)

In [78]:
df_transformed = pd.read_csv('/home/bruno/repos/NBA_2022-2023/data/transformed_df.csv', low_memory=False)

## Processing the data

In [90]:
# Features of the charts
offensive_features = ['PTS', 'AST', 'FG_PCT', 'FG3_PCT', 'FT_PCT', ]
defensive_features = ['OREB', 'DREB', 'STL', 'BLK', 'CONTESTED_SHOTS', 'BOX_OUTS', 'CHARGES_DRAWN']
descriptive_features = ['GP', 'MIN', 'PF', 'PFD', 'TOV', 'REB']

In [91]:
# Players to be compared
playerA = 'Jayson Tatum'
playerB = 'Joel Embiid'

In [92]:
# Offensive features
auxA = df_transformed[df_transformed['PLAYER_NAME'] == playerA][offensive_features].T
auxA.columns = [playerA]
auxB = df_transformed[df_transformed['PLAYER_NAME'] == playerB][offensive_features].T
auxB.columns = [playerB]

# Defensive features
auxA = df_transformed[df_transformed['PLAYER_NAME'] == playerA][defensive_features].T
auxA.columns = [playerA]
auxB = df_transformed[df_transformed['PLAYER_NAME'] == playerB][defensive_features].T
auxB.columns = [playerB]

# Descriptive features
auxA = df_transformed[df_transformed['PLAYER_NAME'] == playerA][descriptive_features].T
auxA.columns = [playerA]
auxB = df_transformed[df_transformed['PLAYER_NAME'] == playerB][descriptive_features].T
auxB.columns = [playerB]

## Plotting

#### Offensive chart

In [93]:
plt.rcParams['figure.figsize'] = [8, 8]

fig = go.Figure()

fig.add_trace(go.Scatterpolar(
      r=auxA[playerA],
      theta=offensive_features,
      fill='toself',
      name=playerA
))

fig.add_trace(go.Scatterpolar(
      r=auxB[playerB],
      theta=offensive_features,
      fill='toself',
      name=playerB
))

fig.update_layout(
  polar=dict(
    radialaxis=dict(
      visible=True,
      range=[0, 1]
    )),
  showlegend=True,
  width=800, height=800,
  template="plotly_dark",
  title = 'Offensive Features'
)

fig.show()

#### Defensive chart

In [94]:
plt.rcParams['figure.figsize'] = [8, 8]

fig = go.Figure()

fig.add_trace(go.Scatterpolar(
      r=auxA[playerA],
      theta=defensive_features,
      fill='toself',
      name=playerA
))

fig.add_trace(go.Scatterpolar(
      r=auxB[playerB],
      theta=defensive_features,
      fill='toself',
      name=playerB
))

fig.update_layout(
  polar=dict(
    radialaxis=dict(
      visible=True,
      range=[0, 1]
    )),
  showlegend=True,
  width=800, height=800,
  template="plotly_dark",
  title = 'Defensive Features'
)

fig.show()

#### Descriptive chart

In [95]:
plt.rcParams['figure.figsize'] = [8, 8]

fig = go.Figure()

fig.add_trace(go.Scatterpolar(
      r=auxA[playerA],
      theta=descriptive_features,
      fill='toself',
      name=playerA
))

fig.add_trace(go.Scatterpolar(
      r=auxB[playerB],
      theta=descriptive_features,
      fill='toself',
      name=playerB
))

fig.update_layout(
  polar=dict(
    radialaxis=dict(
      visible=True,
      range=[0, 1]
    )),
  showlegend=True,
  width=800, height=800,
  template="plotly_dark",
  title = 'Sescriptive Features'
)

fig.show()