# OVERVIEW

The program uses two tools called nba_api and pandas to get and work with data from the NBA website. The program does the following steps:

- It tells the computer to use the commonteamroster tool from the nba_api and the pandas tool as pd.
- It tells the computer to get the data about the Toronto Raptors players from the NBA website and store it in a list of tables called dataframes.
- It tells the computer to make a new table called roster_df that has only five columns: 'NUM', 'PLAYER', 'HEIGHT', 'WEIGHT', and 'AGE' from the first table in the list.
- It tells the computer to change the 'NUM' column to numbers and sort the table by that column from smallest to largest.
- It tells the computer to show only the 'NUM' and 'PLAYER' columns of the table as text without showing the row numbers. The text shows the names and numbers of the Raptors players in order of their numbers.

# SETUP

In [9]:
# Import the commonteamroster module from the nba_api.stats.endpoints package
# The commonteamroster module provides access to the CommonTeamRoster class,
# which can be used to retrieve the roster information for a given NBA team.
from nba_api.stats.endpoints import commonteamroster

# Import the pandas library as pd
import pandas as pd

# INPUT

In [10]:
# Create an instance of the CommonTeamRoster class with team ID 1610612761
# This team ID corresponds to the Toronto Raptors
roster = commonteamroster.CommonTeamRoster(team_id=1610612761)

# Get the list of dataframes from the roster instance
# The first dataframe in the list has the player names, positions, jersey numbers, heights, weights, and other details
roster_df = roster.get_data_frames()[0]

# Get the column names of the dataframe
roster_df.columns

Index(['TeamID', 'SEASON', 'LeagueID', 'PLAYER', 'NICKNAME', 'PLAYER_SLUG',
       'NUM', 'POSITION', 'HEIGHT', 'WEIGHT', 'BIRTH_DATE', 'AGE', 'EXP',
       'SCHOOL', 'PLAYER_ID', 'HOW_ACQUIRED'],
      dtype='object')

# PROCESS

In [11]:
# Select only the columns 'NUM', 'PLAYER', 'HEIGHT', 'WEIGHT', and 'AGE' from the roster_df dataframe
# Assign the result to a new dataframe called roster_df
roster_df = roster_df[["NUM", "PLAYER", "HEIGHT", "WEIGHT", "AGE"]]

# Get the column names of the new dataframe
# The column names are: 'NUM', 'PLAYER', 'HEIGHT', 'WEIGHT', and 'AGE'
roster_df.columns

Index(['NUM', 'PLAYER', 'HEIGHT', 'WEIGHT', 'AGE'], dtype='object')

In [12]:
# Print the first five rows of the roster_df dataframe
# This will show a sample of the data for the selected columns
roster_df.head()

Unnamed: 0,NUM,PLAYER,HEIGHT,WEIGHT,AGE
0,1,Will Barton,6-5,181,32.0
1,3,O.G. Anunoby,6-7,232,25.0
2,4,Scottie Barnes,6-8,225,21.0
3,5,Precious Achiuwa,6-8,225,23.0
4,8,Ron Harper Jr.,6-5,245,23.0


Just for fun I'm going to sort the data by age:

In [13]:
# Sort the roster_df dataframe by the 'AGE' column in ascending order
# Assign the result to the same dataframe name
roster_df = roster_df.sort_values(by="AGE")

# Print the first five rows of the sorted dataframe
# This will show the youngest players on the roster
roster_df.head()

Unnamed: 0,NUM,PLAYER,HEIGHT,WEIGHT,AGE
2,4,Scottie Barnes,6-8,225,21.0
14,35,Christian Koloko,7-0,230,22.0
16,45,Dalano Banton,6-7,204,23.0
3,5,Precious Achiuwa,6-8,225,23.0
4,8,Ron Harper Jr.,6-5,245,23.0


That's much better! Now we only have the columns we might care about.

Now let's work on printing the players' names sorted by their number. 

Notice in this example I'm only printing two columns.

In [14]:
# Sort the roster_df dataframe by the 'NUM' column in ascending order
# This will arrange the players by their jersey numbers
roster_df = roster_df.sort_values(by="NUM")

# Print only the 'NUM' and 'PLAYER' columns of the sorted dataframe
# This will show the names of the players along with their jersey numbers
print(roster_df[["NUM", "PLAYER"]])

   NUM            PLAYER
0    1       Will Barton
5   11      Joe Wieskamp
6   19      Jakob Poeltl
7   20   Jeff Dowtin Jr.
8   21    Thaddeus Young
9   22     Malachi Flynn
10  23     Fred VanVleet
11  25     Chris Boucher
1    3      O.G. Anunoby
12  32   Otto Porter Jr.
13  33    Gary Trent Jr.
14  35  Christian Koloko
2    4    Scottie Barnes
15  43     Pascal Siakam
16  45     Dalano Banton
3    5  Precious Achiuwa
4    8    Ron Harper Jr.


Notice that the numbers didn't print in the proper sorted order? That's because the program thinks the NUM column are characters (we call them *strings* in Python), not numbers.

Let's convert them to numbers and print them again:

In [15]:
# Convert the 'NUM' column of the roster_df dataframe to numeric type
# This will ensure that the sorting by this column is done correctly
roster_df["NUM"] = pd.to_numeric(roster_df["NUM"])

# Sort the roster_df dataframe by the 'NUM' column in ascending order
# This will arrange the players by their jersey numbers
roster_df = roster_df.sort_values(by="NUM")

# Print only the 'NUM' and 'PLAYER' columns of the sorted dataframe
# This will show the names of the players along with their jersey numbers
print(roster_df[["NUM", "PLAYER"]])

    NUM            PLAYER
0     1       Will Barton
1     3      O.G. Anunoby
2     4    Scottie Barnes
3     5  Precious Achiuwa
4     8    Ron Harper Jr.
5    11      Joe Wieskamp
6    19      Jakob Poeltl
7    20   Jeff Dowtin Jr.
8    21    Thaddeus Young
9    22     Malachi Flynn
10   23     Fred VanVleet
11   25     Chris Boucher
12   32   Otto Porter Jr.
13   33    Gary Trent Jr.
14   35  Christian Koloko
15   43     Pascal Siakam
16   45     Dalano Banton


# OUTPUT

For the final output we're going to get rid of the index because we don't need it and it makes the table ugly!

In [16]:
# Print only the 'NUM' and 'PLAYER' columns of the roster_df dataframe as a string
# This will remove the index column and align the output
print(roster_df[["NUM", "PLAYER"]].to_string(index=False))

 NUM           PLAYER
   1      Will Barton
   3     O.G. Anunoby
   4   Scottie Barnes
   5 Precious Achiuwa
   8   Ron Harper Jr.
  11     Joe Wieskamp
  19     Jakob Poeltl
  20  Jeff Dowtin Jr.
  21   Thaddeus Young
  22    Malachi Flynn
  23    Fred VanVleet
  25    Chris Boucher
  32  Otto Porter Jr.
  33   Gary Trent Jr.
  35 Christian Koloko
  43    Pascal Siakam
  45    Dalano Banton


# IN SUMMARY...

Here is a brief explanation of each term:

- **nba_api.stats.endpoints** is a package that contains classes for accessing various endpoints on the NBA website that provide statistical data about the league, its teams, and its players¹.
- **pandas** is a library that provides data analysis and manipulation tools for working with tabular and multidimensional data structures in Python².
- **df.columns** is an attribute that returns the column labels of a pandas DataFrame³.
- **df.head()** is a method that returns the first n rows of a pandas DataFrame, where n is an optional parameter that defaults to 5².
- **df.sort_values()** is a method that sorts a pandas DataFrame by one or more columns in ascending or descending order².
- **df.to_string(index=False)** is a method that converts a pandas DataFrame to a string representation, where index=False means that the row labels are not included in the output².
- **pd.to_numeric()** is a function that converts one or more columns of a pandas DataFrame to numeric type, which can handle errors and missing values².

Source: Conversation with Bing, 2023-05-23
1. NBA API Library - nbasense. http://nbasense.com/nba-api/.
2. swar/nba_api: An API Client package to access the APIs for NBA.com - GitHub. https://github.com/swar/nba_api.
3. nba_api/shotchartdetail.md at master · swar/nba_api · GitHub. https://github.com/swar/nba_api/blob/master/docs/nba_api/stats/endpoints/shotchartdetail.md.
4. Connection Timeout error is thrown while using leaguegamefinder of nba .... https://stackoverflow.com/questions/66736607/connection-timeout-error-is-thrown-while-using-leaguegamefinder-of-nba-api-stats.
5. nba_api/playercareerstats.md at master · swar/nba_api · GitHub. https://github.com/swar/nba_api/blob/master/docs/nba_api/stats/endpoints/playercareerstats.md.

# EXERCISES

1. Modify our example above to also print "POSITION". 
1. We imported the data using this command, which uses index 0 of the data:
   <br>`roster_df = roster.get_data_frames()[0]`
   <br>What is in index 1? Print something interesting from the data that involves sorting.