![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

# Basketball

<img src=https://upload.wikimedia.org/wikipedia/commons/1/10/Basketball_through_hoop.jpg width=600>
<p>
<a href='https://en.wikipedia.org/wiki/Basketball#/media/File:Basketball_through_hoop.jpg'>Basketball falling through hoop</a>

We are going to look at NBA shot data using the [nba_api](https://github.com/swar/nba_api) code library.

To start, we will get data for the LA Lakers for the last two seasons and store it in a dataframe called `shots`.

In [None]:
%pip install -q pyodide_http plotly nba_api nbformat
import pyodide_http
pyodide_http.patch_all()

team_name = 'Lakers'
seasons = ['2021-22', '2022-23']

from nba_api.stats.static import teams
from nba_api.stats.endpoints import shotchartdetail
import pandas as pd

team = teams.find_teams_by_full_name(team_name)[0]
team_id = team['id']
shots = pd.DataFrame()
for season in seasons:
    season_shots = shotchartdetail.ShotChartDetail(team_id=team_id, player_id=0, season_nullable=season, season_type_all_star=['Regular Season', 'Playoffs'], context_measure_simple='FGA').get_data_frames()[0]
    season_shots['SEASON'] = season
    shots = pd.concat([shots, season_shots])
shots

There are quite a few interesting columns in the dataset, we can list them using `shots.columns`.

In [None]:
shots.columns

We can also see all the possible values in a particular column using `.unique()`.

In [None]:
shots['SHOT_ZONE_AREA'].unique()

Let's create a scatterplot of the locations of the shots.

In [None]:
import plotly.express as px
px.scatter(shots, x='LOC_X', y='LOC_Y')

That looks a little like half of a basketball court, let's resize it using `height` and `width`.

In [None]:
px.scatter(shots, x='LOC_X', y='LOC_Y', height=1000, width=800)

We can also color the points by if the shot was made or not.

In [None]:
px.scatter(shots, x='LOC_X', y='LOC_Y', height=1000, width=800, color='SHOT_MADE_FLAG')

The computer thinks that the `SHOT_MADE_FLAG` value is a number, when it should just be true or false, what we call a boolean value. Let's change that and the remake the graph.

In [None]:
shots['SHOT_MADE_FLAG'] = shots['SHOT_MADE_FLAG'].astype(bool)
px.scatter(shots, x='LOC_X', y='LOC_Y', height=1000, width=800, color='SHOT_MADE_FLAG')

We can also add a list of columns to the `hover_data`, such as `SHOT_DISTANCE` and `PLAYER_NAME`. We should also add a title.

In [None]:
px.scatter(shots, x='LOC_X', y='LOC_Y', color='SHOT_MADE_FLAG', hover_data=['SHOT_DISTANCE','PLAYER_NAME'], height=1000, width=800, title='Lakers Shot Chart 2021-2023')

We may want to look at just one season using `shots[shots['SEASON']=='2022-23']`, and also color the points by player.

In [None]:
season_data = shots[shots['SEASON']=='2022-23']
px.scatter(season_data, x='LOC_X', y='LOC_Y', color='PLAYER_NAME', hover_data=['SHOT_MADE_FLAG'], height=1000, width=800, title='Lakers Shot Chart 2022-23')

With these data we can also look for relationships between columns, like maybe the time remaining in the game and the shot distance.

To do that we needed to create a new column that combined `MINUTES_REMAINING` and `SECONDS_REMAINING`. We also converted the `PERIOD` column to a string so that we can select individual periods in the legend.

In [None]:
shots['TIME_REMAINING'] = shots['MINUTES_REMAINING'] * 60 + shots['SECONDS_REMAINING']
shots['PERIOD'] = shots['PERIOD'].astype(str)
px.scatter(shots, x='TIME_REMAINING', y='SHOT_DISTANCE', color='PERIOD', title='Shot Distance by Time Remaining', hover_data=['SHOT_DISTANCE','SHOT_MADE_FLAG'], height=800)

Another option is to use a histogram to see if there are patterns in the data.

In [None]:
px.histogram(shots, x='SHOT_DISTANCE', color='SHOT_MADE_FLAG', title='Shot Distance Frequencies')

With any of these visualizations, we can use a column as an `animation_frame` to show how the data change based on that column.

When creating animations, it is a good idea to set the x-axis and y-axis ranges to reasonable values for your data, so that they don't change with every animation frame.

In [None]:
px.histogram(shots, x='SHOT_DISTANCE', color='SHOT_MADE_FLAG', animation_frame='SEASON', title='Shot Distance Frequencies by Season').update_xaxes(range=[0, 80]).update_yaxes(range=[0, 1000])

You can continue your own analysis in the [next notebook](basketball-challenge.ipynb).

[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)