# Final Project, Part 2

The purpose of this assignment is to create a 'Viz for Experts' with an interactive dashboard interface for exploring your data.

For this submission option, you will submit your work through this Workspace.
    
**Please see Homework Prompt in PrairieLearn interface for more details on the requirements for this assignment.**

A rough outline of elements of code and write-up is shown below:

## NBA Team Performance Analysis Dashboard

### Team Members
Max Zhang

Jimmy Qiu

Zhuokai WU

This interactive dashboard explores NBA player and team performance statistics for the 2023-24 regular season.

## Code:

 * An interactive dashboard within your Workspace that helps an expert explore your dataset thoroughly.
 * There should be a "dashboard" type aspect to this - i.e. a linked view exploring your dataset in an interactive way (like in Lab \#4) with [bqplot](https://bqplot.github.io/bqplot/).
 * Do not delete any cells, *just comment them out*. Show your work.



In [58]:
import pandas as pd
df = pd.read_csv('2023-2024 NBA Player Stats - Regular.csv', sep=';', encoding='latin1')
df.head()

Unnamed: 0,Rk,Player,Pos,Age,Tm,G,GS,MP,FG,FGA,...,FT%,ORB,DRB,TRB,AST,STL,BLK,TOV,PF,PTS
0,1,Precious Achiuwa,PF-C,24,TOT,74,18,21.9,3.2,6.3,...,0.616,2.6,4.0,6.6,1.3,0.6,0.9,1.1,1.9,7.6
1,1,Precious Achiuwa,C,24,TOR,25,0,17.5,3.1,6.8,...,0.571,2.0,3.4,5.4,1.8,0.6,0.5,1.2,1.6,7.7
2,1,Precious Achiuwa,PF,24,NYK,49,18,24.2,3.2,6.1,...,0.643,2.9,4.3,7.2,1.1,0.6,1.1,1.1,2.1,7.6
3,2,Bam Adebayo,C,26,MIA,71,71,34.0,7.5,14.3,...,0.755,2.2,8.1,10.4,3.9,1.1,0.9,2.3,2.2,19.3
4,3,Ochai Agbaji,SG,23,TOT,78,28,21.0,2.3,5.6,...,0.661,0.9,1.8,2.8,1.1,0.6,0.6,0.8,1.5,5.8


In [62]:
import numpy as np
import bqplot as bq
import ipywidgets as widgets

# Remove 'TOT' rows and keep players with their current teams
clean_df = df[df['Tm'] != 'TOT']
clean_df.Tm.unique()

array(['TOR', 'NYK', 'MIA', 'UTA', 'MEM', 'MIN', 'PHO', 'CLE', 'NOP',
       'MIL', 'ORL', 'WAS', 'POR', 'DET', 'CHO', 'PHI', 'BOS', 'SAS',
       'SAC', 'BRK', 'LAC', 'OKC', 'ATL', 'CHI', 'DEN', 'HOU', 'IND',
       'DAL', 'LAL', 'GSW'], dtype=object)

In [63]:
# Create team summary statistics for the heatmap (driver plot)
team_stats = clean_df.groupby('Tm').agg({
    'PTS': 'mean',    # Points per game
    'AST': 'mean',    # Assists per game
    'TRB': 'mean',    # Total rebounds
    'FG%': 'mean',    # Field goal percentage
    '3P%': 'mean'     # 3-point percentage
}).round(2)

print("\nPreview of team statistics:")
print(team_stats.head())


Preview of team statistics:
      PTS   AST   TRB   FG%   3P%
Tm                               
ATL  9.21  2.11  3.65  0.44  0.31
BOS  8.53  1.89  3.52  0.49  0.37
BRK  8.30  2.25  3.58  0.43  0.29
CHI  8.96  1.83  3.71  0.44  0.28
CHO  9.30  2.45  3.53  0.44  0.29


In [65]:
# Create scales for the heatmap
teams_sc = bq.OrdinalScale()
stats_sc = bq.OrdinalScale()
col_sc = bq.ColorScale(scheme='YlOrRd')  # Yellow-Orange-Red color scheme to show intensity

# Create heatmap (driver plot)
heatmap = bq.GridHeatMap(
    row=team_stats.index.tolist(),
    column=team_stats.columns.tolist(),
    color=team_stats.values,
    scales={'row': teams_sc, 'column': stats_sc, 'color': col_sc},
    interactions={'click': 'select'},
    selected_style={'fill': 'purple'},
    title='NBA Team Statistics')

# Create axes for heatmap
x_ax_heat = bq.Axis(
    scale=stats_sc, 
    label='Statistics', 
    orientation='horizontal',
    tick_style={'font-size': 12})
y_ax_heat = bq.Axis(
    scale=teams_sc, 
    label='Teams', 
    orientation='vertical',
    tick_style={'font-size': 12})
col_ax = bq.ColorAxis(
    scale=col_sc, 
    orientation='vertical',
    label='Value')

# Create heatmap figure
fig_heatmap = bq.Figure(
    marks=[heatmap],
    axes=[x_ax_heat, y_ax_heat, col_ax],
    title='Team Statistics Heatmap',
    fig_margin={'top': 60, 'bottom': 50, 'left': 70, 'right': 100},
    min_aspect_ratio=0.8)

# Create scales for bar chart
x_sc = bq.OrdinalScale()
y_sc = bq.LinearScale()

# Create initial bar chart (driven plot)
bars = bq.Bars(
    x=[],
    y=[],
    scales={'x': x_sc, 'y': y_sc},
    colors=['#1f77b4'],  # Professional blue color
    padding=0.2)

# Create axes for bar chart
x_ax_bar = bq.Axis(
    scale=x_sc, 
    label='Players', 
    orientation='horizontal',
    tick_style={'text-anchor': 'end', 'transform': 'rotate(-45)', 'font-size': 12})
y_ax_bar = bq.Axis(
    scale=y_sc, 
    label='Value', 
    orientation='vertical',
    tick_style={'font-size': 12})

# Create bar chart figure
fig_bars = bq.Figure(
    marks=[bars],
    axes=[x_ax_bar, y_ax_bar],
    title='Top 5 Players',
    fig_margin={'top': 60, 'bottom': 100, 'left': 70, 'right': 50},
    min_aspect_ratio=0.8)

# Create selection function for linking plots
def on_selection(change):
    if change['owner'].selected is not None and len(change['owner'].selected) > 0:
        # Get selected team and statistic
        selected = change['owner'].selected[0]
        team = team_stats.index[selected[0]]
        stat = team_stats.columns[selected[1]]
        
        # Get top 5 players from selected team for selected statistic
        team_players = clean_df[clean_df['Tm'] == team].nlargest(5, stat)
        
        # Update bar chart data
        bars.x = team_players['Player'].tolist()
        bars.y = team_players[stat].tolist()
        
        # Update bar chart labels
        y_ax_bar.label = f'{stat} per Game'
        fig_bars.title = f'Top 5 Players - {team} ({stat})'

# Connect the plots
heatmap.observe(on_selection, names=['selected'])

# dashboard
dashboard = widgets.HBox([fig_heatmap, fig_bars], layout={'height': '600px'})
display(dashboard)

HBox(children=(Figure(axes=[Axis(label='Statistics', scale=OrdinalScale(), tick_style={'font-size': 12}), Axis…

ValueError: Unsupported dtype object

ValueError: Unsupported dtype object

## Prose:

* One paragraph explaining how to use the dashboard you created, to help someone who is not an expert understand your dataset.
* A list of 1 or more contextual datasets you have identified, links to where they reside, and a sentence about why they might be useful in telling the final story.
  * by "contextual dataset" here means a dataset that would add context to your chosen dataset. For example, if your dataset is the Champaign bus routes, some interesting contextual datasets could be the Chicago bus routes, or the Springfield bus routes, or the Amtrak routes in Champaign
  * you do not have to do anything with this dataset at the moment beyond writing a bit about why it would be useful. Looking forward, you will want to include "contextual visualizations" (which you may or may not generate on your own) in your Final Project, Part 3 and identifying a possibly useful dataset is a great way to start looking for contextual visualizations.
* If you have identified your dataset as a "large one" (i.e. larger than the GitHub file upload limit) comment on if you want to revise your plan for hosting this data or not. If this does not apply to your dataset please explicitly state this.
* Additionally, please note that as of writing, it is not possible to embed images within Starboard. Be sure to address how you plan on including your contextual dataset to add context to your main dataset given that you won't be able to directly embed images if you plan on using Starboard for Part 3.1 of the Final Project.


## Dashboard Description for Non-Experts
This interactive dashboard helps you explore NBA team statistics and identify top performers for the 2023-24 season. It consists of two main parts:

1. Team Statistics Heatmap (left):
   - Each row shows a different NBA team
   - Each column represents a key statistic (PTS = Points, AST = Assists, etc.)
   - Darker colors indicate higher values
   - Click any cell to see individual player details

2. Player Performance Chart (right):
   - Shows the top 5 players from the selected team
   - Updates automatically when you click the heatmap
   - Helps identify key contributors in specific statistical categories

For example, if you want to see who the best scorers are on the Boston Celtics, click the cell where the Celtics row intersects with the PTS column.

## Dataset Size Statement
Our dataset is approximately 95KB, well below GitHub's file size limits. No special hosting considerations are needed, and we will continue with the same approach used in Part 1.

## Plot Summary

Summarize the characteristics of the dataset in words: what does it represent, what are the fields/columns/rows, what data types are they, etc.