![Banner](https://github.com/Data-Dunkers/lessons/blob/main/images/top-banner.jpg?raw=true)

# Player Comparisons

In this activity, youâ€™ll work with real NBA player statistics to understand how different players contribute to the game. We will process the data to specific columns, sort to find top performers, and visualize the trade-off between scoring volume and efficiency.

## 1. Load the Dataset

First, we'll load the NBA player statistics from the official Data Dunkers repository.

In [None]:
import pandas as pd

url = "https://raw.githubusercontent.com/Data-Dunkers/data/refs/heads/main/NBA/player/nba_player_stats_2024-2025.csv"
df = pd.read_csv(url)
df.head()

## 2. Select Statistics

The dataset has many columns. Let's focus on the essential ones for comparing player roles: scoring (PTS), shooting (FGM, FGA, FG%), and position (POS).

In [None]:
essential_columns = [
    "Name", "POS", "GP", "MIN",
    "PTS", "REB", "AST",
    "FG%", "3P%", "FT%"
]

df_reduced = df[essential_columns]
df_reduced.head()

## 3. Explore the Data

Let's look at the summary statistics to understand what a "typical" player looks like.

In [None]:
df_reduced.describe()

## 4. Find Top Performers

Who are the top scorers in the league? We can sort the data to find out.

In [None]:
df_reduced.sort_values(by="PTS", ascending=False).head(10)

Now, let's find the most efficient shooters. Note that we might need to filter for players who have played enough games or taken enough shots to be statistically significant.

In [None]:
# Filter for players with at least 50 games played
qualified_players = df_reduced[df_reduced["GP"] >= 50]
qualified_players.sort_values(by="FG%", ascending=False).head(10)

## 5. Visualize: Volume vs. Efficiency

A common way to compare players is to look at how many points they score (`PTS`) versus how efficiently they shoot (`FG%`).

- **High PTS, High FG%**: Superstars
- **Low PTS, High FG%**: Role players / Centers (often take easy shots)
- **High PTS, Low FG%**: Volume scorers (inefficient)

We'll use an interactive scatter plot to explore this relationship.

In [None]:
import plotly.express as px

fig = px.scatter(
    df_reduced,
    x="PTS",
    y="FG%",
    color="POS",
    size="MIN",
    hover_name="Name",
    title="Scoring Volume vs. Efficiency (Size = Minutes Played)",
    height=600
)

fig.show()

## Reflection Questions

1. Who are the outliers in the top-right corner (high efficiency, high scoring)?
2. How does position (color) relate to shooting percentage? Do Centers (C) tend to have higher or lower percentages than Guards (G)?
3. Why might a player have a very high shooting percentage but low points per game?

---
### Online Access
You can run this notebook online using the following links:

*   [**Google Colab**](https://colab.research.google.com/github/Data-Dunkers/student/blob/main/activities/player-comparisons.ipynb)
*   [**Callysto Hub**](https://hub.callysto.ca/jupyter/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2FData-Dunkers%2Fstudent&branch=main&subPath=activities/player-comparisons.ipynb&depth=1)