# Mean, Median, and Mode

In statistics, we use "measures of central tendency" to find the typical value in a dataset. In basketball, these measures help us understand the typical scoring or playing time for a team.

* **Mean (Average):** The sum of all values divided by the number of values.
* **Median (Middle):** The middle value when all numbers are sorted from lowest to highest.
* **Mode (Most Common):** The value that appears most frequently in the dataset.

We will use the concepts of mean, median, and mode to understand the typical scoring contribution of players on the **Indiana Pacers 2024-2025 season**. We'll focus on **Points Per Game (PTS)**.

In [None]:
import pandas as pd
import plotly.express as px

url = "https://raw.githubusercontent.com/Data-Dunkers/data/refs/heads/main/NBA/team/2024-2025/IND_2024-2025_players.csv"
df = pd.read_csv(url)

# Remove the "Total" row
df = df[df['Name'] != 'Total']

# Display the first few rows
df[['Name', 'PTS']].head()

## Calculating Central Tendency

Let's calculate the Mean, Median, and Mode for the `PTS` column.

In [None]:
mean_pts = df['PTS'].mean()
median_pts = df['PTS'].median()
mode_pts = df['PTS'].mode()[0]  # Take the first mode if there are multiple

print(f"Mean PTS: {mean_pts:.2f}")
print(f"Median PTS: {median_pts:.2f}")
print(f"Mode PTS: {mode_pts:.2f}")

## Visualization

Now, let's create a scatter plot that shows every player and their points per game, along with lines indicating the mean, median, and mode.

In [None]:
fig = px.scatter(df.sort_values("PTS"), x="Name", y="PTS", 
                 title="Indiana Pacers 2024-2025: Points Per Game",
                 labels={"PTS": "Points Per Game (PTS)", "Name": "Player"})

# Add horizontal lines for Mean, Median, and Mode
fig.add_hline(y=mean_pts, line_dash="dash", line_color="red", annotation_text=f"Mean: {mean_pts:.1f}", annotation_position="top left")
fig.add_hline(y=median_pts, line_dash="dot", line_color="green", annotation_text=f"Median: {median_pts:.1f}", annotation_position="bottom left")
fig.add_hline(y=mode_pts, line_dash="dot", line_color="orange", annotation_text=f"Mode: {mode_pts:.1f}", annotation_position="top right")

fig.show()

## Reflection Questions

1. Looking at the 'Mean' line, which players score significantly above the team average?
2. Is the median higher or lower than the mean? What does this tell you about the distribution of points on the team?
3. How would a 'superstar' player who scores 35 points per game affect the mean? How would this affect the median?