### **Module 5: Descriptive Statistics**

---

### **Introduction to Descriptive Statistics**

Descriptive statistics summarize data in a meaningful way to make it easier to understand. In sports data, you might use descriptive statistics to find:
- The average number of points per game.
- The highest jump height.
- The total number of goals scored.

Python makes it easy to calculate these statistics using lists, dictionaries, Numpy arrays, and Pandas dataframes.

---

### **Calculating Statistics with Lists**

A Python list can store numerical data, and you can use functions like `sum()`, `len()`, and `max()` to calculate statistics.




In [None]:
# Example: Points scored by a player in the last 5 games
points = [24, 30, 28, 22, 26]

# Calculate statistics
average_points = sum(points) / len(points)
max_points = max(points)
total_points = sum(points)

print("Average points:", average_points)
print("Maximum points:", max_points)
print("Total points:", total_points)

### Calculating Statistics with Dictionaries
If your data is in a dictionary, you can extract the values and calculate statistics.

In [None]:
# Example: Points scored by players
player_points = {
    "LeBron James": 27.2,
    "Stephen Curry": 24.6,
    "Kevin Durant": 26.9,
}

# Extract values
points = list(player_points.values())

# Calculate statistics
average_points = sum(points) / len(points)
print("Average points:", average_points)


### Calculating Statistics with Numpy Arrays
Numpy arrays have built-in functions for calculating statistics.



In [None]:
import numpy as np

# Example: Jump heights in cm
jump_heights = np.array([35, 42, 38, 45, 40])

# Calculate statistics
average_height = np.mean(jump_heights)
max_height = np.max(jump_heights)
total_height = np.sum(jump_heights)

print("Average height:", average_height)
print("Maximum height:", max_height)
print("Total height:", total_height)


### Calculating Statistics with Pandas Dataframes
Pandas dataframes make it easy to calculate statistics for entire columns of data.

In [None]:
import pandas as pd

# Example: Player data
player_data = {
    "Name": ["LeBron James", "Stephen Curry", "Kevin Durant"],
    "Points Per Game": [27.2, 24.6, 26.9],
}
df = pd.DataFrame(player_data)

# Calculate statistics
average_points = df["Points Per Game"].mean()
max_points = df["Points Per Game"].max()
total_points = df["Points Per Game"].sum()

print("Average points:", average_points)
print("Maximum points:", max_points)
print("Total points:", total_points)


### **Your Turn: Exercises**

1. Use a **list** to store the game scores of a basketball team: `[102, 96, 110, 98, 115]`.
   - Calculate the average, maximum, and total score.

2. Use a **dictionary** to store the points scored by players:  
   `{"Player A": 20, "Player B": 18, "Player C": 25, "Player D": 15}`  
   - Calculate the total and average points.

3. Create a **Numpy array** for the times (in seconds) of 5 sprints: `[11.2, 10.9, 11.5, 10.8, 11.0]`.
   - Calculate the average and fastest sprint time.

4. Create a **Pandas dataframe** with two columns:
   - **Columns**: `Player`, `Goals Scored`
   - **Data**:  
     - `Player 1`, `12`  
     - `Player 2`, `9`  
     - `Player 3`, `15`  
     - `Player 4`, `10`  
   - Find the total and average goals scored.
