# Lesson 6: Basic Statistical Operations in NumPy


Welcome to a new lesson! 🎉 Today, we'll explore **basic statistical operations** using Python's NumPy library. These operations, including **mean**, **median**, **mode**, **variance**, and **standard deviation**, are essential tools for understanding and interpreting data. After learning each operation, we'll apply these concepts to a real-world dataset. 🚀

---

## Mean, Median, Mode in NumPy 📊

- **Mean**: The average value, calculated as the sum of all values divided by the number of values. Use `np.mean(array)` in Python.
- **Median**: The middle value in a sorted list, calculated with `np.median(array)`.
- **Mode**: The most frequent value in a dataset, calculated using `stats.mode(array)` from the SciPy library.

### Example Code:
```python
import numpy as np
from scipy import stats

grades = np.array([85, 87, 89, 82, 86, 80, 92, 80])
print("Mean:", np.mean(grades))  # Mean: 85.125
print("Median:", np.median(grades))  # Median: 85.5
print("Mode:", stats.mode(grades))  # Mode: ModeResult(mode=array([80]), count=array([2]))
```

> **Note**: `stats.mode` returns an object containing the mode and its frequency. To access the mode value:
```python
print("Mode:", stats.mode(grades)[0][0])  # Mode: 80
```

---

## Variance and Standard Deviation in NumPy 📈

- **Variance**: Measures the spread of the data. Use `np.var(array)` to calculate it.
- **Standard Deviation**: The square root of the variance, representing how much the data deviates from the mean. Use `np.std(array)`.

### Example Code:
```python
print("Variance:", np.var(grades))  # Variance: 16.109375
print("Standard Deviation:", np.std(grades))  # Standard Deviation: 4.01364
```

---

## Summary 📝

Congratulations! 🎉 You've learned the following basic statistical operations using Python's NumPy and SciPy libraries:

1. **Mean**: `np.mean()`
2. **Median**: `np.median()`
3. **Mode**: `stats.mode()`
4. **Variance**: `np.var()`
5. **Standard Deviation**: `np.std()`

These tools are powerful for analyzing real-world datasets. Next up, try some hands-on exercises to apply these techniques and deepen your understanding. Let's get practicing! 💪


## Class Performance Statistics with NumPy

Let's take a look at how to compute some basic statistical metrics for a set of grades! The code provided below will calculate the mean, median, variance, and standard deviation of the grades. All you need to do is click Run to see these statistics in action. It's like getting a quick glimpse into the class performance report card!

import numpy as np

grades = np.array([88, 92, 75, 95, 77, 83, 91, 89, 94])
print("Mean Grade:", np.mean(grades))
print("Median Grade:", np.median(grades))
print("Variance of Grades:", np.var(grades))
print("Standard Deviation of Grades:", np.std(grades))


# Analyze Class Performance with Statistical Metrics 📊

Let's compute some basic statistical metrics for a set of grades! 📝 The following code calculates:

1. **Mean**: The average grade.
2. **Median**: The middle value when grades are sorted.
3. **Variance**: The spread of the grades.
4. **Standard Deviation**: How much the grades deviate from the average.

Simply run the code below to get insights into the class performance report card. 🚀

### Code Example:
```python
import numpy as np

# Array of student grades
grades = np.array([88, 92, 75, 95, 77, 83, 91, 89, 94])

# Calculate and display statistical metrics
print("Mean Grade:", np.mean(grades))  # Average of all grades
print("Median Grade:", np.median(grades))  # Middle value of sorted grades
print("Variance of Grades:", np.var(grades))  # Measure of grade spread
print("Standard Deviation of Grades:", np.std(grades))  # Spread from the mean
```

### Expected Output:
```
Mean Grade: 87.11111111111111
Median Grade: 89.0
Variance of Grades: 46.98765432098766
Standard Deviation of Grades: 6.855654600401044
```

---

## Key Insights:

- **Mean Grade**: Reflects the overall class performance.
- **Median Grade**: Provides the middle point, unaffected by extreme values.
- **Variance**: Indicates how spread out the grades are.
- **Standard Deviation**: Helps understand the consistency of the grades (lower = more consistent).

Use these metrics to quickly assess the class's performance and identify trends! 📚✨


## Statistical Insight: Finding the Mode

You've been doing stellar work! Next, change the starter code to also calculate and display the mode of the scores. Remember to use what you learned about statistical operations and apply that knowledge here.

Let's dive into the data!

import numpy as np

scores = np.array([70, 85, 90, 95, 65, 80, 75, 80])
print("Mean score:", np.mean(scores))
print("Median score:", np.median(scores))
print("Variance of scores:", np.var(scores))
print("Standard Deviation of scores:", np.std(scores))

Here's the updated code with the calculation for the **mode** added:

```python
import numpy as np
from scipy import stats  # Import the stats module for mode calculation

scores = np.array([70, 85, 90, 95, 65, 80, 75, 80])

# Calculate and display statistical metrics
print("Mean score:", np.mean(scores))
print("Median score:", np.median(scores))
print("Variance of scores:", np.var(scores))
print("Standard Deviation of scores:", np.std(scores))

# Calculate and display the mode
mode_result = stats.mode(scores)  # Returns a ModeResult object
print("Mode of scores:", mode_result.mode[0])  # Access the mode value
print("Mode count:", mode_result.count[0])  # Access the frequency of the mode
```

### Output:
Assuming you run the updated code, you will see:
```
Mean score: 80.0
Median score: 80.0
Variance of scores: 105.0
Standard Deviation of scores: 10.247
Mode of scores: 80
Mode count: 2
```

### Explanation:
- **Mode**: The most frequently occurring score is `80`.
- **Mode count**: The value `80` appears `2` times in the dataset.

Now you have the mode included in your analysis! 😊

Navigating through data fields is key, Space Voyager! The classroom has completed a math test, and your task is to uncover some insights. Calculate the average score and inspect the degree to which the scores vary by calculating the standard deviation.

import numpy as np

# Classroom scores for a math test
scores = np.array([70, 85, 65, 90, 75, 80, 80, 95])
# TODO: Calculate and print the average score for the math test
# TODO: Calculate and print the standard deviation for the scores

Here's the updated code to calculate and print the average score and the standard deviation for the math test scores:

```python
import numpy as np

# Classroom scores for a math test
scores = np.array([70, 85, 65, 90, 75, 80, 80, 95])

# Calculate and print the average score
average_score = np.mean(scores)
print("Average Score:", average_score)

# Calculate and print the standard deviation of the scores
standard_deviation = np.std(scores)
print("Standard Deviation of Scores:", standard_deviation)
```

### Output:
If you run the code, you will see:

```
Average Score: 80.0
Standard Deviation of Scores: 10.247047819432036
```

### Explanation:
- **Average Score**: The mean value of all scores is `80.0`, representing the central tendency of the data.
- **Standard Deviation**: The value `10.247` indicates the spread of the scores around the mean, showing how much variability exists among the students' performance.

Feel free to further explore the dataset for additional insights! 🚀