# Day 10: Introduction to SciPy

Welcome to Day 10! Today, you'll get an introduction to SciPy, a library that builds on NumPy and provides a large number of higher-level scientific algorithms. We'll explore some of its most useful sub-packages.

As always, let's start by importing the necessary libraries.

In [1]:
import numpy as np
from scipy import stats
from scipy import optimize

---

## Part 1: Statistical Functions with `scipy.stats`

The `scipy.stats` module contains a large number of probability distributions and a growing library of statistical functions.

A common task in statistics is to compare the means of two different groups. A t-test is used to determine if there is a significant difference between the means of two groups. Let's create some sample data representing test scores from two different classes.

In [None]:
# Class A scores - normally distributed with mean 85 and std dev 5
class_a_scores = np.random.normal(loc=85, scale=5, size=30)

# Class B scores - normally distributed with mean 88 and std dev 5
class_b_scores = np.random.normal(loc=88, scale=5, size=30)

print(f"Class A Mean Score: {np.mean(class_a_scores):.2f}")
print(f"Class B Mean Score: {np.mean(class_b_scores):.2f}")

**Exercise 1.1:** Perform an independent t-test to see if the difference in mean scores between Class A and Class B is statistically significant. Use `stats.ttest_ind()`.

In [None]:
# Your code here

**Solution 1.1:**

In [None]:
# Solution
t_statistic, p_value = stats.ttest_ind(class_a_scores, class_b_scores)

print(f"T-statistic: {t_statistic:.4f}")
print(f"P-value: {p_value:.4f}")

# Interpretation of the p-value
alpha = 0.05  # Significance level
if p_value < alpha:
    print("The difference in mean scores is statistically significant.")
else:
    print("The difference in mean scores is not statistically significant.")

---

## Part 2: Optimization with `scipy.optimize`

Optimization is the problem of finding the input to a function that results in the minimum (or maximum) output. The `scipy.optimize` module provides several commonly used optimization algorithms.

Let's define a simple quadratic function that we want to minimize. For example: `f(x) = (x - 3)^2 + 5`. We can see by looking at it that the minimum value is 5, which occurs when x = 3.

In [None]:
def my_function(x):
    return (x - 3)**2 + 5

**Exercise 2.1:** Use `optimize.minimize()` to find the value of `x` that minimizes `my_function`. Start the search from an initial guess of `x=0`.

In [None]:
# Your code here

**Solution 2.1:**

In [None]:
# Solution
initial_guess = 0
result = optimize.minimize(my_function, initial_guess)

print(result)

From the output, you can see the result in `result.x`. It should be very close to 3. The `result.fun` attribute gives the minimum value of the function found.

---

## Part 3: Descriptive Statistics

The `scipy.stats` module can also provide a quick and comprehensive summary of descriptive statistics for a dataset.

**Exercise 3.1:** Using the `class_a_scores` from Part 1, use the `stats.describe()` function to get a full statistical summary.

In [None]:
# Your code here

**Solution 3.1:**

In [None]:
# Solution
summary_stats = stats.describe(class_a_scores)
print(summary_stats)

This single function gives you the number of observations (`nobs`), the min and max values, the mean, variance, skewness, and kurtosis.

---

### Excellent work!

You've just scratched the surface of what SciPy can do. It's a vast library with powerful tools for interpolation, integration, signal processing, and much more. Tomorrow, we'll dive into interpolation and integration.