# Module 5: SciPy - Comprehensive Test

This test covers all topics from Module 5:
- Statistics (descriptive stats, distributions, hypothesis testing)
- Interpolation and curve fitting
- Optimization (minimize, root finding)
- Integration and ODEs

**Instructions:**
1. Complete all 12 questions
2. Write your code in the provided code cells
3. Run your code to verify it works
4. Some questions have multiple parts - make sure to complete all parts

**Scoring:** Each question is worth points as indicated. Partial credit may be given.

In [None]:
# Required imports - run this cell first
import numpy as np
from scipy import stats
from scipy import interpolate
from scipy.interpolate import interp1d, CubicSpline, UnivariateSpline
from scipy.optimize import minimize, minimize_scalar, root, brentq, curve_fit
from scipy.integrate import quad, dblquad, solve_ivp
import matplotlib.pyplot as plt

# Set random seed for reproducibility
np.random.seed(42)

# Set up plotting style
plt.style.use('seaborn-v0_8-whitegrid')

print("All imports successful!")

---

## Part 1: Statistics (Questions 1-4)

### Question 1: Descriptive Statistics (8 points)

A researcher collected reaction times (in milliseconds) from 40 participants in a cognitive experiment.

Using the data provided:

1. Calculate the mean, median, and standard deviation (2 points)
2. Use `scipy.stats.describe()` to get comprehensive statistics (2 points)
3. Calculate the 25th, 50th, and 75th percentiles (IQR) (2 points)
4. Determine the skewness and interpret what it means for this data (2 points)

In [None]:
# Reaction time data (milliseconds)
np.random.seed(42)
reaction_times = np.concatenate([
    np.random.normal(250, 30, 30),  # Normal responses
    np.random.normal(350, 40, 10)   # Slower responses (fatigue)
])

# Your code here


### Question 2: Probability Distributions (8 points)

A quality control engineer knows that the lifespan of LED bulbs follows an exponential distribution with a mean of 50,000 hours.

1. Create the exponential distribution object (2 points)
2. Calculate the probability that a bulb lasts more than 60,000 hours (2 points)
3. Calculate the probability that a bulb lasts between 30,000 and 70,000 hours (2 points)
4. Find the median lifespan (time at which 50% of bulbs fail) (2 points)

In [None]:
# Mean lifespan
mean_lifespan = 50000  # hours

# Your code here


### Question 3: Hypothesis Testing (10 points)

A pharmaceutical company claims their new medication reduces blood pressure by an average of 15 mmHg. A clinical trial with 30 patients showed the following blood pressure reductions.

1. State the null and alternative hypotheses (2 points)
2. Perform a one-sample t-test to test the company's claim (3 points)
3. Calculate the 95% confidence interval for the mean reduction (3 points)
4. Based on the p-value (alpha = 0.05), state your conclusion (2 points)

In [None]:
# Blood pressure reductions (mmHg) from clinical trial
np.random.seed(123)
bp_reductions = np.random.normal(loc=12.5, scale=5, size=30)

# Claimed reduction
claimed_reduction = 15  # mmHg

# Your code here


### Question 4: Chi-Square Test (8 points)

A survey examined the relationship between education level and smartphone brand preference. The contingency table below shows the observed frequencies.

1. Perform a chi-square test of independence (3 points)
2. Display the expected frequencies (2 points)
3. Interpret the results at alpha = 0.05 (3 points)

In [None]:
# Observed frequencies
# Rows: Education (High School, Bachelor's, Graduate)
# Columns: Brand (Apple, Samsung, Other)
observed = np.array([
    [30, 45, 25],   # High School
    [55, 40, 15],   # Bachelor's
    [65, 30, 15],   # Graduate
])

# Your code here


---

## Part 2: Interpolation and Curve Fitting (Questions 5-7)

### Question 5: Interpolation (8 points)

Temperature measurements were taken at a weather station every 3 hours.

1. Create a cubic spline interpolation of the data (2 points)
2. Estimate the temperature at 10:00 AM (hour 10) and 4:30 PM (hour 16.5) (2 points)
3. Find the time and value of the maximum temperature (2 points)
4. Plot the original data points and the interpolated curve (2 points)

In [None]:
# Time (hours from midnight) and temperature (Celsius)
hours = np.array([0, 3, 6, 9, 12, 15, 18, 21, 24])
temperatures = np.array([15, 13, 14, 19, 26, 28, 24, 19, 16])

# Your code here


### Question 6: Curve Fitting (10 points)

A biologist is studying bacterial growth and collected population data over time. The data appears to follow logistic growth:

$$P(t) = \frac{K}{1 + \frac{K - P_0}{P_0} e^{-rt}}$$

where K is carrying capacity, r is growth rate, and P0 is initial population.

1. Define the logistic growth function (2 points)
2. Use `curve_fit` to fit the model to the data (3 points)
3. Extract the fitted parameters with their uncertainties (2 points)
4. Plot the data and fitted curve (3 points)

In [None]:
# Time (hours) and bacterial population (millions)
t_data = np.array([0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20])
population = np.array([10, 18, 32, 55, 85, 120, 150, 175, 190, 198, 200])

# Your code here


### Question 7: Smoothing Noisy Data (6 points)

A sensor recorded noisy measurements of a periodic signal.

1. Use `UnivariateSpline` to smooth the data with an appropriate smoothing parameter (2 points)
2. Compare at least two different smoothing levels (2 points)
3. Plot the original noisy data and smoothed curves (2 points)

In [None]:
# Noisy sensor data
np.random.seed(42)
x_sensor = np.linspace(0, 4*np.pi, 60)
y_true = np.sin(x_sensor) + 0.3*np.sin(3*x_sensor)
y_noisy = y_true + np.random.normal(0, 0.25, len(x_sensor))

# Your code here


---

## Part 3: Optimization (Questions 8-10)

### Question 8: Scalar Optimization (6 points)

Find the minimum of the function:

$$f(x) = x^3 - 6x^2 + 9x + 1$$

on the interval [0, 5].

1. Use `minimize_scalar` with bounds to find the minimum (2 points)
2. Also find any local maximum in the interval (2 points)
3. Plot the function and mark the extrema (2 points)

In [None]:
def f(x):
    return x**3 - 6*x**2 + 9*x + 1

# Your code here


### Question 9: Constrained Optimization (10 points)

A company manufactures two products. Profit is given by:

$$P(x, y) = 50x + 40y$$

Subject to constraints:
- Machine A: 2x + y <= 100 (hours)
- Machine B: x + 3y <= 90 (hours)
- x, y >= 0 (non-negative production)

1. Set up the optimization problem (maximize profit = minimize -profit) (3 points)
2. Define all constraints properly (3 points)
3. Solve and report optimal production quantities and maximum profit (4 points)

In [None]:
# Your code here


### Question 10: Root Finding (8 points)

The Colebrook equation is used in fluid mechanics to find the friction factor f:

$$\frac{1}{\sqrt{f}} = -2 \log_{10}\left(\frac{\epsilon/D}{3.7} + \frac{2.51}{Re \sqrt{f}}\right)$$

Given: Reynolds number Re = 100000, relative roughness epsilon/D = 0.001

1. Rearrange as g(f) = 0 for root finding (2 points)
2. Use `brentq` to solve for f in the interval [0.01, 0.1] (3 points)
3. Verify your solution by substituting back (3 points)

In [None]:
# Given parameters
Re = 100000
eps_D = 0.001  # epsilon/D

# Your code here


---

## Part 4: Integration and ODEs (Questions 11-12)

### Question 11: Numerical Integration (8 points)

The gamma function is defined as:

$$\Gamma(n) = \int_0^\infty t^{n-1} e^{-t} dt$$

For positive integers, Gamma(n) = (n-1)!

1. Implement the gamma function using `quad` (3 points)
2. Calculate Gamma(5) and verify it equals 4! = 24 (2 points)
3. Calculate Gamma(0.5) and verify it equals sqrt(pi) (3 points)

In [None]:
# Your code here


### Question 12: Solving ODEs (10 points)

A pendulum's motion is described by:

$$\frac{d^2\theta}{dt^2} = -\frac{g}{L}\sin(\theta) - b\frac{d\theta}{dt}$$

where g = 9.81 m/s^2, L = 1 m (length), and b = 0.5 (damping).

Initial conditions: theta(0) = pi/4 (45 degrees), omega(0) = 0 (starts from rest)

1. Convert to a system of first-order ODEs (2 points)
2. Solve using `solve_ivp` for t = 0 to 20 seconds (3 points)
3. Plot theta vs time (2 points)
4. Create a phase portrait (theta vs omega) (3 points)

In [None]:
# Parameters
g = 9.81  # m/s^2
L = 1.0   # m
b = 0.5   # damping coefficient

# Initial conditions
theta0 = np.pi / 4  # 45 degrees
omega0 = 0          # starts from rest

# Your code here


---

## End of Test

**Total Points: 100**

Make sure all cells have been executed and your answers are complete before submitting.