# Statistics Exercises

## Part 1: Type I and Type II Errors
### Task: Evaluate the Effectiveness of a Training Program
- A coach believes that a new training program improves the sprint speed of athletes.
- Define what a Type I error and Type II error would mean in this context.
- What are the potential consequences of each type of error?

## Part 2: Hypothesis Testing & Confidence Interval
### Task: Soccer Team's Goal Scoring Rate
- A soccer team claims that they score an average of at least 2.5 goals per match.
- Formulate null and alternative hypotheses for this claim.
- Use a t-test to test the hypothesis and construct a 95% confidence interval for the average goals per match.

## Part 3: Correlation
### Task: Relationship Between Age and Performance
- You have data on the ages of marathon runners and their finishing times (in minutes).
- Calculate both the Pearson and Spearman correlation coefficients to assess the relationship between age and finishing time.
- Interpret the results.

In [1]:
import numpy as np
from scipy.stats import pearsonr, spearmanr

np.random.seed(42) 
num_runners = 50

ages = np.random.randint(20, 60, num_runners)

finishing_times = 300 - (ages * 2) + np.random.normal(0, 10, num_runners)

## Part 4: Central Limit Theorem & Law of Large Numbers
### Task: Free Throw Shooting in Basketball
- Assume a basketball player has a free-throw success rate of 75%.
- Simulate the player's success rate over different numbers of attempts (10, 100, 1000, and 10,000).
- Use the simulations to demonstrate the Law of Large Numbers and the Central Limit Theorem.

## Part 5: Z-Test, T-Test, ANOVA, and Confidence Interval
### Task: Compare Average Running Speeds across Different Sports
- You have data on the running speeds (in km/h) of players from three sports: soccer, basketball, and baseball.
- Use an ANOVA test to determine if there's a statistically significant difference in running speeds across the sports.
- Construct a 95% confidence interval for the average speed of each sport.

In [2]:
np.random.seed(42)

soccer_mean, soccer_std, soccer_count = 28, 3, 50   # Soccer players are generally faster
basketball_mean, basketball_std, basketball_count = 25, 2, 50  # Basketball players are moderately fast
baseball_mean, baseball_std, baseball_count = 22, 4, 50  # Baseball players are typically slower

soccer_speeds = np.random.normal(soccer_mean, soccer_std, soccer_count)
basketball_speeds = np.random.normal(basketball_mean, basketball_std, basketball_count)
baseball_speeds = np.random.normal(baseball_mean, baseball_std, baseball_count)

## Part 6: Statistical Significance and Power
### Task: Power Analysis for a Sports Training Program
- A new training method is hypothesized to improve the endurance of cyclists.
- You are planning a study to test this training method's effect.
- Calculate the sample size needed to achieve 80% power for detecting a medium effect size at a significance level of 0.05.

Learn from <a href='https://www.statsmodels.org/stable/generated/statsmodels.stats.power.FTestAnovaPower.solve_power.html#statsmodels.stats.power.FTestAnovaPower.solve_power'>documentation </a> how the TTestIndPower function works and how you can use it to complete the exercise.

In [3]:
from statsmodels.stats.power import TTestIndPower