**Exercise 1:**

Problem: You are about to flip a coin five times. The probability of getting heads (success) is 0.5. What is the probability of getting 3 heads?

In [1]:
from scipy.stats import binom

n = 5  # number of trials
p = 0.5  # probability of success
k = 3  # number of successes

# calculate binomial distribution probability
prob = binom.pmf(k, n, p)
print("The probability of getting 3 heads is", prob)


The probability of getting 3 heads is 0.3124999999999998


**Exercise 2:**

Problem: A manufacturer knows that 10% of his products are defective. He sells products in boxes of 20. What is the probability that a box will contain exactly 2 defective products?

In [2]:
from scipy.stats import binom

n = 20  # number of trials
p = 0.1  # probability of success (defective is considered 'success' in this context)
k = 2  # number of successes

# calculate binomial distribution probability
prob = binom.pmf(k, n, p)
print("The probability that a box will contain exactly 2 defective products is", prob)


The probability that a box will contain exactly 2 defective products is 0.28517980706429846


***Exercise 3:***

Problem: A multiple-choice quiz contains 10 questions. Each question has four possible answers, of which one is correct. If a student guesses the answer to each question at random, what is the probability that the student will answer exactly 4 questions correctly?

In [3]:
from scipy.stats import binom

n = 10  # number of trials
p = 0.25  # probability of success (guessing correctly)
k = 4  # number of successes

# calculate binomial distribution probability
prob = binom.pmf(k, n, p)
print("The probability that the student will answer exactly 4 questions correctly is", prob)


The probability that the student will answer exactly 4 questions correctly is 0.14599800109863273


**Exercise 4:**
    
Problem: A fast food restaurant serves an average of 10 customers every 15 minutes. What is the probability of serving exactly 7 customers in a 15 minute interval?

In [4]:
from scipy.stats import poisson

mu = 10  # average rate of success
k = 7  # number of successes

# calculate Poisson distribution probability
prob = poisson.pmf(k, mu)
print("The probability of serving exactly 7 customers in 15 minutes is", prob)


The probability of serving exactly 7 customers in 15 minutes is 0.090079225719216


**Exercise 5:**
    
Problem: An IQ test is scored such that the mean score is 100 and the standard deviation is 15. What is the probability of a person scoring higher than 130?

In [5]:
from scipy.stats import norm

mu = 100  # mean
sigma = 15  # standard deviation
x = 130  # score

# calculate standard normal distribution probability
prob = 1 - norm.cdf(x, mu, sigma)
print("The probability of a score being higher than 130 is", prob)


The probability of a score being higher than 130 is 0.02275013194817921


**Exercise 6:**
    
Problem: A sample of 20 students' test scores has a mean of 76 and a standard deviation of 10. What is the probability of a student scoring less than 70?

In [1]:
from scipy.stats import t
import numpy as np

# Given values
n = 20  # Sample size
mean = 76  # Sample mean
std_dev = 10  # Sample standard deviation
score = 70  # Score to find the probability for

# Calculate the t-score
t_score = (score - mean) / (std_dev / np.sqrt(n))

# Degrees of freedom
df = n - 1

# Calculate the probability using the CDF of the t-distribution
probability = t.cdf(t_score, df)

print(f"The probability of a student scoring less than 70 is {probability:.4f}")


The probability of a student scoring less than 70 is 0.0074


**Extra-exercises: Relevant use cases are in context of hypotheses and model estimation**

**Exercise 7:**

Suppose the variance of a sample of 20 observations is 5. What is the probability that the sample variance is greater than 6?

In [7]:
from scipy.stats import chi2

df = 19  # degrees of freedom
sample_variance = 5
x = df * 6 / sample_variance  # chi-square value

# calculate Chi-square distribution probability
prob = 1 - chi2.cdf(x, df)
print("The probability that the sample variance is greater than 6 is", prob)


The probability that the sample variance is greater than 6 is 0.2462658529094548


**Exercise 8:**

If two groups of data are sampled from normal distributions with the same variance, the ratio of their sample variances will follow an F-distribution. Suppose you have two samples of sizes 15 and 20 with variances 4 and 2, respectively. What is the probability that the ratio of these sample variances is less than 1?

In [8]:
from scipy.stats import f

df1 = 14  # degrees of freedom for the first sample
df2 = 19  # degrees of freedom for the second sample
F = 4 / 2  # F value

# calculate F-distribution probability
prob = f.cdf(F, df1, df2)
print("The probability that the ratio of the sample variances is less than 1 is", prob)


The probability that the ratio of the sample variances is less than 1 is 0.9201831653918014


In [2]:
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

# Generate sample data for 10 students
np.random.seed(42)  # for reproducibility
scores = np.random.normal(75, 10, 10).round().astype(int)

# Calculate sample statistics
sample_mean = np.mean(scores)
sample_std = np.std(scores, ddof=1)  # ddof=1 for sample standard deviation
n = len(scores)

print(f"Student Scores: {scores}")
print(f"Sample Mean: {sample_mean:.2f}")
print(f"Sample Standard Deviation: {sample_std:.2f}")

# Calculate the t-statistic for a score of 85
score_to_check = 85
t_stat = (score_to_check - sample_mean) / (sample_std / np.sqrt(n))

# Degrees of freedom
df = n - 1

# Calculate the probability of scoring above 85
p_value = 1 - stats.t.cdf(t_stat, df)

print(f"\nProbability of scoring above {score_to_check}: {p_value:.4f}")

# Calculate 95% confidence interval for the population mean
conf_interval = stats.t.interval(alpha=0.95, df=df, loc=sample_mean, scale=sample_std/np.sqrt(n))
print(f"\n95% Confidence Interval for population mean: {conf_interval}")

# Perform one-sample t-test to check if the mean score is significantly different from 70
t_statistic, p_value = stats.ttest_1samp(scores, 70)
print(f"\nOne-sample t-test (H0: mean = 70):")
print(f"t-statistic: {t_statistic:.4f}")
print(f"p-value: {p_value:.4f}")

# Plot the t-distribution and highlight the observed t-statistic
x = np.linspace(stats.t.ppf(0.001, df), stats.t.ppf(0.999, df), 100)
plt.figure(figsize=(10, 6))
plt.plot(x, stats.t.pdf(x, df), 'b-', lw=2, label='t-distribution')
plt.fill_between(x[x>t_stat], stats.t.pdf(x[x>t_stat], df), color='red', alpha=0.3, label='p-value region')
plt.axvline(t_stat, color='r', linestyle='--', label=f't-statistic ({t_stat:.2f})')
plt.title(f"T-Distribution (df={df}) with Observed t-statistic")
plt.xlabel('t-value')
plt.ylabel('Probability Density')
plt.legend()
plt.grid(True)
plt.show()

Student Scores: [80 74 81 90 73 73 91 83 70 80]
Sample Mean: 79.50
Sample Standard Deviation: 7.17

Probability of scoring above 85: 0.0191


TypeError: rv_generic.interval() missing 1 required positional argument: 'confidence'

In [3]:
from scipy import stats
import numpy as np

# Given values
n = 20  # Sample size
sample_mean = 76  # Sample mean
sample_std = 10  # Sample standard deviation
score_threshold = 70  # Score to find the probability for

# Calculate degrees of freedom
df = n - 1

# Calculate the t-score
t_score = (score_threshold - sample_mean) / (sample_std / np.sqrt(n))

# Calculate the probability using the cumulative distribution function (CDF) of the t-distribution
probability = stats.t.cdf(t_score, df)

print(f"Sample size: {n}")
print(f"Sample mean: {sample_mean}")
print(f"Sample standard deviation: {sample_std}")
print(f"Score threshold: {score_threshold}")
print(f"Degrees of freedom: {df}")
print(f"T-score: {t_score:.4f}")
print(f"The probability of a student scoring less than {score_threshold} is {probability:.4f}")
print(f"This is equivalent to {probability * 100:.2f}%")

Sample size: 20
Sample mean: 76
Sample standard deviation: 10
Score threshold: 70
Degrees of freedom: 19
T-score: -2.6833
The probability of a student scoring less than 70 is 0.0074
This is equivalent to 0.74%
