**Homework 1**

You are an Analyst working for a high-end clothes design botique.

The company is developing a new line of clothes for very tall people. Your team is analyzing the viability of the project form a sales perspective and your manager has asked you to assist with some input variables to help test the financial forecast.

You need to create two distributions:
- A normal distribution of 1000 observations for heights of men in the US.
- A normal distribution of 1000 observations for heights of women in the US.

Also, for each of the populations you have been asked to identify the minimum height of 2.2% tallest people in that population.

In the US, men's heights have a mean of 69.1 inches (175.5 cm) and standard deviation of 2.9 inches (7.4 cm), while female's heights have a mean of 63.7 inches (161.8 cm) and standard deviation of 2.7 inches (6.9 cm).

In [None]:
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt

mean_men, std_men = 69.1, 2.9  # Men's height (inches)
mean_women, std_women = 63.7, 2.7  # Women's height (inches)

np.random.seed(42)
heights_men = np.random.normal(mean_men, std_men, 1000)
heights_women = np.random.normal(mean_women, std_women, 1000)

# (top 2.2% cutoff)
cutoff_men = stats.norm.ppf(0.978, mean_men, std_men)
cutoff_women = stats.norm.ppf(0.978, mean_women, std_women)

print(f"Minimum height for the tallest 2.2% of men: {cutoff_men:.2f} inches")
print(f"Minimum height for the tallest 2.2% of women: {cutoff_women:.2f} inches")

# Visualization
plt.figure(figsize=(12, 5))

# Histogram for men
plt.hist(heights_men, bins=30, alpha=0.6, color='blue', label="Men's Heights")
plt.axvline(cutoff_men, color='blue', linestyle='dashed', linewidth=2, label=f"98th Percentile (Men) = {cutoff_men:.2f} in")

# Histogram for women
plt.hist(heights_women, bins=30, alpha=0.6, color='pink', label="Women's Heights")
plt.axvline(cutoff_women, color='red', linestyle='dashed', linewidth=2, label=f"98th Percentile (Women) = {cutoff_women:.2f} in")

plt.title("Height Distributions of Men and Women in the US")
plt.xlabel("Height (inches)")
plt.ylabel("Frequency")
plt.legend()
plt.show()

**Homework 2**

You are a Quant working for a Wall-Street investment fund.

The team of traders under your supervision earns profits which can be approximated with a Laplace distribution. Profits have a mean of $95.70 and a standard deviation of $1,247.
Your team makes about 100 trades every week.

Questions:
A. What is the probability of your team making a loss in any given week?
B. What is the probability of your team making a over $20,000 in any given week?

In [None]:
import scipy.stats as stats

mu_trade = 95.70  # Mean profit per trade
sigma_trade = 1247  # Std dev per trade
n_trades = 100  # Weekly trades

#  distribution parameters
mu_weekly = n_trades * mu_trade
sigma_weekly = (n_trades**0.5) * sigma_trade  # Central Limit Theorem

# Probability of making a loss
z_loss = (0 - mu_weekly) / sigma_weekly
p_loss = stats.norm.cdf(z_loss)

# Probability of making over $20,000
z_profit = (20000 - mu_weekly) / sigma_weekly
p_profit = 1 - stats.norm.cdf(z_profit)

print(f"Probability of making a loss: {p_loss:.4f} ({p_loss*100:.2f}%)")
print(f"Probability of making over $20,000: {p_profit:.4f} ({p_profit*100:.2f}%)")

**Homework 3**

You are consultant engaged by a factory which manufacture spoons.

The factory executives recently spent $10,000,000 upgrading equipment and processes in order to combat excessively high defects in manufacturing (23%) which were leading to high return rates from clients.

You have been asked to prove (with a confidence level of 95%) that new equipment has improved the situation and that the number of defective spoons has decreased to under 18%. You have been supplied with a random sample of 150 spoons and found that 23 spoons have defects.

In [None]:
import scipy.stats as stats
import numpy as np

p0 = 0.18  # Null hypothesis defect rate
n = 150  # Sample size
x = 23  # Defective spoons
p_hat = x / n  # Sample proportion

# standard error
se = np.sqrt(p0 * (1 - p0) / n)

# Z-score
z = (p_hat - p0) / se

# p-value
p_value = stats.norm.cdf(z)

print(f"Sample Proportion: {p_hat:.4f}")
print(f"Z-score: {z:.4f}")
print(f"P-value: {p_value:.4f}")

# Decision
alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis: The defect rate has significantly decreased.")
else:
    print("Fail to reject the null hypothesis: No strong evidence that the defect rate is below 18%.")
