# Bootstrap Sampling Activity

In this notebook, you will explore how bootstrap sampling helps estimate confidence intervals and variability in data.

### Step 1: Start with a Dataset

We have a dataset representing the ages of 10 people. Let's visualize the dataset.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Original dataset
data = [150, 160, 165, 170, 155, 180, 175, 162, 158, 168]
plt.hist(data, bins=5, alpha=0.7, color='blue', label='Original Dataset')
plt.title('Dataset Distribution')
plt.legend()
plt.show()

### Step 2: Perform Bootstrap Sampling

Generate 1000 bootstrap samples by randomly selecting 10 ages (with replacement) from the dataset and calculate the mean for each sample.

In [None]:
# Bootstrap sampling
bootstrap_means = [np.mean(np.random.choice(data, size=len(data), replace=True)) for _ in range(1000)]

# Visualize the bootstrap means
plt.hist(bootstrap_means, bins=30, alpha=0.7, color='green', label='Bootstrap Means')
plt.title('Bootstrap Sampling Distribution')
plt.legend()
plt.show()

### Step 3: Compute Confidence Intervals

Calculate the 95% confidence interval for the mean using the bootstrap means.

In [None]:
# Calculate confidence intervals
lower = np.percentile(bootstrap_means, 2.5)
upper = np.percentile(bootstrap_means, 97.5)

print(f'95% Confidence Interval for the Mean: ({lower:.2f}, {upper:.2f})')

### Questions for Discussion:

1. How does the distribution of bootstrap means compare to the original dataset?
2. What does the 95% confidence interval tell us about the population mean?
3. Why is replacement important in bootstrap sampling?