### EXERCISE 1: Permutation Test
Perform a permutation test to determine if there is a significant difference between the means of two samples. Use the following sample data:

Sample 1: [14, 15, 16, 19, 22, 24]
Sample 2: [11, 12, 13, 15, 18, 21]

Steps to perform the permutation test:
1. Combine the two samples into one dataset.
2. Randomly shuffle the combined dataset and split it into two new samples of the same size as the original samples.
3. Calculate the difference in means between the two new samples.
4. Repeat steps 2 and 3 many times (e.g., 10000 permutations) to create a distribution of differences in means.
5. Calculate the p-value as the proportion of permutations where the difference in means is as extreme as or more extreme than the observed difference.
6. Compare the p-value to your significance level (e.g., 0.05) to determine if the difference is significant.

In [1]:
# EXERCISE 1
import numpy as np

# Sample data
sample1 = np.array([14, 15, 16, 19, 22, 24])
sample2 = np.array([11, 12, 13, 15, 18, 21])

# Observed difference in means
obs_diff = np.mean(sample1) - np.mean(sample2)

# Combine the samples
combined = np.concatenate([sample1, sample2])

# Number of permutations
n_permutations = 10000

# Permutation test
diffs = np.zeros(n_permutations)
for i in range(n_permutations):
    np.random.shuffle(combined)
    new_sample1 = combined[:len(sample1)]
    new_sample2 = combined[len(sample1):]
    diffs[i] = np.mean(new_sample1) - np.mean(new_sample2)

# Calculate the p-value
p_value = np.sum(np.abs(diffs) >= np.abs(obs_diff)) / n_permutations
print(f"Observed difference in means: {obs_diff}")
print(f"P-value: {p_value}")

# Assert statement to check if the p-value calculation is correct
assert p_value < 1.0, "P-value should be between 0 and 1"


Observed difference in means: 3.333333333333332
P-value: 0.1856


## Bootstrap

### EXERCISE 2
Use the bootstrap method to estimate the 95% confidence interval for the mean of a sample. Use the following sample data:

Sample: [2, 3, 5, 7, 11, 13, 17, 19]

Steps to perform the bootstrap method:
1. Generate a large number of bootstrap samples (e.g., 10000 samples) by resampling with replacement from the original sample.
2. Calculate the mean of each bootstrap sample.
3. Use the distribution of bootstrap means to estimate the 95% confidence interval for the mean.

In [2]:
# EXERCISE 2
import numpy as np

# Sample data
sample = np.array([2, 3, 5, 7, 11, 13, 17, 19])

# Number of bootstrap samples
n_bootstrap_samples = 10000

# Bootstrap method
bootstrap_means = np.zeros(n_bootstrap_samples)
for i in range(n_bootstrap_samples):
    bootstrap_sample = np.random.choice(sample, size=len(sample), replace=True)
    bootstrap_means[i] = np.mean(bootstrap_sample)

# Calculate the 95% confidence interval
confidence_interval = np.percentile(bootstrap_means, [2.5, 97.5])
print(f"95% confidence interval for the mean: {confidence_interval}")

# Assert statement to check if the confidence interval is reasonable
assert confidence_interval[0] < np.mean(sample) < confidence_interval[1], "Mean should be within the confidence interval"


95% confidence interval for the mean: [ 5.5  13.75]
