### Hypothesis Testing: Definition, Steps, Types, and Examples

**Tutorial 6.1: llustration of the hypothesis testing using the example ‘men are taller than women on average’**

In [1]:
# import the scipy.stats library
import scipy.stats as stats
# define the significance level
# alpha = 0.05, which means there is a 5% chance of making a type I error (rejecting the null hypothesis when it is true)
alpha = 0.05
# generate some random data for men and women heights (in cm)
# you can replace this with your own data
men_heights = stats.norm.rvs(
    loc=175, scale=10, size=100)  # mean = 175, std = 10
women_heights = stats.norm.rvs(
    loc=165, scale=8, size=100)  # mean = 165, std = 8
# calculate the sample means and standard deviations
men_mean = men_heights.mean()
men_std = men_heights.std()
women_mean = women_heights.mean()
women_std = women_heights.std()
# print the sample statistics
print("Men: mean = {:.2f}, std = {:.2f}".format(men_mean, men_std))
print("Women: mean = {:.2f}, std = {:.2f}".format(women_mean, women_std))
# perform a two-sample t-test
# the null hypothesis is that the population means are equal
# the alternative hypothesis is that the population means are not equal
t_stat, p_value = stats.ttest_ind(men_heights, women_heights)
# print the test statistic and the p-value
print("t-statistic = {:.2f}".format(t_stat))
print("p-value = {:.4f}".format(p_value))
# compare the p-value with the significance level and make a decision
if p_value <= alpha:
    print("Reject the null hypothesis: the population means are not equal.")
else:
    print("Fail to reject the null hypothesis: the population means are equal.")

Men: mean = 175.50, std = 10.35
Women: mean = 164.90, std = 7.37
t-statistic = 8.30
p-value = 0.0000
Reject the null hypothesis: the population means are not equal.


**Tutorial 6.2: Illustration of the hypothesis testing using the example `jar of candies`**

In [14]:
# import the scipy.stats library
import scipy.stats as stats
# define the significance level
alpha = 0.05
# geerate some random data for the number of red and blue candies in a handful
# you can replace this with your own data
n = 20  # number of trials (candies)
p = 0.5  # probability of success (red candy)
red_candies = stats.binom.rvs(n, p)  # number of red candies
blue_candies = n - red_candies  # number of blue candies
# print the sample data
print("Red candies: {}".format(red_candies))
print("Blue candies: {}".format(blue_candies))
# perform a binomial test
# the null hypothesis is that the probability of success is 0.5
# the alternative hypothesis is that the probability of success is not 0.5
p_value = stats.binomtest(red_candies, n, p, alternative='two-sided')
# print the p-value
print("p-value = {:.4f}".format(p_value.pvalue))
# compare the p-value with the significance level and make a decision
if p_value.pvalue <= alpha:
    print("Reject the null hypothesis: the probability of success is not 0.5.")
else:
    print("Fail to reject the null hypothesis: the probability of success is 0.5.")

Red candies: 6
Blue candies: 14
p-value = 0.1153
Fail to reject the null hypothesis: the probability of success is 0.5.


**Tutorial 6.3: One sided and two sided test**

In [1]:
# Import the scipy.stats module
import scipy.stats as stats
# Define the scores of both classes as lists
class1 = [80, 85, 90, 95, 100, 105, 110, 115, 120, 125]
class2 = [75, 80, 85, 90, 95, 100, 105, 110, 115, 120]
# Perform a one-sided test to see if class1 is smarter than class2
# The null hypothesis is that the mean of class1 is less than or equal to the mean of class2
# The alternative hypothesis is that the mean of class1 is greater than the mean of class2
t_stat, p_value = stats.ttest_ind(class1, class2, alternative='greater')
print('One-sided test results:')
print('t-statistic:', t_stat)
print('p-value:', p_value)
# Compare the p-value with the significance level
if p_value < 0.05:
    print('We reject the null hypothesis and conclude that class1 is smarter than class2.')
else:
    print('We fail to reject the null hypothesis and cannot conclude that class1 is smarter than class2.')

One-sided test results:
t-statistic: 0.7385489458759964
p-value: 0.23485103640040045
We fail to reject the null hypothesis and cannot conclude that class1 is smarter than class2.


**Tutorial 6.4: An illustration of the two-sided testing using the example ‘My class (Class 1)  and  the other class are different in smartness’ as mentioned in above example**

In [4]:
# Import the scipy.stats module
import scipy.stats as stats
# Define the scores of both classes as lists
class1 = [80, 85, 90, 95, 100, 105, 110, 115, 120, 125]
class2 = [75, 80, 85, 90, 95, 100, 105, 110, 115, 120]
# Perform a two-sided test to see if class1 and class2 are different in smartness
# The null hypothesis is that the mean of class1 is equal to the mean of class2
# The alternative hypothesis is that the mean of class1 is not equal to the mean of class2
t_stat, p_value = stats.ttest_ind(class1, class2, alternative='two-sided')
print('Two-sided test results:')
print('t-statistic:', t_stat)
print('p-value:', p_value)
# Compare the p-value with the significance level
if p_value < 0.05:
    print('We reject the null hypothesis and conclude that class1 and class2 are different in smartness.')
else:
    print('We fail to reject the null hypothesis and cannot conclude that class1 and class2 are different in smartness.')

Two-sided test results:
t-statistic: 0.7385489458759964
p-value: 0.4697020728008009
We fail to reject the null hypothesis and cannot conclude that class1 and class2 are different in smartness.


##### One-Sample and Two-Sample Tests
**Tutorial 6.5: An illustration of the one-sample testing using the example ‘My class is taller than the average height for kids my age’**

In [5]:
# Import the scipy.stats module
import scipy.stats as stats
# Define the heights of your class as a list
my_class = [150, 155, 160, 165, 170, 175, 180, 185, 190, 195]
# Perform a one-sample test to see if your class is taller than the average height for kids your age
# The null hypothesis is that the mean of your class is equal to the population mean
# The alternative hypothesis is that the mean of your class is not equal to the population mean (two-sided)
# or that the mean of your class is greater than the population mean (one-sided)
# According to the WHO, the average height for kids aged 12 years is 152.4 cm for boys and 151.3 cm for girls [^1^][1]
# We will use the average of these two values as the population mean
pop_mean = (152.4 + 151.3) / 2
t_stat, p_value = stats.ttest_1samp(
    my_class, pop_mean, alternative='two-sided')
print('One-sample test results:')
print('t-statistic:', t_stat)
print('p-value:', p_value)
# Compare the p-value with the significance level
if p_value < 0.05:
    print('We reject the null hypothesis and conclude that your class is different in height from the average height for kids your age.')
else:
    print('We fail to reject the null hypothesis and cannot conclude that your class is different in height from the average height for kids your age.')

One-sample test results:
t-statistic: 4.313644314582188
p-value: 0.0019512458685808432
We reject the null hypothesis and conclude that your class is different in height from the average height for kids your age.


**Tutorial 6.6: An illustration of the two-sample testing using the example ‘My class is taller than the other class’**

In [7]:
# Import the scipy.stats module
import scipy.stats as stats
# Define the heights of your class as a list
my_class = [150, 155, 160, 165, 170, 175, 180, 185, 190, 195]
# Perform a two-sample test to see if your class is taller than the other class
# The null hypothesis is that the means of both classes are equal
# The alternative hypothesis is that the means of both classes are not equal (two-sided)
# or that the mean of your class is greater than the mean of the other class (one-sided)
# Define the heights of the other class as a list
other_class = [145, 150, 155, 160, 165, 170, 175, 180, 185, 190]
t_stat, p_value = stats.ttest_ind(
    my_class, other_class, alternative='two-sided')
print('Two-sample test results:')
print('t-statistic:', t_stat)
print('p-value:', p_value)
# Compare the p-value with the significance level
if p_value < 0.05:
    print('We reject the null hypothesis and conclude that your class and the other class are different in height.')
else:
    print('We fail to reject the null hypothesis and cannot conclude that your class and the other class are different in height.')

Two-sample test results:
t-statistic: 0.7385489458759964
p-value: 0.4697020728008009
We fail to reject the null hypothesis and cannot conclude that your class and the other class are different in height.


##### Paired and Independent Tests

**Tutorial 6.7: An illustration of the paired testing using the example ‘My happiness before and after the field trip’**

In [8]:
# We use scipy.stats.ttest_rel to perform a paired t-test
# We assume that the happiness ratings are on a scale of 1 to 10
import scipy.stats as stats
# The happiness ratings of the class before and after the field trip
before = [7, 8, 6, 9, 5, 7, 8, 6, 7, 9]
after = [8, 9, 7, 10, 6, 8, 9, 7, 8, 10]
# Perform the paired t-test
t_stat, p_value = stats.ttest_rel(before, after)
# Print the results
print("Paired t-test results:")
print("t-statistic:", t_stat)
print("p-value:", p_value)

Paired t-test results:
t-statistic: -inf
p-value: 0.0


  res = hypotest_fun_out(*samples, **kwds)


**Tutorial 6.8: An illustration of the independent test using the example ‘My happiness and the happiness of the other class’** 

In [10]:
# We use scipy.stats.ttest_ind to perform an independent t-test
# We assume that the happiness ratings of the other class are also on a scale of 1 to 10
import scipy.stats as stats
# The happiness ratings of the other class before and after the field trip
other_before = [6, 7, 5, 8, 4, 6, 7, 5, 6, 8]
other_after = [7, 8, 6, 9, 5, 7, 8, 6, 7, 9]
# Perform the independent t-test
t_stat, p_value = stats.ttest_ind(after, other_after)
# Print the results
print("Independent t-test results:")
print("t-statistic:", t_stat)
print("p-value:", p_value)

Independent t-test results:
t-statistic: 1.698415551216892
p-value: 0.10664842826837892


##### Parametric and Non-parametric test

**Tutorial 6.9: An illustration of the parametric test**

In [11]:
# We use scipy.stats.ttest_ind to perform a parametric t-test
# We assume that the data follows a normal distribution
import scipy.stats as stats
# The number of students who like chocolate and vanilla ice cream
chocolate = [25, 27, 29, 28, 26, 30, 31, 24, 27, 29]
vanilla = [22, 23, 21, 24, 25, 26, 20, 19, 23, 22]
# Perform the parametric t-test
t_stat, p_value = stats.ttest_ind(chocolate, vanilla)
# Print the results
print("Parametric t-test results:")
print("t-statistic:", t_stat)
print("p-value:", p_value)

Parametric t-test results:
t-statistic: 5.190169516378603
p-value: 6.162927154861931e-05


**Tutorial 6.10: An illustration of the nonparametric test**

In [12]:
# We use scipy.stats.mannwhitneyu to perform a nonparametric Mann-Whitney U test
# We do not assume any distribution for the data
import scipy.stats as stats
# The number of students who like chocolate and vanilla ice cream
chocolate = [25, 27, 29, 28, 26, 30, 31, 24, 27, 29]
vanilla = [22, 23, 21, 24, 25, 26, 20, 19, 23, 22]
# Perform the nonparametric Mann-Whitney U test
u_stat, p_value = stats.mannwhitneyu(chocolate, vanilla)
# Print the results
print("Nonparametric Mann-Whitney U test results:")
print("U-statistic:", u_stat)
print("p-value:", p_value)

Nonparametric Mann-Whitney U test results:
U-statistic: 95.5
p-value: 0.0006480405677249192


#### Significance Tests: Definition, Interpretation, Calculation, and Examples

**Tutorial 6.11: An illustration of the significance testing based on coin toss example**

In [1]:
# Import the binom_test function from scipy.stats
from scipy.stats import binomtest
# Ask the user to input the number of correct guesses by their friend
correct = int(
    input("How many correct guesses did your friend make out of 10 coin tosses? "))
# Calculate the p-value using the binom_test function
# The arguments are: number of successes, number of trials, probability of success, alternative hypothesis
p_value = binomtest(correct, 10, 0.5, "greater")
# Print the p-value
print("p-value = {:.4f}".format(p_value.pvalue))
# Compare the p-value with the cutoff of 0.05
if p_value.pvalue < 0.05:
    # If the p-value is less than 0.05, reject the claim that the coin is fair and the friend is guessing
    print("This result is statistically significant. We reject the claim that the coin is fair and the friend is guessing.")
else:
    # If the p-value is greater than 0.05, do not reject the claim that the coin is fair and the friend is guessing
    print("This result is not statistically significant. We do not reject the claim that the coin is fair and the friend is guessing.")

How many correct guesses did your friend make out of 10 coin tosses?  2


p-value = 0.9893
This result is not statistically significant. We do not reject the claim that the coin is fair and the friend is guessing.


**Tutorial 6.12: An illustration of the significance testing based on candy and smartness example**

In [16]:
# Import the ttest_rel function from scipy.stats
from scipy.stats import ttest_rel
# Define the IQ scores of the candy group before and after the treatment
candy_before = [100, 105, 110, 115, 120, 125, 130, 135, 140]
candy_after = [104, 105, 110, 120, 123, 125, 135, 135, 144]
# Define the IQ scores of the placebo group before and after the treatment
placebo_before = [101, 106, 111, 116, 121, 126, 131, 136, 141]
placebo_after = [100, 104, 109, 113, 117, 121, 125, 129, 133]
# Calculate the difference in IQ scores for each group
candy_diff = [candy_after[i] - candy_before[i] for i in range(9)]
placebo_diff = [placebo_after[i] - placebo_before[i] for i in range(9)]
# Perform a paired t-test on the difference scores
# The null hypothesis is that the mean difference is zero
# The alternative hypothesis is that the mean difference is positive
t_stat, p_value = ttest_rel(candy_diff, placebo_diff, alternative="greater")
# Print the test statistic and the p-value
print(f"The test statistic is {t_stat:.4f}")
print(f"The p-value is {p_value:.4f}")
# Compare the p-value with the significance level of 0.05
if p_value < 0.05:
    # If the p-value is less than 0.05, reject the null hypothesis and accept the alternative hypothesis
    print("This result is statistically significant. We reject the null hypothesis and accept the alternative hypothesis.")
    print("We conclude that the candy makes the children smarter.")
else:
    # If the p-value is greater than 0.05, do not reject the null hypothesis and do not accept the alternative hypothesis
    print("This result is not statistically significant. We do not reject the null hypothesis and do not accept the alternative hypothesis.")
    print("We conclude that the candy has no effect on the children's intelligence.")

The test statistic is 5.6127
The p-value is 0.0003
This result is statistically significant. We reject the null hypothesis and accept the alternative hypothesis.
We conclude that the candy makes the children smarter.


**Tutorial 6.13: To compute the p-value of getting 8 heads and 2 tails when a coin is flipped 10 times, with a significance level of 0.05**

In [3]:
# Import the scipy library for statistical functions
import scipy.stats as stats
# Define the parameters of the binomial distribution
n = 10  # number of flips
k = 8  # number of heads
p = 0.5  # probability of heads
# Calculate the p-value using the cumulative distribution function (cdf)
# The p-value is the probability of getting at least k heads, so we use 1 - cdf(k-1)
p_value = 1 - stats.binom.cdf(k-1, n, p)
# Print the p-value
print(f"The p-value is {p_value:.4f}")
# Compare the p-value with the significance level
alpha = 0.05  # significance level
if p_value < alpha:
    print("The result is statistically significant.")
else:
    print("The result is not statistically significant.")

The p-value is 0.0547
The result is not statistically significant.


**Tutorial 6.14: To illustrate the `z-test` based on above student height example**

In [3]:
# import the ztest function from statsmodels package
from statsmodels.stats.weightstats import ztest
# Create a list of heights (in cm) for each team with extended sample sizes
teamA = [180, 182, 185, 189, 191, 191, 192, 194, 199,
         199, 205, 209, 209, 209, 210, 212, 212, 213, 214, 214,
         180, 183, 186, 190, 192, 193, 195, 198, 200,
         200, 206, 207, 208, 210, 212, 215, 216, 218, 220, 221]

teamB = [190, 191, 191, 191, 195, 195, 199, 199, 208,
         209, 209, 214, 215, 216, 217, 217, 228, 229, 230, 233,
         190, 192, 194, 196, 198, 200, 202, 204, 206,
         208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228]
# perform a two sample z-test to compare the mean heights of the two teams
# the null hypothesis is that the mean heights are equal
# the alternative hypothesis is that the mean heights are different
# we use a two-tailed test with a significance level of 0.05
z_stat, p_value = ztest(teamA, teamB, value=0)
# print the test statistic and the p-value
print("Z-statistic:", z_stat)
print("P-value:", p_value)
# interpret the result
if p_value < 0.05:
    print("We reject the null hypothesis and conclude that the mean heights of the two teams are significantly different.")
else:
    print("We fail to reject the null hypothesis and conclude that the mean heights of the two teams are not significantly different.")

Z-statistic: -2.6516583421751276
P-value: 0.008009754638178521
We reject the null hypothesis and conclude that the mean heights of the two teams are significantly different.


**Tutorial 6.15: To illustrate the `t-test` based on the above pizza delivery time to see significance difference**

In [4]:
# Import the ttest_ind function from scipy.stats package
from scipy.stats import ttest_ind
# Create a list of delivery times (in minutes) for each pizza place
placeA = [15, 18, 20, 22, 25, 28, 30, 32, 35, 40]
placeB = [12, 14, 16, 18, 20, 22, 24, 26, 28, 30]
# Perform a two-sample t-test to compare the mean delivery times of the two pizza places
# The null hypothesis is that the mean delivery times are equal
# The alternative hypothesis is that the mean delivery times are different
# We use a two-tailed test with a significance level of 0.05
t_stat, p_value = ttest_ind(placeA, placeB)
# Print the test statistic and the p-value
print("T-statistic:", t_stat)
print("P-value:", p_value)
# Interpret the result
if p_value < 0.05:
    print("We reject the null hypothesis and conclude that the mean delivery times of the two pizza places are significantly different.")
else:
    print("We fail to reject the null hypothesis and conclude that the mean delivery times of the two pizza places are not significantly different.")

T-statistic: 1.7407039045950503
P-value: 0.0988019572356951
We fail to reject the null hypothesis and conclude that the mean delivery times of the two pizza places are not significantly different.


**Tutorial 6.16: To illustrate the `chi-square test` based on the pet and favorite color example**

In [3]:
# import the chi2_contingency function
from scipy.stats import chi2_contingency
# create a contingency table as a list of lists
data = [[12, 18, 10, 15], [8, 14, 12, 11], [5, 9, 15, 6]]
# perform the chi-square test
stat, p, dof, expected = chi2_contingency(data)
# print the test statistic, the p-value, and the expected frequencies
print("Test statistic:", stat)
print("P-value:", p)
print("Expected frequencies:")
print(expected)
# interpret the result
significance_level = 0.05
if p <= significance_level:
    print("We reject the null hypothesis and conclude that there is a significant association between the type of pet and the favorite color.")
else:
    print("We fail to reject the null hypothesis and conclude that there is no significant association between the type of pet and the favorite color.")

Test statistic: 6.740632143071166
P-value: 0.34550083293175876
Expected frequencies:
[[10.18518519 16.7037037  15.07407407 13.03703704]
 [ 8.33333333 13.66666667 12.33333333 10.66666667]
 [ 6.48148148 10.62962963  9.59259259  8.2962963 ]]
We fail to reject the null hypothesis and conclude that there is no significant association between the type of pet and the favorite color.


**Tutorial 6.17: To illustrate the `one-way ANOVA` test based on baking contest example**

In [4]:
import numpy as np
import scipy.stats as stats
# Define the ratings of the cakes by the judges
cake1 = [8.4, 7.6, 9.2, 8.9, 7.8]  # Cake made with flour type 1
cake2 = [6.5, 5.7, 7.3, 6.8, 6.4]  # Cake made with flour type 2
cake3 = [7.1, 6.9, 8.2, 7.4, 7.0]  # Cake made with flour type 3
# Perform one-way ANOVA
f_stat, p_value = stats.f_oneway(cake1, cake2, cake3)
# Print the results
print("F-statistic:", f_stat)
print("P-value:", p_value)

F-statistic: 11.716117216117217
P-value: 0.001509024295003377


**Tutorial 6.18: To illustrate the `two-way ANOVA` test based on baking contest example**

In [5]:
import pandas as pd
import statsmodels.api as sm
from statsmodels.formula.api import ols
from statsmodels.stats.anova import anova_lm
# Define the data
data = {"music": ["classical", "classical", "classical", "classical", "classical",
                  "rock", "rock", "rock", "rock", "rock",
                  "pop", "pop", "pop", "pop", "pop"],
        "time": ["morning", "morning", "afternoon", "afternoon", "evening",
                 "morning", "morning", "afternoon", "afternoon", "evening",
                 "morning", "morning", "afternoon", "afternoon", "evening"],
        "score": [12, 14, 11, 10, 9,
                  8, 7, 9, 8, 6,
                  10, 11, 12, 13, 14]}
# Create a pandas DataFrame
df = pd.DataFrame(data)
# Perform two-way ANOVA
model = ols("score ~ C(music) + C(time) + C(music):C(time)", data=df).fit()
aov_table = anova_lm(model, typ=2)
# Print the results
print(aov_table)

                     sum_sq   df          F    PR(>F)
C(music)          54.933333  2.0  36.622222  0.000434
C(time)            1.433333  2.0   0.955556  0.436256
C(music):C(time)  24.066667  4.0   8.022222  0.013788
Residual           4.500000  6.0        NaN       NaN


**Tutorial 6.19: To illustrate hypothesis testing and significance on diabetes dataset to test above null and alternative hypothesis, where null hypothesis is the mean BMI of diabetic patients is equal to the mean BMI of non-diabetic patients**

In [9]:
import pandas as pd
from scipy import stats
# Load the diabetes data from a csv file
data = pd.read_csv(
    "/workspaces/ImplementingStatisticsWithPython/data/chapter1/diabetes.csv")
# Null hypothesis: There is a significant difference in the mean BMI of diabetic and non-diabetic patients
# Separate the BMI values for diabetic and non-diabetic patients
bmi_diabetic = data[data["Outcome"] == 1]["BMI"]
bmi_non_diabetic = data[data["Outcome"] == 0]["BMI"]
# Perform a two-sample t-test to compare the means of the two groups
t, p = stats.ttest_ind(bmi_diabetic, bmi_non_diabetic)
# Print the test statistic and the p-value
print("Test statistic:", t)
print("P-value:", p)
# Set a significance level
alpha = 0.05
# Compare the p-value with the significance level and make a decision
if p <= alpha:
    print("We reject the null hypothesis and conclude that there is a significant difference in the mean BMI of diabetic and non-diabetic patients.")
else:
    print("We fail to reject the null hypothesis and conclude that there is not enough evidence to support a significant difference in the mean BMI of diabetic and non-diabetic patients.")

Test statistic: 8.47183994786525
P-value: 1.2298074873116022e-16
We reject the null hypothesis and conclude that there is a significant difference in the mean BMI of diabetic and non-diabetic patients.


**Tutorial 6.20: To illustrate hypothesis testing and significance on diabetes dataset to measure if there is an association between the number of pregnancies and the outcome**

In [11]:
import pandas as pd
from scipy import stats
# Load the diabetes data from a csv file
data = pd.read_csv(
    "/workspaces/ImplementingStatisticsWithPython/data/chapter1/diabetes.csv")
# Separate the number of pregnancies and the outcome for each patient
pregnancies = data["Pregnancies"]
outcome = data["Outcome"]
# Perform a chi-square test to test the independence of the two variables
chi2, p, dof, expected = stats.chi2_contingency(
    pd.crosstab(pregnancies, outcome))
# Print the test statistic and the p-value
print("Test statistic:", chi2)
print("P-value:", p)
# Set a significance level
alpha = 0.05
# Compare the p-value with the significance level and make a decision
if p <= alpha:
    print("We reject the null hypothesis and conclude that there is a significant association between the number of pregnancies and the outcome.")
else:
    print("We fail to reject the null hypothesis and conclude that there is not enough evidence to support a significant association between the number of pregnancies and the outcome.")

Test statistic: 64.59480868723006
P-value: 8.648349123362548e-08
We reject the null hypothesis and conclude that there is a significant association between the number of pregnancies and the outcome.


#### Sampling Techniques and Sampling Distributions

**Tutorial 6.21: A simple illustration of the sampling technique using 15 random numbers**

In [18]:
import random
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
sample_size = 5
sample = random.sample(data, sample_size)
print(f"The sample of size {sample_size} is: {sample}")

The sample of size 5 is: [8, 11, 9, 14, 4]


**Tutorial 6.21: A simple illustration of the sampling distribution using 1000 samples of size 5 generated from a list of 15 integers. We then calculate the mean of each sample and store it in a list. Finally, we calculate the mean of the sample means.**

In [5]:
import random
sample_size = 5
num_samples = 1000
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
sample_means = []
for i in range(num_samples):
    sample = random.sample(data, sample_size)
    sample_mean = sum(sample) / sample_size
    sample_means.append(sample_mean)
print(f"The mean of the sample means is: {sum(sample_means) / num_samples}")

The mean of the sample means is: 7.996399999999991
