# In-Class Lab: T-Test (One Sample & Two Sample)

In [1]:
from scipy import stats
import numpy as np

### Exercise 1: One-Sample T-Test - Mean Weight of a Population

You have a sample of weights (in kg) from a certain population. Test the
hypothesis that the mean weight is 70 kg.
* Sample: [72, 68, 75, 71, 69, 70, 73, 68]
* Null Hypothesis: The population mean is 70 kg.

**Task**: Conduct a one-sample t-test to verify if the population mean is statistically
different from 70 kg.

In [2]:
# One Sample t-test
weights = np.array([72, 68, 75, 71, 69, 70, 73, 68])

In [3]:
# Hypothesized mean weight
popmean = 70

In [4]:
# Perform one-sample t-test
t_stat, p_value = stats.ttest_1samp(weights, popmean)
print("T statistic:", t_stat)
print("P-value:", p_value)

T statistic: 0.8509629433967633
P-value: 0.42294092668021854


In [5]:
# Setting significance level
alpha = 0.05

In [6]:
# Interpret the results
if p_value < alpha:
    print("Reject the null hypothesis; there is a significant difference between the sample mean and the hypothesized mean weight.")
else:
    print("Fail to reject the null hypothesis; there is no significant difference between the sample mean and the hypothesized mean weight.")

Fail to reject the null hypothesis; there is no significant difference between the sample mean and the hypothesized mean weight.


Findings: The population mean weight is 70 kg.

### Exercise 2: One-Sample T-Test - Average Height of Students

A school claims the average height of its students is 165 cm. You take a random
sample of 10 students:

* Sample: [164, 162, 168, 167, 165, 166, 160, 159, 170, 163]
* Null Hypothesis: The mean height is 165 cm.

**Task**: Test the school’s claim using a one-sample t-test.

In [7]:
# One Sample t-test
heights = np.array([164, 162, 168, 167, 165, 166, 160, 159, 170, 163])

In [8]:
# Hypothesized mean height
popmean = 165

In [9]:
# Perform one-sample t-test
t_stat, p_value = stats.ttest_1samp(heights, popmean)
print("T statistic:", t_stat)
print("P-value:", p_value)

T statistic: -0.5417363388859563
P-value: 0.6011521875426783


In [10]:
# Interpret the results
if p_value < alpha:
    print("Reject the null hypothesis; there is a significant difference between the sample mean and the hypothesized mean height.")
else:
    print("Fail to reject the null hypothesis; there is no significant difference between the sample mean and the hypothesized mean height.")

Fail to reject the null hypothesis; there is no significant difference between the sample mean and the hypothesized mean height.


Findings: The mean height is 165 cm as the school's claim.

### Exercise 3: One-Sample T-Test - Customer Satisfaction Scores

A company claims their average customer satisfaction score is 4.5 out of 5. You
sample 12 customers:

* Sample: [4.2, 4.4, 4.5, 4.7, 4.5, 4.6, 4.4, 4.3, 4.5, 4.6, 4.2, 4.5]
* Null Hypothesis: The mean satisfaction score is 4.5.

**Task**: Conduct a one-sample t-test to evaluate the company’s claim.

In [11]:
# One Sample t-test
satisfactions = np.array([4.2, 4.4, 4.5, 4.7, 4.5, 4.6, 4.4, 4.3, 4.5, 4.6, 4.2, 4.5])

In [12]:
# Hypothesized mean satisfaction score
popmean = 4.5

In [13]:
# Perform one-sample t-test
t_stat, p_value = stats.ttest_1samp(satisfactions, popmean)
print("T statistic:", t_stat)
print("P-value:", p_value)

T statistic: -1.1055415967851299
P-value: 0.2925184553957747


In [14]:
# Interpret the results
if p_value < alpha:
    print("Reject the null hypothesis; there is a significant difference between the sample mean and the hypothesized mean satisfaction score.")
else:
    print("Fail to reject the null hypothesis; there is no significant difference between the sample mean and the hypothesized mean satisfaction score.")

Fail to reject the null hypothesis; there is no significant difference between the sample mean and the hypothesized mean satisfaction score.


Findings: The company's average customer satisfaction score is 4.5 out of 5 as their claim.

### Exercise 4: Two-Sample T-Test - Exam Scores of Two Classes

Compare the exam scores of two different classes:
* Class A: [85, 78, 90, 88, 84, 91, 89]
* Class B: [82, 80, 88, 86, 85, 79, 87]

**Task**: Perform a two-sample t-test to check if there is a significant difference in
the mean exam scores between the two classes.

In [15]:
# Two Sample t-test
scores_a = np.array([85, 78, 90, 88, 84, 91, 89])
scores_b = np.array([82, 80, 88, 86, 85, 79, 87])

In [16]:
# Perform two-sample t-test
t_stat, p_value = stats.ttest_ind(scores_a, scores_b)
print("T statistic:", t_stat)
print("P-value:", p_value)

T statistic: 1.1886087235395915
P-value: 0.257584411334151


In [17]:
# Interpret the results
if p_value < alpha:
    print("Reject the null hypothesis; there is a significant difference between the mean exam scores of Class A and the mean exam scores of Class B.")
else:
    print("Fail to reject the null hypothesis; there is no significant difference between the mean exam scores of Class A and the mean exam scores of Class B.")

Fail to reject the null hypothesis; there is no significant difference between the mean exam scores of Class A and the mean exam scores of Class B.


Findings: There is no significant difference in the exam mean scores between the two classes.

### Exercise 5: Two-Sample T-Test - Sales Performance Before and After Training

A sales team underwent a training program, and their sales were recorded before
and after the training:
* Before: [1200, 1300, 1250, 1400, 1350, 1500]
* After: [1450, 1380, 1550, 1600, 1500, 1580]

**Task**: Conduct a two-sample t-test to assess if the training significantly improved
sales performance.

In [18]:
# Two Sample t-test
performances_a = np.array([1200, 1300, 1250, 1400, 1350, 1500])
performances_b = np.array([1450, 1380, 1550, 1600, 1500, 1580])

In [19]:
# Perform two-sample t-test
t_stat, p_value = stats.ttest_ind(performances_a, performances_b)
print("T statistic:", t_stat)
print("P-value:", p_value)

T statistic: -3.163967663492102
P-value: 0.010090540561009647


In [20]:
# Interpret the results
if p_value < alpha:
    print("Reject the null hypothesis; there is a significant difference between the sales performance before training and the sales performance after training.")
else:
    print("Fail to reject the null hypothesis; there is no significant difference between the sales performance before training and the sales performance after training.")

Reject the null hypothesis; there is a significant difference between the sales performance before training and the sales performance after training.


Findings: The training significantly improved sales performance.

### Exercise 6: Two-Sample T-Test - Blood Pressure Medication
You are given two groups: one taking a blood pressure medication and one taking
a placebo. Their blood pressure reduction is measured:
* Medication group: [10, 12, 9, 14, 11, 13]
* Placebo group: [3, 5, 2, 4, 6, 5]

**Task**: Use a two-sample t-test to determine whether the medication has a
statistically significant effect on blood pressure.

In [21]:
# Two Sample t-test
blood_pressures_a = np.array([10, 12, 9, 14, 11, 13])
blood_pressures_b = np.array([3, 5, 2, 4, 6, 5])

In [22]:
# Perform two-sample t-test
t_stat, p_value = stats.ttest_ind(blood_pressures_a, blood_pressures_b)
print("T statistic:", t_stat)
print("P-value:", p_value)

T statistic: 7.545937746270389
P-value: 1.9570173447435993e-05


In [23]:
# Interpret the results
if p_value < alpha:
    print("Reject the null hypothesis; there is a significant difference between the blood pressures in medication group and the blood pressures in placebo group.")
else:
    print("Fail to reject the null hypothesis; there is no significant difference between the blood pressures in medication group and the blood pressures in placebo group.")

Reject the null hypothesis; there is a significant difference between the blood pressures in medication group and the blood pressures in placebo group.


Findings: The medication has a statistically significant effect on blood pressure.

### Exercise 7: One-Sample T-Test - Test if a Coin is Fair
You flip a coin 100 times, and it lands on heads 58 times. A fair coin should have
50 heads out of 100 flips.
* Null Hypothesis: The proportion of heads is 0.50.

**Task**: Perform a one-sample t-test on the proportion of heads to determine if the
coin is biased.

In [24]:
# One Sample t-test
proportions = np.concatenate((np.zeros(58, dtype=int), np.ones(42, dtype=int)))

In [25]:
# Hypothesized proportion of heads
popmean = 0.50

In [26]:
# Perform one-sample t-test
t_stat, p_value = stats.ttest_1samp(proportions, popmean)
print("T statistic:", t_stat)
print("P-value:", p_value)

T statistic: -1.612757024996288
P-value: 0.10998066941759047


In [27]:
# Interpret the results
if p_value < alpha:
    print("Reject the null hypothesis; the coin is biased.")
else:
    print("Fail to reject the null hypothesis; the coin is not biased.")

Fail to reject the null hypothesis; the coin is not biased.


Findings: The coin is not biased.

### Exercise 8: Two-Sample T-Test - Comparison of Test Scores

A researcher believes that students who study with music perform differently than
those who study in silence. You have two groups:
* Music: [80, 85, 78, 90, 87, 76]
* Silence: [88, 82, 84, 89, 91, 85]

**Task**: Use a two-sample t-test to compare the test scores of both groups and
determine if the difference is significant.

In [28]:
# Two Sample t-test
scores_music = np.array([80, 85, 78, 90, 87, 76])
scores_silence = np.array([88, 82, 84, 89, 91, 85])

In [29]:
# Perform two-sample t-test
t_stat, p_value = stats.ttest_ind(scores_music, scores_silence)
print("T statistic:", t_stat)
print("P-value:", p_value)

T statistic: -1.4529052821772128
P-value: 0.1768975161673121


In [30]:
# Interpret the results
if p_value < alpha:
    print("Reject the null hypothesis; there is a significant difference between the test scores in studied with music group and the test scores in studied with silence group.")
else:
    print("Fail to reject the null hypothesis; there is no significant difference between the test scores in studied with music group and the test scores in studied with silence group.")

Fail to reject the null hypothesis; there is no significant difference between the test scores in studied with music group and the test scores in studied with silence group.


Findings: The difference of two groups is not significant. Studying with music does not affect test scores.

### Exercise 9: One-Sample T-Test - Average Lifespan of a Product
A manufacturer claims that the average lifespan of their product is 5 years. A
sample of products shows these lifespans:
* Sample: [4.8, 5.1, 4.9, 5.3, 4.7, 5.2, 5.0, 4.9]

**Task**: Perform a one-sample t-test to check if the manufacturer’s claim is
statistically valid.

In [31]:
# One Sample t-test
lifespans = np.array([4.8, 5.1, 4.9, 5.3, 4.7, 5.2, 5.0, 4.9])

In [32]:
# Hypothesized average lifespan
popmean = 5

In [33]:
# Perform one-sample t-test
t_stat, p_value = stats.ttest_1samp(lifespans, popmean)
print("T statistic:", t_stat)
print("P-value:", p_value)

T statistic: -0.17407765595570038
P-value: 0.8667318497550061


In [34]:
# Interpret the results
if p_value < alpha:
    print("Reject the null hypothesis; there is a significant difference between the claimed mean and the hypothesized mean lifespan.")
else:
    print("Fail to reject the null hypothesis; there is no significant difference between the claimed mean and the hypothesized mean lifespan.")

Fail to reject the null hypothesis; there is no significant difference between the claimed mean and the hypothesized mean lifespan.


Findings: The manufacturer's claim is statistically valid. The average lifespan of their product is 5 years.

### Exercise 10: Two-Sample T-Test - Average Income of Two Cities
Compare the average income of two cities based on a sample of residents:
* City A: [45000, 48000, 47000, 50000, 49000, 46000]
* City B: [43000, 42000, 44000, 41000, 45000, 43000]

**Task**: Conduct a two-sample t-test to determine if there is a significant difference
in income between the two cities.

In [35]:
# Two Sample t-test
incomes_a = np.array([45000, 48000, 47000, 50000, 49000, 46000])
incomes_b = np.array([43000, 42000, 44000, 41000, 45000, 43000])

In [36]:
# Perform two-sample t-test
t_stat, p_value = stats.ttest_ind(incomes_a, incomes_b)
print("T statistic:", t_stat)
print("P-value:", p_value)

T statistic: 4.700096710803842
P-value: 0.0008414426956321737


In [37]:
# Interpret the results
if p_value < alpha:
    print("Reject the null hypothesis; there is a significant difference in income between city A and city B.")
else:
    print("Fail to reject the null hypothesis; there is no significant difference in income between city A and city B.")

Reject the null hypothesis; there is a significant difference in income between city A and city B.


Findings: There is a significant difference in income between the two cities.