In [24]:
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
import seaborn as sns

## Hypothesis testing

- method for inferring population parameters based on sample data
- structured approach for evaluating claims or assumptions about a population using empirical evidence
- two complementary statements:
    - null hypothesis ($H_0$): statement of no effect, difference, or relationship
        - represents the status quo or the current understanding
    - alternative hypothesis ($H_1$): statement that contradicts the null hypothesis
        - represents the claim or the new understanding that the researcher wants to prove

##### Errors in hypothesis testing

|  | $H_0$ is true | $H_0$ is false |
| --- | --- | --- |
| Reject $H_0$ | Type I error | Correct decision |
| Do not reject $H_0$ | Correct decision | Type II error |

### T-test
- used to determine whether there is a significant difference between the means of two groups or between a sample mean and a known value
- particularly useful when dealing with small sample sizes or when the population standard deviation is unknown
- assumptions:
    - in each group, the data are approximately normally distributed
    - homogeneity of variances of the two groups
    - independence of observations within each group
- types:
    - one-sample t-test: compares the mean of a single sample to a known value or population mean
    - independent two-sample t-test: compares the means of two independent groups
    - paired t-test: compares means from the same group at different times or under different conditions

#### Types of t-tests

| | One-sample t-test | Independent two-sample t-test | Paired t-test |
| --- | --- | --- | --- |
| Synonyms | Student’s t-test | Independent groups / samples t-test, Equal variances t-test, Pooled t-test, Unequal variances t-test | Dependent samples t-test |
| Data | 
| Purpose | is population mean equal to a specific value or not | are population means for two different groups equal or not | is difference between paired measurements for a population zero or not |
| Example: test if... | mean heart rate of group of people $= 65$ or not | mean HR for two groups of people are the same or not | mean difference in HR for group of people before and after exercise is zero or not |
| Estimate of population $\mu$ | sample average | sample average for each group | sample average of differences in paired measurements |
| Population $\sigma$ | unk., use sample std. dev. | unk., use sample std. devs. for each group | unk., use sample std. dev. of differences in paired measurements |
| Degrees of freedom | observations in sample $- 1$, or $n-1$ | $n_1 + n_2 - 2$ | paired observations in sample $- 1$, or $n-1$ |

In [26]:
x = stats.t.rvs(10, size=10000) # generate random numbers from t-distribution
m, v, s, k = stats.t.stats(df=10, moments='mvsk') # get mean, variance, skew, kurtosis of t-distribution with df=10 (theoretical)
n, (smin, smax), sm, sv, ss, sk = stats.describe(x) # get mean, variance, skew, kurtosis of sample (empirical)
print(f'distribution: mean = {m:.4f}, variance = {v:.4f}, skew = {s:.4f}, kurtosis = {k:.4f}')
print(f'sample: mean = {sm:.4f}, variance = {sv:.4f}, skew = {ss:.4f}, kurtosis = {sk:.4f}')

distribution: mean = 0.0000, variance = 1.2500, skew = 0.0000, kurtosis = 1.0000
sample: mean = 0.0012, variance = 1.2415, skew = 0.0124, kurtosis = 0.8707


##### Single sample t-test
- formula: $$t = \frac{\bar{x} - \mu}{s / \sqrt{n}}$$
    - $\bar{x}$: sample mean
    - $\mu$: population mean
    - $s$: sample standard deviation
    - $n$: sample size

In [36]:
sample = [20.7, 27.46, 22.15, 19.85, 21.29, 24.75, 20.75, 22.91, 25.34, 20.33, 21.54, 21.08, 22.14, 19.56, 21.1, 18.04, 24.12, 19.95, 19.72, 18.28, 16.26, 17.46, 20.53, 22.12, 25.06, 22.44, 19.08, 19.88, 21.39, 22.33, 25.79]
mu = 20

# t-test for one sample
t, p = stats.ttest_1samp(sample, mu)
print(f't = {t:.4f}, p = {p:.4f}')

# t-test for one sample (manual)
# 1. calculate t-value
t = (np.mean(sample) - mu) / (np.std(sample, ddof=1) / np.sqrt(len(sample))) # ddof=1 for sample standard deviation

# 2. calculate critical value
t_critical = stats.t.ppf(0.975, len(sample)-1) # 0.975 for two-tailed test









p = stats.t.sf(np.abs(t), len(sample)-1) * 2
print(f't = {t:.4f}, p = {p:.4f}')

t = 3.0668, p = 0.0046
t = 3.0668, p = 0.0046
























Purpose of test	Decide if the population mean is equal to a specific value or not	Decide if the population means for two different groups are equal or not	Decide if the difference between paired measurements for a population is zero or not
Example: test if...	Mean heart rate of a group of people is equal to 65 or not	Mean heart rates for two groups of people are the same or not	Mean difference in heart rate for a group of people before and after exercise is zero or not
Estimate of population mean	Sample average	Sample average for each group	Sample average of the differences in paired measurements
Population standard deviation	Unknown, use sample standard deviation	Unknown, use sample standard deviations for each group	Unknown, use sample standard deviation of differences in paired measurements
Degrees of freedom	Number of observations in sample minus 1, or:
n–1	Sum of observations in each sample minus 2, or:
n1 + n2 – 2	Number of paired observations in sample minus 1, or:
n–1



#### Z-test
- used to determine whether there is a significant difference between the sample mean and the population mean or between the means of two groups when the population variance is known and the sample size is large
- primarily used when the sample size exceeds 30, allowing the use of the normal distribution to approximate the distribution of the test statistic
- assumptions:
    - known population variance
    - large sample size (typically $\gte 30$)
    - normal distribution of the population
- types:
    - one-sample Z-test: compares the mean of a single sample to a known population mean
    - two-sample Z-test: compares the means of two independent samples
    - proportion Z-test: compares the proportion of a certain characteristic in a sample to a known population proportion or between two sample proportions




















What is a t-test?
A t-test is a statistical test used to determine whether there is a significant difference between the means of two groups or between a sample mean and a known value. It is particularly useful when dealing with small sample sizes or when the population standard deviation is unknown. 

The t-test statistic for a one sample t-test is calculated using the formula: 

t-Test Equation

t-test Equation. Image by Author.

where:

Xˉ is the sample mean
μ is the population mean (or the mean of the comparison group)
s is the sample standard deviation, and 
n is the sample size.
Types of t-tests
There are three main types of t-tests. Each compares means under different conditions:

One-Sample t-test: This test compares the mean of a single sample to a known value or population mean. It determines if the sample mean significantly deviates from a specific benchmark. For example, we can use a one-sample t-test to evaluate whether the average test score of a small class differs from the national average.
Independent Two-Sample t-test: This test compares the means of two independent groups to determine if there is a statistically significant difference between them. It is commonly used in experiments where two groups undergo different treatments or conditions. For instance, we could use an independent two-sample t-test to compare test scores between students taught using two different teaching methods to see if one method is more effective.
Paired t-test: This test compares means from the same group at different times or under different conditions. It evaluates whether there is a significant change within the same group after an intervention or over time. An example is measuring student performance before and after implementing a new teaching strategy to assess its impact.
Assumptions of the t-test
The t-test relies on certain assumptions to provide valid results:

Normality of the Data: The t-test assumes that the data in each group are approximately normally distributed. This is especially important when dealing with small sample sizes. If the data are not normally distributed, the t-test results may be unreliable.
Homogeneity of Variances: For an independent two-sample t-test, the variances of the two groups being compared are assumed to be equal. This assumption ensures that the t-test correctly accounts for variability within each group. If the variances are not equal, it can affect the accuracy of the test.
Independence of Observations: The observations within each group should be independent. This means that the value of one observation should not influence or be related to the value of another observation. Violation of this assumption can lead to incorrect conclusions.
It is important to check these assumptions before applying the t-test in any analysis to ensure the validity of the results. Read our T-tests in R Tutorial or our Introduction to Python T-Tests to learn how to conduct t-tests in R or Python. 

What is a Z-test?
A Z-test is a statistical test used to determine whether there is a significant difference between the sample mean and the population mean or between the means of two groups when the population variance is known, and the sample size is large. 

It is primarily used when the sample size exceeds 30, allowing the use of the normal distribution to approximate the distribution of the test statistic.

The Z-test statistic for a one-sample Z-test is calculated using the formula: 

Z-Test Equation

Z-test Equation. Image by Author.


where: 

Xˉ is the sample mean, 
μ is the population mean, 
σ is the population standard deviation, and 
n is the sample size.
Types of Z-tests
There are three main types of Z-tests:

One-Sample Z-test: This test compares the mean of a single sample to a known population mean. It is used when you want to assess whether the sample mean significantly deviates from the population mean, assuming the population variance is known. For example, a one-sample z-test might be used to determine if the average height of a group of more than 30 people differs from the known national average height.
Two-Sample Z-test: This test compares the means of two independent samples to determine if there is a significant difference between them. It is used when both samples are large and the population variances are known. An example of this would be comparing the average test scores of students from two different schools to see if there is a significant difference in performance between the two schools.
Proportion Z-test: This test compares the proportion of a certain characteristic in a sample to a known population proportion or between two sample proportions. It is used to evaluate whether the observed proportion in the sample significantly differs from what is expected based on the population proportion. For instance, a proportion Z-test might be used to compare the proportion of voters favoring a particular candidate in a sample to the proportion observed in previous elections.
There are additional variations of the test, such as the paired Z-test, the Z-test for regression coefficients, and the Z-test for differences in means. 

Assumptions of the Z-test
The Z-test relies on certain assumptions to provide valid results:

Known Population Variance: The Z-test assumes that the population variance is known. This is a key distinction from the t-test, where the population variance is typically unknown. The known variance allows for using the z-distribution to assess the significance of the test statistic.
Large Sample Size: The Z-test assumes a large sample size, typically greater than 30. With larger samples, the sampling distribution of the sample mean approaches a normal distribution, even if the original data are not normally distributed, according to the Central Limit Theorem.
Normal Distribution of the Population: The data are assumed to be drawn from a normally distributed population. This assumption is less critical for large samples but still important when the sample size is moderate.
Key Differences Between t-tests and Z-tests
The t-test and Z-test are used to compare sample statistics to population parameters, but they differ in their underlying assumptions, applications, and the conditions under which they are most appropriate. Let us analyze and understand the differences between the two tests:

Sample size considerations
t-test: The t-test is typically used when the sample size is small, generally less than 30. It is designed to be robust when the sample size does not meet the threshold needed for applying the Central Limit Theorem.
Z-test: The Z-test is used when the sample size is large, typically greater than 30. In large samples, the sampling distribution of the mean is approximately normal, which justifies using the Z-test.
Population variance knowledge
t-test: The t-test is used when the population variance is unknown. Instead of the population variance, the sample variance is used to calculate the test statistic. The t-distribution, which has heavier tails than the normal distribution, accounts for the additional uncertainty due to estimating the population variance.
Z-test: The Z-test requires that the population variance is known. This is a key assumption because it allows the use of the standard normal distribution to calculate the test statistic. When the population variance is known, the Z-test provides more precise estimates.
Distribution assumptions
t-test: The t-test assumes that the data within each group are approximately normally distributed. This is particularly important when dealing with small sample sizes. The test statistic in a t-test follows a t-distribution, which has wider tails than the normal distribution. This accounts for the additional variability and uncertainty when estimating the population standard deviation from a small sample.
Z-test: The Z-test assumes that the data are normally distributed or that the sample size is large enough to apply for the Central Limit Theorem. The Central Limit Theorem ensures that, for large samples, the sampling distribution of the mean is approximately normal, even if the underlying data are not perfectly normal.
Practical applications and use cases
t-test: The t-test is commonly used in small-sample studies, such as pilot studies, where the population variance is unknown. Examples include comparing the effectiveness of two treatments in a small group or assessing changes within the same group over time.
Z-test: The Z-test is used in large-sample studies or when dealing with well-established populations where the variance is known. It is often applied in quality control, survey analysis, and large-scale experimental studies.
Here is table with the key differences:

Key differences between T-Test and Z-Test.Key differences between t-test and Z-test. Image by Author.


Conclusion
This tutorial introduced you to hypothesis testing and two commonly used tests—t-tests and z-tests. We also learned each test's definitions, different types, and assumptions and further understood their key differences. We concluded which test is best to be used in which scenario, thus enabling you to establish relationships between variables confidently through hypothesis testing.

After solidifying the statistical concepts behind hypothesis testing with our Introduction to Statistics course, I would encourage you to implement these concepts through any of the popular technologies through the following resources:

Hypothesis Testing in Python course
Hypothesis Testing in R course
Hypothesis Testing (chi-square test) in Excel tutorial
Happy learning!

distribution: mean = 0.0000, variance = 1.2500, skew = 0.0000, kurtosis = 1.0000
sample: mean = -0.0114, variance = 1.2701, skew = -0.0323, kurtosis = 1.1035
