## Hypothesis Testing

Hypothesis testing is a statistical technique used to make inferences or draw conclusions about a population based on a sample of data. It involves formulating two competing hypotheses: the null hypothesis (H0) and the alternative hypothesis (H1 or Ha). The goal is to test whether the evidence supports rejecting the null hypothesis in favor of the alternative hypothesis. 

The steps involved in hypothesis testing include:
1. Formulating the null and alternative hypotheses based on the research question.
2. Choosing a significance level (alpha) to determine the threshold for rejecting the null hypothesis.
3. Performing a statistical test using appropriate techniques.
4. Comparing the obtained p-value with the significance level to make a decision: reject the null hypothesis if the p-value is less than alpha, or fail to reject the null hypothesis if the p-value is greater than or equal to alpha.

In the below code, we first import the necessary libraries, including NumPy for numerical operations and SciPy for statistical tests.

We generate two sets of data (data1 and data2) using the `np.random.normal()` function to represent two populations with different means.

We then perform a two-sample t-test using the `stats.ttest_ind()` function from SciPy. This test compares the means of two independent samples and returns the t-statistic and p-value.

Next, we set the significance level (`alpha`) to 0.05, which is commonly used in hypothesis testing. This represents the threshold for determining statistical significance.

We check the p-value against the significance level. If the p-value is less than alpha, we reject the null hypothesis; otherwise, we fail to reject the null hypothesis.

Finally, we display the t-statistic and p-value to provide additional information about the test results.

In [None]:
# Importing the required libraries
import numpy as np
import scipy.stats as stats

In [None]:
# Generating two sets of data
np.random.seed(42)
data1 = np.random.normal(loc=10, scale=5, size=100)
data2 = np.random.normal(loc=12, scale=5, size=100)

In [None]:
# Performing a two-sample t-test
t_statistic, p_value = stats.ttest_ind(data1, data2)

In [None]:
# Setting the significance level (alpha)
alpha = 0.05

In [None]:
# Checking the p-value against the significance level
if p_value < alpha:
    print("Reject the null hypothesis")
else:
    print("Fail to reject the null hypothesis")

In [None]:
# Displaying the t-statistic and p-value
print(f"t-statistic: {t_statistic}")
print(f"p-value: {p_value}")

Example 1: One-sample t-test: This test compares the mean of a sample to a known population mean using `stats.ttest_1samp()`. It tests whether the sample mean is significantly different from the population mean.

In [None]:
# Example 1: One-sample t-test
np.random.seed(42)
data = np.random.normal(loc=15, scale=3, size=100)
t_statistic, p_value = stats.ttest_1samp(data, popmean=10)
if p_value < alpha:
    print("Reject the null hypothesis")
else:
    print("Fail to reject the null hypothesis")

Example 2: Paired t-test: This test compares the means of two related samples (e.g., before and after treatment) using `stats.ttest_rel()`. It tests whether there is a significant difference between the means of the paired observations.

In [None]:
# Example 2: Paired t-test
np.random.seed(42)
before = np.random.normal(loc=10, scale=3, size=100)
after = before + np.random.normal(loc=2, scale=1, size=100)
t_statistic, p_value = stats.ttest_rel(before, after)
if p_value < alpha:
    print("Reject the null hypothesis")
else:
    print("Fail to reject the null hypothesis")

Example 3: Chi-square test: This test examines the association between categorical variables using `stats.chi2_contingency()`. It tests whether there is a significant association between the observed frequencies and the expected frequencies under the null hypothesis. For each example, the code performs the hypothesis test, checks the p-value against the significance level (`alpha`), and prints whether to reject or fail to reject the null hypothesis.

In [None]:
# Example 3: Chi-square test
np.random.seed(42)
observed = np.array([[25, 15, 10], [30, 35, 25]])
chi2, p_value, dof, expected = stats.chi2_contingency(observed)
if p_value < alpha:
    print("Reject the null hypothesis")
else:
    print("Fail to reject the null hypothesis")