# Statistical Inference and Hypothesis Testing

**Statistical Inference:**

Statistical inference is a process of drawing conclusions about a population based on a sample of data from that population. It involves making predictions, estimates, or decisions about a population parameter (such as a mean or proportion) using information obtained from a subset of that population (the sample). The goal is to generalize the findings from the sample to the larger population while accounting for the inherent uncertainty involved in the sampling process.

There are two main types of statistical inference:



*   Estimation: This involves estimating an unknown population parameter based on sample data. Common estimation techniques include point estimation, where a single value is used to estimate the parameter, and interval estimation, where a range of values (confidence interval) is provided.

*   Hypothesis Testing: This involves making decisions or inferences about a population parameter by comparing sample data to a hypothesis or set of hypotheses. The most common hypothesis testing involves testing a null hypothesis against an alternative hypothesis to determine if there is enough evidence to reject the null hypothesis in favor of the alternative.

**Hypothesis Testing:**

Hypothesis testing is a structured statistical method for making decisions or drawing inferences about population parameters using sample data.

1. Formulate Hypotheses:

 *   Null Hypothesis $H_0$: This is a statement of no effect or difference, often reflecting the status quo or a default assumption.
 *   Alternative Hypothesis $H_0$ or $H_α$: This is a statement contradicting the null hypothesis, representing the effect or difference that the researcher aims to identify.

2. Choose Significance Level $α$:

 *  The significance level, denoted as $α$, signifies the probability of committing a Type I error (incorrectly rejecting a true null hypothesis). Commonly used values for $α$ include 0.05 and 0.01.

3. Collect and Analyze Data:

Collect a sample of data and perform statistical analysis, employing appropriate tests based on the hypothesis under examination (e.g., t-test, chi-square test).
Calculate Test Statistic and P-value:

4. Compute a test statistic summarizing information from the sample and determine the p-value, representing the probability of observing the results (or more extreme) if the null hypothesis is true.

5. Make a Decision:

* Compare the p-value to the significance level:
 * If $ p ≤ α$ , reject the null hypothesis, indicating sufficient evidence to support the alternative hypothesis.
 * If $ p > α$, fail to reject the null hypothesis, signifying insufficient evidence to support the alternative hypothesis.

<img src= "https://stats.libretexts.org/@api/deki/files/855/null_hypothesis_1.png?revision=1&size=bestfit&width=762&height=298" >



In [5]:
import numpy as np
from scipy.stats import ttest_1samp

# Generate or load your dataset
# For the purpose of this example, let's create a dataset of exam scores
np.random.seed(42)
exam_scores = np.random.normal(loc=75, scale=10, size=30)  # mean=75, std=10, sample size=30

# Population mean (the value we want to test against)
population_mean = 80

# Perform one-sample t-test
t_statistic, p_value = ttest_1samp(exam_scores, population_mean)

# Display the results
print(f'Test Statistic: {t_statistic}')
print(f'P-value: {p_value}')

# Check if the null hypothesis is rejected at a significance level of 0.05
alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis: The sample mean is significantly different from the population mean.")
else:
    print("Fail to reject the null hypothesis: There is not enough evidence to claim a significant difference.")


Test Statistic: -4.18789873337477
P-value: 0.00023965836838861642
Reject the null hypothesis: The sample mean is significantly different from the population mean.


# Conclusion

In conclusion, the one-sample t-test conducted on a hypothetical dataset of exam scores aimed to determine whether the sample mean significantly differed from a specified population mean of 80. The calculated test statistic and p-value were used to make a decision at a significance level of 0.05. If the p-value is less than 0.05, the null hypothesis is rejected, indicating a significant difference in exam scores. Conversely, if the p-value is greater than 0.05, the null hypothesis is not rejected, suggesting insufficient evidence to claim a significant difference. The results provide insights into the statistical significance of the observed differences, guiding researchers in drawing conclusions about the population based on the analyzed sample.