# Statistics Advance Assignment - 3

# Q1: What is Estimation Statistics? Explain point estimate and interval estimate.

Estimation Statistics:

Estimation statistics is a branch of statistics concerned with estimating population parameters based on sample data. In estimation statistics, we use information from a sample to make inferences about unknown parameters of the population from which the sample was drawn. The two main types of estimation in statistics are point estimation and interval estimation.

Point Estimate:

A point estimate is a single value that serves as the best guess or approximation of an unknown population parameter. It is obtained by using a statistic calculated from sample data to estimate the value of the parameter of interest.

# Q2. Write a Python function to estimate the population mean using a sample mean and standard deviation.

In [1]:
import scipy.stats as stats

def estimate_population_mean(sample_mean, sample_std, sample_size, confidence_level=0.95):
 
    # Calculate the standard error (standard deviation of the sampling distribution)
    standard_error = sample_std / (sample_size ** 0.5)
    
    # Calculate the margin of error using the Z-score for the desired confidence level
    z_score = stats.norm.ppf(1 - (1 - confidence_level) / 2)
    margin_of_error = z_score * standard_error
    
    # Calculate the lower and upper bounds of the confidence interval
    lower_bound = sample_mean - margin_of_error
    upper_bound = sample_mean + margin_of_error
    
    return lower_bound, upper_bound

# Example usage:
sample_mean = 50
sample_std = 10
sample_size = 100
confidence_level = 0.95

lower_bound, upper_bound = estimate_population_mean(sample_mean, sample_std, sample_size, confidence_level)
print(f"Estimated population mean lies between {lower_bound:.2f} and {upper_bound:.2f} with {confidence_level*100}% confidence.")


Estimated population mean lies between 48.04 and 51.96 with 95.0% confidence.


# Q3: What is Hypothesis testing? Why is it used? State the importance of Hypothesis testing.

Hypothesis Testing:

Hypothesis testing is a statistical method used to make inferences or decisions about population parameters based on sample data. It involves formulating a hypothesis about the population parameter of interest and then using sample data to assess the plausibility of the hypothesis. Hypothesis testing is widely used in scientific research, quality control, business decision-making, and various other fields to test theories, validate assumptions, and make evidence-based decisions.

Importance of Hypothesis Testing:

Scientific Research: 

Hypothesis testing is essential in scientific research to evaluate theories, hypotheses, and research questions. It allows researchers to determine whether observed differences or relationships in data are statistically significant or simply due to chance.

Quality Control and Process Improvement: 

In industries such as manufacturing, healthcare, and finance, hypothesis testing is used to assess the effectiveness of process improvements, product enhancements, or quality control measures. It helps organizations identify areas for improvement and make data-driven decisions to enhance efficiency and quality.

Business Decision-Making: 

Hypothesis testing is valuable in business decision-making to evaluate marketing strategies, product launches, pricing changes, and other business initiatives. It enables organizations to assess the impact of interventions and make informed decisions to optimize performance and profitability.

Policy Evaluation: 

In public policy and social sciences, hypothesis testing is used to evaluate the effectiveness of policies, programs, and interventions. It helps policymakers and researchers determine whether interventions have the intended effects and whether resources are being allocated efficiently.

Medical and Healthcare Research:

Hypothesis testing plays a crucial role in medical and healthcare research to assess the efficacy of treatments, interventions, and diagnostic procedures. It helps healthcare professionals make evidence-based decisions and improve patient outcomes.

Legal and Regulatory Compliance:

In legal and regulatory contexts, hypothesis testing is used to assess compliance with laws, regulations, and standards. It helps ensure fairness, accountability, and transparency in various domains, such as environmental protection, consumer safety, and financial regulation.

Educational Assessment: 

Hypothesis testing is employed in educational research and assessment to evaluate teaching methods, curriculum changes, and educational interventions. It helps educators and policymakers identify effective educational practices and improve student outcomes.

In summary, hypothesis testing is a fundamental statistical method used to evaluate hypotheses, make decisions, and draw conclusions based on sample data. Its importance lies in its wide-ranging applications across various fields, enabling researchers, practitioners, and decision-makers to make informed decisions, solve problems, and advance knowledge.

# Q4. Create a hypothesis that states whether the average weight of male college students is greater than the average weight of female college students.

Null Hypothesis (H0):
The average weight of male college students is equal to or less than the average weight of female college students.

Alternative Hypothesis (H1):
The average weight of male college students is greater than the average weight of female college students.

# Q5. Write a Python script to conduct a hypothesis test on the difference between two population means, given a sample from each population.

In [None]:
from scipy import stats

def independent_t_test(sample1, sample2, alpha=0.05):
   
    t_statistic, p_value = stats.ttest_ind(sample1, sample2)
    return t_statistic, p_value

# Example usage:
sample1 = [65, 68, 70, 72, 71]  # Sample from population 1
sample2 = [60, 62, 64, 63, 66]  # Sample from population 2

t_statistic, p_value = independent_t_test(sample1, sample2)

print("Independent Samples t-Test Results:")
print("t-statistic:", t_statistic)
print("p-value:", p_value)

# Interpret the results based on the p-value and chosen significance level
if p_value < alpha:
    print("Reject the null hypothesis; there is a significant difference between the population means.")
else:
    print("Fail to reject the null hypothesis; there is no significant difference between the population means.")


# Q6: What is a null and alternative hypothesis? Give some examples.

In statistical hypothesis testing, the null hypothesis (denoted as H0) and the alternative hypothesis (denoted as H1 or Ha) are two complementary statements that are used to assess the validity of a claim about a population parameter based on sample data.

Null Hypothesis (H0):
The null hypothesis is a statement that represents the default or status quo assumption. It assumes that there is no significant effect, no difference, or no relationship between variables. In other words, the null hypothesis proposes that any observed difference or effect in the sample data is due to random chance or sampling variability. The null hypothesis is typically what we want to test against in hypothesis testing.

Alternative Hypothesis (H1 or Ha):
The alternative hypothesis is a statement that contradicts the null hypothesis. It represents the claim, effect, or difference that the researcher is interested in investigating. The alternative hypothesis suggests that there is a significant effect, difference, or relationship between variables beyond what would be expected by chance alone.

Examples:

Example 1 - Mean Comparison:

Null Hypothesis (H0): The average height of male and female students in a school is the same.
Alternative Hypothesis (H1): The average height of male students is different from the average height of female students.
Example 2 - Proportion Comparison:

Null Hypothesis (H0): The proportion of defective products produced by Machine A is equal to the proportion of defective products produced by Machine B.
Alternative Hypothesis (H1): The proportion of defective products produced by Machine A is different from the proportion of defective products produced by Machine B.
Example 3 - Correlation Analysis:

Null Hypothesis (H0): There is no correlation between hours of study and exam scores.
Alternative Hypothesis (H1): There is a significant correlation between hours of study and exam scores.
Example 4 - Treatment Effect:

Null Hypothesis (H0): The mean blood pressure of patients before and after treatment is the same.
Alternative Hypothesis (H1): The mean blood pressure of patients after treatment is different from the mean blood pressure before treatment.

# Q7: Write down the steps involved in hypothesis testing.