<span style=color:red;font-size:50px>ASSIGNMENT</span>

<span style=color:blue;font-size:46px>STATISTICS ADVANCE-3</span>

<span style=color:green>Q1: What is Estimation Statistics? Explain point estimate and interval estimate</span>

Ans-

### Estimation Statistics

Estimation statistics involves the use of sample data to estimate unknown population parameters. There are two main types of estimates: point estimates and interval estimates.

#### Point Estimate:

A point estimate is a single value that is used to approximate an unknown population parameter. It's a specific numerical value calculated from the sample data. For example, if you want to estimate the average height of a population based on a sample, the sample mean would be a point estimate for the population mean.

#### Interval Estimate:

An interval estimate provides a range within which the true population parameter is likely to fall. It recognizes the uncertainty associated with estimating a parameter based on a sample. Confidence intervals are a common form of interval estimate. A confidence interval includes both a point estimate and a margin of error, creating a range of values within which the true parameter is likely to lie with a certain level of confidence. For instance, a 95% confidence interval for the average height of a population might be [65 inches, 70 inches], indicating that we are 95% confident that the true population mean falls within this interval.

In summary, while a point estimate gives a single best guess for the parameter, an interval estimate provides a range of values that is believed to contain the true parameter. The choice between the two depends on the level of precision and confidence required in the estimation process.


<span style=color:green>Q2. Write a Python function to estimate the population mean using a sample mean and standard
deviation.</span>

Ans-

In [4]:
def estimate_population_mean(sample_mean, sample_size, sample_std):
    """
    Estimate the population mean using a sample mean, sample size, and sample standard deviation.

    Parameters:
    - sample_mean (float): The mean of the sample.
    - sample_size (int): The size of the sample.
    - sample_std (float): The standard deviation of the sample.

    Returns:
    - float: Estimated population mean.
    """
    # Calculate the standard error of the mean
    standard_error = sample_std / (sample_size ** 0.5)

    # Calculate the margin of error (assuming a certain level of confidence, e.g., 95%)
    # You can customize the confidence level if needed
    confidence_level = 1.96  # for a 95% confidence interval
    margin_of_error = confidence_level * standard_error

    # Calculate the lower and upper bounds of the confidence interval
    lower_bound = sample_mean - margin_of_error
    upper_bound = sample_mean + margin_of_error

    # Return the estimated population mean and the confidence interval
    return {
        "estimated_mean": sample_mean,
        "confidence_interval": (lower_bound, upper_bound)
    }

# Example usage:
sample_mean = 60.0
sample_size = 110
sample_std = 40.0

result = estimate_population_mean(sample_mean, sample_size, sample_std)
print(f"Estimated Population Mean: {result['estimated_mean']}")
print(f"95% Confidence Interval: {result['confidence_interval']}")


Estimated Population Mean: 60.0
95% Confidence Interval: (52.524853300314554, 67.47514669968544)


<span style=color:green>Q3: What is Hypothesis testing? Why is it used? State the importance of Hypothesis testing.</span>

Ans-

### Hypothesis Testing

**Definition:**
Hypothesis testing is a statistical method used to make inferences about population parameters based on a sample of data. It involves the formulation of two competing hypotheses: a null hypothesis (H0) and an alternative hypothesis (H1). The goal is to assess the evidence provided by the sample data to either reject the null hypothesis in favor of the alternative hypothesis or fail to reject the null hypothesis.

**Components of Hypothesis Testing:**
1. **Null Hypothesis (H0):** A statement that there is no significant difference or effect.
2. **Alternative Hypothesis (H1):** A statement that contradicts the null hypothesis, suggesting a significant difference or effect.
3. **Significance Level (α):** The predetermined level of significance that represents the probability of rejecting the null hypothesis when it is true.
4. **Test Statistic:** A calculated value from the sample data used to make a decision about the null hypothesis.
5. **P-value:** The probability of observing a test statistic as extreme as, or more extreme than, the one calculated from the sample data, assuming the null hypothesis is true.

**Importance of Hypothesis Testing:**

1. **Inference:** Hypothesis testing allows researchers to draw conclusions about population parameters based on a sample. It provides a structured approach to inferential statistics.

2. **Decision-Making:** Hypothesis testing provides a systematic method for decision-making in various fields, including science, business, medicine, and social sciences. Decisions are made based on statistical evidence rather than intuition.

3. **Scientific Research:** Hypothesis testing is fundamental to scientific research. It helps researchers determine whether their findings are statistically significant and whether they can reject the null hypothesis in favor of a new theory.

4. **Quality Control:** In industries, hypothesis testing is used to ensure the quality and consistency of products or processes. For example, it may be employed to test whether a manufacturing process is meeting certain specifications.

5. **Policy and Strategy Development:** Decision-makers in various fields use hypothesis testing to evaluate the effectiveness of policies, strategies, or interventions. It provides a basis for evidence-based decision-making.

6. **Risk Management:** Hypothesis testing helps assess risks and uncertainties by providing a statistical framework for evaluating the likelihood of specific outcomes.

In summary, hypothesis testing is a crucial statistical tool that provides a systematic and objective approach to drawing conclusions from sample data, making informed decisions, and advancing scientific knowledge in various domains.


<span style=color:green>Q4. Create a hypothesis that states whether the average weight of male college students is greater than
the average weight of female college students.

</span>

Ans-

### Hypothesis for Average Weight Comparison Between Male and Female College Students

**Null Hypothesis (H0):**
The average weight of male college students is equal to or less than the average weight of female college students.

**Alternative Hypothesis (H1):**
The average weight of male college students is greater than the average weight of female college students.

In symbols:

- H0: μ_male ≤ μ_female
- H1: μ_male > μ_female

Here, μ_male represents the population mean weight of male college students, and μ_female represents the population mean weight of female college students.

This hypothesis can be tested using statistical methods and sample data to determine whether there is enough evidence to reject the null hypothesis in favor of the alternative hypothesis, suggesting that, on average, male college students weigh more than female college students.


<span style=color:green>Q5. Write a Python script to conduct a hypothesis test on the difference between two population means,
given a sample from each population.</span>

Ans-

In [5]:
import numpy as np
from scipy.stats import ttest_ind

def conduct_two_sample_t_test(sample1, sample2, alpha=0.05):
    """
    Conducts a two-sample t-test on the difference between two population means.

    Parameters:
    - sample1 (array-like): Sample data for the first population.
    - sample2 (array-like): Sample data for the second population.
    - alpha (float): Significance level (default is 0.05).

    Returns:
    - dict: Test result summary.
    """
    # Perform the t-test for independent samples
    t_statistic, p_value = ttest_ind(sample1, sample2)

    # Compare p-value to the significance level
    if p_value < alpha:
        result = "Reject the null hypothesis"
    else:
        result = "Fail to reject the null hypothesis"

    # Return test results
    return {
        "t_statistic": t_statistic,
        "p_value": p_value,
        "result": result
    }

# Example usage:
# Assuming you have two samples, sample1 and sample2
# Replace these with your actual data
sample1 = np.array([23, 78, 55, 94, 80])
sample2 = np.array([32, 78, 40, 55, 76])

# Set the significance level (alpha)
alpha = 0.05

# Conduct the two-sample t-test
test_result = conduct_two_sample_t_test(sample1, sample2, alpha)

# Display the results
print("Two-Sample T-Test Results:")
print(f"T-Statistic: {test_result['t_statistic']}")
print(f"P-Value: {test_result['p_value']}")
print(f"Conclusion: {test_result['result']} at {alpha} significance level")


Two-Sample T-Test Results:
T-Statistic: 0.6318768178707783
P-Value: 0.5450967193380833
Conclusion: Fail to reject the null hypothesis at 0.05 significance level


<span style=color:green>Q6: What is a null and alternative hypothesis? Give some examples.</span>

Ans-

### Null and Alternative Hypotheses

**Null Hypothesis (H0):**
The null hypothesis is a statement that there is no significant difference or effect, or that a parameter is equal to a specific value. It serves as the default assumption or starting point in a hypothesis test. The null hypothesis is typically denoted as H0.

**Alternative Hypothesis (H1 or Ha):**
The alternative hypothesis is a statement that contradicts the null hypothesis. It suggests the presence of a significant difference, effect, or relationship. The alternative hypothesis is what researchers aim to support or demonstrate through their analysis. The alternative hypothesis is denoted as H1 or Ha.

**Examples:**

1. **Example for a Mean Comparison:**
   - Null Hypothesis (H0): The average height of male and female students is equal.
   - Alternative Hypothesis (H1): The average height of male students is different from the average height of female students.

2. **Example for a Proportion Comparison:**
   - Null Hypothesis (H0): The proportion of customers satisfied with the product is 0.8.
   - Alternative Hypothesis (H1): The proportion of customers satisfied with the product is not equal to 0.8.

3. **Example for a Correlation Test:**
   - Null Hypothesis (H0): There is no correlation between hours of study and exam scores.
   - Alternative Hypothesis (H1): There is a significant correlation between hours of study and exam scores.

4. **Example for a Difference in Means:**
   - Null Hypothesis (H0): The average response time of two different systems is the same.
   - Alternative Hypothesis (H1): The average response time of one system is greater than the average response time of the other system.

5. **Example for a Difference in Proportions:**
   - Null Hypothesis (H0): The proportion of defects in two manufacturing processes is equal.
   - Alternative Hypothesis (H1): The proportion of defects in one manufacturing process is greater than the proportion in the other.

In each example, the null hypothesis represents a statement of equality or no effect, while the alternative hypothesis suggests a difference, effect, or relationship that the researcher is interested in testing. The goal is to collect and analyze data to determine whether there is enough evidence to reject the null hypothesis in favor of the alternative hypothesis.


<span style=color:green>Q7: Write down the steps involved in hypothesis testing</span>

Ans-

### Steps Involved in Hypothesis Testing

1. **Formulate the Hypotheses:**
   - Null Hypothesis (H0): The default assumption that there is no significant difference or effect.
   - Alternative Hypothesis (H1): The statement that contradicts the null hypothesis, suggesting a significant difference or effect.

2. **Set the Significance Level (α):**
   - Choose a significance level (α), typically 0.05, which represents the probability of rejecting the null hypothesis when it is true.

3. **Collect and Prepare the Data:**
   - Gather data through sampling or experimentation.
   - Ensure that the data is representative of the population of interest.

4. **Select the Appropriate Test:**
   - Choose a statistical test based on the nature of the data and the hypotheses being tested (e.g., t-test, chi-square test, ANOVA).

5. **Determine the Test Statistic:**
   - Calculate the test statistic based on the sample data.
   - The test statistic depends on the chosen statistical test.

6. **Calculate the P-value:**
   - Determine the probability of obtaining the observed test statistic (or more extreme) under the assumption that the null hypothesis is true.
   - The p-value is compared to the significance level.

7. **Make a Decision:**
   - If the p-value is less than or equal to the significance level (α), reject the null hypothesis.
   - If the p-value is greater than the significance level, fail to reject the null hypothesis.

8. **Draw a Conclusion:**
   - Based on the decision in step 7, draw a conclusion about the null hypothesis.
   - If the null hypothesis is rejected, provide evidence in support of the alternative hypothesis.

9. **Interpret the Results:**
   - Consider the practical significance of the findings.
   - Discuss the implications of the results in the context of the research question.

10. **Document and Communicate:**
   - Clearly document the results, including the test statistic, p-value, and conclusion.
   - Communicate the findings in a report or presentation.

These steps provide a systematic framework for conducting hypothesis testing. It's important to note that hypothesis testing is a probabilistic approach, and results are subject to uncertainty. The choice of the significance level and the interpretation of the p-value are critical aspects of the process.


<span style=color:green>Q8. Define p-value and explain its significance in hypothesis testing</span>

Ans-

### P-value in Hypothesis Testing

**Definition:**
The p-value, or probability value, is a measure that helps assess the evidence against a null hypothesis in hypothesis testing. It quantifies the probability of obtaining a test statistic as extreme as, or more extreme than, the one observed in the sample, assuming that the null hypothesis is true.

**Significance Level (α):**
The p-value is compared to a pre-determined significance level (α), commonly set at 0.05. If the p-value is less than or equal to α, it is considered statistically significant, leading to the rejection of the null hypothesis. Conversely, if the p-value is greater than α, there is insufficient evidence to reject the null hypothesis.

**Interpretation:**
- Small p-value (typically ≤ α): Indicates strong evidence against the null hypothesis. It suggests that the observed data is unlikely under the assumption that the null hypothesis is true.
  
- Large p-value (typically > α): Suggests weak evidence against the null hypothesis. It indicates that the observed data is not surprising or unusual under the assumption that the null hypothesis is true.

**Significance in Hypothesis Testing:**

1. **Decision Rule:**
   - If p-value ≤ α: Reject the null hypothesis.
   - If p-value > α: Fail to reject the null hypothesis.

2. **Strength of Evidence:**
   - A smaller p-value provides stronger evidence against the null hypothesis. It suggests that the observed results are unlikely to have occurred by random chance alone.

3. **False Positive Rate (Type I Error):**
   - The significance level (α) represents the probability of making a Type I error (incorrectly rejecting a true null hypothesis). Choosing a lower α reduces the chance of a Type I error but may increase the chance of a Type II error (incorrectly failing to reject a false null hypothesis).

4. **Practical Significance:**
   - While statistical significance is crucial, it's also important to consider the practical significance of the findings. A small p-value may not necessarily imply practical importance.

5. **Decision-Making:**
   - The p-value provides a basis for decision-making in hypothesis testing. Researchers use it to determine whether the observed results are statistically significant and whether they have enough evidence to reject the null hypothesis.

In summary, the p-value is a key component in hypothesis testing, serving as a bridge between the observed data and the decision about the null hypothesis. It helps researchers make informed conclusions about the population parameters based on the sample data.


<span style=color:green>Q9. Generate a Student's t-distribution plot using Python's matplotlib library, with the degrees of freedom
parameter set to 10.</span>

Ans-

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import t

# Set the degrees of freedom
degrees_of_freedom = 10

# Generate x values
x = np.linspace(-3, 3, 1000)

# Calculate the probability density function (PDF) for the t-distribution
pdf_values = t.pdf(x, df=degrees_of_freedom)

# Plot the t-distribution
plt.plot(x, pdf_values, label=f't-distribution (df={degrees_of_freedom})')
plt.title("Student's t-distribution")
plt.xlabel('x')
plt.ylabel('Probability Density Function (PDF)')
plt.legend()c
plt.grid(True)
plt.show()


<span style=color:green>Q10. Write a Python program to calculate the two-sample t-test for independent samples, given two
random samples of equal size and a null hypothesis that the population means are equal.</span>

Ans-

In [2]:
import numpy as np
from scipy.stats import ttest_ind

def two_sample_t_test(sample1, sample2):
    """
    Perform a two-sample t-test for independent samples.

    Parameters:
    - sample1 (array-like): First sample data.
    - sample2 (array-like): Second sample data.

    Returns:
    - float: t-statistic
    - float: p-value
    """
    # Perform the two-sample t-test
    t_statistic, p_value = ttest_ind(sample1, sample2)

    return t_statistic, p_value

# Example usage:
# Replace these with your actual sample data
sample_size = 100
sample1 = np.random.normal(loc=5, scale=2, size=sample_size)
sample2 = np.random.normal(loc=6, scale=2, size=sample_size)

# Perform the two-sample t-test
t_statistic, p_value = two_sample_t_test(sample1, sample2)

# Display the results
print("Two-Sample T-Test Results:")
print(f"T-Statistic: {t_statistic}")
print(f"P-Value: {p_value}")

# Check the significance level (e.g., α = 0.05)
alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis (population means are not equal).")
else:
    print("Fail to reject the null hypothesis (no significant evidence that population means are different).")


Two-Sample T-Test Results:
T-Statistic: -2.262376877963864
P-Value: 0.024759988927294578
Reject the null hypothesis (population means are not equal).


<span style=color:green>Q11: What is Student’s t distribution? When to use the t-Distribution.</span>

Ans-

### Student’s t-Distribution

**Definition:**
Student’s t-distribution (or simply t-distribution) is a probability distribution that arises in the context of estimating the mean of a normally distributed population when the sample size is small, and the population standard deviation is unknown. It is named after William Sealy Gosset, who published under the pseudonym "Student." The t-distribution is similar in shape to the normal distribution but has heavier tails.

**Characteristics:**
- Bell-shaped and symmetric like the normal distribution.
- Controlled by a parameter called degrees of freedom (df).
- As the degrees of freedom increase, the t-distribution approaches the standard normal distribution.

**Probability Density Function (PDF):**
The probability density function for the t-distribution with df degrees of freedom is given by:
\[ f(t; df) = \frac{\Gamma\left(\frac{df+1}{2}\right)}{\sqrt{\pi df}\Gamma\left(\frac{df}{2}\right)} \left(1 + \frac{t^2}{df}\right)^{-\frac{df+1}{2}} \]
where \(\Gamma\) is the gamma function.

### When to Use the t-Distribution

1. **Small Sample Size:**
   - The t-distribution is particularly useful when dealing with small sample sizes (typically when the sample size is less than 30) and the population standard deviation is unknown.

2. **Population Standard Deviation Unknown:**
   - When the population standard deviation is unknown and must be estimated from the sample data, the t-distribution is used for more accurate inference.

3. **Estimating Confidence Intervals:**
   - In situations where you need to construct a confidence interval for the population mean based on a sample mean, especially with a small sample size and unknown population standard deviation.

4. **Hypothesis Testing:**
   - For hypothesis testing involving the mean of a sample when the population standard deviation is unknown and the sample size is small.

### Comparison with Normal Distribution:

- The t-distribution becomes very close to the standard normal distribution as the degrees of freedom increase.
- For larger sample sizes (typically above 30), the normal distribution is often used instead of the t-distribution.

In summary, the t-distribution is a valuable tool in statistics, especially when dealing with small sample sizes and unknown population standard deviations. It provides a more realistic model for the variability of sample means in such scenarios.


<span style=color:green>Q12: What is t-statistic? State the formula for t-statistic.</span>

Ans-

### T-Statistic

The t-statistic is a measure used in hypothesis testing to assess whether the means of two groups are significantly different from each other or whether the mean of a single group is significantly different from a known value (e.g., a population mean). It is based on the Student's t-distribution and is commonly used when dealing with small sample sizes and situations where the population standard deviation is unknown.

### Formula for T-Statistic (Two-Sample T-Test):

For a two-sample t-test comparing the means of two independent samples (assuming equal variances), the formula for the t-statistic is given by:

\[ t = \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{s^2 \left(\frac{1}{n_1} + \frac{1}{n_2}\right)}} \]

Where:
- \(\bar{X}_1\) and \(\bar{X}_2\) are the sample means of the two groups.
- \(s^2\) is the pooled sample variance.
- \(n_1\) and \(n_2\) are the sample sizes of the two groups.

### Formula for T-Statistic (One-Sample T-Test):

For a one-sample t-test comparing the mean of a single sample to a known value (e.g., a population mean), the formula for the t-statistic is given by:

\[ t = \frac{\bar{X} - \mu}{\frac{s}{\sqrt{n}}} \]

Where:
- \(\bar{X}\) is the sample mean.
- \(\mu\) is the population mean under the null hypothesis.
- \(s\) is the sample standard deviation.
- \(n\) is the sample size.

### Interpretation:

- If the absolute value of the t-statistic is large, it suggests that the difference between the sample mean and the hypothesized population mean is significant.
- The t-statistic is compared to a critical value or used to calculate a p-value, which helps determine whether to reject the null hypothesis in favor of the alternative hypothesis.

In summary, the t-statistic quantifies how far a sample mean is from the population mean (or from another sample mean) in terms of standard errors. It plays a crucial role in hypothesis testing, providing a standardized measure of the difference between sample and population means.


<span style=color:green>Q13. A coffee shop owner wants to estimate the average daily revenue for their shop. They take a random
sample of 50 days and find the sample mean revenue to be $500 with a standard deviation of $50.
Estimate the population mean revenue with a 95% confidence interval.</span>

Ans-

In [3]:
import numpy as np

# Given data
sample_mean = 500  # in dollars
sample_std_dev = 50  # in dollars
sample_size = 50

# Z-score for a 95% confidence interval
z_score = 1.96

# Calculate the margin of error
margin_of_error = z_score * (sample_std_dev / np.sqrt(sample_size))

# Calculate the confidence interval
confidence_interval_lower = sample_mean - margin_of_error
confidence_interval_upper = sample_mean + margin_of_error

# Display the results
print(f"95% Confidence Interval for Population Mean Revenue: ${confidence_interval_lower:.2f} to ${confidence_interval_upper:.2f}")


95% Confidence Interval for Population Mean Revenue: $486.14 to $513.86


<span style=color:green>Q14. A researcher hypothesizes that a new drug will decrease blood pressure by 10 mmHg. They conduct a
clinical trial with 100 patients and find that the sample mean decrease in blood pressure is 8 mmHg with a
standard deviation of 3 mmHg. Test the hypothesis with a significance level of 0.05.</span>

Ans-

In [4]:
import numpy as np
from scipy.stats import t

# Given data
sample_mean = 8  # in mmHg
hypothesized_mean = 10  # hypothesized population mean under the null hypothesis
sample_std_dev = 3  # in mmHg
sample_size = 100

# Calculate the t-statistic
t_statistic = (sample_mean - hypothesized_mean) / (sample_std_dev / np.sqrt(sample_size))

# Degrees of freedom for a one-sample t-test
degrees_of_freedom = sample_size - 1

# Calculate the critical value for a two-tailed test at a 0.05 significance level
alpha = 0.05
critical_value = t.ppf(1 - alpha / 2, degrees_of_freedom)

# Perform the hypothesis test
p_value = 2 * (1 - t.cdf(np.abs(t_statistic), degrees_of_freedom))

# Display the results
print(f"T-Statistic: {t_statistic:.4f}")
print(f"Critical Value: ±{critical_value:.4f}")
print(f"P-Value: {p_value:.4f}")

# Check the hypothesis based on the p-value
if p_value < alpha:
    print("Reject the null hypothesis. There is significant evidence that the drug decreases blood pressure.")
else:
    print("Fail to reject the null hypothesis. There is insufficient evidence that the drug decreases blood pressure.")


T-Statistic: -6.6667
Critical Value: ±1.9842
P-Value: 0.0000
Reject the null hypothesis. There is significant evidence that the drug decreases blood pressure.


<span style=color:green>Q15. An electronics company produces a certain type of product with a mean weight of 5 pounds and a
standard deviation of 0.5 pounds. A random sample of 25 products is taken, and the sample mean weight
is found to be 4.8 pounds. Test the hypothesis that the true mean weight of the products is less than 5
pounds with a significance level of 0.01.</span>

Ans-

In [5]:
import numpy as np
from scipy.stats import t

# Given data
hypothesized_mean = 5  # hypothesized population mean under the null hypothesis
sample_mean = 4.8  # in pounds
sample_std_dev = 0.5  # in pounds
sample_size = 25

# Calculate the t-statistic
t_statistic = (sample_mean - hypothesized_mean) / (sample_std_dev / np.sqrt(sample_size))

# Degrees of freedom for a one-sample t-test
degrees_of_freedom = sample_size - 1

# Calculate the critical value for a one-tailed test at a 0.01 significance level
alpha = 0.01
critical_value = t.ppf(alpha, degrees_of_freedom)

# Perform the hypothesis test
p_value = t.cdf(t_statistic, degrees_of_freedom)

# Display the results
print(f"T-Statistic: {t_statistic:.4f}")
print(f"Critical Value: {critical_value:.4f}")
print(f"P-Value: {p_value:.4f}")

# Check the hypothesis based on the p-value
if p_value < alpha:
    print("Reject the null hypothesis. There is significant evidence that the mean weight is less than 5 pounds.")
else:
    print("Fail to reject the null hypothesis. There is insufficient evidence that the mean weight is less than 5 pounds.")


T-Statistic: -2.0000
Critical Value: -2.4922
P-Value: 0.0285
Fail to reject the null hypothesis. There is insufficient evidence that the mean weight is less than 5 pounds.


<span style=color:green>Q16. Two groups of students are given different study materials to prepare for a test. The first group (n1 =
30) has a mean score of 80 with a standard deviation of 10, and the second group (n2 = 40) has a mean
score of 75 with a standard deviation of 8. Test the hypothesis that the population means for the two
groups are equal with a significance level of 0.01.</span>

Ans-

In [8]:
import numpy as np
from scipy.stats import t

# Given data for Group 1
mean1 = 80
std_dev1 = 10
sample_size1 = 30

# Given data for Group 2
mean2 = 75
std_dev2 = 8
sample_size2 = 40

# Calculate the two-sample t-test statistic
t_statistic = (mean1 - mean2) / np.sqrt((std_dev1**2 / sample_size1) + (std_dev2**2 / sample_size2))

# Calculate degrees of freedom
df = ((std_dev1**2 / sample_size1) + (std_dev2**2 / sample_size2))**2 / \
     (((std_dev1**2 / sample_size1)**2 / (sample_size1 - 1)) + ((std_dev2**2 / sample_size2)**2 / (sample_size2 - 1)))

# Calculate the critical value for a two-tailed test at a 0.01 significance level
alpha = 0.01
critical_value = t.ppf(1 - alpha / 2, df)

# Perform the hypothesis test
p_value = 2 * (1 - t.cdf(np.abs(t_statistic), df))

# Display the results
print(f"T-Statistic: {t_statistic:.4f}")
print(f"Degrees of Freedom: {df:.0f}")
print(f"Critical Value: ±{critical_value:.4f}")
print(f"P-Value: {p_value:.4f}")

# Check the hypothesis based on the p-value
if p_value < alpha:
    print("Reject the null hypothesis. There is significant evidence that the population means are not equal.")
else:
    print("Fail to reject the null hypothesis. There is insufficient evidence that the population means are not equal.")


T-Statistic: 2.2511
Degrees of Freedom: 54
Critical Value: ±2.6696
P-Value: 0.0285
Fail to reject the null hypothesis. There is insufficient evidence that the population means are not equal.
