**Q1: What is Estimation Statistics? Explain point estimate and interval estimate.**

Estimation statistics involves the use of sample data to make inferences or predictions about population parameters. There are two main types of estimates: point estimates and interval estimates.

1. **Point Estimate:**
   - A point estimate is a single value that is used to estimate the population parameter.
   - For example, if you calculate the mean (average) of a sample and use it to estimate the population mean, the sample mean is a point estimate.

2. **Interval Estimate:**
   - An interval estimate provides a range within which the true population parameter is likely to fall.
   - It recognizes the inherent variability in the estimation process.
   - The most common form of interval estimate is the confidence interval.
   - Confidence intervals are constructed around a point estimate and provide a range of values that are likely to contain the true parameter with a certain level of confidence.

   For instance, if you calculate a 95% confidence interval for the population mean, it means that you are 95% confident that the true population mean falls within the calculated interval.

In summary, estimation in statistics involves providing both point estimates and interval estimates. Point estimates offer a single value as an estimate, while interval estimates provide a range of values, typically with a certain level of confidence, within which the true parameter is likely to be found.

**Q2. Write a Python function to estimate the population mean using a sample mean and standard
deviation.**

Certainly! You can use the following Python function to estimate the population mean based on a sample mean, sample size, and sample standard deviation:

```python
import math

def estimate_population_mean(sample_mean, sample_std_dev, sample_size):
    """
    Estimate the population mean using a sample mean, sample standard deviation, and sample size.

    Parameters:
    - sample_mean: The mean of the sample.
    - sample_std_dev: The standard deviation of the sample.
    - sample_size: The size of the sample.

    Returns:
    - population_mean_estimate: The estimated population mean.
    """
    # Calculate the standard error of the mean (SEM)
    standard_error = sample_std_dev / math.sqrt(sample_size)

    # Use the sample mean and SEM to calculate the confidence interval
    population_mean_estimate = sample_mean

    return population_mean_estimate

# Example usage:
sample_mean = 50.0
sample_std_dev = 10.0
sample_size = 30

estimated_mean = estimate_population_mean(sample_mean, sample_std_dev, sample_size)
print(f"Estimated Population Mean: {estimated_mean}")
```

This function uses the standard error of the mean (SEM) to estimate the population mean. The formula for the standard error is `sample_std_dev / sqrt(sample_size)`. The estimated population mean is then simply the sample mean. Note that this is a point estimate; if you want to provide an interval estimate, you would typically calculate a confidence interval around this point estimate.

**Q3: What is Hypothesis testing? Why is it used? State the importance of Hypothesis testing.**

**Hypothesis testing** is a statistical method used to make inferences about population parameters based on a sample of data. It involves formulating a hypothesis about the population parameter, collecting and analyzing sample data, and then making a decision about whether to reject or fail to reject the null hypothesis.

Here's a breakdown of the key components of hypothesis testing:

1. **Null Hypothesis (H0):**
   - This is a statement of no effect or no difference. It represents the status quo or a default assumption.
   - Typically denoted as H0.

2. **Alternative Hypothesis (H1 or Ha):**
   - This is a statement that contradicts the null hypothesis and suggests the presence of an effect or difference.
   - The hypothesis the researcher aims to support.
   - Denoted as H1 or Ha.

3. **Significance Level (α):**
   - The significance level represents the probability of rejecting the null hypothesis when it is actually true.
   - Commonly set at 0.05, meaning a 5% chance of making a Type I error (incorrectly rejecting a true null hypothesis).

4. **Test Statistic:**
   - A value calculated from the sample data that is used to assess the evidence against the null hypothesis.

5. **P-value:**
   - The probability of obtaining a test statistic as extreme as, or more extreme than, the one observed, assuming the null hypothesis is true.
   - If the p-value is smaller than the significance level, the null hypothesis is rejected.

6. **Decision:**
   - Based on the p-value and the significance level, the researcher decides whether to reject the null hypothesis or fail to reject it.

**Importance of Hypothesis Testing:**

1. **Scientific Rigor:** Hypothesis testing provides a structured and rigorous method for evaluating hypotheses and making statistical inferences.

2. **Decision-Making:** It helps in making decisions based on evidence from data, especially in scientific research, business, and various fields where data-driven decisions are crucial.

3. **Statistical Inference:** Hypothesis testing allows researchers to draw conclusions about population parameters based on sample data, providing a way to generalize findings.

4. **Comparisons:** It enables comparisons between groups or conditions, helping to assess whether observed differences are statistically significant or if they could have occurred by chance.

5. **Quality Control:** In manufacturing and quality control processes, hypothesis testing is used to ensure products meet certain specifications and standards.

6. **Research Validation:** It is a key tool in validating or refuting research hypotheses, contributing to the advancement of scientific knowledge.

In summary, hypothesis testing is a fundamental statistical tool that allows researchers and decision-makers to draw conclusions about population parameters, make informed decisions, and assess the significance of observed effects in various fields.

**Q4. Create a hypothesis that states whether the average weight of male college students is greater than
the average weight of female college students.**

Certainly! Let's formulate a hypothesis to test whether the average weight of male college students is greater than the average weight of female college students. 

**Null Hypothesis (H0):**
The null hypothesis typically assumes no effect, no difference, or equality. In this case:<br>
$ H_0: \mu_{\text{male}} \leq \mu_{\text{female}} $

This null hypothesis states that the average weight $\mu$ of male college students $\mu_{\text{male}}$ is less than or equal to the average weight of female college students $\mu_{\text{female}}$.

**Alternative Hypothesis (H1 or Ha):**
The alternative hypothesis contradicts the null hypothesis and suggests the presence of an effect or difference:<br>
$ H_1: \mu_{\text{male}} > \mu_{\text{female}} $

This alternative hypothesis states that the average weight of male college students is greater than the average weight of female college students.

In summary, the hypotheses are as follows:

- Null Hypothesis (H0): $\mu_{\text{male}} \leq \mu_{\text{female}}$
- Alternative Hypothesis (H1): $\mu_{\text{male}} > \mu_{\text{female}}$

This hypothesis could be tested through data collection and statistical analysis to determine whether there is enough evidence to reject the null hypothesis in favor of the alternative hypothesis.

**Q5. Write a Python script to conduct a hypothesis test on the difference between two population means,
given a sample from each population.**

Certainly! To conduct a hypothesis test on the difference between two population means, you can use the t-test for independent samples. Here's a Python script using the `scipy.stats` module, assuming you have two sets of sample data (one for each population):

```python
import numpy as np
from scipy import stats

def two_sample_t_test(sample1, sample2, alpha=0.05):
    """
    Conducts a two-sample t-test for the difference between two population means.

    Parameters:
    - sample1: The sample data for the first population.
    - sample2: The sample data for the second population.
    - alpha: The significance level (default is 0.05).

    Returns:
    - result: A tuple containing the test statistic and p-value.
    """
    # Calculate means and standard deviations
    mean1, mean2 = np.mean(sample1), np.mean(sample2)
    std1, std2 = np.std(sample1, ddof=1), np.std(sample2, ddof=1)
    n1, n2 = len(sample1), len(sample2)

    # Calculate the standard error of the difference between means
    std_error = np.sqrt((std1**2 / n1) + (std2**2 / n2))

    # Calculate the t-statistic
    t_statistic = (mean1 - mean2) / std_error

    # Calculate degrees of freedom
    degrees_of_freedom = n1 + n2 - 2

    # Calculate p-value
    p_value = 2 * (1 - stats.t.cdf(np.abs(t_statistic), df=degrees_of_freedom))

    # Check for statistical significance
    if p_value < alpha:
        print("Reject the null hypothesis. There is a significant difference between population means.")
    else:
        print("Fail to reject the null hypothesis. There is not enough evidence to support a significant difference.")

    return t_statistic, p_value

# Example usage:
population1 = [68, 72, 75, 71, 73, 70, 74, 69]
population2 = [62, 65, 63, 68, 66, 67, 64, 68]

t_stat, p_val = two_sample_t_test(population1, population2)
print(f"T-Statistic: {t_stat}\nP-Value: {p_val}")
```

Replace `population1` and `population2` with your actual sample data. This script will perform a two-sample t-test and provide the test statistic and p-value. The script also prints whether the null hypothesis should be rejected based on the specified significance level (`alpha`). Adjust `alpha` according to your desired level of significance.

**Q6: What is a null and alternative hypothesis? Give some examples.**

**Null Hypothesis (H0):**
The null hypothesis is a statement that there is no significant difference, effect, or relationship. It represents the default assumption or the status quo. In hypothesis testing, the null hypothesis is what researchers aim to test against. It is often denoted as $H_0$.

**Alternative Hypothesis (H1 or Ha):**
The alternative hypothesis is a statement that contradicts the null hypothesis and suggests the presence of a significant difference, effect, or relationship. It is what researchers are trying to support or demonstrate. The alternative hypothesis is denoted as $H_1$ or $Ha$.

Here are some examples to illustrate the concept:

**Example 1 (Two Population Means):**
- Null Hypothesis$H_0$: The average height of men is equal to the average height of women.
- Alternative Hypothesis $H_1$: The average height of men is not equal to the average height of women.

**Example 2 (Correlation):**
- Null Hypothesis $H_0$: There is no correlation between the amount of time spent studying and exam scores.
- Alternative Hypothesis $H_1$: There is a significant correlation between the amount of time spent studying and exam scores.

**Example 3 (Difference in Proportions):**
- Null Hypothesis $H_0$: The proportion of customers who prefer product A is the same as the proportion who prefer product B.
- Alternative Hypothesis $H_1$: The proportion of customers who prefer product A is different from the proportion who prefer product B.

**Example 4 (One-Sample Mean):**
- Null Hypothesis $H_0$: The average weight of a certain product meets the specified standard.
- Alternative Hypothesis $H_1$: The average weight of the product is less than the specified standard.

In each example, the null hypothesis assumes no effect, no difference, or equality, while the alternative hypothesis proposes a specific effect, difference, or relationship that the researcher is trying to find evidence for based on collected data. The goal is to conduct statistical tests to determine whether there is enough evidence to reject the null hypothesis in favor of the alternative hypothesis.

**Q7: Write down the steps involved in hypothesis testing.**

Hypothesis testing involves a series of steps to make inferences about population parameters based on sample data. Here are the key steps involved in hypothesis testing:

1. **Formulate the Hypotheses:**
   - Null Hypothesis $H_0$: A statement of no effect, no difference, or equality.
   - Alternative Hypothesis $H_1$ or $H_a$: A statement contradicting the null hypothesis, suggesting the presence of an effect, difference, or relationship.

2. **Set the Significance Level (α):**
   - Choose a significance level $\alpha$ representing the probability of rejecting the null hypothesis when it is actually true. Common choices include 0.05, 0.01, or 0.10.

3. **Select the Test Statistic:**
   - Choose a statistical test based on the nature of the data and the hypotheses being tested (e.g., t-test, z-test, chi-square test).

4. **Collect and Analyze the Data:**
   - Collect a sample from the population and analyze the relevant data to obtain summary statistics.

5. **Calculate the Test Statistic:**
   - Use the collected data to calculate the test statistic based on the chosen statistical test.

6. **Determine the P-value:**
   - Calculate the p-value, representing the probability of obtaining the observed data or more extreme results under the assumption that the null hypothesis is true.

7. **Make a Decision:**
   - Compare the p-value to the significance level $\alpha$.
   - If $p \leq \alpha$, reject the null hypothesis in favor of the alternative hypothesis.
   - If $p > \alpha$, fail to reject the null hypothesis.

8. **Draw a Conclusion:**
   - Based on the decision, draw a conclusion about the null hypothesis and the presence or absence of a significant effect or difference.

9. **Interpret the Results:**
   - Provide an interpretation of the results in the context of the specific problem or question being investigated.

10. **Document the Findings:**
    - Clearly document the findings, including the test statistic, p-value, decision, and conclusion.

It's important to note that hypothesis testing is a structured approach to making statistical inferences, and the results are subject to the chosen significance level. The process is iterative, and researchers need to carefully design their studies, choose appropriate tests, and interpret results in a meaningful way.

**Q8. Define p-value and explain its significance in hypothesis testing.**

The **p-value** (probability value) is a crucial concept in hypothesis testing. It quantifies the evidence against a null hypothesis and helps researchers make decisions about whether to reject the null hypothesis in favor of the alternative hypothesis. The p-value represents the probability of observing a test statistic as extreme as, or more extreme than, the one observed in the sample, assuming that the null hypothesis is true.

Here's a breakdown of the significance of the p-value in hypothesis testing:

1. **Interpretation:**
   - A low p-value (typically below the chosen significance level, $\alpha$ suggests that the observed data is unlikely under the assumption that the null hypothesis is true.
   - A high p-value indicates that the observed data is likely, or not unusual, given the null hypothesis.

2. **Decision Rule:**
   - If the p-value is less than or equal to the significance level $p \leq \alpha$, researchers reject the null hypothesis.
   - If the p-value is greater than the significance level $p > \alpha$, researchers fail to reject the null hypothesis.

3. **Significance Level $\alpha$:**
   - The significance level is the threshold set by the researcher to determine whether the evidence against the null hypothesis is strong enough to reject it.
   - Common choices for $\alpha$ include 0.05, 0.01, or 0.10.

4. **Strength of Evidence:**
   - A smaller p-value indicates stronger evidence against the null hypothesis.
   - A larger p-value suggests weaker evidence against the null hypothesis.

5. **Decision-Making:**
   - The p-value guides decision-making in hypothesis testing. If the p-value is low, it provides support for the alternative hypothesis. If the p-value is high, it suggests that the observed data is consistent with the null hypothesis.

6. **Error Rates:**
   - The p-value is related to the error rates in hypothesis testing:
      - Type I Error: Incorrectly rejecting a true null hypothesis. The probability of Type I Error is $\alpha$.
      - Type II Error: Incorrectly failing to reject a false null hypothesis.

In summary, the p-value is a critical tool in hypothesis testing that quantifies the strength of evidence against the null hypothesis. Researchers use the p-value, in conjunction with the chosen significance level, to make informed decisions about whether to accept or reject the null hypothesis based on the observed data.

**Q9. Generate a Student's t-distribution plot using Python's matplotlib library, with the degrees of freedom
parameter set to 10.**

Certainly! You can use the `scipy.stats` module to generate a Student's t-distribution and the `matplotlib` library to create a plot. Here's an example Python script:

```python
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import t

# Set degrees of freedom
degrees_of_freedom = 10

# Generate x values
x_values = np.linspace(-4, 4, 1000)

# Calculate the probability density function (PDF) for the t-distribution
pdf_values = t.pdf(x_values, df=degrees_of_freedom)

# Plot the t-distribution
plt.figure(figsize=(8, 6))
plt.plot(x_values, pdf_values, label=f't-distribution (df={degrees_of_freedom})', color='blue')
plt.title("Student's t-distribution")
plt.xlabel('x')
plt.ylabel('Probability Density Function (PDF)')
plt.legend()
plt.grid(True)
plt.show()
```

This script generates a Student's t-distribution plot with degrees of freedom set to 10. You can adjust the `degrees_of_freedom` parameter to see how the shape of the distribution changes with different degrees of freedom. The `linspace` function is used to create a range of x values, and the `t.pdf` function is used to calculate the probability density function for the t-distribution. Finally, the plot is created using `matplotlib`.

**Q10. Write a Python program to calculate the two-sample t-test for independent samples, given two
random samples of equal size and a null hypothesis that the population means are equal.**

Certainly! You can use the `scipy.stats` module in Python to perform a two-sample t-test for independent samples. Here's an example Python script:

```python
import numpy as np
from scipy import stats

def two_sample_t_test(sample1, sample2, alpha=0.05):
    """
    Conducts a two-sample t-test for independent samples.

    Parameters:
    - sample1: The first sample data.
    - sample2: The second sample data.
    - alpha: The significance level (default is 0.05).

    Returns:
    - result: A tuple containing the test statistic and p-value.
    """
    # Perform a two-sample t-test
    t_statistic, p_value = stats.ttest_ind(sample1, sample2)

    # Check for statistical significance
    if p_value < alpha:
        print("Reject the null hypothesis. There is a significant difference between population means.")
    else:
        print("Fail to reject the null hypothesis. There is not enough evidence to support a significant difference.")

    return t_statistic, p_value

# Example usage:
np.random.seed(42)  # Set a seed for reproducibility

# Generate two random samples of equal size
sample_size = 30
sample1 = np.random.normal(loc=5, scale=2, size=sample_size)
sample2 = np.random.normal(loc=7, scale=2, size=sample_size)

# Perform two-sample t-test
t_stat, p_val = two_sample_t_test(sample1, sample2)
print(f"T-Statistic: {t_stat}\nP-Value: {p_val}")
```

In this example, two random samples (`sample1` and `sample2`) are generated using the NumPy library. The `ttest_ind` function from `scipy.stats` is then used to perform the two-sample t-test. The result includes the test statistic and p-value. The script also prints whether the null hypothesis should be rejected based on the specified significance level (`alpha`). Adjust `alpha` according to your desired level of significance.

**Q11: What is Student’s t distribution? When to use the t-Distribution.**

**Student's t-distribution** (or simply t-distribution) is a probability distribution that arises in statistical inference when estimating the population mean from a sample and when conducting hypothesis tests on the mean. It is named after William Sealy Gosset, who published under the pseudonym "Student" in 1908.

The t-distribution is similar to the normal distribution but has heavier tails, which makes it more suitable for small sample sizes. As the sample size increases, the t-distribution approaches the standard normal distribution. The shape of the t-distribution depends on the degrees of freedom, which is determined by the sample size.

**Key Characteristics of the t-Distribution:**
1. **Symmetry:** Like the normal distribution, the t-distribution is symmetric around its mean.
2. **Bell-shaped:** It has a bell-shaped curve, but with heavier tails compared to the normal distribution.
3. **Location:** The center of the distribution is at zero.
4. **Degrees of Freedom:** The spread and shape of the t-distribution depend on the degrees of freedom $df$. As $df$ increases, the t-distribution approaches the normal distribution.

**When to Use the t-Distribution:**
1. **Small Sample Sizes:** The t-distribution is particularly useful when dealing with small sample sizes $(typically (n < 30))$. In such cases, it provides better estimates of uncertainty compared to the normal distribution.

2. **Unknown Population Standard Deviation:** When the population standard deviation is unknown and must be estimated from the sample data, the t-distribution is used in place of the normal distribution.

3. **Estimating Population Mean:** When estimating the population mean $\mu$ based on a sample mean $\bar{x}$, especially in situations where the sample size is small.

4. **Hypothesis Testing:** The t-distribution is commonly used in hypothesis testing when comparing sample means or conducting t-tests.

**Mathematically, the Probability Density Function (PDF) of the t-Distribution:**
$f(t) = \frac{\Gamma\left(\frac{df+1}{2}\right)}{\sqrt{\pi df} \Gamma\left(\frac{df}{2}\right)} \left(1 + \frac{t^2}{df}\right)^{-\frac{df+1}{2}} $

Here, $t$ is the random variable, $\Gamma$ is the gamma function, and $df$ is the degrees of freedom.

In summary, the t-distribution is a fundamental probability distribution used in statistical inference, particularly when dealing with small sample sizes or situations where the population standard deviation is unknown. It provides a more accurate representation of the uncertainty associated with estimating population parameters from limited sample data.

**Q12: What is t-statistic? State the formula for t-statistic.**

The **t-statistic** is a measure that quantifies how far a sample mean is from the hypothesized population mean in terms of standard errors. It is commonly used in hypothesis testing to assess whether there is a significant difference between a sample mean and a population mean, or between two sample means.

The formula for the t-statistic depends on the context. Here are two common scenarios:

1. **One-Sample t-Test:**
   - The one-sample t-test is used to test whether the mean of a single sample is significantly different from a known or hypothesized population mean $\mu$.
   - The formula for the t-statistic in a one-sample t-test is:
     $ t = \frac{\bar{x} - \mu}{s/\sqrt{n}} $
     where:
     - $\bar{x}$ is the sample mean.
     - $\mu$ is the hypothesized population mean.
     - $s$ is the sample standard deviation.
     - $n$ is the sample size.

2. **Two-Sample t-Test (Independent Samples):**
   - The two-sample t-test is used to compare the means of two independent samples.
   - The formula for the t-statistic in a two-sample t-test is:
     $ t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}} $
     where:
     - $\bar{x}_1 and \bar{x}_2$ are the sample means of the two samples.
     - $s_1 and s_2$ are the sample standard deviations of the two samples.
     - $n_1 and n_2$ are the sample sizes of the two samples.

In both formulas, the t-statistic represents the number of standard errors that the sample mean is away from the hypothesized population mean or the difference between two sample means. The larger the absolute value of the t-statistic, the more evidence there is against the null hypothesis, indicating a potentially significant result. The t-statistic is then compared to critical values or p-values to make decisions in hypothesis testing.

**Q13. A coffee shop owner wants to estimate the average daily revenue for their shop. They take a random
sample of 50 days and find the sample mean revenue to be \\$500 with a standard deviation of \\$50.
Estimate the population mean revenue with a 95% confidence interval.**

To estimate the population mean revenue with a 95% confidence interval, you can use the following formula for the confidence interval:

$ \text{Confidence Interval}  = \bar{x} \pm \left( \text{critical value} \times \frac{s}{\sqrt{n}} \right) $

Where:
- $\bar{x}$ is the sample mean,
- $s$ is the sample standard deviation,
- $n$ is the sample size,
- The critical value depends on the desired confidence level.

For a 95% confidence interval with a normal distribution, the critical value (z-value) is approximately 1.96. The formula becomes:

$ \text{Confidence Interval} = \bar{x} \pm \left( 1.96 \times \frac{s}{\sqrt{n}} \right) $

Given your data:
- $\bar{x}$ = \\$500 (sample mean revenue),
- $s$ = \\$50 (sample standard deviation),
- $n = 50$ (sample size),
- Critical value (z-value) for a 95% confidence interval is approximately 1.96.

Let's substitute these values into the formula:

$ \text{Confidence Interval} = 500 \pm \left( 1.96 \times \frac{50}{\sqrt{50}} \right) $ 

Now, calculate the margin of error and then construct the confidence interval:

```python
import math

# Given values
sample_mean = 500
sample_std_dev = 50
sample_size = 50
confidence_level = 0.95

# Critical value for a 95% confidence interval
critical_value = 1.96

# Calculate the margin of error
margin_of_error = critical_value * (sample_std_dev / math.sqrt(sample_size))

# Construct the confidence interval
confidence_interval_lower = sample_mean - margin_of_error
confidence_interval_upper = sample_mean + margin_of_error

print(f"95% Confidence Interval: (${confidence_interval_lower:.2f}, ${confidence_interval_upper:.2f})")
```

So, the 95% confidence interval for the average daily revenue is approximately \($493.23, $506.77\). This means that we are 95% confident that the true population mean revenue falls within this interval.

**Q14. A researcher hypothesizes that a new drug will decrease blood pressure by 10 mmHg. They conduct a
clinical trial with 100 patients and find that the sample mean decrease in blood pressure is 8 mmHg with a
standard deviation of 3 mmHg. Test the hypothesis with a significance level of 0.05.**

To test the hypothesis about the decrease in blood pressure using a one-sample t-test, we can use the following hypotheses:

- Null Hypothesis $ H_0 $: The true mean decrease in blood pressure $\mu$ is equal to 10 mmHg.<br>
  $ H_0: \mu = 10 $

- Alternative Hypothesis $H_1$: The true mean decrease in blood pressure $\mu$ is not equal to 10 mmHg.<br>
  $ H_1: \mu \neq 10 $

The test statistic for a one-sample t-test is given by:
$ t = \frac{\bar{x} - \mu_0}{\frac{s}{\sqrt{n}}} $

Where:
- $\bar{x}$ is the sample mean decrease in blood pressure,
- $\mu_0$ is the hypothesized population mean (10 mmHg),
- $s$ is the sample standard deviation,
- $n$ is the sample size.

Given your data:
- $\bar{x} = 8$ mmHg (sample mean decrease),
- $\mu_0 = 10$ mmHg (hypothesized population mean),
- $s = 3$ mmHg (sample standard deviation),
- $n = 100$ (sample size).

Let's calculate the t-statistic and perform the hypothesis test:

```python
import math
from scipy import stats

# Given values
sample_mean = 8
hypothesized_mean = 10
sample_std_dev = 3
sample_size = 100
significance_level = 0.05

# Calculate the t-statistic
t_statistic = (sample_mean - hypothesized_mean) / (sample_std_dev / math.sqrt(sample_size))

# Calculate degrees of freedom
degrees_of_freedom = sample_size - 1

# Calculate the p-value for a two-tailed test
p_value = 2 * (1 - stats.t.cdf(abs(t_statistic), df=degrees_of_freedom))

# Compare the p-value to the significance level
if p_value < significance_level:
    print(f"Reject the null hypothesis. The sample provides enough evidence to suggest a significant difference.")
else:
    print("Fail to reject the null hypothesis. There is not enough evidence to suggest a significant difference.")

print(f"t-Statistic: {t_statistic}\nP-Value: {p_value}")
```

The script calculates the t-statistic and compares the p-value to the significance level. If the p-value is less than the significance level, we reject the null hypothesis. Otherwise, we fail to reject the null hypothesis. The output provides the t-statistic and p-value for the hypothesis test.

**Q15. An electronics company produces a certain type of product with a mean weight of 5 pounds and a
standard deviation of 0.5 pounds. A random sample of 25 products is taken, and the sample mean weight
is found to be 4.8 pounds. Test the hypothesis that the true mean weight of the products is less than 5
pounds with a significance level of 0.01.**

To test the hypothesis that the true mean weight of the products is less than 5 pounds, we can use a one-sample t-test. The hypotheses are as follows:

- Null Hypothesis $H_0$: The true mean weight $\mu$ is equal to 5 pounds.<br>
  $ H_0: \mu = 5 $

- Alternative Hypothesis $H_1$: The true mean weight $\mu$ is less than 5 pounds.<br>
  $ H_1: \mu < 5 $

The test statistic for a one-sample t-test is given by:
$ t = \frac{\bar{x} - \mu_0}{\frac{s}{\sqrt{n}}} $

Where:
- $\bar{x}$ is the sample mean weight,
- $\mu_0$ is the hypothesized population mean (5 pounds),
- $s$ is the sample standard deviation,
- $n$ is the sample size.

Given your data:
- $\bar{x} = 4.8$ pounds (sample mean weight),
- $\mu_0 = 5$ pounds (hypothesized population mean),
- $s = 0.5$ pounds (sample standard deviation),
- $n = 25$ (sample size).

The significance level $\alpha$ is given as 0.01.

Let's calculate the t-statistic and perform the hypothesis test:

```python
import math
from scipy import stats

# Given values
sample_mean = 4.8
hypothesized_mean = 5
sample_std_dev = 0.5
sample_size = 25
significance_level = 0.01

# Calculate the t-statistic
t_statistic = (sample_mean - hypothesized_mean) / (sample_std_dev / math.sqrt(sample_size))

# Calculate degrees of freedom
degrees_of_freedom = sample_size - 1

# Calculate the p-value for a one-tailed test
p_value = stats.t.cdf(t_statistic, df=degrees_of_freedom)

# Compare the p-value to the significance level
if p_value < significance_level:
    print(f"Reject the null hypothesis. The sample provides enough evidence to suggest the true mean weight is less than 5 pounds.")
else:
    print("Fail to reject the null hypothesis. There is not enough evidence to suggest the true mean weight is less than 5 pounds.")

print(f"t-Statistic: {t_statistic}\nP-Value: {p_value}")
```

The script calculates the t-statistic and compares the p-value to the significance level. If the p-value is less than the significance level, we reject the null hypothesis. Otherwise, we fail to reject the null hypothesis. The output provides the t-statistic and p-value for the hypothesis test.

**Q16. Two groups of students are given different study materials to prepare for a test. The first group (n1 =
30) has a mean score of 80 with a standard deviation of 10, and the second group (n2 = 40) has a mean
score of 75 with a standard deviation of 8. Test the hypothesis that the population means for the two
groups are equal with a significance level of 0.01.**

To test the hypothesis that the population means for the two groups are equal, you can use a two-sample t-test for independent samples. The hypotheses are as follows:

- Null Hypothesis (\(H_0\)): The true mean of the first group is equal to the true mean of the second group.<br>
  $ H_0: \mu_1 = \mu_2 $

- Alternative Hypothesis (\(H_1\)): The true mean of the first group is not equal to the true mean of the second group.<br>
  $ H_1: \mu_1 \neq \mu_2 $

The test statistic for a two-sample t-test is given by:
$ t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}} $

Where:
- $\bar{x}_1$ and $\bar{x}_2$ are the sample means of the two groups.
- $s_1$ and $s_2$ are the sample standard deviations of the two groups.
- $n_1$ and $n_2$ are the sample sizes of the two groups.

Given your data:
- For Group 1: $\bar{x}_1 = 80$, $s_1 = 10$, $n_1 = 30$
- For Group 2: $\bar{x}_2 = 75$, $s_2 = 8$, $n_2 = 40$

The significance level $\alpha$ is given as 0.01.

Let's calculate the t-statistic and perform the hypothesis test:

```python
import math
from scipy import stats

# Given values for Group 1
mean_group1 = 80
std_dev_group1 = 10
size_group1 = 30

# Given values for Group 2
mean_group2 = 75
std_dev_group2 = 8
size_group2 = 40

# Significance level
significance_level = 0.01

# Calculate the t-statistic for a two-sample t-test
t_statistic = (mean_group1 - mean_group2) / math.sqrt((std_dev_group1**2 / size_group1) + (std_dev_group2**2 / size_group2))

# Calculate degrees of freedom
degrees_of_freedom = size_group1 + size_group2 - 2

# Calculate the p-value for a two-tailed test
p_value = 2 * (1 - stats.t.cdf(abs(t_statistic), df=degrees_of_freedom))

# Compare the p-value to the significance level
if p_value < significance_level:
    print("Reject the null hypothesis. There is enough evidence to suggest a significant difference in means.")
else:
    print("Fail to reject the null hypothesis. There is not enough evidence to suggest a significant difference in means.")

print(f"t-Statistic: {t_statistic}\nP-Value: {p_value}")
```

The script calculates the t-statistic and compares the p-value to the significance level. If the p-value is less than the significance level, we reject the null hypothesis. Otherwise, we fail to reject the null hypothesis. The output provides the t-statistic and p-value for the hypothesis test.

**Q17. A marketing company wants to estimate the average number of ads watched by viewers during a TV
program. They take a random sample of 50 viewers and find that the sample mean is 4 with a standard
deviation of 1.5. Estimate the population mean with a 99% confidence interval.**

To estimate the population mean with a confidence interval, you can use the following formula for the confidence interval:

$ \text{Confidence Interval} = \bar{x} \pm \left( \text{critical value} \times \frac{s}{\sqrt{n}} \right) $

Where:
- $\bar{x}$ is the sample mean.
- $s$ is the sample standard deviation.
- $n$ is the sample size.

The critical value depends on the desired confidence level. For a 99% confidence interval with a normal distribution, the critical value (z-value) is approximately 2.576.

Given your data:
- $\bar{x} = 4$ (sample mean),
- $s = 1.5$ (sample standard deviation),
- $n = 50$ (sample size),
- Critical value (z-value) for a 99% confidence interval is approximately 2.576.

Let's substitute these values into the formula:

$ \text{Confidence Interval} = 4 \pm \left( 2.576 \times \frac{1.5}{\sqrt{50}} \right) $

Now, calculate the margin of error and then construct the confidence interval:

```python
import math

# Given values
sample_mean = 4
sample_std_dev = 1.5
sample_size = 50
confidence_level = 0.99

# Critical value for a 99% confidence interval
critical_value = 2.576

# Calculate the margin of error
margin_of_error = critical_value * (sample_std_dev / math.sqrt(sample_size))

# Construct the confidence interval
confidence_interval_lower = sample_mean - margin_of_error
confidence_interval_upper = sample_mean + margin_of_error

print(f"99% Confidence Interval: ({confidence_interval_lower:.2f}, {confidence_interval_upper:.2f})")
```

So, the 99% confidence interval for the average number of ads watched by viewers is approximately (3.47, 4.53). This means that we are 99% confident that the true population mean falls within this interval.