### Q1: What is Estimation Statistics? Explain point estimate and interval estimate.

**Estimation Statistics**:
- Estimation statistics involve using sample data to estimate population parameters. The goal is to infer characteristics of a population based on a sample.

**Point Estimate**:
- A point estimate is a single value used to estimate a population parameter. For example, the sample mean is a point estimate of the population mean.

**Interval Estimate**:
- An interval estimate provides a range of values within which the population parameter is expected to lie, with a certain level of confidence. For example, a 95% confidence interval for the mean provides a range where we expect the true population mean to fall 95% of the time.

### Q2: Write a Python function to estimate the population mean using a sample mean and standard deviation.

```python
def estimate_population_mean(sample_mean, sample_std_dev, confidence_level=0.95, sample_size=30):
    import scipy.stats as stats
    import math
    
    # Calculate the critical value
    alpha = 1 - confidence_level
    critical_value = stats.norm.ppf(1 - alpha / 2)
    
    # Calculate the margin of error
    margin_of_error = critical_value * (sample_std_dev / math.sqrt(sample_size))
    
    # Calculate the confidence interval
    lower_bound = sample_mean - margin_of_error
    upper_bound = sample_mean + margin_of_error
    
    return lower_bound, upper_bound

# Example usage:
sample_mean = 50
sample_std_dev = 10
confidence_interval = estimate_population_mean(sample_mean, sample_std_dev)
print(f"95% Confidence Interval for the Population Mean: {confidence_interval}")

### Q3: What is Hypothesis Testing? Why is it used? State the importance of Hypothesis Testing.

**Hypothesis Testing**:
- Hypothesis testing is a statistical method used to make inferences or draw conclusions about a population based on a sample of data. It involves formulating and testing hypotheses to determine if there is enough evidence to support a specific claim or theory.

**Why It Is Used**:
- To evaluate claims or assumptions about a population.
- To make decisions based on sample data.
- To determine if observed effects are statistically significant.

**Importance**:
- Provides a systematic method to make data-driven decisions.
- Helps in assessing the validity of scientific and practical claims.
- Ensures that conclusions are based on statistical evidence rather than chance.

### Q4: Create a hypothesis that states whether the average weight of male college students is greater than the average weight of female college students.

**Null Hypothesis (H0)**:
- \( H_0: \mu_{m} \leq \mu_{f} \) (The average weight of male college students is less than or equal to the average weight of female college students.)

**Alternative Hypothesis (H1)**:
- \( H_1: \mu_{m} > \mu_{f} \) (The average weight of male college students is greater than the average weight of female college students.)

### Q5: Write a Python script to conduct a hypothesis test on the difference between two population means, given a sample from each population.

```python
import scipy.stats as stats

# Sample data
mean1 = 70  # Mean of sample 1
std1 = 10   # Standard deviation of sample 1
n1 = 30     # Sample size of sample 1

mean2 = 65  # Mean of sample 2
std2 = 12   # Standard deviation of sample 2
n2 = 40     # Sample size of sample 2

# Perform two-sample t-test
t_statistic, p_value = stats.ttest_ind_from_stats(mean1, std1, n1, mean2, std2, n2)

print(f"T-statistic: {t_statistic}")
print(f"P-value: {p_value}")

### Q6: What is a null and alternative hypothesis? Give some examples.

**Null Hypothesis (\(H_0\))**:
- The null hypothesis is a statement that there is no effect or no difference. It serves as the default assumption that any observed difference is due to random chance.
- **Example**: A drug has no effect on blood pressure; \( H_0: \mu = 0 \).

**Alternative Hypothesis (\(H_A\) or \(H_1\))**:
- The alternative hypothesis is a statement that indicates the presence of an effect or a difference. It is what you want to prove or find evidence for.
- **Example**: A drug does affect blood pressure; \( H_A: \mu \neq 0 \).

### Q7: Write down the steps involved in hypothesis testing.

1. **State the Hypotheses**:
   - Formulate the null hypothesis (\(H_0\)) and the alternative hypothesis (\(H_A\)).

2. **Choose the Significance Level**:
   - Decide on the significance level (\(\alpha\)), commonly 0.05 or 0.01.

3. **Select the Appropriate Test**:
   - Choose a statistical test based on the type of data and the hypotheses (e.g., t-test, z-test).

4. **Collect Data**:
   - Gather the sample data required for the test.

5. **Calculate the Test Statistic**:
   - Compute the test statistic using the sample data.

6. **Determine the p-value or Critical Value**:
   - Compare the test statistic to the critical value or use the p-value to assess significance.

7. **Make a Decision**:
   - Reject the null hypothesis if the p-value is less than \(\alpha\) or if the test statistic exceeds the critical value. Otherwise, fail to reject the null hypothesis.

8. **Draw a Conclusion**:
   - Summarize the findings and interpret the results in the context of the research question.

### Q8: Define p-value and explain its significance in hypothesis testing.

**P-value**:
- The p-value is the probability of observing the test results, or something more extreme, under the null hypothesis. It quantifies the evidence against the null hypothesis.

**Significance**:
- A small p-value (typically less than the chosen significance level \(\alpha\)) indicates strong evidence against the null hypothesis, leading to its rejection.
- A large p-value suggests weak evidence against the null hypothesis, leading to failure to reject it.

### Q9: Generate a Student's t-distribution plot using Python's matplotlib library, with the degrees of freedom parameter set to 10.

```python
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import t

# Parameters
df = 10  # degrees of freedom
x = np.linspace(-5, 5, 1000)
y = t.pdf(x, df)

# Plot
plt.figure(figsize=(8, 6))
plt.plot(x, y, label=f't-distribution df={df}')
plt.title("Student's t-Distribution")
plt.xlabel("x")
plt.ylabel("Probability Density")
plt.legend()
plt.grid(True)
plt.show()

### Q10: Write a Python program to calculate the two-sample t-test for independent samples, given two random samples of equal size and a null hypothesis that the population means are equal.

```python
import numpy as np
from scipy import stats

# Sample data
sample1 = np.array([/* data */])
sample2 = np.array([/* data */])

# Perform the two-sample t-test
t_statistic, p_value = stats.ttest_ind(sample1, sample2)

print(f"T-statistic: {t_statistic}")
print(f"P-value: {p_value}")


### Q11: What is Student’s t distribution? When to use the t-Distribution.

**Student’s t Distribution**:
- The Student’s t distribution is a family of distributions that are similar in shape to the standard normal distribution but have heavier tails. It is used when the sample size is small and the population standard deviation is unknown.

**When to Use**:
- Use the t-distribution when:
  - Sample size is small (typically \( n < 30 \)).
  - Population standard deviation is unknown.
  - You are estimating the mean of a normally distributed population.

### Q12: What is t-statistic? State the formula for t-statistic.

**t-Statistic**:
- The t-statistic is a standardized value that is used in hypothesis testing to determine how far the sample mean deviates from the null hypothesis mean in units of the standard error.

**Formula**:
\[
t = \frac{\bar{X} - \mu}{\frac{s}{\sqrt{n}}}
\]
where:
- \( \bar{X} \) is the sample mean,
- \( \mu \) is the population mean under the null hypothesis,
- \( s \) is the sample standard deviation,
- \( n \) is the sample size.

### Q13: A coffee shop owner wants to estimate the average daily revenue for their shop. They take a random sample of 50 days and find the sample mean revenue to be $500 with a standard deviation of $50. Estimate the population mean revenue with a 95% confidence interval.

**Confidence Interval Calculation**:
1. **Sample Mean (\(\bar{X}\))**: $500
2. **Sample Standard Deviation (\(s\))**: $50
3. **Sample Size (\(n\))**: 50
4. **Degrees of Freedom (\(df\))**: \( n - 1 = 49 \)
5. **Critical Value (\(t_{\alpha/2}\))**: For 95% confidence and 49 degrees of freedom, \( t_{\alpha/2} \approx 2.0096 \) (from t-distribution table).

**Confidence Interval**:
\[
\text{CI} = \bar{X} \pm t_{\alpha/2} \cdot \frac{s}{\sqrt{n}}
\]
\[
\text{CI} = 500 \pm 2.0096 \cdot \frac{50}{\sqrt{50}}
\]
\[
\text{CI} = 500 \pm 2.0096 \cdot 7.071
\]
\[
\text{CI} = 500 \pm 14.2
\]

**95% Confidence Interval**: \([485.8, 514.2]\)

### Q14: A researcher hypothesizes that a new drug will decrease blood pressure by 10 mmHg. They conduct a clinical trial with 100 patients and find that the sample mean decrease in blood pressure is 8 mmHg with a standard deviation of 3 mmHg. Test the hypothesis with a significance level of 0.05.

**Hypothesis Testing**:
- **Null Hypothesis (\(H_0\))**: \(\mu = 10\) mmHg
- **Alternative Hypothesis (\(H_1\))**: \(\mu \neq 10\) mmHg

1. **Sample Mean (\(\bar{X}\))**: 8 mmHg
2. **Sample Standard Deviation (\(s\))**: 3 mmHg
3. **Sample Size (\(n\))**: 100

**t-Statistic Calculation**:
\[
t = \frac{\bar{X} - \mu}{\frac{s}{\sqrt{n}}} = \frac{8 - 10}{\frac{3}{\sqrt{100}}} = \frac{-2}{0.3} = -6.67
\]

**Critical Value**: For a two-tailed test at \(\alpha = 0.05\) with 99 degrees of freedom, the critical value \( t_{\alpha/2} \approx 1.984 \).

**Decision**:
- Since \( |t| = 6.67 \) is greater than 1.984, we reject the null hypothesis.

**Conclusion**: There is sufficient evidence to conclude that the new drug has a significant effect on decreasing blood pressure.

### Q15: An electronics company produces a certain type of product with a mean weight of 5 pounds and a standard deviation of 0.5 pounds. A random sample of 25 products is taken, and the sample mean weight is found to be 4.8 pounds. Test the hypothesis that the true mean weight of the products is less than 5 pounds with a significance level of 0.01.

**Hypothesis Testing**:
- **Null Hypothesis (\(H_0\))**: \(\mu = 5\) pounds
- **Alternative Hypothesis (\(H_1\))**: \(\mu < 5\) pounds

1. **Sample Mean (\(\bar{X}\))**: 4.8 pounds
2. **Population Mean (\(\mu\))**: 5 pounds
3. **Population Standard Deviation (\(\sigma\))**: 0.5 pounds
4. **Sample Size (\(n\))**: 25

**z-Statistic Calculation**:
\[
z = \frac{\bar{X} - \mu}{\frac{\sigma}{\sqrt{n}}} = \frac{4.8 - 5}{\frac{0.5}{\sqrt{25}}} = \frac{-0.2}{0.1} = -2
\]

**Critical Value**: For a one-tailed test at \(\alpha = 0.01\), the critical value \( z_{\alpha} \approx -2.33 \).

**Decision**:
- Since \( z = -2 \) is greater than -2.33, we do not reject the null hypothesis.

**Conclusion**: There is not enough evidence to conclude that the true mean weight is less than 5 pounds.

### Q16: Two groups of students are given different study materials to prepare for a test. The first group (\(n_1 = 30\)) has a mean score of 80 with a standard deviation of 10, and the second group (\(n_2 = 40\)) has a mean score of 75 with a standard deviation of 8. Test the hypothesis that the population means for the two groups are equal with a significance level of 0.01.

**Hypothesis Testing**:
- **Null Hypothesis (\(H_0\))**: \(\mu_1 = \mu_2\)
- **Alternative Hypothesis (\(H_1\))**: \(\mu_1 \neq \mu_2\)

1. **Group 1**: Mean = 80, Standard Deviation = 10, \(n_1 = 30\)
2. **Group 2**: Mean = 75, Standard Deviation = 8, \(n_2 = 40\)

**t-Statistic Calculation**:
\[
t = \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}} = \frac{80 - 75}{\sqrt{\frac{10^2}{30} + \frac{8^2}{40}}}
\]
\[
t = \frac{5}{\sqrt{3.33 + 1.6}} = \frac{5}{2.08} = 2.40
\]

**Degrees of Freedom**: Using approximate degrees of freedom for unequal variances:
\[
df \approx \frac{\left(\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}\right)^2}{\frac{\left(\frac{s_1^2}{n_1}\right)^2}{n_1 - 1} + \frac{\left(\frac{s_2^2}{n_2}\right)^2}{n_2 - 1}}
\]
\[
df \approx 65
\]

**Critical Value**: For a two-tailed test at \(\alpha = 0.01\) with \( df \approx 65 \), the critical value \( t_{\alpha/2} \approx \pm 2.65 \).

**Decision**:
- Since \( |t| = 2.40 \) is less than 2.65, we do not reject the null hypothesis.

**Conclusion**: There is not enough evidence to conclude that the population means for the two groups are different.

### Q17: A marketing company wants to estimate the average number of ads watched by viewers during a TV program. They take a random sample of 50 viewers and find that the sample mean is 4 with a standard deviation of 1.5. Estimate the population mean with a 99% confidence interval.

**Confidence Interval Calculation**:
1. **Sample Mean (\(\bar{X}\))**: 4
2. **Sample Standard Deviation (\(s\))**: 1.5
3. **Sample Size (\(n\))**: 50
4. **Degrees of Freedom (\(df\))**: \( n - 1 = 49 \)
5. **Critical Value (\(t_{\alpha/2}\))**: For 99% confidence and 49 degrees of freedom, \( t_{\alpha/2} \approx 2.684 \) (from t-distribution table).

**Confidence Interval**:
\[
\text{CI} = \bar{X} \pm t_{\alpha/2} \cdot \frac{s}{\sqrt{n}}
\]
\[
\text{CI} = 4 \pm 2.684 \cdot \frac{1.5}{\sqrt{50}}
\]
\[
\text{CI} = 4 \pm 2.684 \cdot 0.212
\]
\[
\text{CI} = 4 \pm 0.569
\]

**99% Confidence Interval**: \([3.431, 4.569]\)