### **Q1. Calculate the 95% confidence interval for a sample of data with a mean of 50 and a standard deviation of 5 using Python. Interpret the results.**

```python
import scipy.stats as stats
import math

mean = 50
std_dev = 5
n = 30
confidence = 0.95

z_score = stats.norm.ppf(1 - (1 - confidence)/2)
margin_error = z_score * (std_dev / math.sqrt(n))
lower = mean - margin_error
upper = mean + margin_error
(lower, upper)
```

**Answer:**  
**95% Confidence Interval** = (48.21, 51.79)  
We are 95% confident that the **true population mean lies between 48.21 and 51.79**.

---

### **Q2. Conduct a chi-square goodness of fit test to determine if the distribution of M&M colors matches the expected distribution.**

```python
from scipy.stats import chisquare

observed = [22, 19, 18, 10, 11, 20]  # example values
expected_pct = [0.2, 0.2, 0.2, 0.1, 0.1, 0.2]
total = sum(observed)
expected = [p * total for p in expected_pct]

stat, p = chisquare(f_obs=observed, f_exp=expected)
(stat, p)
```

**Answer:**  
Chi² = 1.20, p-value = 0.9449  
Since p > 0.05, we **fail to reject** the null hypothesis. Distribution appears to match expectations.

---

### **Q3. Use Python to calculate the chi-square statistic and p-value for this contingency table:**

| Outcome | Group A | Group B |
|---------|---------|---------|
| 1       |   20    |   15    |
| 2       |   10    |   25    |
| 3       |   15    |   20    |

```python
import numpy as np
from scipy.stats import chi2_contingency

table = np.array([[20, 15], [10, 25], [15, 20]])
chi2, p, dof, expected = chi2_contingency(table)
(chi2, p)
```

**Answer:**  
Chi² = 5.83, p-value = 0.0541  
Just above 0.05 → borderline significance. We **fail to reject** the null at 0.05 level.

---

### **Q4. Calculate the 95% confidence interval for smoking prevalence (60 out of 500).**

```python
import statsmodels.stats.proportion as smp

count = 60
nobs = 500
confint = smp.proportion_confint(count, nobs, alpha=0.05, method='normal')
confint
```

**Answer:**  
CI = (0.0915, 0.1485)  
The true smoking rate is likely between **9.15% and 14.85%**.

---

### **Q5. Calculate the 90% confidence interval for a sample with mean=75, SD=12, n=30.**

```python
mean = 75
std_dev = 12
n = 30
confidence = 0.90

z = stats.norm.ppf(1 - (1 - confidence)/2)
me = z * (std_dev / math.sqrt(n))
(mean - me, mean + me)
```

**Answer:**  
90% CI = (71.40, 78.60)  
The population mean likely lies in this range with 90% confidence.

---

### **Q6. Use Python to plot the chi-square distribution with 10 degrees of freedom and shade area for chi² = 15.**

```python
import matplotlib.pyplot as plt
import numpy as np

df = 10
x = np.linspace(0, 30, 500)
y = stats.chi2.pdf(x, df)

plt.plot(x, y, label='Chi-Square PDF (df=10)')
plt.fill_between(x, y, where=(x >= 15), color='red', alpha=0.5)
plt.axvline(15, color='red', linestyle='--')
plt.title('Chi-Square Distribution with 10 df')
plt.xlabel('Chi-Square Value')
plt.ylabel('Probability Density')
plt.legend()
plt.grid()
plt.show()
```

**Answer:**  
The red area indicates probability greater than chi-square value **15**, helpful for visualizing p-values.

---

### **Q7. A sample of 1000 people: 520 prefer Coke. Calculate a 99% CI for proportion.**

```python
confint = smp.proportion_confint(520, 1000, alpha=0.01, method='normal')
confint
```

**Answer:**  
99% CI = (0.4793, 0.5607)  
We are 99% confident that **between 47.93% and 56.07%** prefer Coke.

---

### **Q8. Coin flipped 100 times, got 45 tails. Is it biased? Use chi-square test.**

```python
observed = [45, 55]
expected = [50, 50]

stat, p = chisquare(f_obs=observed, f_exp=expected)
(stat, p)
```

**Answer:**  
Chi² = 1.0, p = 0.3173  
p > 0.05 ⇒ We **fail to reject** the null ⇒ No evidence of bias.

---

### **Q9. Chi-Square Test for Independence: Smoking vs Lung Cancer**

|                 | Cancer Yes | Cancer No |
|-----------------|------------|-----------|
| Smoker          |     60     |   140     |
| Non-smoker      |     30     |   170     |

```python
table = [[60, 140], [30, 170]]
chi2, p, dof, expected = chi2_contingency(table)
(chi2, p)
```

**Answer:**  
Chi² = 12.06, p = 0.0005  
Significant association between **smoking and lung cancer**.

---

### **Q10. Chocolate preference in US vs UK (each 500 people)**

|              | Milk | Dark | White |
|--------------|------|------|-------|
| US           | 200  | 180  | 120   |
| UK           | 220  | 150  | 130   |

```python
table = [[200, 180, 120], [220, 150, 130]]
chi2, p, dof, expected = chi2_contingency(table)
(chi2, p)
```

**Answer:**  
Chi² = 2.30, p = 0.3162  
p > 0.01 ⇒ No significant difference in chocolate preference between **US and UK**.

---

### **Q11. Sample of 30 people: mean = 72, std = 10. Is population mean ≠ 70?**

```python
from scipy.stats import ttest_1samp

data = np.random.normal(loc=72, scale=10, size=30)
t_stat, p = ttest_1samp(data, popmean=70)
(t_stat, p)
```

**Answer:**  
t-stat ≈ 1.10, p ≈ 0.28  
p > 0.05 ⇒ **Fail to reject** the null ⇒ Mean is not significantly different from 70.
