
# 📌 Inferential Statistics – Interview Q\&A + Coding

---

### **Q1. What is Inferential Statistics? How is it different from Descriptive Statistics?**

**Answer:**

* **Descriptive Statistics** → Summarizes data (mean, median, variance, plots).
* **Inferential Statistics** → Makes predictions/decisions about a **population** based on a **sample**.
* Uses **probability theory, hypothesis testing, and confidence intervals**.

---

### **Q2. What are the key concepts in Inferential Statistics?**

**Answer:**

1. **Population vs Sample**
2. **Sampling Distribution & CLT**
3. **Hypothesis Testing (Null vs Alternative)**
4. **P-values & Significance Levels (α)**
5. **Confidence Intervals**
6. **t-tests, z-tests, Chi-square, ANOVA**

---

### **Q3. Demonstrate hypothesis testing (one-sample t-test).**

Suppose average exam score is claimed to be **70**. A sample of students gave scores:
`[65, 68, 72, 71, 69, 70, 73, 74, 68, 66]`.
Test if the true mean = 70.

```python
import numpy as np
from scipy import stats

data = [65, 68, 72, 71, 69, 70, 73, 74, 68, 66]

# One-sample t-test
t_stat, p_val = stats.ttest_1samp(data, 70)

print("t-statistic:", t_stat)
print("p-value:", p_val)
```

✅ **Answer:**

* If **p < 0.05**, reject null → mean ≠ 70.
* Here, p ≈ 0.28 → fail to reject null → no strong evidence against mean = 70.

---

### **Q4. Compare two groups (independent t-test).**

Test if there is a significant difference between:

* Group A (Drug) = `[85, 90, 88, 75, 95]`
* Group B (Placebo) = `[70, 65, 80, 72, 68]`

```python
group_a = [85, 90, 88, 75, 95]
group_b = [70, 65, 80, 72, 68]

t_stat, p_val = stats.ttest_ind(group_a, group_b)

print("t-statistic:", t_stat)
print("p-value:", p_val)
```

✅ **Answer:**

* p < 0.05 → statistically significant difference between drug and placebo groups.

---

### **Q5. Construct a 95% confidence interval for a sample mean.**

Sample: `[12, 15, 14, 10, 13, 17, 12, 16]`

```python
import numpy as np
import scipy.stats as st

data = np.array([12, 15, 14, 10, 13, 17, 12, 16])
mean = np.mean(data)
sem = st.sem(data)   # standard error of mean

conf_interval = st.t.interval(0.95, len(data)-1, loc=mean, scale=sem)
print("95% Confidence Interval:", conf_interval)
```

✅ **Answer:**
If CI = (11.2, 15.3), it means we are **95% confident** that the true mean lies between 11.2 and 15.3.

---

### **Q6. When to use z-test vs t-test?**

**Answer:**

* **z-test:** Large sample (n ≥ 30), population variance known.
* **t-test:** Small sample (n < 30), population variance unknown.

---

### **Q7. Perform a Chi-Square Test of Independence.**

Check if **Gender** and **Preference (Yes/No)** are related:

|        | Yes | No |
| ------ | --- | -- |
| Male   | 30  | 20 |
| Female | 25  | 25 |

```python
import numpy as np
from scipy.stats import chi2_contingency

data = np.array([[30, 20],
                 [25, 25]])

chi2, p, dof, expected = chi2_contingency(data)

print("Chi2 Statistic:", chi2)
print("p-value:", p)
print("Expected Values:\n", expected)
```

✅ **Answer:**

* If p < 0.05 → reject null → Gender and Preference are associated.
* Otherwise, independent.

---

### **Q8. Explain Type I and Type II Errors.**

**Answer:**

* **Type I Error (α):** Rejecting null when it’s true (false positive).
* **Type II Error (β):** Failing to reject null when it’s false (false negative).
* Power of a test = 1 - β.

---

### **Q9. Explain ANOVA with coding.**

Compare means across 3 groups:

```python
group1 = [23, 20, 22, 21, 24]
group2 = [30, 28, 29, 32, 31]
group3 = [40, 42, 38, 41, 39]

f_stat, p_val = stats.f_oneway(group1, group2, group3)

print("F-statistic:", f_stat)
print("p-value:", p_val)
```

✅ **Answer:**

* If p < 0.05 → at least one group mean is significantly different.

---

### **Q10. Quick one-liner definition**

👉 “Inferential statistics uses sample data to draw conclusions about populations using **confidence intervals, hypothesis testing, and probability-based models**.”

---
