# DA-Session-9-DPP – Complete Solutions (Jupyter Notebook)

This notebook contains **all questions (Q1–Q11)** solved in a **clean, exam-safe format**.
All mathematical expressions use **Jupyter-compatible LaTeX (`$$ $$`)** to avoid rendering issues.

## Question 1: Systematic Sampling (Mall Survey)

**Definitions:**  
- `N` = total number of customers  
- `n` = required sample size  
- `k` = sampling interval

**Formula:**

$$ k = \frac{N}{n} $$

**Steps:**
1. Arrange customers in the order of entry.
2. Choose a random starting point between 1 and `k`.
3. Select every `k`-th customer thereafter.

This method ensures uniform coverage and simplicity.

## Question 2: Stratified Sampling (School Grades)

The population is divided into **homogeneous subgroups (strata)** based on grades.

**Formula for overall mean:**

$$ \bar{X} = \sum (w_i \times \bar{X}_i) $$

where:  
- $w_i$ = proportion of students in grade *i*  
- $\bar{X}_i$ = mean score of grade *i*

This guarantees representation from all grades.

## Question 3: Proportional Allocation

**Given:**  
- Total students, $N = 800$  
- Sample size, $n = 100$

**Formula:**

$$ n_i = \frac{N_i}{N} \times n $$

**Calculations:**

- Junior grades: $(400/800) \times 100 = 50$  
- Senior grades: $(250/800) \times 100 \approx 31$  
- Advanced grades: $(150/800) \times 100 \approx 19$

## Question 4: Systematic Sampling of Transactions

**Definitions:**  
- `a` = starting transaction number = 7  
- `k` = sampling interval = 20  
- `i` = index (0, 1, 2, ...)

**Formula:**

$$ T_i = a + i \times k $$

**First 10 sampled transactions:**

7, 27, 47, 67, 87, 107, 127, 147, 167, 187

## Question 5: Advantage of Cluster Sampling

In cluster sampling, the population is divided into **clusters (neighborhoods)**, and entire clusters are surveyed.

**Main Advantage:**
- Reduced cost and time
- Practical for geographically dispersed populations

## Question 6: Standard Error of the Mean

**Given:**  
- Population standard deviation, $\sigma = 3$  
- Sample size, $n = 36$

**Formula:**

$$ SE = \frac{\sigma}{\sqrt{n}} $$

**Calculation:**

$$ SE = \frac{3}{\sqrt{36}} = \frac{3}{6} = 0.5 $$

## Question 7: Standard Error (Light Bulb Lifespan)

**Given:**  
- $\sigma = 200$, $n = 50$

**Formula:**

$$ SE = \frac{\sigma}{\sqrt{n}} $$

**Calculation:**

$$ SE = \frac{200}{\sqrt{50}} \approx 28.28 $$

## Question 8: Effect of Sample Size on Variability

**Formula:**

$$ SE = \frac{\sigma}{\sqrt{n}} $$

**Results:**

- $n = 16 \Rightarrow SE = 2.5$  
- $n = 64 \Rightarrow SE = 1.25$  
- $n = 256 \Rightarrow SE = 0.625$

**Conclusion:** As sample size increases, standard error decreases.

## Question 9: Survey Sampling Method and Bias

**Sampling Method Used:** Voluntary response sampling

**Potential Biases:**
- Non-response bias
- Overrepresentation of motivated respondents

## Question 10: Sampling Distribution of the Mean

**Sample means:**
75, 78, 74, 76, 77, 75, 79, 76, 74, 77

**Mean of sampling distribution:**

$$ \bar{X} = \frac{\sum X}{n} = \frac{761}{10} = 76.1 $$

**Standard deviation of means:**

$$ \sigma \approx 1.7 $$

## Question 11: Probability Using Central Limit Theorem

**Given:**  
- Population mean, $\mu = 200$  
- Population standard deviation, $\sigma = 50$  
- Sample size, $n = 36$

**Step 1: Standard Error**

$$ SE = \frac{50}{\sqrt{36}} = 8.33 $$

**Step 2: Z-score**

$$ Z = \frac{210 - 200}{8.33} = 1.20 $$

**Step 3: Probability**

From Z-table:

$$ P(Z < 1.20) = 0.8849 $$

$$ P(\bar{X} > 210) = 1 - 0.8849 = 0.115 $$

Thus, there is an 11.5% probability that the sample mean exceeds 210 g.