# Question 1. Linear Discriminant Function (PAGE 08)

We aim to prove why the "Linear Discriminant Function" is given by  
$
\delta_k(x) = \frac{x \mu_k}{\sigma^2} - \frac{\mu_k^2}{2\sigma^2} + \ln(\pi_k).
$
Using the hint:  
$
\ln \left( \Pr(Y = k \mid X = x) \right) \approx \delta_k(x).
$

## Step-by-Step Derivation

### 1. Bayes’ Theorem  
Using Bayes' Theorem, the posterior probability of class $ Y = k $ is:  
$
\Pr(Y = k \mid X = x) = \frac{f_k(x) \cdot \pi_k}{\sum_{j=1}^K f_j(x) \cdot \pi_j},
$
where  
- $ f_k(x) $: class-conditional density function (assumed to be normal),  
- $ \pi_k $: prior probability of class $ k $,  
- $ K $: total number of classes.

### 2. Normal (Gaussian) Density Function for $ f_k(x) $  
Since $ f_k(x) $ follows a normal distribution:  
$
f_k(x) = \frac{1}{\sqrt{2 \pi \sigma^2}} \exp \left( - \frac{(x - \mu_k)^2}{2 \sigma^2} \right),
$
where $ \mu_k $ is the mean and $ \sigma^2 $ is the (common) variance.

### 3. Logarithm of Posterior Probability (Approximation)  
Take the natural logarithm of both sides:  
$
\ln \left( \Pr(Y = k \mid X = x) \right) = \ln \left( \frac{f_k(x) \cdot \pi_k}{\sum_{j=1}^K f_j(x) \cdot \pi_j} \right).
$
Since we are only interested in the discriminant function and not the exact posterior, we focus on the numerator:  
$
\ln \left( f_k(x) \cdot \pi_k \right) = \ln(f_k(x)) + \ln(\pi_k).
$

### 4. Simplify $ \ln(f_k(x)) $  
Substitute the normal density function $ f_k(x) $:  
$
\ln(f_k(x)) = \ln \left( \frac{1}{\sqrt{2 \pi \sigma^2}} \exp \left( - \frac{(x - \mu_k)^2}{2 \sigma^2} \right) \right).
$
Break this into separate terms:  
$
\ln(f_k(x)) = - \frac{1}{2} \ln(2 \pi \sigma^2) - \frac{(x - \mu_k)^2}{2 \sigma^2}.
$

### 5. Expand $ (x - \mu_k)^2 $  
Expand the quadratic term:  
$
(x - \mu_k)^2 = x^2 - 2x \mu_k + \mu_k^2.
$
Thus,  
$
\ln(f_k(x)) = - \frac{1}{2} \ln(2 \pi \sigma^2) - \frac{x^2 - 2x \mu_k + \mu_k^2}{2 \sigma^2}.
$

### 6. Simplify the Expression  
Drop constants that do not depend on $ k $, as they will cancel out when comparing different classes. This leaves:  
$
\ln(f_k(x)) \approx - \frac{x^2}{2 \sigma^2} + \frac{x \mu_k}{\sigma^2} - \frac{\mu_k^2}{2 \sigma^2}.
$
Now, include the term $ \ln(\pi_k) $:  
$
\ln \left( f_k(x) \cdot \pi_k \right) \approx \frac{x \mu_k}{\sigma^2} - \frac{\mu_k^2}{2 \sigma^2} + \ln(\pi_k).
$

### 7. Define the Linear Discriminant Function  
Let $ \delta_k(x) $ represent the discriminant function:  
$
\delta_k(x) = \frac{x \mu_k}{\sigma^2} - \frac{\mu_k^2}{2 \sigma^2} + \ln(\pi_k).
$

### Conclusion  
We have derived the linear discriminant function $ \delta_k(x) $, which depends linearly on $ x $ and differentiates between classes based on their means $ \mu_k $, prior probabilities $ \pi_k $, and the common variance $ \sigma^2 $.


# Question 2. Compare Two Clas (PAGE 09)

### 1. Discriminant Functions for Two Categories $ k $ and $ i $
From Linear Discriminant Analysis (LDA), we have derived in Question 1 of Homework 5 that the discriminant function for a given category $ k $ is:  
$
\delta_k(x) = \frac{x \mu_k}{\sigma^2} - \frac{\mu_k^2}{2 \sigma^2} + \ln(\pi_k),
$
and for category $ i $, it is:  
$
\delta_i(x) = \frac{x \mu_i}{\sigma^2} - \frac{\mu_i^2}{2 \sigma^2} + \ln(\pi_i).
$

### 2. Comparing $ \delta_k(x) $ and $ \delta_i(x) $
We assign $ x $ to category $ k $ if $ \delta_k(x) > \delta_i(x) $.  
Thus, we compare the two discriminant functions:  $ \delta_k(x) > \delta_i(x).$    
This inequality becomes:  $\frac{x \mu_k}{\sigma^2} - \frac{\mu_k^2}{2 \sigma^2} + \ln(\pi_k) > \frac{x \mu_i}{\sigma^2} - \frac{\mu_i^2}{2 \sigma^2} + \ln(\pi_i).$　　

### 3. Simplifying the Inequality
Rearrange the terms:  $\frac{x \mu_k}{\sigma^2} - \frac{x \mu_i}{\sigma^2} > \frac{\mu_k^2}{2 \sigma^2} - \frac{\mu_i^2}{2 \sigma^2} + \ln(\pi_i) - \ln(\pi_k).$  
Factor out common terms:  $\frac{x (\mu_k - \mu_i)}{\sigma^2} > \frac{\mu_k^2 - \mu_i^2}{2 \sigma^2} + \ln\left( \frac{\pi_i}{\pi_k} \right).$  
Multiply both sides by $ \sigma^2 $ (since $ \sigma^2 > 0 $,  this doesn't change the inequality):  $x (\mu_k - \mu_i) > \frac{\mu_k^2 - \mu_i^2}{2} + \sigma^2 \ln\left( \frac{\pi_i}{\pi_k} \right).$　　

### 4. Case 1: Equal Priors ( $ \pi_k = \pi_i $ )   <--- Assumption in the LectureSlide.
If $ \pi_k = \pi_i $, then $ \ln\left( \frac{\pi_i}{\pi_k} \right) = 0 $, and the inequality simplifies to:  　$x (\mu_k - \mu_i) > \frac{\mu_k^2 - \mu_i^2}{2}.$  
Multiply both sides by 2 to remove the fraction:  $2x (\mu_k - \mu_i) > \mu_k^2 - \mu_i^2.$  
Thus, if this inequality holds, assign $ x $ to category $ k $. Otherwise, assign $ x $ to category $ i $.  
This gives us the two cases:
- **If $ 2x (\mu_k - \mu_i) > \mu_k^2 - \mu_i^2 $**, assign $ x $ to group $ k $.  
- **If $ 2x (\mu_k - \mu_i) < \mu_k^2 - \mu_i^2 $**, assign $ x $ to group $ i $.

---

### 5. Case 2: Unequal Priors ( $ \pi_k \neq \pi_i $ )  
If the prior probabilities $ \pi_k $ and $ \pi_i $ are not equal, we cannot drop the $ \ln\left( \frac{\pi_i}{\pi_k} \right) $ term.  
Thus, the inequality becomes:  $2x (\mu_k - \mu_i) > \mu_k^2 - \mu_i^2 + 2 \sigma^2 \ln\left( \frac{\pi_i}{\pi_k} \right).$
If this inequality holds, assign $ x $ to category $ k $; otherwise, assign $ x $ to category $ i $.  

---

### Conclusion  

The derived formula allows us to compare the two groups based on their means $ \mu_k $, $ \mu_i $, and their priors $ \pi_k $, $ \pi_i $. In simple terms:
- The left-hand side $ 2x (\mu_k - \mu_i) $ represents how far $ x $ is from the midpoint of the two means.  
- The right-hand side $ \mu_k^2 - \mu_i^2 + 2 \sigma^2 \ln\left( \frac{\pi_i}{\pi_k} \right) $ represents the "threshold" adjusted by the class priors.

Thus, this formula helps decide which category $ x $ should be assigned to based on the comparison between the observation and this threshold.