# Problem 1 – Naive Bayes on Animal Dataset

### Name:
Xavier Soto

### UCF ID:
5601517

---

## Implementation

We classify an animal as **Mammal** or **Non-Mammal** from four categorical features:

1) Give Birth (yes/no)  
2) Can Fly (yes/no)  
3) Live in Water (no/sometimes/yes)  
4) Have Legs (yes/no)

We use **Naive Bayes** with Laplace smoothing ($\alpha=1$).  

General formulas:

- **Class prior:**
$$
P(C=c) = \frac{N_c}{N}
$$

- **Smoothed likelihood:**
$$
P(x=v \mid C=c) = \frac{\text{count}(x=v,\,C=c)+\alpha}{N_c+\alpha |V|}
$$

- **MAP rule:**
$$
\hat{c} = \arg\max_{c} \Big[ \log P(C=c) + \sum_i \log P(x_i \mid C=c) \Big]
$$

---

## Step 1: Priors

From the dataset (20 animals):

$$
P(\text{Mammal}) = \tfrac{8}{20} = 0.4,
\quad
P(\text{NonMammal}) = \tfrac{12}{20} = 0.6
$$

---

## Step 2: Likelihood Tables (with Laplace smoothing)

### 1) Give Birth ($|V|=2$)

| Value | $P(\cdot \mid \text{Mammal})$ | $P(\cdot \mid \text{NonMammal})$ |
|---|---:|---:|
| yes | $(7+1)/(8+2) = 0.8$ | $(0+1)/(12+2) = 1/14$ |
| no  | $(1+1)/(8+2) = 0.2$ | $(12+1)/(12+2) = 13/14$ |

### 2) Can Fly ($|V|=2$)

| Value | $P(\cdot \mid \text{Mammal})$ | $P(\cdot \mid \text{NonMammal})$ |
|---|---:|---:|
| yes | $(1+1)/10 = 0.2$ | $(3+1)/14 = 4/14$ |
| no  | $(7+1)/10 = 0.8$ | $(9+1)/14 = 10/14$ |

### 3) Live in Water ($|V|=3$)

| Value | $P(\cdot \mid \text{Mammal})$ | $P(\cdot \mid \text{NonMammal})$ |
|---|---:|---:|
| yes       | $(4+1)/(8+3) = 5/11$ | $(3+1)/(12+3) = 4/15$ |
| sometimes | $(0+1)/(8+3) = 1/11$ | $(4+1)/(12+3) = 5/15$ |
| no        | $(4+1)/(8+3) = 5/11$ | $(5+1)/(12+3) = 6/15$ |

### 4) Have Legs ($|V|=2$)

| Value | $P(\cdot \mid \text{Mammal})$ | $P(\cdot \mid \text{NonMammal})$ |
|---|---:|---:|
| yes | $(5+1)/10 = 0.6$ | $(9+1)/14 = 10/14$ |
| no  | $(3+1)/10 = 0.4$ | $(3+1)/14 = 4/14$ |

---

## Step 3: Evidence and Posterior (Worked Example)

Let the test animal be:  
$x = (\text{give\_birth=no},\;\text{can\_fly=no},\;\text{live\_in\_water=sometimes},\;\text{have\_legs=yes})$

### Likelihoods
- Mammal:
$$
P(x \mid M) = \tfrac{2}{10}\cdot\tfrac{8}{10}\cdot\tfrac{1}{11}\cdot\tfrac{6}{10} = \tfrac{12}{1375} \approx 0.008727
$$

- Non-Mammal:
$$
P(x \mid N) = \tfrac{13}{14}\cdot\tfrac{10}{14}\cdot\tfrac{5}{15}\cdot\tfrac{10}{14}
= \tfrac{325}{2058} \approx 0.157920
$$

### Evidence
$$
P(x) = P(x \mid M)P(M) + P(x \mid N)P(N)
$$
$$
P(x) \approx (0.008727)(0.4) + (0.157920)(0.6) \approx 0.09824
$$

### Posterior
$$
P(M \mid x) = \frac{P(x \mid M)P(M)}{P(x)} \approx 0.0355
$$
$$
P(N \mid x) = \frac{P(x \mid N)P(N)}{P(x)} \approx 0.9645
$$

Prediction:
$$
\hat{c} = \arg\max_{c} P(c \mid x) = \text{NonMammal}
$$

---

## Conclusion

- We computed **class priors**, **smoothed likelihoods**, and applied **Bayes’ rule** step by step.  
- Laplace smoothing prevented zero probabilities (e.g., “sometimes” for Mammal in water).  
- Evidence was calculated as the weighted sum of likelihoods across both classes.  
- Posterior probabilities showed that the penguin-like test animal is classified as **Non-Mammal**.  
- This experiment confirms that Naive Bayes provides a simple and interpretable baseline classifier, effective for distinguishing mammals vs. non-mammals given categorical features, while limited by the independence assumption.
