# Generative Model

## Definition
In statistical classification, two main approaches exist:
- **Generative model**: Models the joint probability distribution $$P(X,Y)$$ on observable variable X and target Y
- **Discriminative model**: Models the conditional probability $$P(Y|X=x)$$

## Key Differences
| Aspect | Generative Model | Discriminative Model |
|--------|------------------|----------------------|
| Models | $$ P(X,Y) $$     | (  P( Y|X )      )      |
| Can generate data? | Yes | No |
| Examples | Naive Bayes, LDA | Logistic Regression, SVM |

## Mathematical Foundation
The relationship between models:

$$
P(X,Y) = P(X|Y)P(Y) = P(Y|X)P(X)
$$

Bayes' Rule connects them:

$$
P(Y|X) = \frac{P(X|Y)P(Y)}{P(X)}
$$

## Deep Generative Models
Modern approaches combining generative models with deep learning:

1. **Variational Autoencoders (VAEs)**
   $$ \mathcal{L}(\theta,\phi) = \mathbb{E}_{q_\phi(z|x)}[\log p_\theta(x|z)] - D_{KL}(q_\phi(z|x)||p(z)) $$

2. **Generative Adversarial Networks (GANs)**
   $$ \min_G \max_D V(D,G) = \mathbb{E}_{x\sim p_{data}}[\log D(x)] + \mathbb{E}_{z\sim p_z}[\log(1-D(G(z)))] $$

3. **Diffusion Models**
   $$ q(x_t|x_{t-1}) = \mathcal{N}(x_t; \sqrt{1-\beta_t}x_{t-1}, \beta_t\mathbf{I}) $$

## Examples
### Simple Classification Case
For data:
$$ (x,y) \in \{(1,0),(1,1),(2,0),(2,1)\} $$

Joint distribution:
$$
\begin{array}{c|cc}
 & y=0 & y=1 \\
\hline
x=1 & 1/4 & 1/4 \\
x=2 & 1/4 & 1/4 \\
\end{array}
$$

Conditional distribution:
$$
\begin{array}{c|cc}
 & y=0 & y=1 \\
\hline
x=1 & 1/2 & 1/2 \\
x=2 & 1/2 & 1/2 \\
\end{array}
$$

## References
1. Ng & Jordan (2002) - "On Discriminative vs. Generative Classifiers"
2. Goodfellow et al. (2014) - "Generative Adversarial Networks"
3. Kingma & Welling (2013) - "Auto-Encoding Variational Bayes"