# Discriminative VS generative models

## Latent variables

Explain/discover structure in data:
    
- estimate the underlying pdf;
- model the pdf as unobservable random variables $h$ inducing observable outcomes $x$.

> The variables $h$ are the __hidden causes__ behind the outcome generation.

Inference of the hidden variables $h$ given the outcome $x$, by chain rule the following are equivalent:

$$\large
p(h|x)p(x) = p(x|h)p(h)
$$

This means that using $x$ we can discover $h$ directly estimating: $\large p(h|x)p(x)$.

But another possibility is to estimate $h$ maximizing the likelihood of data: $\large p(x|h)$ or the posterior $\large p(x|h)p(h)$.

These two expressions can be represented as Bayesian networks as follows:

In [None]:
from visualization import graph

base = "h[style=filled,fillcolor=gray]\n"
graph(f"{base}x->h", title="Direct modeling: $\large arg\max_hp(h|x)p(x)$")
graph(f"{base}h->x", title="A-posteriori estimation maximizing: $\large arg\max_hp(x|h)p(h)$")

In the latter case no $p(x)$ must be estimated, generally being that pdf the intractable one.

## Direct $p(h|x)$ estimation: discriminative models

Naive Bayes: $\large p(h|x)=\prod_i{p(h_i|x)}$

We can estimate the (simpler) $p(h_i|x)$ (eg. in tabular form).

Given an $x$, choose the combination of $h_i$ maximizing $p(h|x)$.

Practical example: classification (only one $h_i$ to 1, the remaining to 0).

__This approach allows discrimination only__

## Maximum likelihood $p(x|h)$ estimation: generative models

PPCA: $\large x = Wh + \mu + \sigma$  
where $h \sim \mathcal{N}(\cdot;I,0)$

A stochastic recipe to compute $x$ given $h$ is defined by the model.

__This approach allows both discrimination and generation__

In [None]:
import numpy as np
import pandas as pd

# Load data (from https://github.com/daradecic/Python-Eigenfaces):
faces = pd.read_csv('data/face_data.csv')
faces = faces.drop('target',axis=1)
faces = np.array(faces)

In [None]:
from matplotlib import pyplot as plt

def plot_face(face):
    plt.figure()
    plt.imshow(face.reshape(64, 64), cmap='gray')
    plt.show()

plot_face(faces[0])

In [None]:
from sklearn.decomposition import PCA

# Computing the PCA of images:
pca = PCA(n_components=100).fit(faces)
latent = pca.transform(faces)

In [None]:
# Reconstructing a face:
def reconstruct_face(latent):
    return pca.inverse_transform(latent)

def plot_reconstructed_face(latent):
    plot_face(reconstruct_face(latent))

plot_reconstructed_face(latent[0])

In [None]:
from visualization import interact_vector

# Synthesis of new faces:
def plot_new_face(v):
    # Mean and modified face:
    n = latent.shape[1]
    h = np.zeros((n,))
    f_mean = reconstruct_face(h)
    h[0:len(v)] = v
    f_mod = reconstruct_face(h)
    
    # Plotting both:
    _, axes = plt.subplots(1, 2, figsize=(8, 16))
    fs = [f_mean,f_mod]
    for i in range(2):
        axes[i].imshow(fs[i].reshape(64, 64), cmap='gray')
    plt.show()
    print("v:",h)

# Interactive face reconstruction:
interact_vector("v", 10, plot_new_face, min=-10, max=10, step=1);