# Conditional Mixture Models:
## Linear Regression

In [4]:
import numpy as np
import matplotlib.pyplot as plt
from numpy.random import randn, rand, seed

In [2]:
%config InlineBackend.figure_format = "retina"

Consider $K$ linear regression models in the real line ($f_k: \mathbb{R}^M\to\mathbb{R}$), each governed by its own weight vector ${\bf w}_k$ and sharing the same precision parameter $\beta$. To make a **mixture of linear regressions**, we consider a set of weighting coefficients $\{\pi_k\}_k$ and a mixture distribution given by

$$
    p({\bf t}\vert \boldsymbol\theta,\boldsymbol\phi) = \sum_{k=1}^K \pi_k \mathcal{N}(t_n\vert{\bf w}_k^T\boldsymbol\phi, \beta^{-1})
$$

Where $\boldsymbol\theta=\big\{\{{\bf w}_k\}_k, \{\pi_k\}_k, \beta\big\}$

In order to find the values of $\boldsymbol\theta$, we make use of th EM algorithm by introducing latent variables ${\bf Z}=\{{\bf z}_n\}_n$. The EM algorithm then turns in finding the coefficients

$$
\gamma_{nk} = \frac{\pi_k\mathcal{N}(t_n\vert{\bf w}_k^T\boldsymbol\phi_n, \beta^{-1})}{\sum_j\pi_j\mathcal{N}(t_n\vert{\bf w}_j^T\boldsymbol\phi_n, \beta^{-1})}
$$

for the **E-step**, and maximizing

$$
Q(\boldsymbol\theta, \boldsymbol\theta^{\text{old}}) = \sum_{n=1}^N\sum_{k=1}^K \gamma_{nk}\big[\log\pi_k + \log\mathcal{N}(t_n\vert{\bf w}_k^T\boldsymbol\phi, \beta^{-1})\big]
$$

w.r.t. each component of $\boldsymbol\theta$ for the **M-step**.

The M-step results in the updating equations:

$$
\pi_k^\text{new} = \frac{1}{N}\sum_{n=1}^N\gamma_{nk}
$$

$$
{\bf w}_k^\text{new} = \left(\boldsymbol\Phi^T R_k\boldsymbol\Phi\right)^{-1}\boldsymbol\Phi R_k {\bf t}
$$

$$
\frac{1}{\beta^\text{new}} = \frac{1}{N}\sum_{n=1}^N\sum_{k=1}^K \gamma_{nk}[t_n - {\bf w}_k^T\boldsymbol\phi_n]
$$

## An example

In [9]:
n_obs = 51
xrange = np.linspace(-1, 1, n_obs) + rand(n_obs) * 0.1

In [16]:
n_vals = n_obs // 3

In [35]:
z = np.ones((n_obs, 2))
z[n_vals: 2* n_vals, 0] = 0
z[:, 1] = z[:, 1]- z[:, 0]