---
title: Mixture Models
layout: collection
permalink: /Machine-Learning/Mixture-Models
collection: Machine-Learning
entries_layout: grid
mathjax: true
toc: true
categories:
  - study
tags:
  - mathematics
  - statistics
  - machine-learning 
---

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats

sns.set_theme()

In mixture models we assume that the input examples come from different potentially unobserved types (groups, clusters etc.).
We assume that there are $m$ underlying types where each type $z$ occurs with a certain probability $ \mathbb{P}(z) $.
The examples of the type $z$ are then conditionally distributed $ \mathbb{P}(\mathbf{x}|z )  $. 
The observations $ \mathbf{x}   $ then come from a so called mixture distribution, which is just the weighted sum of the type probability times the conditional type probability

$$
\mathbb{P}(\mathbf{x} ) = \sum_{j=1}^m  \mathbb{P}(z=j) \mathbb{P}(\mathbf{x} | z=j, \mathbf{\theta}_j )   
$$

A mixture of gaussians model has the form

$$
\mathbb{P}(\mathbf{x} | \mathbf{\theta}  ) = \sum_{j=1}^m \pi_j \mathcal{N}(\mathbf{x} | \mathbf{\mu}_j, \Sigma_j  )    
$$

where $ \mathbf{\theta} = \pi_1, ..., \pi_m | \mathbf{\mu}_1, ..., \mathbf{\mu}_m | \Sigma_1, ..., \Sigma_m   $. $\pi_j$is the so called mixing proportion which can be seen as the probability of an observations coming from a class $j$. Thus the probability of a class itself.

# Data generation

During the data generation, with probability $ \mathbb{P}(z)  $ class $z_j$ is chosen and the sample points $ \mathbf{x} $ are chosen from the conditional distribution $ \mathbb{P}(\mathbf{x} | z = j )  $.
For a two class system, our sample points $ \mathbf{x}  $ could then have been generated in two ways. We thus would like to find out the underlying distribution of our observations.

# Latent Variable Models (LVM).

In the model $ \mathbb{P} $(\mathbf{x} | z=j, \mathbf{\theta}) the class indicator variable $z$ is latent. This means that $z$ is a variable that can only be indirectly inferred through mathematical models from other observable variables that can directly be observed. 
They are not directly measurable.
This then is an example of a large calss of latent variable models (LVM).