# Audio source separation with magnitude priors: the BEADS model 

## Antoine Liutkus$^1$, Christian Rohlfing$^2$, Antoine Deleforge$^3$

$^1$ Zenith team, Inria, University of Montpellier, France<p>
$^2$ RWTH, Aachen University, Germany<p>
$^3$ Inria Rennes - Bretagne Atlantique, France<p>

<img src="figures/logos.svg" style="height:5em; margin-top:5em">

# Context

## Separation of complex random variables

# The source separation problem 
For each Time-Frequency bin, the mixture is the sum of sources $x=\sum_j s_j$
<img src="figures/nocode/fig_sources1.svg">

# The source separation problem 
For each Time-Frequency bin, the mixture is the sum of sources $x=\sum_j s_j$
<img src="figures/nocode/fig_sources2.svg">

# Typical separation pipeline

<img src="figures/source_separation_pipeline.svg" style="height:10em">

## In this talk
* __Filtering__ from magnitude estimates $b_j>0$ to separated signals $s_j\in\mathbb{C}$ 
* Tractable model for __complex variables $s_j$ with (approximately) known magnitude $b_j$__


## In the paper
* The multichannel case
* Evaluation for audio coding

The classical Gaussian model $s_j\sim\mathcal{N}\left(0, \frac{2}{\pi}b_j^2\right)$ matches the prior $\mathbb{E}\left[\left|s_j\right|\right]=b_j$

<img src="figures/nocode/fig_lgm.svg">

$\Rightarrow$ Highest probability mass on 0

The mixture is Gaussian $x\sim\mathcal{N}\left(0,\sum_j b_j^2\right)$, sources are recovered as: $s\mid x\sim \mathcal{N}\left(\frac{b^2_j}{\sum b^2} x, \sigma_j^2\left(1 - \frac{b_j^2}{\sum b^2}\right)\right)$
<img src="figures/nocode/fig_lgmdemo1.svg">

The mixture is Gaussian $x\sim\mathcal{N}\left(0,\sum_j \sigma_j\right)$, sources are recovered as: $s\mid x\sim \mathcal{N}\left(\frac{\sigma_j}{\sum \sigma} x, \sigma_j - \frac{\sigma_j^2}{\sum \sigma}\right)$
<img src="figures/nocode/fig_lgmdemo2.svg">

The mixture is Gaussian $x\sim\mathcal{N}\left(0,\sum_j b_j^2\right)$, sources are recovered as: $s\mid x\sim \mathcal{N}\left(\frac{b^2_j}{\sum b^2} x, \sigma_j^2\left(1 - \frac{b_j^2}{\sum b^2}\right)\right)$
<img src="figures/nocode/fig_lgmdemo3.svg">

The mixture is Gaussian $x\sim\mathcal{N}\left(0,\sum_j \sigma_j\right)$, sources are recovered as: $s\mid x\sim \mathcal{N}\left(\frac{\sigma_j}{\sum \sigma} x, \sigma_j - \frac{\sigma_j^2}{\sum \sigma}\right)$
<img src="figures/nocode/fig_lgmdemo4.svg">
$\Rightarrow$ Aligned estimated sources, magnitudes inconsistent with prior<p>
$\Rightarrow$ Uncertainty independent of the mixture

Another classical solution: magnitude ratios:
$\hat{s}_j=\frac{b_j}{\sum b}x$
<img src="figures/nocode/fig_magdemo1.svg">

Another classical solution: magnitude ratios:
$\hat{s}_j=\frac{b_j}{\sum b}x$
<img src="figures/nocode/fig_magdemo2.svg">

Another classical solution: magnitude ratios:
$\hat{s}_j=\frac{b_j}{\sum b}x$
<img src="figures/nocode/fig_magdemo3.svg">

Another classical solution: magnitude ratios:
$\hat{s}_j=\frac{b_j}{\sum b}x$
<img src="figures/nocode/fig_magdemo4.svg">

$\Rightarrow$ Still estimating aligned sources rather than complying with the magnitude prior<p>
$\Rightarrow$ No tractable uncertainty 

# An ideal model

## The donut-shaped distribution

## Objective
What do we want of a probabilistic model for a complex random variable with (approximately) known magnitude?
<img src="figures/nocode/fig_donut1.svg">

## Objective
What do we want of a probabilistic model for a complex random variable with (approximately) known magnitude?
<img src="figures/nocode/fig_donut2.svg">

## Objective
What do we want of a probabilistic model for a complex random variable with (approximately) known magnitude?
<img src="figures/nocode/fig_donut3.svg">

## The Donut distribution for modeling the sources
<img src="figures/nocode/fig_sourcesdonut.svg">

$\Rightarrow$ No model for the sum of donut variables<p>
$\Rightarrow$ No easy way for separation: $\mathbb{P}\left[s\mid x\right]$ non tractable

# Contribution
## **BEADS** Bayesian Expansion to Approximate the Donut Shape
<img src="figures/nocode/fig_beadsintro1.svg">
Sources distribution as a Gaussian Mixture Model: $P\left[s_j\right] = \sum_c \pi[c] \mathcal{N}\left(s_j\mid b_j \omega^c, \sigma_j\right)$<p>
$\Rightarrow$ Only two parameters: $b_j$ and $\sigma_j$

# Contribution
## **BEADS** Bayesian Expansion to Approximate the Donut Shape
<img src="figures/nocode/fig_beadsintro2.svg">
Sources distribution as a Gaussian Mixture Model: $P\left[s_j\right] = \sum_c \pi[c] \mathcal{N}\left(s_j\mid b_j \omega^c, \sigma_j\right)$<p>
$\Rightarrow$ Only two parameters: $b_j$ and $\sigma_j$

# Contribution
## **BEADS** Bayesian Expansion to Approximate the Donut Shape
<img src="figures/nocode/fig_beadsintro3.svg">
Sources distribution as a Gaussian Mixture Model: $P\left[s_j\right] = \sum_c \pi[c] \mathcal{N}\left(s_j\mid b_j \omega^c, \sigma_j\right)$<p>
$\Rightarrow$ Only two parameters: $b_j$ and $\sigma_j$

# Contribution
## **BEADS** Bayesian Expansion to Approximate the Donut Shape
<img src="figures/nocode/fig_beadsintro4.svg">
Sources distribution as a Gaussian Mixture Model: $P\left[s_j\right] = \sum_c \pi[c] \mathcal{N}\left(s_j\mid b_j \omega^c, \sigma_j\right)$<p>
$\Rightarrow$ Only two parameters: $b_j$ and $\sigma_j$

# Contribution
## **BEADS** Bayesian Expansion to Approximate the Donut Shape
<img src="figures/nocode/fig_beadsintro5.svg">
Sources distribution as a Gaussian Mixture Model: $P\left[s_j\right] = \sum_c \pi[c] \mathcal{N}\left(s_j\mid b_j \omega^c, \sigma_j\right)$<p>
$\Rightarrow$ Only two parameters: $b_j$ and $\sigma_j$

## Summing beads random variables
BEADS model for the sources $\Rightarrow$ Gaussian Mixture Model for the mixture
<img src="figures/nocode/fig_beadssources1.svg">

## Summing beads random variables
BEADS model for the sources $\Rightarrow$ Gaussian Mixture Model for the mixture
<img src="figures/nocode/fig_beadssources2.svg">

## Summing beads random variables
BEADS model for the sources $\Rightarrow$ Gaussian Mixture Model for the mixture
<img src="figures/nocode/fig_beadssources3.svg">

The sources are estimated through Bayes theorem as $s\mid x=\sum_c \pi(c\mid x)\mathcal{N}(s\mid \mu_{c\mid x}, \sigma_{\mid x})$
<img src="figures/nocode/fig_beadsdemo1.svg">

The sources are estimated through Bayes theorem as $s\mid x=\sum_c \pi(c\mid x)\mathcal{N}(s\mid \mu_{c\mid x}, \sigma_{\mid x})$
<img src="figures/nocode/fig_beadsdemo2.svg">
$\Rightarrow$ Posterior is tractable, estimates consistent with the magnitude prior<p>
$\Rightarrow$ Uncertainty is mix-dependent<p>

The sources are estimated through Bayes theorem as $s\mid x=\sum_c \pi(c\mid x)\mathcal{N}(s\mid \mu_{c\mid x}, \sigma_{\mid x})$
<img src="figures/nocode/fig_beadsdemo3.svg">
$\Rightarrow$ Posterior is tractable, estimates consistent with the magnitude prior<p>
$\Rightarrow$ Uncertainty is mix-dependent<p>

The sources are estimated through Bayes theorem as $s\mid x=\sum_c \pi(c\mid x)\mathcal{N}(s\mid \mu_{c\mid x}, \sigma_{\mid x})$
<img src="figures/nocode/fig_beadsdemo4.svg">
$\Rightarrow$ Posterior is tractable, estimates consistent with the magnitude prior<p>
$\Rightarrow$ Uncertainty is mix-dependent<p>

# Conclusion: The beads model

## Core advantages
* Complex random variables with approximately known magnitudes
* Sums of beads sources is a GMM
* Separation is easy as GMM inference

## To go further
* Generalizes easily to multichannel
* Shared variances for the beads $\Rightarrow$ computational savings

## Source code for this presentation
[https://github.com/aliutkus/beads-presentation](https://github.com/aliutkus/beads-presentation)