This is an [jupyter](http://jupyter.org) notebook.
Lectures about Python, useful both for beginners and experts, can be found at http://scipy-lectures.github.io.

Open the notebook by (1) copying this file into a directory, (2) in that directory typing 
jupyter-notebook
and (3) selecting the notebook.

# <font color= 'blue'>Blind Source Separation</font>
## Independent Component Analysis

***
A notebook by ***Shashwat Shukla*** and ***Dhruv Ilesh Shah***
***

In this tutorial we will learn how to solve the Cocktail Party Problem using Independent Component Analysis(ICA).
We will first take a look at Principle Component Analysis(PCA). The limitations of PCA will naturally lead to an understanding of ICA does.

# Overview
## The Cocktail Party Problem(CPP)

So what is the Cocktail Party Problem? 
Imagine you are at a party where a lot of different conversations are happening in different parts of the room. As a listener in  the room, you are receiving sound from all of these conversations at the same time. And yet, as humans, we possess the ability to identify different threads of conversation and to focus on any conversation of our choice. How do we do that? And how can we program a computer to do that?
So this is essentially the Cocktail Party Problem: Given **m** sources(conversations at the party for example), and some number of sound receivers, separate out the different signals. (We will talk about how many receivers we need later.)

We need to make some mathematical assumptions and also phrase the problem more formally.

## The data

So first of all, our signals here are the sounds coming from different sources. 
At every (uniformly spaced) discrete interval of time we record **m** samples, one at each of our **m** microphones.

Note the implicit assumptions that we have made here:

1) There are as many microphoneses as there are independent conversations(sources) going on in the room. This assumption allows us to come up with a method to retrieve all the m independent signals. We can say that our system is **critically determined**(and is not under- or over- determined). Henceforth, we shall only consider this case in the tutorial.

2) Each microphone records a reasonably distinct combination of the independent signals. This simply amounts to not keeping two microphones too close to each other. Due to practical computational limits (see floating point math), it is always best to have easily distinguishable recordings. 

How are we recording this data? We simply record the amplitude of the sound at each instant. Recording the pressure amplitude is a convenient thing to do(and is what a microphone does. A transducer then converts the pressure amplitude to a voltage).
Note that we are recording the signals at discrete intervals of time (at a rate assumed to be greater than the Shannon rate)and will be working only in the time domain with these discrete signals.

One very important thing: We assume that the sound that any receiver records is a **linear combination** of sounds from the different sources. This is a reasonable assumption to make as pressure adds linearly. Each receiver will receive a different linear combination: If the first receiver is closer to a particular speaker than the second receiver, then the linear weight of this speaker will be proportionately higher for the first receiver.

![Cocktail Party Problem](Notebook/cocktail_1.png)


We further assume that each source is **statistically independent** with respect to all the other sources. We will look at a mathematical interpretation of statistical independence of two signals later. Within the context of the Cocktail Party parable an intuitive understanding of this assumption follows naturally, as the conversations happening in different parts of the room are independent of each other. Hence, knowing the signal at a particular instant from one source does not allow us to predict the value of the signal from any other source at that instant. They are independent variables.

This is the key assumption in Blind Source separation that allows to solve the problem.

We are also making one vital assumption about the sources of the signals: that they are non-Gaussian. We will look at what that means and why it matters in the next section. 

## The math

We will index our microphones from **1** to **m**. 

The signal received by the microphone labelled **i** over the entire time of recording will be denoted by $x_{i}$. A particular sample of this recorded signal, recorded at the time index **j** will then be denoted by $x_{i}^{j}$. 

Hence, if the samples of the signals recorded over time be **N**, then $x_{i}$ can be seen to be a row vector in **N**-dimensional space. It's jth element is given by $x_{i}^{j}$.

We had said that we have **m** microphones. Hence **i** in the above description ranges from **1** to **m**.

If we stack up these row vectors, we will get an **m x N** matrix whose ith row corresponds to the samples recorded by a particular microphone. A 'vertical slice' of this matrix, i.e a column corresponds to all the samples recorded at a given instant of time, indexed by the indices of the corresponding microphone.

Let us call this data matrix **x**.

To reiterate, $x_{i}^{j}$ corresponds to the sound sample recorded by the **i**th mike at the time (indexed by) **j**. 

Let us now similarly define matrices corresponding to the sources that we wish to finally recover.
The indices for the independent sources also go from **1** to **m**.

Let $s_{i}$ denote the signal generated by the **i**th independent source that we wish to recover (the **i**th conversation in the room). It is defined as a row vector.

$s_{i}^{j}$ is then the **j**th time sample of this signal. 

Again, we vertically stack up these row vectors to get a **m x N** matrix denoted by **s**.

Now that we have defined our data and the signals that we wish to retrieve, we will describe the (assumed) relationship between the two. Note that we had assumed that the independent sources add **linearly** to give the recorded signals. 

This means that each $x_{i}$ is some linear combination of the vectors $s_{1}$ through $s_{m}$.

Putting it all together, we conclude that $x = As$ ; for some **m x m** matrix **A**, called the mixing matrix.

Our objective is to then find an "un-mixing" matrix **W** that satisfies $s = Wx$.

If we know this **W**, as we already have **x**, we can calculate **s** by a direct multiplication. 

![The Problem](Notebook/cocktail_2.jpg)


## Outline of solution

Our objective is to find the matrix **W**. As we have assumed that the number of microphones is equal to the number of independent conversations, it turns out that the matrix **A** is invertible and hence **W** is just the inverse of **A**. \

Hence, it suffices to find **A**. We will employ [Singular Value Decomposition(SVD)](https://www.wikiwand.com/en/Singular_value_decomposition) on the matrix **A**.

Hence, $A = UDV$ for orthogonal matrices **U**, **V** and diagonal matrix **D**. 

We will then determine each of **U**, **D**, **V** by considering the covariance matrix of **x** and exploiting the independence of the source signals.

The details follow.
