### 01. Random Projection

In this lesson, we'll be talking about dimensionality reduction. The first method we'll look at in this lesson is random projection, which is a powerful dimensionality reduction method that is computationally more efficient than PCA.

It is commonly used in cases where a dataset has too many dimensions for PCA to be directly computed.

Let's say your application is running on a system with limited computational resources, or you just find that PCA is too taxing for a specific situation that you're in. Just like PCA, it takes a dataset. Let's say this is our dataset with d dimensions, let's say 1,000 and a certain number of samples or rows, let's say n. These are columns. So, it takes our dataset and it produces a transformation of it that is in a much smaller number of columns. So, okay, let's say 50 for example, but the same number of samples and where your each column here it captures information from multiple columns there.

![01%20Random%20Projection.PNG](attachment:01%20Random%20Projection.PNG)

Let's look at an oversimplified example for reducing the dimensions of a dataset from two dimensions into one dimension. 

PCA here will try to maximize variance.  So, it finds the vector or direction that maximizes the variance so it loses the least amount of information when it projects the data from two dimensions to one. So, that line would be something like this, and these would be the projections, and so in one dimension, the dataset would look like this.

Random projection:  so with this calculation that PCA did especially if you're talking about a lot of dimensions, it consumes a certain amount of resources. Random projection just say, pick a line, any line, we'll do a projection on that, that's our dataset. So, while it doesn't really make a lot of sense in our over-simplified scenario like this from two dimensions to one,
 it actually works and it works really well in higher dimensions, and it works in a high-performance way.
 
 ![01%20PCA%20VS%20RP.PNG](attachment:01%20PCA%20VS%20RP.PNG)

The basic premise of a random projection is that we can simply reduce the number of dimensions in our dataset, by multiplying it by a random matrix like this. So, d, we will have d in our dataset but k is something we either compute or it's something that we desire and there is a way to compute what is a conservative or a good estimate for k. So, that would be the resulting dataset. Just multiplying by a random matrix, that's all random projection is in a way.

![RP%20multiply%20by%20Matrix.PNG](attachment:RP%20multiply%20by%20Matrix.PNG)

Let's take a concrete example here, let's say this is our dataset, and it has 12,000 dimensions, that's our d, and it has 1,500 rows or samples. If we give this to scikit-learn and say, okay, scikit-learn, can you please do a random projection for this dataset just using your default values? It will return this dataset for us which it will be in 6,200 dimensions and the same number of samples obviously. 

![01%20sklearn%20RP.PNG](attachment:01%20sklearn%20RP.PNG)

So, how do we that it works and where does the k come from?  

The theoretical underpinning for random projection is this idea called the Johnson-Lindenstrauss lemma which states that a dataset of N points in high-dimensional space. So, this dataset, N points, high-dimensional space, 12,000 is pretty high. It can be mapped. Multiplying by this random matrix, down to a space in much lower dimension which is this narrow dataset in a way. This is why it's really important for us.

It can be done in a way that preserves the distances between the points to a large degree. So, the distances between each two points, each pair of points in these datasets, after projection that is preserved in a certain way. That's really important because in most or in a lot of supervised and unsupervised learning, the algorithms really care about the distances between the points.

So, we have a set a level of guarantee that these distances will be distorted a little bit but they can be preserved. 

![Johnson-lindenstrauss%20lemma.PNG](attachment:Johnson-lindenstrauss%20lemma.PNG)

### 03. Random Projection in sklearn

![03%20RP%20in%20Sklearn.PNG](attachment:03%20RP%20in%20Sklearn.PNG)

here if we want to declare or specify a specific value for Epsilon, or if we want to force it into a specific size or number of components (resulting columns/features). If we don't specify anything, the number of components is automatic and it's calculated by using Epsilon, and by using the number of points or samples in the dataset. So, it will choose based on the dataset.

### 04. Independent Component Analysis (ICA)

Independent component analysis is a method similar to PCA and random projection and that it takes a set of features and produces a different set that is useful in some way. But it's different in that PCA works to maximize variance, ICA assumes that the features are mixtures of independent sources and it tries to isolate these independent sources that are mixed in this dataset.
The classic example used to explain ICA is something called the cocktail party problem.

So, let's say that three of your friends go to an art show and there are three curtains. One of them opens up and there's a person playing a piano there. Then after a little bit of time another curtain is uncovered and there is a person playing the cello. The piano is still playing and then cello starts playing but they're playing two different pieces.the third curtain is opened and then there's a TV turned on and there's this sign wave noise coming in. 

So, here your friends decide that "Okay. This is kind of interesting. Let's take our phones out and record this." So, they record let's say six seconds of audio for this.

This friend was closer to the piano so they have more piano and they're recording, this person was closer to the TV so they had more TV. 

is there a way for you to retrieve the original signals?, Datasets? The answer is yes. That's what an example of what independent component analysis allows you to do.So, this is a type of problem called the blind source separation and that's the problem that ICA solves. 

![04%20ICA.PNG](attachment:04%20ICA.PNG)

### 05. FastICA Algorithm

Independent Component Analysis Algorithm : So, this is going to be a very high level view, we'll not delve deep into the math. 

Let's have a general idea about how it works and what assumptions are there when we want to use it. 

![05%20FastICA.PNG](attachment:05%20FastICA.PNG)

So, the dataset that we have, we call X. Right? So, that X was generated by multiplying. What we would call a mixing matrix, which is A by the source signals, which we also don't have. So, we don't have A, we don't have S. But, S is what we want to calculate in the end. So, if X is A times S, we can say that S, which is the source goal of what we want here is W, which is the inverse of A. So, if A is the mixing matrix, we can call W the unmixing the matrix, times X times the dataset that we have, the original recordings that we have. So, this formula here. So, X is an input that we have, W is what we are trying to calculate, S is the results. 

![05%20FastICA%202.PNG](attachment:05%20FastICA%202.PNG)

So, the ICA algorithm in a process is all about approximating W or finding the best W that we can multiply by X, the dataset here to produce the original signals.

The ICA algorithm is explained clearly inThe ICA algorithm is explained clearly in this paper called Independent Component Analysis: Algorithms and Applications.

![05%20FastICA%203.PNG](attachment:05%20FastICA%203.PNG)

It goes into the derivation of everything here. It shows a couple of ways to calculate number of different parts of the algorithm, but if we're to have a just a very high-level view of the algorithm called FastICA. So, this is one way, this is the one that's implemented actually in scikit-learn.

Paper: ["Independent component analysis: algorithms and applications"](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.322.679&rep=rep1&type=pdf) (pdf)

### 07. ICA in sklearn

![07%20ICA%20in%20sklearn.PNG](attachment:07%20ICA%20in%20sklearn.PNG)

### 08. [Lab] Independent Component Analysis

Lab is completed in a different jupyter notebook in a folder of exiting folder

foldername : ica_lab-master 

filename :  Independent Component Analysis Lab.ipynb


### 10. ICA Applications

![10%20Application%201.PNG](attachment:10%20Application%201.PNG)

Paper: [Independent Component Analysis of Electroencephalographic Data](http://papers.nips.cc/paper/1091-independent-component-analysis-of-electroencephalographic-data.pdf) [PDF]

Paper: [Applying Independent Component Analysis to Factor Model in Finance](https://www.semanticscholar.org/paper/Applying-Independent-Component-Analysis-to-Factor-Cha-Chan/a34be08a20eba7523600203a32abb026a8dd85a3?p2df) [PDF]