
<br>
==========================<br>
FastICA on 2D point clouds<br>
==========================<br>
This example illustrates visually in the feature space a comparison by<br>
results using two different component analysis techniques.<br>
:ref:`ICA` vs :ref:`PCA`.<br>
Representing ICA in the feature space gives the view of 'geometric ICA':<br>
ICA is an algorithm that finds directions in the feature space<br>
corresponding to projections with high non-Gaussianity. These directions<br>
need not be orthogonal in the original feature space, but they are<br>
orthogonal in the whitened feature space, in which all directions<br>
correspond to the same variance.<br>
PCA, on the other hand, finds orthogonal directions in the raw feature<br>
space that correspond to directions accounting for maximum variance.<br>
Here we simulate independent sources using a highly non-Gaussian<br>
process, 2 student T with a low number of degrees of freedom (top left<br>
figure). We mix them to create observations (top right figure).<br>
In this raw observation space, directions identified by PCA are<br>
represented by orange vectors. We represent the signal in the PCA space,<br>
after whitening by the variance corresponding to the PCA vectors (lower<br>
left). Running ICA corresponds to finding a rotation in this space to<br>
identify the directions of largest non-Gaussianity (lower right).<br>


In [None]:
print(__doc__)

Authors: Alexandre Gramfort, Gael Varoquaux<br>
License: BSD 3 clause

In [None]:
import numpy as np
import matplotlib.pyplot as plt

In [None]:
from sklearn.decomposition import PCA, FastICA

#############################################################################<br>
Generate sample data

In [None]:
rng = np.random.RandomState(42)
S = rng.standard_t(1.5, size=(20000, 2))
S[:, 0] *= 2.

Mix data

In [None]:
A = np.array([[1, 1], [0, 2]])  # Mixing matrix

In [None]:
X = np.dot(S, A.T)  # Generate observations

In [None]:
pca = PCA()
S_pca_ = pca.fit(X).transform(X)

In [None]:
ica = FastICA(random_state=rng)
S_ica_ = ica.fit(X).transform(X)  # Estimate the sources

In [None]:
S_ica_ /= S_ica_.std(axis=0)

#############################################################################<br>
Plot results

In [None]:
def plot_samples(S, axis_list=None):
    plt.scatter(S[:, 0], S[:, 1], s=2, marker='o', zorder=10,
                color='steelblue', alpha=0.5)
    if axis_list is not None:
        colors = ['orange', 'red']
        for color, axis in zip(colors, axis_list):
            axis /= axis.std()
            x_axis, y_axis = axis
            # Trick to get legend to work
            plt.plot(0.1 * x_axis, 0.1 * y_axis, linewidth=2, color=color)
            plt.quiver(0, 0, x_axis, y_axis, zorder=11, width=0.01, scale=6,
                       color=color)
    plt.hlines(0, -3, 3)
    plt.vlines(0, -3, 3)
    plt.xlim(-3, 3)
    plt.ylim(-3, 3)
    plt.xlabel('x')
    plt.ylabel('y')

In [None]:
plt.figure()
plt.subplot(2, 2, 1)
plot_samples(S / S.std())
plt.title('True Independent Sources')

In [None]:
axis_list = [pca.components_.T, ica.mixing_]
plt.subplot(2, 2, 2)
plot_samples(X / np.std(X), axis_list=axis_list)
legend = plt.legend(['PCA', 'ICA'], loc='upper right')
legend.set_zorder(100)

In [None]:
plt.title('Observations')

In [None]:
plt.subplot(2, 2, 3)
plot_samples(S_pca_ / np.std(S_pca_, axis=0))
plt.title('PCA recovered signals')

In [None]:
plt.subplot(2, 2, 4)
plot_samples(S_ica_ / np.std(S_ica_))
plt.title('ICA recovered signals')

In [None]:
plt.subplots_adjust(0.09, 0.04, 0.94, 0.94, 0.26, 0.36)
plt.show()