# Unmixing signals with ICA

Unmixing sound signals is an example of cocktail party problem you are going to use for getting hands-on experience with ICA. You have 5 mixed sound sources in **mixed** folder (go check them out). Your goal is to unmix them.

In [90]:
import scipy.io.wavfile
import numpy as np

### Loading data from WAV files

Loading data from WAV files

In [91]:
dataset = []
for i in range(1,6):
    sample_rate, wav_data = scipy.io.wavfile.read('mixed/mix'+str(i)+'.wav')
    dataset.append(wav_data)

dataset = np.array(dataset).T
print(dataset.shape)
print(dataset[:10,:])

(53442, 5)
[[ 343 -546 -327 -275  612]
 [ 627 -840 -579 -124  890]
 [ 589 -725 -491 -115  989]
 [ 712 -887 -571  -24 1111]
 [ 589 -725 -491 -115  989]
 [ 268 -462 -146 -236  678]
 [ 107 -330   27 -296  522]
 [-214  -67  372 -416  211]
 [-214  -67  372 -416  211]
 [ 159 -206  -26 -233  445]]


Normalizing data

In [92]:
maxs = np.max(np.abs(dataset), axis=0).astype(np.int64)
data_normalized = 0.99 * dataset / maxs;
print(data_normalized[:10,:])

[[ 0.01046796 -0.01666328 -0.00997965 -0.00839268  0.01867752]
 [ 0.0191353  -0.02563581 -0.0176704  -0.00378433  0.02716175]
 [ 0.01797558 -0.02212614 -0.01498474 -0.00350966  0.03018311]
 [ 0.0217294  -0.02707019 -0.01742625 -0.00073245  0.03390641]
 [ 0.01797558 -0.02212614 -0.01498474 -0.00350966  0.03018311]
 [ 0.00817904 -0.01409969 -0.00445575 -0.00720244  0.02069176]
 [ 0.00326551 -0.01007121  0.00082401 -0.00903357  0.01593082]
 [-0.00653103 -0.00204476  0.011353   -0.01269583  0.00643947]
 [-0.00653103 -0.00204476  0.011353   -0.01269583  0.00643947]
 [ 0.00485249 -0.00628688 -0.00079349 -0.00711089  0.01358087]]


In [93]:
print(data_normalized.shape)

(53442, 5)


### Implementing ICA

Initializing unmixing matrix $ W $.

In [94]:
W = np.identity(5)

In [95]:
g(W.dot(data_normalized.T)).shape

(5, 53442)

Implement learning unmixing matrix $ W $ with ICA.

In [121]:
# =============== TODO: Your code here ===============
# Implement learning unmixing matrix W with ICA. Do not forget to account for the dimensionality.
def g(x):
    return 1 / (1 + np.exp(-x))

W = np.identity(5)
alpha = 1e-2
eps = 5e-2
res = np.inf
it = 0
while res > eps:
    it += 1
    W_old = W.copy()
    for x in data_normalized:
        x = x.reshape(-1,1)
        grad = (1 - 2 * g(W.dot(x))).dot(x.T) + np.linalg.inv(W.T)
        W = W + alpha * grad
    res = np.linalg.norm(W - W_old)
    print('{}: ||W_i+1 - W_i|| = {:.3f}'.format(it, res))
#     print(W)
# ====================================================

1: ||W_i+1 - W_i|| = 47.959
2: ||W_i+1 - W_i|| = 13.114
3: ||W_i+1 - W_i|| = 8.086
4: ||W_i+1 - W_i|| = 5.951
5: ||W_i+1 - W_i|| = 4.816
6: ||W_i+1 - W_i|| = 4.078
7: ||W_i+1 - W_i|| = 3.505
8: ||W_i+1 - W_i|| = 3.028
9: ||W_i+1 - W_i|| = 2.623
10: ||W_i+1 - W_i|| = 2.278
11: ||W_i+1 - W_i|| = 1.981
12: ||W_i+1 - W_i|| = 1.727
13: ||W_i+1 - W_i|| = 1.509
14: ||W_i+1 - W_i|| = 1.323
15: ||W_i+1 - W_i|| = 1.165
16: ||W_i+1 - W_i|| = 1.031
17: ||W_i+1 - W_i|| = 0.917
18: ||W_i+1 - W_i|| = 0.820
19: ||W_i+1 - W_i|| = 0.738
20: ||W_i+1 - W_i|| = 0.667
21: ||W_i+1 - W_i|| = 0.607
22: ||W_i+1 - W_i|| = 0.556
23: ||W_i+1 - W_i|| = 0.511
24: ||W_i+1 - W_i|| = 0.473
25: ||W_i+1 - W_i|| = 0.440
26: ||W_i+1 - W_i|| = 0.410
27: ||W_i+1 - W_i|| = 0.384
28: ||W_i+1 - W_i|| = 0.361
29: ||W_i+1 - W_i|| = 0.341
30: ||W_i+1 - W_i|| = 0.322
31: ||W_i+1 - W_i|| = 0.305
32: ||W_i+1 - W_i|| = 0.290
33: ||W_i+1 - W_i|| = 0.275
34: ||W_i+1 - W_i|| = 0.262
35: ||W_i+1 - W_i|| = 0.250
36: ||W_i+1 - W_i|| = 0.239

### Unmixing sounds

Use learned matrix $ W $ to unmix the sounds into separate data sources. Make sure you represent the resulting unmixing matrix in a way so that each row is a separate track (i.e. the matrix should have 5 rows).

In [122]:
# =============== TODO: Your code here ===============
# Use learned matrix W to unmix the sounds into separate data sources.
unmixed = W.dot(data_normalized.transpose(1,0))
# ====================================================

Saving unmixed sounds. Please note that some players may not support the resulting WAV format. If that is the case, you can use Winamp to play the unmixed sounds.

In [123]:
maxs = np.max(np.abs(unmixed), axis=1).reshape((5,1))
unmixed_normalized = 0.99 * unmixed / maxs;

for i in range(unmixed_normalized.shape[0]):
    track = unmixed_normalized[i,:]
    scipy.io.wavfile.write('unmixed/unmixed'+str(i+1)+'.wav', sample_rate, track)