# TMA4320 - Prosjekt 1 - Independent component analysis

This notebook is the first project in TMA4320 at the Norwegian University of Science and Technology, and it describes the basic principles of independent component analysis, and uses it to decompose mixed sound signals into sepparate components.

^kan sikkert skrives bedre

## Preparations 

Firsly we need to import the data (sound files) that we will be working with and make them playable. The three cells below takes care of this utilising the script "wav_file_loader.py", and the files are assumed to be placed in .audio/.


In [1]:
'''This cell uploads the files that are to be decomposed.'''
import numpy as np
from wav_file_loader import read_wavefiles

paths = ['audio/mix_1.wav', 'audio/mix_2.wav', 'audio/mix_3.wav']
data, sampling_rate = read_wavefiles(paths)
num_signals = data.shape[0]

In [2]:
"""I denne cellen fins en funksjon som normaliserer lydsignalenes volum, slik at de får omtrent samme lydstyrke"""
def normalize_audio(data):
    """Scale amplitude s.t. max(data[i]) == 1."""
    abs_data = np.absolute(data)
    maximums = np.amax(abs_data,1)
    # Divide each row by a different vector element:
    data = data / maximums.reshape((3,1))
    return data

data = normalize_audio(data)

In [3]:
"""Her kan du spille av de tre opplastede lydklippene"""
import IPython.display as ipd

ipd.display(ipd.Audio(data[0,:], rate=sampling_rate))
ipd.display(ipd.Audio(data[1,:], rate=sampling_rate))
ipd.display(ipd.Audio(data[2,:], rate=sampling_rate))

## Miksing
Denne delen om miksing, inkludert kodecellen nedenfor kan du foreløpig ignorere. Men dersom du seinere vil teste ut algoritmen ved å selv blande opplastede uavhengige signaler så kan funksjonene nedenfor komme til nytte.

In [4]:
def normalize_rowsums(A):
    """Divide each row in A by its sum.
    
    The sum of each row in the result is 1.0."""
    the_sum = np.sum(A,1)
    A = A / the_sum.reshape((3,1))
    return A

def random_mixing_matrix(signals, observations):
    """ Creates a random matrix
    Each element is a small positive number, not too close to 0.
    (1/11, 5/7).
    """
    
    A = 0.25 + np.random.rand(observations, signals)
    return normalize_rowsums(A)


In [5]:
A = random_mixing_matrix(num_signals, num_signals)
data_mixed = normalize_audio(A @ data)

In [6]:
import IPython.display as ipd

ipd.display(ipd.Audio(data_mixed[0,:], rate=sampling_rate))
ipd.display(ipd.Audio(data_mixed[1,:], rate=sampling_rate))
ipd.display(ipd.Audio(data_mixed[2,:], rate=sampling_rate))

## Preprosessering
I den etterfølgende cellen skal du skrive funksjonene <font color='blue'> center_rows </font> og <font color='blue'> whiten_rows </font> for å gjøre de preprosesseringsstegene som er omtalt i prosjektbeskrivelsen. I hvert tilfelle er variabelen Z et array av dimensjon $d\times N$ av miksede signaler.

In [7]:
def center_rows(Z):
    """
    Ensures each row has zero mean.
    Takes a matrix of arbitrary shape and subtracts from each row the mean value of that row.
    """
    
    # Here goes your code. The code should return a dxN-matrix, say Zc, where each row has zero mean
    row_means = np.mean(Z, axis=1)
    Z_transposed = Z.transpose()
    Zc_transposed = Z_transposed - row_means
    Zc = Zc_transposed.transpose()
    return Zc #, mus


def whiten_rows(Z):
    """
    Return whitened version of Z and the matrix for the transform, say Zw, T, where Zw=T*Z
    """
    
    # Covariance matrix, C
    C = np.cov(Z)
    
    # The following two statements compute T (inverse square root of C).
    U, S, _ = np.linalg.svd(C, full_matrices=False)
    T  = U @ np.diag(1 / np.sqrt(S)) @ U.T
    
    # Finally the withened version of Z, Zw
    Zw = np.matmul(T,Z)
    
    return Zw, T

## Hovediterasjonen - maksimering av ikke-gaussiskhet

In [8]:
def normalize_rownorms(Z):
    """
    Divide each row in A by its Euclidean norm.
    
    The norm of each row in the output is equal to one.
    
    Computes the Euclidean norm of each row of the matrix Z
    and then scale each row by this norm. 
    """
     
    e_norms = np.linalg.norm(Z, axis=1)
    Z_trans = Z.transpose()
    Z_norm_trans = Z_trans / e_norms
    Z_norm = Z_norm_trans.transpose()
    return Z_norm

In [9]:
def decorrelate_weights(W):
    """
    This is the orthogonalization step (or decorrelation step) The dxd input matrix W is projected onto an 
    orthogonal matrix by the transformation Wd = (WW^T)^{-1/2} W as described in the note. The single output 
    argument is the projected W-matrix (Wd).
    
    Uses a similar technque for computing the inverse square root as in the whitening step.
    """
    
    # Matrix product of W and its transposed
    WW_T = np.matmul(W,W.transpose())
    
    # Computes T, the inverse square root of WW_T
    U, S, _ = np.linalg.svd(WW_T, full_matrices=False)
    T = U @ np.diag(1 / np.sqrt(S)) @ U.transpose()
    
    # Finally computes the decorrelated matrix Wd
    Wd = np.matmul(T, W)

    return Wd


In [29]:
def update_W(W, Zcw):
    """x
    Calculates W_k+1 from W_k.
    The input is W=W_k (d x d) as well as the cenztered, whitened data Zcw (dxN known as tilde{x} in the note)
    Output is the new W (W_{k+1}).
    
    This function does the two iteration steps in the note: The optimisation step and the 
    orthogonalisation (decorrelation) step. The first step, the orthogonalisation is provided by the 
    function decorrelate_weights which is called below.
    
    Uses either kurtosis or negentropy.
    """
    
    # Optimization step:
    s_k = np.dot(W, Zcw)
    
    # kurtosis and derivative as lambda funtions
    kurtosis = lambda u: 4*(u**3)
    kurtosis_d = lambda u: 12*(u**2)
    
    #negentropy and derivative as lambda functions
    negentropy = lambda u: u * np.exp(-((u**2)/2))
    negentropy_d = lambda u: -np.exp(-((u**2)/2)) * (u**2 - 1)
    
    # Applies kurtosis function and derivative on each element of s_k
    G = kurtosis(s_k)
    G_d = kurtosis_d(s_k
                                      
    # Applies negentropy function and derivative on each element of s_k
    #G = negentropy(s_k)
    #G_d = negentropy_d(s_k)
    
    # Gets number of rows and columns of G_d
    N = G_d.shape[1]
    
    # Computes W_k+1, here called W_p
    W_p = (1/N) * np.dot(G, Zcw.transpose()) - np.matmul(np.diag(np.average(G_d, axis=1)), W)
    
    # Normalizes rows of W_p (W_k+1)
    W_pn = normalize_rownorms(W_p)
    
    #########################################################################################
    
    # Orthogonalization step:
    W_pnd = decorrelate_weights(W_pn)

    return W_pnd
    

SyntaxError: invalid syntax (<ipython-input-29-28c0f0f7cc35>, line 34)

In [30]:
def measure_of_convergence(W1, W2):
    """
    This function computes an error estimate for the maximisation iteration, it computes the convergence
    criterion given in the note. 
    Input: W1 is the previous iterate, and W2 is the one just computed.
    Output: The quantity delta defined in the note.
    Typical numpy-functions to use: numpy.sum, numpy.absolute, numpy.amax.
    """
    
    a_s = np.absolute(np.sum(np.multiply(W2, W1), axis=1))
    delta = np.absolute(np.amax((1-a_s)))
    return delta

In [31]:
import warnings

tol_default = 1e-10

def fast_ICA(Z, signals_to_find, tol=tol_default, max_iter=100):
    """ This is the function that organises all the work.
    
    Input: Z is the unprocessed data
           signals_to_find: in our case, always d the number of sources
           tol is the tolerance, default value 1.0e-10
           max_iter abort after max_iter iterations if not converged, (to avoid infinite loop)
    Output: Z_ica, the separated signals (dxN matrix, approximating the sources)
            W The final converged W-matrix (dxd)
            Also some other variables of interest can be returned if desired
    """
    
    # center the rows of Z
    Z_cent = center_rows(Z)

    # whiten the centered rows
    Z_cent_wit, T = whiten_rows(Z_cent)
    
    # Put W_0 = W to a random initial value and normalise the rows to length 1
    M = Z_cent.shape[0]
    W_0 = np.random.rand(M, M)
    W_0 = normalize_rowsums(W_0)
    
    # Initialise some variables to prepare for the while-loop (such as delta)
    delta = tol + 1
    number_of_iter = 0

    # while delta>tol and number_of_iterations < max_iter:
    #      do an iteration to get a new W-iterate 
    #      Compute the error estimate to update delta

    while delta > tol and number_of_iter < max_iter:
        W_p = update_W(W_0, Z_cent_wit)
        delta = measure_of_convergence(W_0, W_p)
        W_0 = W_p
        number_of_iter += 1
        
    # Clean up, check if converged or max_iter attained
    if number_of_iter == max_iter:
        print('max_iter reached :(((\ndelta: ', delta)
        
    else:
        print("Endelig delta: ", delta)
        print("Antall iterasjoner: ", number_of_iter)
        
    Z_ica = np.matmul(W_p, Z_cent_wit)
    return Z_ica, W_p
    


In [34]:
#Kjører algoritmen og produserer sepparerte lydfiler!
#Husk å kjøre øvrige bokser først
Z_ica, W_ica = fast_ICA(data, 3)

Endelig delta:  4.668820885456171e-12
Antall iterasjoner:  10


In [35]:
"""Her kan du spille av de tre opplastede lydklippene"""
import IPython.display as ipd

ipd.display(ipd.Audio(data[0,:], rate=sampling_rate))
ipd.display(ipd.Audio(data[1,:], rate=sampling_rate))
ipd.display(ipd.Audio(data[2,:], rate=sampling_rate))

In [36]:
"""Her kan du spille av de tre lydklippene etter de har vært igjennom sepparasjonen"""
import IPython.display as ipd

ipd.display(ipd.Audio(Z_ica[0,:], rate=sampling_rate))
ipd.display(ipd.Audio(Z_ica[1,:], rate=sampling_rate))
ipd.display(ipd.Audio(Z_ica[2,:], rate=sampling_rate))

<br>
<br>
<br>
<br>
<h1>Cellene nedenfor er kode som ble brukt til å teste koden underveis</h1>

In [None]:
#Test above functions

#Test of center_rows:
Z = np.array([[1,0,1,0],[0,1,0,1],[3,0,1,-3]]) #just an arbitrary test matrix
Zc = center_rows(Z)

print('Zc: \n', Zc, '\n\n')
print('Vector with mean of rows in Zc: \n', np.mean(Zc, axis=1), '\n')

#Test of whiten_rows:
Zcw, T = whiten_rows(Zc)

print('Zwc: \n', Zcw, '\n\n')
print('T: \n', T)

Z_d_w, T = whiten_rows(data)
print(Z_d_w.shape)

In [None]:
#Test og normalize_rownorms:
Z_norm = normalize_rownorms(Z)

print('Z_norm: \n',Z_norm, '\n')
print('Vector with row norms: \n', np.linalg.norm(Z_norm, axis=1))

In [None]:
#Test of decorrelate_weights:
W = np.array([[1,4,5],[6,1,3],[4,9,5]]) # An arbitrart dxd matrix
Wd = decorrelate_weights(W)

print('Wd: \n', Wd, '\n\n')
print(np.sum(Wd, axis=0), '\n')
print('cov-matrix of Wd: \n', np.cov(Wd), '\n')
print('cov-matrix of W: \n', np.cov(W))

In [None]:
#test ac update_W
W_p = update_W(W, Zcw)

print(W_p)

In [None]:
#test av measure_of_convergence
W1 = np.array([[4,5,9],[80,2,5],[2,2,1]])
W2 = np.array([[6,3,7], [9,3,6], [5,3,1]])
print(measure_of_convergence(W1, W2))
print(np.multiply(W2, W1))
print(np.sum(np.multiply(W2, W1), axis = 1))