# MA2501 - template for code in project 2
This is a suggestion for how you can structure the python code.

## Preparations
We must first import some data. In the code below we assume the sound files with the signals are located in audio/mix_1.wav, audio/mix_2.wav, audio/mix_3.wav. You will find these files in the archive Project2.zip together with the python file wav_file_loader.py which is imported in the code below.

In [1]:
"""In this cell we upload three sound clips which we will use to test the algorithm"""
import numpy as np
from wav_file_loader import read_wavefiles

paths = ['audio/mix_1.wav', 'audio/mix_2.wav', 'audio/mix_3.wav']
data, sampling_rate = read_wavefiles(paths)
num_signals = data.shape[0]


ModuleNotFoundError: No module named 'wav_file_loader'

In [8]:
"""In this cell we define a function which normalizes the sound signal volume, 
so they get approximately the same strength"""
def normalize_audio(data):
    """Scale amplitude s.t. max(data[i]) == 1."""
    abs_data = np.absolute(data)
    maximums = np.amax(abs_data,1)
    # Divide each row by a different vector element:
    data = data / maximums.reshape((3,1))
    return data

data = normalize_audio(data)

In [13]:
"""Here you can play the three different sound clips"""
import IPython.display as ipd

ipd.display(ipd.Audio(data[0,:], rate=sampling_rate))
ipd.display(ipd.Audio(data[1,:], rate=sampling_rate))
ipd.display(ipd.Audio(data[2,:], rate=sampling_rate))

## Mixing
For now you can ignore this part about mixing, including the cell below. Later you can test the algorithm by mixing your own sound clips using the below functions.

In [10]:
def normalize_rowsums(A):
    """Divide each row in A by its sum.
    
    The sum of each row in the result is 1.0."""
    the_sum = np.sum(A,1)
    A = A / the_sum.reshape((3,1))
    return A

def random_mixing_matrix(signals, observations):
    """ Creates a random matrix
    
    Each element is a small positive number, not too close to 0.
    (1/11, 5/7).
    """
    A = 0.25 + np.random.rand(observations, signals)
    return normalize_rowsums(A)


In [11]:
A = random_mixing_matrix(num_signals, num_signals)
data_mixed = normalize_audio(A @ data)

In [12]:
import IPython.display as ipd

ipd.display(ipd.Audio(data_mixed[0,:], rate=sampling_rate))
ipd.display(ipd.Audio(data_mixed[1,:], rate=sampling_rate))
ipd.display(ipd.Audio(data_mixed[2,:], rate=sampling_rate))

## Preprocessing
In the following cell you should write the functions <font color='blue'> center_rows </font> and <font color='blue'> whiten_rows </font> to do the preprocessing which is given in the project description. The variable Z is an array of dimension $d\times N$ of mixed signals.

In [7]:
def center_rows(Z):
    """Ensures each row has zero mean.
    
    Takes a matrix of arbitrary shape and subtracts from each row the mean value of that row."""
    
    # Here goes your code. The code should return a dxN-matrix, say Zc, where each row has zero mean
    return Zc, mus

def whiten_rows(Z):
    """Return whitened version of Z and the matrix for the transform, say Zw, T, where Zw=T*Z
    
    """
    # Your code goes here.
    # Hints: The covariance matrix can be obtained by the function cov in numpy, call it C.
    # The following two statements compute T (inverse square root of C).
    #U, S, _ = np.linalg.svd(C, full_matrices=False)
    #T  = U @ np.diag(1 / np.sqrt(S)) @ U.T
    

## Main iteration - maximization of non-gaussianity

In [9]:
def normalize_rownorms(Z):
    """Divide each row in A by its Euclidean norm.
    
    The norm of each row in the output is equal to one.
    
    Your code goes under here. You need to compute the Euclidean norm of each row of the matrix Z
    and then scale each row by this norm. 
    """
    




In [5]:
def decorrelate_weights(W):
    """ This is the orthogonalization step (or decorrelation step) The dxd input matrix W is projected onto an 
    orthogonal matrix by the transformation Wd = (WW^T)^{-1} W as described in the note. The single output 
    argument is the projected W-matrix (Wd)
    Hint: Use a similar technque for the inverse square root as in the whitening step
    
    Your code goes here      """
    



In [10]:
def update_W(W, Zcw):
    """Calculates W_k+1 from W_k.
    So the input is W=W_k (d x d) as well as the centered, whitened data Zcw (dxN known as tilde{x} in the note)
    Output is the new W (W_{k+1}).
    
    This function does the two iteration steps in the note: The optimisation step and the 
    orthogonalisation (decorrelation) step. The first step you need to code, the orthogonalisation is already
    provided by the function decorrelate_weights that needs to be called.
    You can use the kurtosis version, i.e. G(u)=4*(u**3) and its derivative. Don't include the while-loop in 
    this function
    """
    

In [11]:
def measure_of_convergence(W1, W2):
    """This function computes an error estimate for the maximisation iteration, it computes the convergence
    criterion given in the note. 
    Input: W1 is the previous iterate, and W2 is the one just computed.
    Output: The quantity delta defined in the note.
    Typical numpy-functions to use: numpy.sum, numpy.absolute, numpy.amax.
    
    Your code goes here:
    """
     

In [12]:
import warnings


def fast_ICA(Z, signals_to_find, tol=1e-10, max_iter=100):
    """ This is the function that organises all the work.
    
    Input: Z is the unprocessed data
           signals_to_find: in our case, always d the number of sources
           tol is the tolerance, default value 1.0e-10
           max_iter abort after max_iter iterations if not converged, (to avoid infinite loop)
    Output: Z_ica, the separated signals (dxN matrix, approximating the sources)
            W The final converged W-matrix (dxd)
            Also some other variables of interest can be returned if desired
    """
    # center the rows of Z
    # whiten the centered rows
    # Put W_0 = W to a random initial value and normalise the rows to length 1
    # Initialise some variables to prepare for the while-loop (such as delta)
    # while delta>tol and number_of_iterations < max_iter:
    #      do an iteration to get a new W-iterate 
    #      Compute the error estimate to update delta
    # Clean up, check if converged or max_iter attained
    
