# Correlation
Correlation or “Co-Relation” is a measure of similarity/relationship between two signals.
If $x[n]$ and $h[n]$ are two discrete-time signals, then the correlation of $x[n]$ with respect to $h[n]$ is given as:
$$ r[i] = \sum_{j=0}^{M-1}{x[j]h[j-i]} $$

We can say that *“Correlation, mathematically, is just Convolution,  with the second sequence, time-reversed"*.

## Exercise:
In this notebook you will implement a `Correlation` class with some methods that will help you identify similarity between signals. To implement your `Correlation` you will use the `Convolve` class developed in the previous notebook. After that, you will use your `Correlation` class to see how Barker code works.

In [None]:
import sys
sys.path.insert(0, '../../../')

import numpy as np
import matplotlib.pyplot as plt

from Common import common_plots
from Common import convolution
cplots = common_plots.Plot()
convolve = convolution.Convolve()

### 1. Create a `Correlation` class
First you will create a `Correlation` class that will have the following methods:
1. `correlation` which calculates the correlation of two signals, $x[n]$ and $h[n]$
2. `auto_corr` which calculates the auto correlation of a given signal $x[n]$
3. `norm_correlation` which calculates the normalized correlation of two signals, $x[n]$ and $h[n]$
4. `norm_auto_corr` which calculates the normalized auto correlation of a given signal $x[n]$
5. `delay` auxiliary function that calculates the time delay between $x[n]$ respect to $h[n]$ based on the correlation between both signals.

You will have to use the `Convolve` class and be able to select between the three types of convolutions: `conv1d`, `convolve_input_algorithm`, and `convolve_output_algorithm`.

A good resource to check is this [link](http://host.uniroma3.it/laboratori/sp4te/teaching/sp4bme/documents/LectureCorrelation.pdf).

In [None]:
from Common import convolution


class Correlation():
    
    def __init__(self):
        self.convolve = convolution.Convolve()
        pass
    
    
    def correlation(self, x, h, algorithm='output'):
        """ 
        Function that finds the correlation of an input signal x with an step response h.
        Parameters: 
        x (numpy array): Array of numbers representing the input signal to be correlated.
        h (numpy array): Array of numbers representing the unit step response of a filter or signal.
        algorithm (string): String that selects the algoritm to use for finding the convolution.
                            Can be `fast` if `conv1d` function is used, `input` if `convolve_input_algorithm`
                            is used, and `output` if `convolve_output_algorithm` is used. Default value is
                            `output`.

        Returns: 
        numpy array: Returns correlation r_xh[n]=x[n]*h[-n].

        """
        
        #SOLVE IN HERE
        
        if (algorithm == 'fast'):
            pass
            
        elif (algorithm == 'input'):
            pass
            
        elif (algorithm == 'output'):
            pass
            
        pass
    
    
    def auto_corr(self, x, algorithm='output'):
        """ 
        Function that finds the auto correlation of an input signal x.
        Parameters: 
        x (numpy array): Array of numbers representing the input signal to be auto correlated.
        algorithm (string): String that selects the algoritm to use for finding the convolution.
                            Can be `fast` if `conv1d` function is used, `input` if `convolve_input_algorithm`
                            is used, and `output` if `convolve_output_algorithm` is used. Default value is
                            `output`.

        Returns: 
        numpy array: Returns auto correlation r_xx[n]=x[n]*x[-n].

        """
        
        #SOLVE IN HERE
        pass
    
    
    def norm_correlation(self, x, h, algorithm='output'):
        """ 
        Function that finds the normalized correlation of an input signal x with an step response h.
        Parameters: 
        x (numpy array): Array of numbers representing the input signal to be correlated.
        h (numpy array): Array of numbers representing the unit step response of a filter or signal.
        algorithm (string): String that selects the algoritm to use for finding the convolution.
                            Can be `fast` if `conv1d` function is used, `input` if `convolve_input_algorithm`
                            is used, and `output` if `convolve_output_algorithm` is used. Default value is
                            `output`.

        Returns: 
        numpy array: Returns normalized correlation y[n]=r_xh[n]/(sqrt(max(r_xx[n])*max(r_hh[n]))).

        """
        
        #SOLVE IN HERE
        pass
    
    
    def norm_auto_corr(self, x, algorithm='output'):
        """ 
        Function that finds the normalized auto correlation of an input signal x.
        Parameters: 
        x (numpy array): Array of numbers representing the input signal to be auto correlated.
        algorithm (string): String that selects the algoritm to use for finding the convolution.
                            Can be `fast` if `conv1d` function is used, `input` if `convolve_input_algorithm`
                            is used, and `output` if `convolve_output_algorithm` is used. Default value is
                            `output`.

        Returns: 
        numpy array: Returns normalized auto correlation y[n]=r_xx[n]/max(r_xx[n]).

        """
        
        #SOLVE IN HERE
        pass

    
    def delay(self):
        """ 
        Function that finds the lag between a signal x[n] with respect to the filter or signal h[n].
        Before invoking this function, self.correlation() must be invoked.
        Parameters: 
        None

        Returns: 
        numpy value: Returns negative difference between maximum correlation index and (filter lenght - 1).


        """
        
        #SOLVE IN HERE
        pass
        

Now it is time to test your class. In order to do so, you will compare the correlation between $a[n]$, $b[n]$, and $c[n]$, which are given as:

In [None]:
a = np.array([[1, 2, 3, 4, 3, 2, 1]]).T
b = np.array([[4, 8, 12, 16, 12, 8, 4]]).T
c = np.array([[8, 8, 8, 8, 8, 8, 8]]).T

Now call your `Correlation` class as an object named `corr`.

In [None]:
#SOLVE IN HERE

Test your `fast`, `output`, and `input` implementations for the `correlation` method:

In [None]:
#SOLVE IN HERE

The expected result should be:

In [None]:
print(np.correlate(a.reshape(-1),b.reshape(-1), 'full'))

Test your `fast`, `output`, and `input` implementations for the `norm_correlation` method:

In [None]:
#SOLVE IN HERE

The expected result should be:

In [None]:
r_aa = np.correlate(a.reshape(-1),a.reshape(-1), 'full')
r_bb = np.correlate(b.reshape(-1),b.reshape(-1), 'full')
r_ab = np.correlate(a.reshape(-1),b.reshape(-1), 'full')
print(r_ab/np.sqrt(r_aa.max()*r_bb.max()))

As you can see, all three algoritms perform the same results. Which is what we expected.

In order to have an accurate test between correlation, it is better to use a normalized correlation. To do so, we will use our `norm_correlation` method.

We will use the `norm_correlation` method to compare $a[n]$ with respect to $b[n]$, and $a[n]$ with respect to $c[n]$ as follows:

In [None]:
#SOLVE IN HERE

In [None]:
plt.rcParams["figure.figsize"] = (13,5)

plt.subplot(1,2,1)
cplots.plot_single(norm_corr_a_b.T, 'Normalized Correlation between a[n] and b[n]')
plt.subplot(1,2,2)
cplots.plot_single(norm_corr_a_c.T, 'Normalized Correlation between a[n] and c[n]')
plt.tight_layout(pad=3.0)

As we can see, there's a slightly higher correlation between $a[n]$ and $b[n]$, than with $a[n]$ and $c[n]$, that is because $b[n]$ is just an scaled version of $a[n]$ compared to $c[n]$ which is a constant train of pulses.

Now let's see how we can use the `delay` method you've developed. For this we have the signals $x[n]$, $y[n]$, and $z[n]$:

In [None]:
x = np.array([[1, 2, 3, 4, 3, 2, 1]]).T 
y = np.array([[0, 0, 0, 1, 2, 3, 4, 3, 2, 1]]).T
z = np.array([[1, 2, 3, 4, 3, 2, 1]]).T

In [None]:
plt.rcParams["figure.figsize"] = (15,5)

plt.subplot(1,3,1)
cplots.plot_single(x.T, 'x[n]')
plt.ylim((-1,5))
plt.subplot(1,3,2)
cplots.plot_single(y.T, 'y[n]')
plt.ylim((-1,5))
plt.subplot(1,3,3)
cplots.plot_single(z.T, 'z[n]')
plt.ylim((-1,5));

From the figures you can see the following:
1. $x[n-3]=y[n]$
2. $x[n]=z[n]$
3. $y[n+3]=z[n]$

Using the `delay` function we can find this lag value:

In [None]:
corr.correlation(x,y)
print('x[n-{}] = y[n]'.format(corr.delay()))

In [None]:
corr.correlation(x,z)
print('x[n-{}] = z[n]'.format(corr.delay()))

In [None]:
corr.correlation(y,z)
print('y[n-{}] = z[n]'.format(corr.delay()))

As you can see, `delay` function returns a **postive value for a right shift**, and a **negative value for a left shift**.

### 2. Barker Code
Now it is time to see some application for the correlation in a real life example. In this case, we use Barker codes. Barker codes are binary numbers using two to 13 bits and have unique auto-correlation functions. The points adjacent to the peak of the correlation function equal zero. This is very useful in a radar system since any spurious response can be misinterpreted as a target. A Barker-coded pulse typically uses binary phase modulation. By adding a Barker code between two bpsk data blocks, it is possible to detect the end and start of bpsk data blocks. In this part you will test your `Correlation` class and see how we can use the Barker code to detect the start of a bpsk data block.

First, we create an auxiliary function called `generate_bpsk_data` whose purpose is to create some dummy bpsk data.

In [None]:
def generate_bpsk_data(size=100, threshold=50):
    """ 
        Function that generates a bpsk block code of variable size were the percentage of samples being 
        equal to -1 is given by threshold, and the percentage of samples being equal to 1 is given by 
        (100 - threshold).
        
        Parameters: 
        size (int): Size of the bpsk random block generated.
        threshold (int): Percentage of samples being equal to -1.

        Returns: 
        numpy array: Returns bpsk block code of variable size with values between -1 or 1.

        """
    data = (np.random.rand(size,1)*100).astype('int')
    data[data<threshold]=-1
    data[data>=threshold]=1
    return data

We use our `generate_bpsk_data` function and create a data block of [64-symbols][13-symbols][128-symbols]. The 64-symbols and 128-symbols represent data being transmitted, the 13-symbols represent the Barker code inserted.

In [None]:
np.random.seed(123)

#SOLVE IN HERE
blk_1_len = None
blk_2_len = None

block_1 = generate_bpsk_data(blk_1_len, blk_1_len/2)
block_2 = generate_bpsk_data(blk_2_len,blk_2_len/2)
barker_code = np.array([1, 1, 1, 1, 1, -1, -1, 1, 1, -1, 1, -1, 1]).reshape(-1,1)

temp = np.append(block_1, barker_code, axis=0)
block = np.append(temp, block_2, axis=0)

cplots.plot_single(block.T)

Now let's use our `correlation` or `norm_correlation` to detect the inserted Barker code

In [None]:
barker_corr = corr.correlation(block, barker_code).argmax()
print('Barker correlation found at {}'.format(barker_corr))
print('{} symbols + {} Barker symbols = {}'.format(blk_1_len, 
                                                   barker_code.shape[0],blk_1_len+ barker_code.shape[0]))

You can see that since we start at position $0$, Barker correlation correctly estimates the new bpsk data block start.

In [None]:
cplots.plot_single(corr.correlation(block, barker_code).T)

By using the `delay` function we can find the starting position too. (Remember that a left shift gives a negative value.)

In [None]:
print('Start of new bpsk block at: {}'.format(barker_code.shape[0]-corr.delay()))

## Exercise: Create your own Correlation class

Finally you will save your `Correlation` class into a file called `correlation.py` in the `Common` folder.