# TASK:

   You have been provided with an example EEG dataset from one of our mother-infant hyperscanning tasks.  In this task, we were interested in assessing social learning. Mothers were given two novel and ambiguous objects to teach their infants about. 
    
   For one object, mothers were asked to demonstrate strong liking by, for example, saying  **This one is very nice, I really like it**   whilst smiling.  For the other object, they were asked to demonstrate strong dislike/disgust, for example, saying  **This one is nasty, I don’t like it**  whilst frowning. 

   Infants were then given a choice of the two objects and we recorded which one infants selected to play with first. This task was repeated several times with different pairs of objects.

   In the example data from one mother-infant pair, the infant completed 8 trials in total and their object selection for each trial is given in the ‘Response’ variable, where 1 indicates selection of the positively-modelled object and 2 indicates selection of the negatively-modelled object.

   The respective EEG signals from mother and infant during the object demonstration phase for each trial are given in the variables ‘EEG_mother’ and ‘EEG_infant’ respectively. The first column contains data collected during the positive object demonstration and the second column contains data collected during the negative object demonstration. 

   Rows indicate separate trials. Each cell contains n samples x 32 channels. The sampling rate is 200 Hz and n varies from trial to trial because only periods when the infant is attentively looking at her mother have been included.  
   
   The names and 3D locations of the 32 channels are identical for mother and infant, and are provided as ‘Channames’ and ‘Chanlocs’ respectively. No artifact rejection has been performed on the data.

* Analyse the data in a way that would allow you to assess neural predictors of positive versus negative social learning by infants. You may conduct as little or as much analysis as you wish, using any technique you prefer. However, please submit executable Matlab code that you have written for this analysis, and a brief 1-page description of your approach and rationale.

* What potential caveats/pitfalls are there for analysing such a dataset? 

# Approach and Rationale.

After some thought and a quick overview of the data I decided to carry out a Fourier analysis with the hope of finding some frequency signatures which then I could employ to write a simple classifier. In other words, if there is a mechanism at work associated with positive/negative social learning it is reasonable to investigate **as a first approximation** if these mechanisms are expressed in a particular frequency range, which then will over-express the Fourier component associated to that particular range of neural frequencies. This assumption only fouses on the infant response signatures. 

The analysis was carried out as follows:

1. Visual inspection of all the responses to detect possible artifacts on the data.
2. Automated removal of aritifacts.
3. Spectral decomposition and identification of possible response signatures.
4. Classification.

## Data preprocessing.

A quick inspection of the data revealed a number of unbounded and difted driven signals over all the responses. In order to systematically remove such behaviours I wrote a little filter to decompose the signals into its fourier components and then set to zero all the coeficients above 45 hertz and the very first 10 coeffients next to zero. 

Figure 1 below illustrate this process.


 
<img src="Banner.png" alt="Drawing" style="width: 90%;"/>
**Figure 1. Atrifact removal and spectral decomposition.** (First columm left) Example of an original signal(top), Left Power Spectrum (Middle) and Full Spectra (bottom).
(Second and thirds columns left to right) Lowest frequency Contribution tho the signal (which is to be removed), and a high component where the coefficients are zero, note that the reconstructed signal is identical to the original (difference = 1.5 e-5).    
(fourth column) Applied filter removing the 20 lower components and all the frequecies above 45 Htz (Removes line  noise) and the inverse Fourier transformed signal which is used for the analysis.



# Results.

After detrending all the channels and ensuring that the amplitudes remain in between the range of 150. Then I computed and averaged the spectra over ever channel per trial, hence obtaining 16 spectra. 8 for the positive case and 8 for the negative selection event. (See Figures 2 and 3).

After computing the averaged spectra I noticed a minima present in every pair occuring at 20 Htz, if this is a well known effect or not I do not know, however the signatures I imagined at the moment of thinking of this simp,e process seem to be somehow present in the data, Figure 2 shows the spectra of trials 1 and 2 which gave negative and positive object selection respectively. 

<img src="AVPW1.png" alt="Drawing" style="width: 100%;"/>
**Figure 2** Averaged power spectra for the infant data in trial one. Note the minima at 20Htz, which is present in every trial.

<img src="AVPW2.png" alt="Drawing" style="width: 90%;"/>
**Figure 3** Averaged power spectra for the infant data in trial three, where the result is that of positive object selection.

Although there are great many ways of quantifying the information and differences in power spectra, I came up with a very simple way which is very intuitive in terms of power contributions. Let $P_i(\theta)$ represent the averaged spectra for infant and adult ($i=A,I$). Then summing over the spectral region of interest and dividing by the sum over the full spectral range  we get $$z_i(a,b)=\frac{\int_{a}^{b}P_i(\theta)d\theta}{\int_{0}^{\infty}P_i(\theta)d\theta}$$ 

Tell us the contribution of the Fourier components within the range a and b. Then we could use this in a classifier such as: $$R=1+\frac{1}{1+e^{\frac{-(z_i(a,b)-z_o)}{\gamma}}}$$

In my case I used the region above 20 Hertz, hence computing $z_i(20,45)$. With values of $\gamma=100$ and $z_o=0.1$. 

This predictor fails in one out of the eight trials, suggesting that despite being rather crude, this way of processing is perhaps capturing some of the social learning mechanisms encoded in the infant's signals. 


# Caveats and Pitfalls.



# Code Description.

The approach taken to complete this task consisted in three stages: 

1.- Load and exploration.
2.- Artifact removal.
3.- Spectral Classification.

All the code discussed here was developed to complete this task with minimal invokation of toolboxes.

## Load and exploration.

This module is contained in the script **eeglx.m**. Excecuting this script will open the file and extract each trial signals allocating it in arrays which are eventually then processed and analysed. I used the sampling frequency and the length of the signal to construct the respecitve time domains, as show in the image below.

<img src="Ffull.png" alt="Drawing" style="width: 90%;"/>

<!-- img src="F2.png" alt="Drawing" style="width: 20%;"/ -->

# Pitfalls.

# Higher order stats.