# Preprocessing 
# Get synchronization times

This notebook takes a MCD file that include the analog signal to recovery the synchronization time of all stimuli. This is very important step to analyze with pressition the response of the recording cells. 

This codes create some file as all synchrinization time, repeated frame time and a csv file with all relevante information about the stimuli. This last file has a list of events, a event is a sequence of together images. Each event has a field of start and end time, number of frames, total time of the event, repeted frames and time inter event.

In [1]:
import sys
sys.path.append('../src')
from lib.utils import checkDirectory

from os import path, mkdir
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [2]:

# experiment_name = '2018-01-25'
experiment_name = '2018-04-18'
# experiment_name = '2016-08-11'
# experiment_name = '2016-04-11'
# experiment_name = '2016-09-14'
# experiment_name = 'nn'

## Get synchronization signal
The MCD file is read from a Matlab script. There are 5 fields to run the code:   
mcd_file: path of MCD source file 
* **output_folder**: directory to put ourputs
* **mcd_channel**: number of analog channel in MCD file
* **sampling_rate**: sampling rate for the recording 
* **real_fps**: the real number of frame per second on screen (look at log file)

Analog signal only register the start of frame, so the finidh time in a secuence of images is extrapolate to have the correct time in the output file.

Add image of analog signal...

For more information look at [here](#)

In [3]:
mcd_file = '../data/raw_data/'+experiment_name+'/'+experiment_name+'_analog.mcd'
output_folder = '../data/sync/'+experiment_name+'/'
mcd_channel = 1
sampling_rate = 20000
real_fps = 59.7596
checkDirectory(output_folder)

command_matlab = "\"cd ../src/syncAnalyzer/; syncAnalyzer('../{}','{}','../{}',{},{},{}); quit\""\
                .format( mcd_file, experiment_name, output_folder, mcd_channel,
                        sampling_rate,real_fps)
print('ranning...\n ' + command_matlab)
!matlab -nodesktop -nodisplay -nosplash -r $command_matlab

ranning...
 "cd ../src/syncAnalyzer/; syncAnalyzer('../../data/raw_data/2018-04-18/2018-04-18_analog.mcd','2018-04-18','../../data/sync/2018-04-18/',1,20000,59.7596); quit"
[?1h=
                            < M A T L A B (R) >
                  Copyright 1984-2017 The MathWorks, Inc.
                   R2017a (9.2.0.538062) 64-bit (glnxa64)
                             February 23, 2017

 
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
 
Neuroshare information:

entity = 

  struct with fields:

    EntityLabel: 'anlg0001 0254 0000       A3'
     EntityType: 2
      ItemCount: 128182000


total_duration =

   128182000

../../data/sync/2018-04-18/start_end_frames_2018-04-18.txt   saved.
../../data/sync/2018-04-18/repeated_frames_2018-04-18.txt   saved
../../data/sync/2018-04-18/total_duration_2018-04-18.txt    saved
OK
[?1l>

## Create event list
From sync start_end file created above we can compute the difference between all frames and get only frames with a difference more than one frame to create a event.

![Events](https://raw.githubusercontent.com/creyesp/SamplingInterface/master/Doc/img/img%20examples/analog_signal_sequence_img.png)

In [4]:
# load data
source_folder = '../data/sync/'+experiment_name+'/'
start_end_file = source_folder+'start_end_frames_'+experiment_name+'.txt' 
total_duration_file = source_folder+'total_duration_'+experiment_name+'.txt' 

start_end = np.loadtxt( start_end_file , dtype='int32')-1
total_duration = np.loadtxt( total_duration_file , dtype='int32')-1

start_frame, end_frame = start_end[:,0], start_end[:,1]
diff_time = np.diff( np.concatenate(
            ( np.array([0]), start_frame, np.array([total_duration]) ) ))

dis_frame = np.ceil(sampling_rate/real_fps) #max dist between frames
filter_event = diff_time > dis_frame
end_event_pos = np.where( filter_event == True )[0]

# Select start and end time for each event
start_event = start_frame[ end_event_pos[:-1] ]
end_event = end_frame[ end_event_pos[1:] - 1 ]
n_frames = end_event_pos[1:] - end_event_pos[:-1]

# Add bound condition 
start_event_full = np.concatenate(( np.array([0]), start_event, end_event[-1:] ))
end_event_full = np.concatenate(( start_event[:1], end_event, np.array([total_duration]) ))
n_frames_full = np.concatenate(( np.array([0]), n_frames, np.array([0]) ))

# Create a DataFrame to create a list of event
event_list = pd.DataFrame({'start_event': start_event_full,'end_event':end_event_full,
                               'n_frames':n_frames_full})

event_list['event_duration'] = (event_list['end_event']-
                                    event_list['start_event'])/sampling_rate

inter_event_duration = event_list[['start_event','end_event']].values / sampling_rate
event_list['inter_event_duration'] = pd.Series(inter_event_duration[1:,0] - inter_event_duration[:-1,1])
event_list.loc[len(event_list)-1,'inter_event_duration'] = 0

## Add repeated frame in the event lists
Some time the screen repeat one frame because the time to prepare the next frame wasn't reached and the screen show again the last frame. It should be critic in some case because the sincronization is lost in this case. If you know exactly the repeated frame you can correct the stimlulus or discart this event.

<img src="https://raw.githubusercontent.com/creyesp/SamplingInterface/master/Doc/img/img%20examples/repeated_frame.png" alt="drawing" width="400px"/>

In [5]:
# Load data
repeated_file = source_folder+'repeated_frames_'+experiment_name+'.txt' 
repeated_frame_time = np.loadtxt(repeated_file, dtype='int32')

event_list['repeated_frames'] = ''
event_list['#repeated_frames'] = 0
events = event_list[['start_event','end_event']].values
for kidx, (kstart, kend) in enumerate(events):
    filter_rep = (repeated_frame_time >= kstart) * (repeated_frame_time <= kend)
    event_list.loc[kidx,'repeated_frames'] = str(repeated_frame_time[filter_rep])
    event_list.loc[kidx,['#repeated_frames']] = len(repeated_frame_time[filter_rep])

# sorted columns
event_list = event_list[['n_frames','start_event','end_event',
                                 'event_duration','inter_event_duration',

                                 '#repeated_frames','repeated_frames']]

In [6]:
filter_frame = event_list['n_frames'] > -1
event_list[filter_frame]

Unnamed: 0,n_frames,start_event,end_event,event_duration,inter_event_duration,#repeated_frames,repeated_frames
0,0,0,7632254,381.6127,0.0,0,[]
1,17800,7632254,13590110,297.8928,6.5959,8900,[ 7632255 7632925 7633594 ... 13588104 13588...
2,2100,13722028,14424921,35.14465,3.99985,0,[]
3,2100,14504918,15207810,35.1446,3.99985,0,[]
4,2100,15287807,15990700,35.14465,3.9998,0,[]
5,2100,16070696,16773589,35.14465,3.99985,0,[]
6,2100,16853586,17556478,35.1446,3.99985,0,[]
7,2101,17636475,18339702,35.16135,3.99985,1,[17673294]
8,2100,18419699,19122592,35.14465,3.9998,0,[]
9,2100,19202588,19905481,35.14465,3.9998,0,[]


**Note:** The times in event_duration have small differences with the theoric times, it's because the theorical FPS of the projector is different to the real FPS.   

For example, projector show theorically 60 FPS but the real FPS is 59.7523. Please ckeck it in the log file. 
* 2100 images to 60 [fps] = 35 [s]
* 2100 images to 59.7523 [fps] = 35.145 [s]
The differences could be significant!

## Save to csv file

In [7]:
output_folder = '../data/sync/'+experiment_name+'/event_list/'
checkDirectory(output_folder)
event_list['protocol_name']= ''
event_list['repetition_name']= ''
event_list.to_csv(output_folder+'event_list_'+experiment_name+'_.csv',index=False)

## Create separated sync files
For each event we can create a separated sync file using event information in csv. Before you have check the csv file and remove all false positive and all you should write the name of events to do more easy understand them. 

Create a definitive version removing the last underscore name. '*_.csv' -> '*.csv'

In [70]:
source_folder = '../data/sync/'+experiment_name+'/'
output_folder = '../data/sync/'+experiment_name+'/event_list/times/'
checkDirectory(output_folder)

event_file = source_folder+'event_list/event_list_'+experiment_name+'.csv'
start_end_file = source_folder+'start_end_frames_'+experiment_name+'.txt' 

# Load data
event_list_def = pd.read_csv(event_file)
start_end = np.loadtxt( start_end_file , dtype='int32')-1

pointer = start_end
for kidx, (kstart,kend) in enumerate(event_list_def[['start_event','end_event']].values):
    filter_frame = (pointer[:,0] >= kstart) * (pointer[:,0] < kend) 
    start_end_event = pointer[filter_frame]
    np.savetxt(output_folder+'{:03d}.txt'.format(kidx),start_end_event,fmt='%d')
    print(output_folder+'{:03d}.txt'.format(kidx))
    pointer = pointer[~filter_frame]
    

../data/sync/2018-01-25/event_list/times/000.txt
../data/sync/2018-01-25/event_list/times/001.txt
../data/sync/2018-01-25/event_list/times/002.txt
../data/sync/2018-01-25/event_list/times/003.txt
../data/sync/2018-01-25/event_list/times/004.txt
../data/sync/2018-01-25/event_list/times/005.txt
../data/sync/2018-01-25/event_list/times/006.txt
../data/sync/2018-01-25/event_list/times/007.txt
../data/sync/2018-01-25/event_list/times/008.txt
../data/sync/2018-01-25/event_list/times/009.txt
../data/sync/2018-01-25/event_list/times/010.txt
../data/sync/2018-01-25/event_list/times/011.txt
../data/sync/2018-01-25/event_list/times/012.txt
../data/sync/2018-01-25/event_list/times/013.txt
../data/sync/2018-01-25/event_list/times/014.txt
../data/sync/2018-01-25/event_list/times/015.txt
../data/sync/2018-01-25/event_list/times/016.txt
../data/sync/2018-01-25/event_list/times/017.txt
../data/sync/2018-01-25/event_list/times/018.txt
../data/sync/2018-01-25/event_list/times/019.txt
../data/sync/2018-01