<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#NB-description" data-toc-modified-id="NB-description-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>NB description</a></span></li><li><span><a href="#The-dotsPositions.csv-data" data-toc-modified-id="The-dotsPositions.csv-data-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>The dotsPositions.csv data</a></span></li><li><span><a href="#Write-a-dotsDB-HDF5-file" data-toc-modified-id="Write-a-dotsDB-HDF5-file-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Write a dotsDB HDF5 file</a></span><ul class="toc-item"><li><span><a href="#Mapping-dots-to-trials" data-toc-modified-id="Mapping-dots-to-trials-3.1"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>Mapping dots to trials</a></span></li></ul></li></ul></div>

# NB description
date: 11 Nov 2019  
This notebook contains code that:
- builds an HDF5 dotsDB database off of dotsPositions.csv files from the Pilot run (summer 2019)
- reads off trials from the HDF5 db in order to compute motion energy
- plots reverse kernels

In [1]:
import sys
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import pprint
import seaborn as sns
import h5py     

# add location of custom modules to path
sys.path.insert(0,'../modules/')
sys.path.insert(0,'../modules/dots_db/dotsDB/')

# custom modules
import dotsDB as ddb
import motionenergy as kiani_me
import stimulus as stim
import ME_functions as my_me

# The dotsPositions.csv data
The first step is to find what .csv files we have. What I do below is that I inspect a few .csv files and then concatenate them all into a single pandas DataFrame and dump it to file (I only retain active dots info).

In [2]:
# !find /home/adrian/SingleCP_DotsReversal/raw -name "*dotsPositions.csv" -print

In [3]:
DOTS_DATA = '/home/adrian/SingleCP_DotsReversal/processed/dots_pilot_summer_2019.csv'

In [4]:
def inspect_csv(df):
    """df is a pandas.DataFrame"""
    print(df.head())
    print(len(df))
    print(np.unique(df['taskID']))
    try:
        print(np.unique(df['pilotID']))
    except KeyError:
        print(np.unique(df['subject']))

In [5]:
# a = pd.read_csv('/home/adrian/SingleCP_DotsReversal/raw/2019_06_25_13_24/2019_06_25_13_24_dotsPositions.csv')
# inspect_csv(a)

# b = pd.read_csv('/home/adrian/SingleCP_DotsReversal/raw/2019_07_03_15_03/2019_07_03_15_03_dotsPositions.csv')
# inspect_csv(b)

# c = pd.read_csv('/home/adrian/SingleCP_DotsReversal/raw/2019_07_10_17_19/2019_07_10_17_19_dotsPositions.csv')
# inspect_csv(c)

In [6]:
# files = [
#     '/home/adrian/SingleCP_DotsReversal/raw/2019_06_25_13_24/2019_06_25_13_24_dotsPositions.csv',
#     '/home/adrian/SingleCP_DotsReversal/raw/2019_07_03_15_03/2019_07_03_15_03_dotsPositions.csv',
#     '/home/adrian/SingleCP_DotsReversal/raw/2019_06_24_13_31/2019_06_24_13_31_dotsPositions.csv',
#     '/home/adrian/SingleCP_DotsReversal/raw/2019_06_24_13_06/2019_06_24_13_06_dotsPositions.csv',
#     '/home/adrian/SingleCP_DotsReversal/raw/2019_07_17_17_17/2019_07_17_17_17_dotsPositions.csv',
#     '/home/adrian/SingleCP_DotsReversal/raw/2019_07_10_12_18/2019_07_10_12_18_dotsPositions.csv',
#     '/home/adrian/SingleCP_DotsReversal/raw/2019_06_20_13_27/2019_06_20_13_27_dotsPositions.csv',
#     '/home/adrian/SingleCP_DotsReversal/raw/2019_06_24_12_38/2019_06_24_12_38_dotsPositions.csv',
#     '/home/adrian/SingleCP_DotsReversal/raw/2019_07_10_17_19/2019_07_10_17_19_dotsPositions.csv',
#     '/home/adrian/SingleCP_DotsReversal/raw/2019_06_20_12_54/2019_06_20_12_54_dotsPositions.csv',
#     '/home/adrian/SingleCP_DotsReversal/raw/2019_06_21_13_08/2019_06_21_13_08_dotsPositions.csv',
#     '/home/adrian/SingleCP_DotsReversal/raw/2019_07_12_11_11/2019_07_12_11_11_dotsPositions.csv'
# ]

In [7]:
# pandas = [pd.read_csv(f) for f in files]
# total = pd.concat(pandas)
# inspect_csv(total)

In [8]:
# len(files)

In [9]:
# final = total.loc[total['isActive'] == 1,:]
# inspect_csv(final)

In [10]:
# write_to_file = False
# if write_to_file:
#     final.to_csv('dots_pilot_summer_2019.csv', index=False)

# Write a dotsDB HDF5 file
Now that all the dotsPositions.csv data is collected into a single global .csv file, I wish to dump it all into an hdf5 database.

Several actions need to be implemented.
1. For each trial in the dotsPositions.csv data, I need to know: _coherence_, _viewing duration_, _presenceCP_, _direction_, _subject_, _block_ (_probCP_). For this, I will assume that the `trialEnd` (from FIRA) and `seqDumpTime` (from dotsPositions) timestamps are in the same unit.
2. I need to decide how to organize my dotsDB hierarchically. Example is `subj15/probCP0.1/coh0/ansleft/CPno/VD100`

In [11]:
# # Needs to be re-written
# def snowDots2DotsDBNormalizedFrames(df):
#     """
#     Casts a pandas.DataFrame corresponding to the .csv file outputted by some snow-dots programs into a list of 
#     lists of frames, as dotsDB accepts them. The main mapping is the swapping of x and y.
    
#     In dotsDB, x represents vertical position, y horizontal, and (0,0) is top left corner.
#     In snow-dots, x represents horizontal position, y vertical, and (0,0) is top left corner.
      
#     :param df: dataframe with columns "xpos", "ypos", "isActive", "isCoherent", "frameIdx", 
#          "seqDumpTime", "pilotID", "taskID", "trialCount" 
#     :type df: pandas.DataFrame
#     """
#     list_of_trials = []
# #     num_trials = np.max(df["trialCount"])
#     for tr in range(num_trials):
#         list_of_frames = []
# #         trial_data = df[df["trialCount"] == (tr+1)]
        
#         num_frames = np.max(trial_data["frameIdx"])
#         assert not np.isnan(num_frames), f'trial {tr+1}, num_frames is {num_frames}'
        
#         for fr in range(num_frames):
#             frame_data = trial_data[(trial_data["frameIdx"] == (fr+1)) & (trial_data["isActive"] == 1)]
#             list_of_frames.append(np.array(frame_data[['ypos','xpos']]))  # here I swap xpos with ypos
        
#         list_of_trials.append(list_of_frames)
#     return list_of_trials

In [12]:
t = pd.read_csv('/home/adrian/SingleCP_DotsReversal/processed/all_valid_data.csv')
d = pd.read_csv(DOTS_DATA)

In [13]:
inspect_csv(d)

       xpos      ypos  isActive  isCoherent  frameIdx   seqDumpTime  pilotID  \
0  0.996977  0.013053         1           1         1  1.204152e+06        2   
1  0.828417  0.618607         1           1         1  1.204152e+06        2   
2  0.378624  0.591975         1           1         1  1.204152e+06        2   
3  0.748699  0.621605         1           1         1  1.204152e+06        2   
4  0.675888  0.318813         1           1         1  1.204152e+06        2   

   taskID  
0       1  
1       1  
2       1  
3       1  
4       1  
1906784
[ 1  2  3  4  5  6  7  8 10 11 12 13 14]
[1 2 3 4 5]


In [14]:
inspect_csv(t)

  subject              date  taskID  trialIndex     trialStart       trialEnd  \
0      S1  2019_06_20_12_54       2           1  769481.078453  769485.302769   
1      S1  2019_06_20_12_54       2           2  769491.408107  769495.878215   
2      S1  2019_06_20_12_54       2           3  769495.881308  769501.918366   
3      S1  2019_06_20_12_54       2           4  769501.921422  769505.751018   
4      S1  2019_06_20_12_54       2           5  769505.754041  769510.374374   

      dirRT  dirChoice  dirCorrect  cpRT  ...  targetOn    dotsOn   dotsOff  \
0  0.584855        1.0         1.0   NaN  ...  1.532990  2.092313  2.516041   
1  0.433302        0.0         1.0   NaN  ...  0.745712  2.491474  2.913933   
2  0.583730        1.0         1.0   NaN  ...  1.169542  3.931527  4.338307   
3  0.480400        0.0         1.0   NaN  ...  1.179547  1.806666  2.225525   
4  1.230994        1.0         1.0   NaN  ...  1.206468  1.850535  2.274264   

   dirChoiceTime  cpChoiceTime  CPresp

In [15]:
ts = np.unique(d['seqDumpTime'])

In [16]:
np.size(ts)

5976

In [17]:
ts.shape

(5976,)

In [18]:
dots_dump_time = np.min(ts)

In [19]:
np.max(ts)

1205490.81387667

In [20]:
trials = np.unique(t['trialEnd'])

In [21]:
trials.shape

(11351,)

In [22]:
np.min(trials)

386.872299287

In [23]:
np.max(trials)

1209099.79360426

## Mapping dots to trials
Having the dump times of the dots and of the trials, our first task is to recover from which trial each dot is from.

Let's start simple, with the first dump times.

In [24]:
def get_trial_params(df):
    """coherence, viewing duration, presenceCP, direction, subject, block (probCP)"""
    print(
        'coh {}, VD {}, CP {}, dir {}, {}, {}, probCP {}'.format(
            df['coherence'].values[0],
            df['viewingDuration'].values[0],
            df['presenceCP'].values[0],
            df['initDirection'].values[0],
            df['subject'].values[0],
            df['block'].values[0],
            df['probCP'].values[0]
        )
    )

In [25]:
def get_trial_from_dots_ts(dot_ts, trials_ts, trials_df):
    trial_dump_time = np.min(trials_ts[trials_ts>dot_ts])
    return trials_df[trials_df['trialEnd'] == trial_dump_time]

In [26]:
for i in range(10):
    get_trial_params(
        get_trial_from_dots_ts(ts[-i], trials, t)
    )

coh 65.0, VD 0.4, CP 0, dir 0, S1, Quest, probCP nan
coh 0.0, VD 0.2, CP 0, dir 180, S2, Block2, probCP 0.0
coh 22.0, VD 0.4, CP 0, dir 180, S2, Block2, probCP 0.0
coh 22.0, VD 0.4, CP 0, dir 0, S2, Block2, probCP 0.0
coh 22.0, VD 0.3, CP 0, dir 180, S2, Block2, probCP 0.0
coh 22.0, VD 0.3, CP 0, dir 180, S2, Block2, probCP 0.0
coh 0.0, VD 0.2, CP 0, dir 180, S2, Block2, probCP 0.0
coh 0.0, VD 0.3, CP 0, dir 0, S2, Block2, probCP 0.0
coh 0.0, VD 0.3, CP 0, dir 180, S2, Block2, probCP 0.0
coh 22.0, VD 0.1, CP 0, dir 0, S2, Block2, probCP 0.0


So far so good, for a given `seqDumpTime` value, I am able to recover the trial's parameters. All that remains to do is to add columns to the dots dataframe (and remove the `isActive` one).