In [11]:
IRdisplay::display_html(file='code_hiding.html')

In [24]:
# load packages and define constants
library(data.table) # see https://cran.r-project.org/web/packages/data.table/vignettes/datatable-intro.html for reference
library(ggplot2)
source("R_functions.r")

# folder/file-specific constants
PILOT_NUMBER <- 5
DATA_FOLDER <- "~/programing/data/psychophys/"
FIRA_TAG <- "FIRA"
FRAMES_TAG <- "framesInfo"
DOTS_TAG <- "dotsPositions"

# key-specific constants 
TRIALS <- "trials"
FRAMES <- "frames"
DOTS <- "dots"

In [25]:
# load csv files into data.tables
tb <- list(
    loadPilotCSV(pilotNumber, dataFolder, FIRA_TAG),
    loadPilotCSV(pilotNumber, dataFolder, FRAMES_TAG),
    loadPilotCSV(pilotNumber, dataFolder, DOTS_TAG))
names(tb) <- c(TRIALS, FRAMES, DOTS)

# Notebook's goals

Make sure the workflow from running the task to analyzing the data works.


## Workflow description
1. Task is run with repo [SingleCP_DotsReversal_Task](https://github.com/TheGoldLab/SingleCP_DotsReversal_Task/) (appropriate branch must be chosen).
  1. A `.mat` data file is outputted
  2. I usually manually rename this file `pilot_#.mat` and upload it to PennBox (Data/Psychophysics/Radillo_SingleCP_DotsReversal/)
2. Data is analyzed with repo [SingleCP_DotsReversal_DataAnalaysis](https://github.com/aernesto/SingleCP_DotsReversal_DataAnalysis)(again, with the appropriate branch).
  1. The first step is to convert data from the .mat file into CSV format
    - `pilot#_framesInfo.csv` produced with [this script](https://github.com/aernesto/SingleCP_DotsReversal_DataAnalysis/blob/25d37b8a9cb2fb768359dd30be4452aed60b9c62/MATLAB_scripts/explore_data_file.m) as of 01/28/2019. [Fields description](https://github.com/aernesto/SingleCP_DotsReversal_DataAnalysis/wiki/Fields-description-of-*framesInfo.csv-file).
    - `pilot#_FIRA.csv` produced with [this script](https://github.com/aernesto/SingleCP_DotsReversal_DataAnalysis/blob/25d37b8a9cb2fb768359dd30be4452aed60b9c62/MATLAB_scripts/explore_data_file.m) as of 01/28/2019. [Fields description](https://github.com/aernesto/SingleCP_DotsReversal_DataAnalysis/wiki/Fields-Description-of-*FIRA.csv-files).
    - `pilot#_dotsPositions.csv` produced with same script as above, as of 01/28/2019. [Fields description](https://github.com/aernesto/SingleCP_DotsReversal_DataAnalysis/wiki/Fields-descriptions-for-*dotsPositions.csv-file).

**TO-DO**
- Write detailed explanation of each column in each `.csv` file (i.e. update the [Wiki](https://github.com/aernesto/SingleCP_DotsReversal_DataAnalysis/wiki))
- Make sure data from the three `.csv` files is consistent

## Specific questions
1. How many frames are skipped on each trial, and where in the trial do they occur?
2. Does this number match the offset in viewing duration per trial?
3. Compute reverse kernels with the number of coherent dots as a proxy for motion energy

# Exploring the data
## The `*FIRA.csv` file (trials level)

In [26]:
str(tb[[TRIALS]])
unique(tb[[TRIALS]][,viewingDuration])

Classes ‘data.table’ and 'data.frame':	18 obs. of  25 variables:
 $ taskID         : int  2 2 2 2 2 2 2 2 2 2 ...
 $ trialIndex     : int  13 11 2 17 16 5 1 10 8 12 ...
 $ trialStart     : num  35828 35834 35839 35845 35850 ...
 $ trialEnd       : num  35834 35839 35845 35850 35856 ...
 $ RT             : num  1.26 1.18 1.5 1.48 1.27 ...
 $ choice         : int  1 1 0 1 1 1 1 0 1 0 ...
 $ correct        : int  1 1 0 1 0 0 0 1 0 1 ...
 $ initDirection  : int  0 0 0 0 0 180 180 180 180 180 ...
 $ endDirection   : int  0 0 0 0 180 180 180 180 180 180 ...
 $ presenceCP     : int  0 0 0 0 1 0 0 0 0 0 ...
 $ coherence      : num  6.4 25.6 6.4 25.6 12.8 25.6 6.4 12.8 6.4 25.6 ...
 $ viewingDuration: num  0.3 0.2 0.1 0.3 0.3 0.1 0.1 0.2 0.2 0.2 ...
 $ probCP         : num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
 $ timeCP         : num  0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 ...
 $ randSeedBase   : int  1385 3014 365 7065 6090 8825 8015 5250 2485 4081 ...
 $ fixationOn     : num  71658 71

## The `*framesInfo.csv` file (frames level)

In [33]:
str(tb[[FRAMES]])
unique(tb[[FRAMES]]$trialIndex)

Classes ‘data.table’ and 'data.frame':	302 obs. of  6 variables:
 $ frameTotCount: int  3 4 5 6 7 8 9 10 11 12 ...
 $ onsetTime    : num  35830 35831 35832 35832 35832 ...
 $ onsetFrame   : int  375 493 495 500 502 503 504 506 507 507 ...
 $ swapTime     : num  35829 35831 35831 35832 35832 ...
 $ isTight      : int  0 0 0 0 0 1 1 0 1 1 ...
 $ trialIndex   : int  13 13 13 13 13 13 13 13 13 13 ...
 - attr(*, ".internal.selfref")=<externalptr> 


In [47]:
tb[[FRAMES]][order(onsetFrame),.(minOnset=min(onsetFrame), maxOnset=max(onsetFrame), numUniqueOnset=.N), by=trialIndex]

trialIndex,minOnset,maxOnset,numUniqueOnset
13,375,515,18
11,677,824,16
2,976,1126,11
17,1296,1465,23
16,1640,1794,24
5,1948,2092,11
1,2263,2410,10
10,2570,2724,17
8,2896,3024,17
12,3184,3314,17


In [55]:
tb[[FRAMES]][trialIndex==1, ]

frameTotCount,onsetTime,onsetFrame,swapTime,isTight,trialIndex
106,35860.95,2263,35860.94,0,1
107,35862.7,2368,35862.7,0,1
108,35863.29,2403,35863.28,0,1
109,35863.3,2404,35863.3,1,1
110,35863.32,2405,35863.31,1,1
111,35863.34,2406,35863.33,1,1
112,35863.35,2407,35863.35,1,1
113,35863.37,2408,35863.38,1,1
114,35863.39,2409,35863.4,1,1
115,35863.4,2410,35863.41,1,1


## The `*dotsPositions.csv` file (dots level)

In [56]:
tb[[DOTS]][trialCount == 7,.N,by=frameIdx]

frameIdx,N
1,182
2,182
3,182
4,182
5,182
6,182


In [28]:
str(tb[[DOTS]])
#tb[[DOTS]][,.(.N),by=.(frameIdx,trialCount)]

Classes ‘data.table’ and 'data.frame':	41678 obs. of  6 variables:
 $ xpos      : num  0.763 0.57 0.962 0.2 0.223 ...
 $ ypos      : num  0.322 0.387 0.931 0.445 0.578 ...
 $ isActive  : int  1 0 0 1 0 0 1 0 0 1 ...
 $ isCoherent: int  1 0 0 1 0 0 1 0 0 1 ...
 $ frameIdx  : int  1 1 1 1 1 1 1 1 1 1 ...
 $ trialCount: int  1 1 1 1 1 1 1 1 1 1 ...
 - attr(*, ".internal.selfref")=<externalptr> 


# Checking that the three datasets are consistent
As we can see, the `frameIdx` field is common to `tb[[FRAMES]]` and `tb[[DOTS]]`, and the `trialIdx` is common to `tb[[DOTS]]` and `tb[[TRIALS]]`. Let's check whether these common fields match in terms of their unique values in each dataset.
## Exploring match between frames and dots levels

In [29]:
length(unique(tb[[FRAMES]][,frameTotCount]))

In [30]:
unique(tb[[DOTS]][,frameIdx])
length(unique(tb[[DOTS]][,frameIdx]))

## Exploring match between `DOTS` and `TRIALS` levels

In [31]:
unique(tb[[DOTS]][,trialCount])

In [32]:
unique(tb[[TRIALS]][,trialIndex])
length(unique(tb[[TRIALS]][,trialIndex]))
min(unique(tb[[TRIALS]][,trialIndex]))
max(unique(tb[[TRIALS]][,trialIndex]))

## Exploring match between `FRAMES` and `TRIALS` levels

In [41]:
setkey(tb[[FRAMES]], trialIndex)
setkey(tb[[TRIALS]], trialIndex)

# Full OUTER JOIN (see https://rstudio-pubs-static.s3.amazonaws.com/52230_5ae0d25125b544caab32f75f0360e775.html)
frameCount <- merge(
    tb[[FRAMES]][order(trialIndex),.(numFramesInFRAMES=.N),by=trialIndex],
    tb[[TRIALS]][order(trialIndex),.(numFramesInTRIALS=numFrames),by=trialIndex],
    all=TRUE)
frameCount[,.(trialIndex, numFramesInFRAMES, numFramesInTRIALS, match=numFramesInFRAMES == numFramesInTRIALS)]

trialIndex,numFramesInFRAMES,numFramesInTRIALS,match
1,10,10,True
2,11,11,True
3,12,11,False
4,11,11,True
5,11,11,True
6,11,11,True
7,17,17,True
8,17,17,True
9,17,17,True
10,17,17,True


Any non-match in the right-most column above signals an *issue*.