# ICSS Data analysis script

## README
A template to analyze ICSS data from output files to plot. The motivation is to perform reproducible research by standardizing the steps that are performed for data analysis and statistical inference. 

### OUTLINE:
0. Prerequisite: CSV files of ICSS data files generated using the script written by Steve Cabilio
1. Preprocessing
2. 

---

### Author(s): Suman K. Guha

## 1. ICSS data preprocessing 
As a prerequisite of analysis is to convert the data MedPC generated data to Theta0, M50, and Max Rate, using Steve Cabilio's program. **Note:** We use the CSV program as it is an open format and easier to work with. 

A typical file generated by the analysis program looks like

    Source:,C:\ICSS\EC-IVSA\RF\SG13\20190117.XLS

    Data_File,Box,Subject,Group,Experiment,End Date,End Time,MPC,Comment
    C:\ICSS\EC-IVSA\RF\SG13\20190117.XLS,1,SG13,RF,EC-IVSA,2019-01-17,11:57:23,ICSS4_01,Generated by macro C:\MED-PC IV\Macro\2019_RF_ICSS+IVSA_G1-PM_1 1/17/2019 10:50:26 AM

    Pulse1w,uAmp1,Pulse1Pulse2,Pulse2w,uAmp2,StimDur,StimDelay,TimeOut,TrialDur,ITI,NumPrimes,InterPrime,PrimeDelay,FR
    100,0,100,100,170,500,0,0.5,50,10,5,500,5500,1

    ,Pass1,Pass2,Pass3,Pass4
    M50,72.22,78.28,58.25,74.78
    T0,37.61,50.42,49.50,63.00
    MaxR,275,177,241,212
    Slope,210.71,201.21,740.44,618.61

    ,Pass1,Pass2,Pass3,Pass4
    M50 Mean,72.22,78.28,58.25,74.78
    T0 Mean,37.61,50.42,49.50,63.00
    MaxR Mean,275.00,177.00,241.00,212.00
    Slope Mean,210.71,201.21,740.44,618.61

    ,Pass1,Pass2,Pass3,Pass4
    M50 %Baseline,100.00,108.39,80.65,103.54
    T0 %Baseline,100.00,134.08,131.62,167.52
    MaxR %Baseline,100.00,64.36,87.64,77.09
    Slope %Baseline,100.00,95.49,351.40,293.58

Once we run the preprocessing step it looks like

    Date,Subject,Experiment,Pass,T0,M50,MaxRate
    2019-01-17,SG13,RF,1,37.61,72.22,275
    2019-01-17,SG13,RF,2,50.42,78.28,177
    2019-01-17,SG13,RF,3,49.50,58.25,241
    2019-01-17,SG13,RF,4,63.00,74.78,212
    
In the next step we are going to batch process this conversion step by looping over all the files. For us to be able to do that, we need to have very rigid file structures. A typical file structure _**should**_ be
   
    Cohort01
    |
    |---Subject01
    |   |
    |   ANA_<YYYYMMDD>.CSV
    |
    |---Subject02
    |   |
        ANA_<YYYYMMDD>.CSV

For the preprocessing part, we extract the values of each animal, for each day. After that is performed we will combine and store them in a table format.

In [None]:
%%capture
%%bash
# We move to the folder where the cohort data is stored
# cd </path/to/cohort/data>
cd /Users/sumanguha/Dropbox\ \(Partners\ HealthCare\)/Projects/R01_2017_OxycSA-NASh-Glutamate/_data_R01_2017/_data_R01_2017_ICSS/_ana_files/Cohort01

# entering each subject/animal directory to list all the CSV files and to run the program on each
for dirName in $(ls -d */)
do
    cd $dirName
    for fileName in $(ls ANA*.CSV)
    do
        preprocessICSSFiles --file $fileName;
    done
    cd ..
done

In [None]:
# loading the interface that will let us talk to the R Statistical programing language
%load_ext rpy2.ipython

In [71]:
%%capture
%%R
# loading required libraries for R
library(tidyverse)
library(lubridate)

# 1. loading data: set the path to 
dataDir <- "~/Dropbox (Partners HealthCare)/Projects/R01_2017_OxycSA-NASh-Glutamate/_data_R01_2017/_data_R01_2017_ICSS/_ana_files/Cohort01/"
fileList <- list.files(path = dataDir, pattern = "preprocessed.csv", recursive = T)
## generating combined data table
data <- fileList %>% map(~ read_csv(file.path(dataDir, .))) %>% reduce(rbind)

In [66]:
%%R
data %>% print

[90m# A tibble: 812 x 7[39m
   Date       Subject Experiment  Pass T0    M50   MaxRate
   [3m[90m<date>[39m[23m     [3m[90m<chr>[39m[23m   [3m[90m<chr>[39m[23m      [3m[90m<dbl>[39m[23m [3m[90m<chr>[39m[23m [3m[90m<chr>[39m[23m   [3m[90m<dbl>[39m[23m
[90m 1[39m 2019-01-17 SG13    RF             1 37.61 72.22     275
[90m 2[39m 2019-01-17 SG13    RF             2 50.42 78.28     177
[90m 3[39m 2019-01-17 SG13    RF             3 49.5  58.25     241
[90m 4[39m 2019-01-17 SG13    RF             4 63    74.78     212
[90m 5[39m 2019-01-18 SG13    RF             1 33.77 55.55     201
[90m 6[39m 2019-01-18 SG13    RF             2 59.77 69.08     165
[90m 7[39m 2019-01-18 SG13    RF             3 62.66 77.56     168
[90m 8[39m 2019-01-18 SG13    RF             4 57.85 81.47     158
[90m 9[39m 2019-01-22 SG13    RF             1 38.03 46.48     282
[90m10[39m 2019-01-22 SG13    RF             2 33.38 52.09     340
[90m# ... with 802 more rows