## Tutorial

This tutorial exemplifies how to use the python module to list files and grab sessions from the google DATA folder.

1. list subjects
1. list sessions
1. list all files for a subject
1. retrieve a specific experiment
1. download all files of a type

#### To install the package: 

1. install and configure ``rclone`` to access the lab's data shared folder
   * with Anaconda just do ``conda install rclone -c conda-forge`` then ``rclone config``
1. edit the ``$HOME/.labdatatools`` file that gets created the first time you try to import the module.
   * add the name of the shared drive ([churchland_data]) to ``rclone/drive``
   * add/replace the name of the local data folder to ``paths``
   
   
### Expected data organization

Data are organized in folders in the server; the first 3 folders are structured as:

1. __SUBJECT__ (XX000)
    2. __SESSION__ typically the DATE_TIME of the session (20201230_181455)
        3. __DATATYPE__ identifyer of the type of data (a session can have multiple datatypes, ephys, behavior...)
            4. ...

In [6]:
# import the module
from labdatatools import *

#### List all subjects in the data server

In [2]:
# list subjects and print a summary
subjects = rclone_list_subjects()

print('There are data from {0} subjects in the server:'.format(len(subjects)))
print('\t'+','.join(subjects))

There are data from 126 subjects in the server:
	AC006,AC007,AC008,AC009,AC010,AC011,AC012,ACh001(825),AK000,AK001,AK002,AK003,AK004,AK005,AK006,AK007,JC015,JC016,JC017,JC018,JC019,JC020,JC021,JC022,JC023,JC024,JC025,JC026,JC027,JC028,JC029,JC030,JC031,JC032,JC033,JC034,JC035,JC036,JC037,JC038,JC039,JC040,JC041,JC042,JC043,JC044,JC045,JC046,JC047,JC048,JC049,JC050,JC051,JC052,JC053,JC054,JC055,JC056,JC058,JC059,JC060,JR002,JR006,JR007,JR008,JR009,JR010,JR011,JR012,JR013,JR015,JR016,JR018,JR019,JR020,JR021,JR022,JR024,JR028,JR030,JR031,JR033,JR034,LO007,LO008,LO009,LO010,LO011,LO012,LO013,LO014,LO015,LO016,LO017,LO018,LO019,LO020,LO021,LO022,LO023,LO025,LO026,MR01,MR02,MT006,No77,No79,No9031,No9032,RatEphys,UCLA004,UCLA005,UCLA006,UCLA007,UCLA008,UCLA009,UCLA010,UCLA011,UCLA012,UCLA013,cy03,cy06,cy08,cy11,dummy,freeMovBlack


#### List sessions for a specific subject

In [2]:
# list sessions for a subject (get only the names)
subject = 'JC046'
sessionnames = rclone_list_sessions(subject = subject)
print('There are {1} sessions for subject {0}.'.format(subject,len(sessionnames)))

There are 138 sessions for subject JC046.


#### List files for a subject

In [2]:
# list files for subject
subject = 'JC027'
files = rclone_list_files(subject = subject)
print('''There are {0} files for subject {1} in {2} sessions and {3} different experiment 
data kinds: {4}'''.format(
    len(files),
    subject,
    len(files.session.unique()),
    len(files.datatype.unique()),', '.join(files.datatype.unique())))

There are 442 files for subject JC027 in 81 sessions and 3 different experiment 
data kinds: SpatialSparrow, two_photon, suite2p


#### Find the sessions with 2P experiments

In [3]:
# filter sessions with 2P recordings
sessions = files[files.datatype == 'two_photon'].session.unique()

print('''
    Sessions with 2P for {1}:
\t{0}
'''.format(', \n\t'.join(sessions),subject))


    Sessions with 2P for JC027:
	20210202_164855, 
	20210202_111053, 
	20210201_183530, 
	20210201_182320



#### Fetch all mat files from a subject to disk


In [None]:
# get behavior mat files from the first sessions with 2p data
tparse = rclone_get_data(subject = subject, includes=['*.mat'])

In [35]:
rclone_list_files(include=['*Droplets*'])

Unnamed: 0,filename,filesize,filepath,dirname,session,datatype
0,JC059_20210719_164509_DropletsTask.triallog.h5,1556152,JC059/20210719_164509/DropletsTask/JC059_20210...,JC059/20210719_164509/DropletsTask,20210719_164509,DropletsTask
1,DropletsTask.yaml,1052,JC059/20210719_164509/DropletsTask/DropletsTas...,JC059/20210719_164509/DropletsTask,20210719_164509,DropletsTask
2,JC059_20210719_121235_DropletsTask.triallog.h5,1465564,JC059/20210719_121235/DropletsTask/JC059_20210...,JC059/20210719_121235/DropletsTask,20210719_121235,DropletsTask
3,DropletsTask.yaml,1067,JC059/20210719_121235/DropletsTask/DropletsTas...,JC059/20210719_121235/DropletsTask,20210719_121235,DropletsTask
4,JC059_20210716_115028_DropletsTask.triallog.h5,1420586,JC059/20210716_115028/DropletsTask/JC059_20210...,JC059/20210716_115028/DropletsTask,20210716_115028,DropletsTask
...,...,...,...,...,...,...
499,JC054_20210524_202522_DropletsTask.triallog.h5,1076416,JC054/JC054_20210524_202522/DropletsTask/JC054...,JC054/JC054_20210524_202522/DropletsTask,JC054_20210524_202522,DropletsTask
500,JC055_20210524_201340_DropletsTask.triallog.h5,1078224,JC055/JC055_20210524_201340/DropletsTask/JC055...,JC055/JC055_20210524_201340/DropletsTask,JC055_20210524_201340,DropletsTask
501,JC056_20210524_200658_DropletsTask.triallog.h5,1075696,JC056/JC056_20210524_200658/DropletsTask/JC056...,JC056/JC056_20210524_200658/DropletsTask,JC056_20210524_200658,DropletsTask
502,JC055_20210524_192349_DropletsTask.triallog.h5,1075120,JC055/JC055_20210524_192349/DropletsTask/JC055...,JC055/JC055_20210524_192349/DropletsTask,JC055_20210524_192349,DropletsTask


In [None]:
# list all files in the database
allfiles = rclone_list_files()