# Using [International Brain Laboratory](https://www.internationalbrainlab.com/) behavior data for an example analysis

Christopher S Krasniak, Cold Spring Harbor Laboratory, 2020-01-22

In order to encourage access and use the IBL data used for and released with the [bioRxiv](https://www.biorxiv.org/content/10.1101/2020.01.17.909838v2) paper detailing the standardized training of the IBL, the Outreach Working Group of the IBL created this tutorial. The purpose of this document is to encourage the use of IBL data, specifically as a resource for teaching the use of python for the analysis of neuroscience and psychology data. Many simple data analysis questions can be explored with this data set, a few examples of which are in the accompanying document DOCUMENT. We hope these questions will help future neuroscientists and psychologists explore this dataset and perhaps make their own unique discoveries as they learn to use python for data analysis.

To proceed with the tutorial, make sure you have completed the installation steps in the [README](https://github.com/cskrasniak/behavior_analysis_demo/blob/master/README.md)

What follows is a tutorial that can be used as an example of how to access the IBL data and perform a simple analysis to answer a simple question. The data used in this tutorial are from mice that have been trained on a basic visual detection task, please read the [paper](https://www.biorxiv.org/content/10.1101/2020.01.17.909838v2) to understand the dataset you will be working with.

## Question: Who performs more trials, male or female mice?

### Import packages
The first step, as with any python code, is to import all of the packages we will need to work with the data, this is a good set to start with when working with IBL data. You may need more or fewer for specific questions, but this is a good start.

In [6]:
import numpy as np
import pandas as pd
import sys
import matplotlib.pyplot as plt
import seaborn as sns
import datajoint as dj
import os
import matplotlib as mpl
from ibl_pipeline import subject, behavior, acquisition
from paper_behavior_functions import query_sessions

### Fetch the data we'll need
Now that that's all setup, the next thing we have to do is retrieve the data from the database. To do that we will be using [DataJoint](https://docs.datajoint.io/python/), we will be running queries that will return the data we are looking for, queries in DataJoint are run using mySQL syntax. Read more about DataJoint in the link above, and the IBL Data Architecture [here](https://www.biorxiv.org/content/10.1101/827873v1).
Included in the _simple_anlaysis_demo_ folder is the list of universially unique identifiers (UUIDs) of the mice we will use to answer our question, and we already imported a function to 

In [None]:
dj.

In [7]:
query_sessions(criterion='trained')

subject_uuid,session_start_time  start time,lab_name  name of lab,subject_project,session_uuid,task_protocol,subject_nickname  nickname,institution_short,training_status
95241a9c-481b-443c-83a7-462165f729ec,2019-02-10 09:49:50,angelakilab,ibl_neuropixel_brainwide_01,2823cc5f-dee0-4f98-850b-ebd6d9a7bcd6,_iblrig_tasks_trainingChoiceWorld3.5.3,IBL-T3,NYU,in_training
95241a9c-481b-443c-83a7-462165f729ec,2019-02-11 10:35:17,angelakilab,ibl_neuropixel_brainwide_01,ff2b357e-6f14-4745-b6d4-41f3008433c2,_iblrig_tasks_trainingChoiceWorld3.5.3,IBL-T3,NYU,in_training
95241a9c-481b-443c-83a7-462165f729ec,2019-02-12 08:25:22,angelakilab,ibl_neuropixel_brainwide_01,f6501177-1c46-427e-b8c9-a7136afd543a,_iblrig_tasks_trainingChoiceWorld3.5.3,IBL-T3,NYU,in_training
95241a9c-481b-443c-83a7-462165f729ec,2019-02-13 09:27:32,angelakilab,ibl_neuropixel_brainwide_01,cf02a0c2-bc07-4b17-83b4-64a72ddccb49,_iblrig_tasks_trainingChoiceWorld3.5.3,IBL-T3,NYU,in_training
95241a9c-481b-443c-83a7-462165f729ec,2019-02-14 08:55:50,angelakilab,ibl_neuropixel_brainwide_01,7bbd0d31-6802-4990-b072-8e0827f7429c,_iblrig_tasks_trainingChoiceWorld3.5.3,IBL-T3,NYU,in_training
95241a9c-481b-443c-83a7-462165f729ec,2019-02-15 09:46:23,angelakilab,ibl_neuropixel_brainwide_01,76d7eed4-f203-40cd-b2ed-0198eb86027d,_iblrig_tasks_trainingChoiceWorld3.5.3,IBL-T3,NYU,in_training
95241a9c-481b-443c-83a7-462165f729ec,2019-02-18 08:17:07,angelakilab,ibl_neuropixel_brainwide_01,8806b436-444c-4d46-bb02-bea6db05af5b,_iblrig_tasks_trainingChoiceWorld3.5.3,IBL-T3,NYU,in_training
95241a9c-481b-443c-83a7-462165f729ec,2019-02-19 08:02:33,angelakilab,ibl_neuropixel_brainwide_01,eea4b47d-62c8-4998-b9dd-1734990d87e8,_iblrig_tasks_trainingChoiceWorld3.5.3,IBL-T3,NYU,in_training
95241a9c-481b-443c-83a7-462165f729ec,2019-02-19 15:36:09,angelakilab,ibl_neuropixel_brainwide_01,62f203e8-6c54-47bc-9200-8d3ce6b0248b,_iblrig_tasks_trainingChoiceWorld3.7.2,IBL-T3,NYU,in_training
95241a9c-481b-443c-83a7-462165f729ec,2019-02-20 10:46:10,angelakilab,ibl_neuropixel_brainwide_01,34dd7859-72d5-41a4-9995-00b7bcf00cf2,_iblrig_tasks_trainingChoiceWorld3.7.2,IBL-T3,NYU,in_training


To inspire original questions for students, the following line can be run to find what the available data types are for analysis.