In this tutorial we will use ONE to load and perform some simple analysis on IBL behavioural data. In particular we will cover the following concepts
- Using One to search for and download data
- 

This tutorial assumes that you have setup the unified ibl environment and authorised access to IBL data through ONE. If not please follow the previous steps of this tutorial

Let's get started by importing ONE and setting up a connection

In [None]:
from oneibl.one import ONE
one = ONE()

We want to look at behavioural data for a subject in a given lab. Let's see which labs we can choose from

In [None]:
one.list(None, 'labs')

We will choose the cortex lab. To find which subjects are available we will use the one.alyx.rest command. For more information about this useful command please see 

In [None]:
subj_info = one.alyx.rest('subjects', 'list', lab='cortexlab')

Let's see how many subjects have been assigned to the cortex lab and also examine the content of the first item in subj_info

In [None]:
print(len(subj_info))
subj_info[0]

We can see that there are a total of (insert here) subjects in the cortex lab. Each entry in the list subj_info is a dictionary that contains the details about this subject, including the nickname, whether the subject is alive or dead, the gender of the subject. We are interested in finding out the possible subject nicknames so we can refine our search. We can quickly iterate over all items in the subj_info list and extract the subject nicknames

In [None]:
subject_names = [subj['nickname'] for subj in subj_info]
print(subject_names)

Let's choose subject KS022 for further analysis and find all the sessions for this subject using the one.search command
>**NOTE** we restrict by task_protocol to find the sessions that only have training data, we could also restrict by 'biased' or 'ephys' to only search for sessions where the subject was in phase2 of trainng or when performing the task during with recordings

In [None]:
eids, sess_info = one.search(subject='KS022', task_protocol='training', details=True)

By returning the session information for each eid we can extract the date and order our experiment ids by date (or training days). Let's first look at the content of one of the sess_info

In [None]:
sess_info[0]

We can see this contains information about the session, including the date, the subject, we can quickly collect the dates for all the sessions

In [None]:
session_date = [sess['start_time'] for sess in sess_info]

The dates are returned wi, for convenienve let's reverse the list so that the first training session day is at index 0, for consistency we must reverse the list of eids as well 

In [None]:
session_date.reverse()
eids.reverse()

We will start by looking at data for the first training day. Let's list what datasets are available

In [None]:
eid_day1 = eids[0]
one.list(eid=eid_day1)

In this tutorial we are interested in the in the trials dataset that contains information about the performance of the subject on the task. We can define a list of all the data set types we want to load, for example and download these

In [None]:
d_types = ['trials.choice',
           'trials.contrastLeft']
_ = one.load(eid=eid_day1, dataset_types=d_types)

Alternatively we can take advantage of the ALF file format and download all files that have the prefix trials. 
>**NOTE** This would be called loading all attributes associated with the trials object. See here for more information on the ALF file naming convention that the IBL uses

For this we will use a slightly different loading function one.load_object

In [None]:
d_object = 'trials'
_ = one.load_object(eid=eid_day1, obj= d_object)

We can find the path where the data has been downloaded using

In [None]:
data_path = one.path_from_eid(eid_day1)
data_path

ibllib contains a useful set of functions contained in alf that can be used to read in alf objects. Let's import this module and load in all data associated with the trials object.

In [None]:
import alf.io as aio
from pathlib import Path

alf_path = Path(data_path, 'alf')

trials_day1 = aio.load_object(alf_path, '_ibl_trials')

>**NOTE**By using trials_day1 = one.load(eid=eid, data_types=dtypes) we could have automatically loaded in the trials object into memory after downloading the files. Here we have chosen to read in the data after downloading to introduce the useful functions such one.path_from_eid and alf.io.load_object

Let's look at the content of the trials object

In [None]:
print(trials_day1.keys())

Find how many trials there were in the session by inspecting the length of one of the attributes. 

In [None]:
n_trials_day1 = len(trials_day1.choice)
print(n_trials_day1)

>**NOTE** We chose to look at the first attribute of trials oject to find the no. of trials, but we could have looked at the lenght of any of the attributes and got the same results. This is another conswquency of the ALF file format. All attributes associated with a given object will have the same number of rows.

Next, let's look at the visual stimulus contrasts that were presented to the subject on day 1. For this we will inspect trials.constrastLeft dataset

In [None]:
trials_day1.contrastLeft

We have three values 1 which indicates a 100 % contrast, 0.5 which indicates a 50 % contrast and whole load of nans.....

If we inspect trials.contrastRight we will find that all the indices that contain nans in the trials.contrastLeft are filled in trials.contrastRight. Similarly, the opposite is true, all indices with nan values in trials.contrastRight are filled in trials.contrastLeft. Nans basically indicate that the contrast was show on the oppisite side

Lets combine trials.contrastLeft and trials.contrastRight into a new dataset called trials.contrast. By convetion in the IBL, contrasts that appear on the left are denoted to be negative while those on the right are positive. Let's also reflect this convention when forming our new dataset

In [None]:
import numpy as np

trials_day1.contrast = np.empty((n_trials_day1))
contrastRight_idx = np.where(~np.isnan(trials_day1.contrastRight))[0]
contrastLeft_idx = np.where(~np.isnan(trials_day1.contrastLeft))[0]

trials_day1.contrast[contrastRight_idx] = trials_day1.contrastRight[contrastRight_idx]
trials_day1.contrast[contrastLeft_idx] = -1 * trials_day1.contrastLeft[contrastLeft_idx]


We can inspect how many of each type of contrast was presented to the subject

In [None]:
contrasts, n_contrasts = np.unique(trials_day1.contrast, return_counts=True)
print(contrasts)
print(n_contrasts)

Finally let's look at how the mouse performed. This information is stored in the feedbackType attribute of the trials object. A positive feedback of +1 means the mouse got the task correct, whereas a feedback of -1 means the mouse got the trial wrong. Let's double check that these are the only values we see in trials.feedbakType 

In [None]:
np.unique(trials_day1.feedbackType)

We can easily compute how well the mouse performed

In [None]:
correct = np.sum(trials_day1.feedbackType == 1)/ n_trials_day1
incorrect =  np.sum(trials_day1.feedbackType == -1)/ n_trials_day1
print(correct * 100)
print(incorrect * 100)

As expected on the first day of training the mouse performed at chance level and was probably just guessing. Let's break down the performance at each contrast level and create a simple plot

In [None]:
import matplotlib.pyplot as plt

contrast_performance = np.empty((contrasts.size))
for ic, c in enumerate(contrasts):
    contrast_idx = np.where(trials_day1.contrast == c)[0]
    contrast_performance[ic] = np.sum(trials_day1.feedbackType[contrast_idx] == 1) / contrast_idx.shape[0]

  
plt.plot(contrasts * 100, contrast_performance * 100)
plt.scatter(contrasts * 100, contrast_performance * 100)
plt.ylim([0,100])
plt.xticks([*(contrasts * 100)])
plt.xlabel('Stimulus Contrast (%)')
plt.ylabel('Performance (%)')


As the mice learns the task we expect its performance to improve. Let's repeat the steps above and see how the same mouse performaed on day 20 of trainng

In [None]:
eid_day20 = eids[14]
trials_day20 = one.load_object(eid=eid_day20, obj=d_object)
n_trials_day20 = len(trials_day20.choice)

trials_day20.contrast = np.empty((n_trials_day20))
contrastRight_idx = np.where(~np.isnan(trials_day20.contrastRight))[0]
contrastLeft_idx = np.where(~np.isnan(trials_day20.contrastLeft))[0]

trials_day20.contrast[contrastRight_idx] = trials_day20.contrastRight[contrastRight_idx]
trials_day20.contrast[contrastLeft_idx] = -1 * trials_day20.contrastLeft[contrastLeft_idx]

contrasts, n_contrasts = np.unique(trials_day20.contrast, return_counts=True)
print(contrasts)
print(n_contrasts)

Notice how on day 20 the mouse has not only has trials with 100 and 50 % visual stimuli contrast but also . This follows the IBL training protocol where harder contrasts are introduced as the mouse becomes more expert at the task. 

In [None]:
correct = np.sum(trials_day20.feedbackType == 1)/ n_trials_day20
incorrect =  np.sum(trials_day20.feedbackType == -1)/ n_trials_day20
print(correct * 100)
print(incorrect * 100)

The performance has vastly improved compared to day 1. We can see the mouse is no longer performing at chance level. Once again let's break this down further into the performance on individual contrasts

In [None]:
contrast_performance = np.empty((contrasts.size))
for ic, c in enumerate(contrasts):
    contrast_idx = np.where(trials_day20.contrast == c)[0]
    contrast_performance[ic] = np.sum(trials_day20.feedbackType[contrast_idx] == 1) / contrast_idx.shape[0]

  
plt.plot(contrasts * 100, contrast_performance * 100)
plt.scatter(contrasts * 100, contrast_performance * 100)
plt.ylim([0,100])
plt.xticks([*(contrasts * 100)])
plt.xlabel('Stimulus Contrast (%)')
plt.ylabel('Performance (%)')
plt.gca().spines['right'].set_color('none')
plt.gca().spines['top'].set_color('none')
plt.xticks(rotation=90)

If we change the y axis , this follows the shape of the pyschometric curve that is used in IBL data trainign 

In [None]:
contrasts_performance_rightward = 1

What other interesting plots can we do with this data?
- Try plotting reaction time as with contrast
- Try the intervals between tasks. Do you notice that toward the end of the session, the mouse is becoming less engaged

You should now be familiar with the basics of how to see load data with one and how to analyse the output from task. Let's now extend our understanding of ONE and the ALF format and combine electrophysiology data with behavioural data. Alternatively, you can replicate everything in this tutorial but using the datajoint approach.