

<h1 align="center">Navigating the Allen Brain Observatory</h1> 
<h3 align="center">CSHL Neural Data Analysis</h3>
<h3 align="center">Tuesday July 23, 2019</h3> 

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import os

### Brain Observatory Setup

In [None]:
from allensdk.core.brain_observatory_cache import BrainObservatoryCache
drive_path = '/data/allen-brain-observatory/visual-coding-2p'
manifest_file = os.path.join(drive_path,'manifest.json')

boc = BrainObservatoryCache(manifest_file=manifest_file)

`manifest_file` is a path to the manifest file.  This needs to reflect where you are storing and accessing the data. If you leave this out, a manifest file will be created in your working directory, and data will be downloaded to this location.

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">

<h1> Part 1: Getting oriented to the dataset</h1>

</div>

The Brain Observatory Cache allows us to understand the dimensions of the data in the dataset - the conditions under which each experiment was acquired. It also allows us to access those data, once we select what we want to use.

Let's take a look at the available **imaging depths**, **cre lines**, **areas**, and **stimuli** available in the Brain Observatory dataset.

Start by getting all the areas, which we call "targeted structures"

In [None]:
boc.get_all_targeted_structures()

Use similar functions to get all imaging depths, all cre lines, all reporter lines, all stimuli, and all session types

### Other boc functions
These "get all X" functions return the unique values for key experiment parameters. We can use these parameters to find experiments of interest and use other boc functions to get those data.

### 1.1 Experiment containers & sessions

The experiment container describes a set of 3 imaging sessions performed for the same field of view (ie. same targeted area and imaging depth in the same mouse that targets the same set of neurons). Each experiment container has a unique ID number.

> Choose a visual area and Cre line from the lists above

In [None]:
visual_area = 'VISp'
cre_line ='Cux2-CreERT2'

In [None]:
exps = boc.get_experiment_containers(targeted_structures=[visual_area], cre_lines=[cre_line])

<b>get_experiment_containers</b> returns a list of experiment containers that meets the conditions we have specified. If we don't pass any parameters, it returns all experiment containers. 

How many experiment containers are there for the area and Cre line you chose?

What information do we get from this list? Make a dataframe of this list, to compare the information for each container.

In [None]:
pd.DataFrame(exps)

> Let's look at one experiment container, imaged from Cux2, in VISp, from imaging depth 175 um.

In [None]:
experiment_container_id = 511510736

In [None]:
sessions = boc.get_ophys_experiments(experiment_container_ids=[experiment_container_id])

<b>get_ophys_experiments</b> returns a list of imaging sessions for the conditions that we specified (in this case we passed a single experiment container id). If we don't pass any parameters, it returns all imaging sessions. What other keywords can we use to select imaging sessions?

In [None]:
pd.DataFrame(sessions)

!['Diagram of containers'](http://alleninstitute.github.io/AllenSDK/_static/container_session_layout.png)

> Let's get the id for the imaging session for this container that container natural scenes

In [None]:
session_id = boc.get_ophys_experiments(experiment_container_ids=[experiment_container_id], stimuli=['natural_scenes'])[0]['id']

In [None]:
print(session_id)

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">

<h1> Part 2: Getting data for an experiment</h1>


</div>

The Ophys Experiment data object gives us access to everything in the NWB file for a single imaging session

In [None]:
data_set = boc.get_ophys_experiment_data(ophys_experiment_id=session_id)

### 2.1 Maximum projection
This is the projection of the full motion corrected movie. It shows all of the cells imaged during the session.

In [None]:
max_projection = data_set.get_max_projection()

In [None]:
fig = plt.figure(figsize=(6,6))
plt.imshow(max_projection, cmap='gray')

### 2.2 ROI Masks
These are all of the segmented masks for cell bodies in this experiment.

In [None]:
rois = data_set.get_roi_mask_array()

What is the shape of this array? How many neurons are in this experiment?

Plot the masks for all the ROIs.

### 2.3 DF/F Traces
There are a number of accessible traces in the NWB file, including raw fluorescence, neuropil corrected traces, demixed traces, and DF/F traces. 

In [None]:
ts, dff = data_set.get_dff_traces()

In [None]:
dff.shape

Let's look at the first neuron

In [None]:
plt.figure(figsize=(10,4))

plt.plot(dff[0,:])

plt.ylabel("DFF (%)", fontsize=16)

Let's look at the first 50 cells. 

In [None]:
fig = plt.figure(figsize=(10,8))
for i in range(50):
    plt.plot(dff[i,:]+(i*2), color='gray')

It looks like different cells are active at different times. What could that be about?

### 2.4 Stimulus epochs

Several stimuli are shown during each imaging session, interleaved with each other. The stimulus epoch table provides information of these interleaved stimulus epochs


Get the stimulus epoch table from data_set and print the table

In [None]:
stim_epoch = 

Overlay stimulus epochs on the DFF traces.  

In [None]:
fig = plt.figure(figsize=(14,8))
for i in range(50):
    plt.plot(dff[i,:]+(i*2), color='gray')
    
#for each stimulus, shade the plot when the stimulus is presented
colors = ['blue','orange','green','red']
for c,stim_name in enumerate(stim_epoch.stimulus.unique()):
    stim = stim_epoch[stim_epoch.stimulus==stim_name]
    for j in range(len(stim)):
        plt.axvspan(xmin=stim.start.iloc[j], xmax=stim.end.iloc[j], color=colors[c], alpha=0.1)

### 2.5 Running speed

The running speed of the animal on the rotating disk during the entire session.

In [None]:
dxcm, tsd = data_set.get_running_speed()

Plot the running speed. Label the units (they are cm/s)

Add the running speed to the neural activity and stimulus epoch figure

In [None]:
fig = plt.figure(figsize=(14,10))
for i in range(50):
    plt.plot(dff[i,:]+(i*2), color='gray')
plt.plot((0.2*dxcm)-20)
    
#for each stimulus, shade the plot when the stimulus is presented
colors = ['blue','orange','green','red']
for c,stim_name in enumerate(stim_epoch.stimulus.unique()):
    stim = stim_epoch[stim_epoch.stimulus==stim_name]
    for j in range(len(stim)):
        plt.axvspan(xmin=stim.start.iloc[j], xmax=stim.end.iloc[j], color=colors[c], alpha=0.1)

### Interesting things

There are some interesting neurons here.  Plot the dff trace for neuron 49 with the stimulus epochs, and running trace. Repeat for neuron 4. And for neuron 35.  What is intereating about these neurons

### 2.5b Extracted events
As of the October 2018 data release, we are providing access to events extracted from the DFF traces using the L0 method developed by Sean Jewell and Daniella Witten. These are not stored in the NWB file, thus aren't a function of the data_set object, but are available through the boc

In [None]:
events = boc.get_ophys_experiment_events(ophys_experiment_id=session_id)

In [None]:
events.shape

Plot the events and the dff trace for one neurons (say neuron 1). Zoom in to a relevant portion.

Remake our plot of neural activty, stimulus, and running using events

### 2.6 Stimulus Table
For each stimulus there is a stimulus table with information about the condition and timing of each trial. 

In [None]:
natural_scene_table = data_set.get_stimulus_table('natural_scenes')

In [None]:
natural_scene_table.head()

Get the stimulus table for static gratings. Print the top of this dataframe. What are the parameters for this stimulus?

### 2.7 Stimulus Template

The images and movies presented during the session area also included in the NWB file as the stimulus template. Stimuli that are generated programmatically (eg. drifting and static gratings) do not have a stimulus template. There are tools in the SDK to recreate these stimuli.

In [None]:
natural_scene_template = data_set.get_stimulus_template('natural_scenes')

In [None]:
natural_scene_template.shape

Look at the scene presented for the first trial

In [None]:
scene_number = natural_scene_table.frame.loc[0]
plt.imshow(natural_scene_template[scene_number,:,:], cmap='gray')

Plot the time when this image is presented overlayed on the activity the 50 neurons

In [None]:
fig = plt.figure(figsize=(10,8))
for i in range(50):
    plt.plot(dff[i,:]+(i*2), color='gray')
    
#shade traces with the time of each presentation of the above scene
stim_subset = natural_scene_table[natural_scene_table.frame==scene_number]
for j in range(len(stim_subset)):
    plt.axvspan(xmin=stim_subset.start.iloc[j], xmax=stim_subset.end.iloc[j], color='red', alpha=0.4)

### 2.8 Metadata
This includes metadata about the experiment, some of which we used to select this experiment, some of which is only included here.

In [None]:
data_set.get_metadata()

Metadata includes: age, sex, device & device_name, genotype, start_time.  Note: start_time is not the time the experiment was collected, but rather the time the NWB file was created.  We are hoping to fix this soon.  :(

# 2.9 Cell ids and indices

Each cell in the dataset has a unique id, called the cell specimen id. To find the cells in this session, get the cell specimen ids. This id can be used to find experiments/session - as we'll do later today.

In [None]:
cell_ids = data_set.get_cell_specimen_ids()

In [None]:
cell_ids

Within each individual session, a cell id is associated with an index. This index maps into the dff of event traces.  Pick one cell id from the list above and find the index for that cell. Look for the cell specimen indices.

# 2.10 Cell Specimen Table

For every cell in the entire observatory dataset, there are precomputed metrics for the different stimuli. These metrics can be useful for identifying neurons you want to use for further analysis. 

In [None]:
cell_specimens = pd.DataFrame(boc.get_cell_specimens())

In [None]:
cell_specimens.head()

In [None]:
cell_specimens.keys()

# 2.11 Motion correction
This returns a dataframe of the motion correction applied to each frame of the movie for both x and y.

In [None]:
motion = data_set.get_motion_correction()

In [None]:
motion.head()

In [None]:
plt.figure(figsize=(10,4))
plt.plot(motion.x_motion)
plt.plot(motion.y_motion)
plt.ylabel("Pixels")

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">

<h2> 3. (BONUS!) Accessing calcium movies.</h2>

If you accessing the data on AWS, you can find the files for the calcium data (more precisely, the motion corrected calcium movies).  Here is an example piece of code that shows the beginning of the experiment we've been looking at.

</div>

In [None]:
import h5py
from matplotlib import animation, rc
from IPython.display import HTML

In [None]:
raw_data_dir = '/data/allen-brain-observatory/visual-coding-2p/ophys_movies/'

In [None]:
def get_raw_data_path(session_id):
    return os.path.join(raw_data_dir, 'ophys_experiment_'+str(session_id)+'.h5')

exp_path = get_raw_data_path(session_id)

In [None]:
raw_data = h5py.File(exp_path, 'r')
raw_data['data']

In [None]:
fig, ax = plt.subplots(figsize=(10,10))

im = ax.imshow(raw_data['data'][0])
ax.axis('off')

def init():
    im.set_data(raw_data['data'][0])
    return (im,)

def animate(i):
    im.set_data(raw_data['data'][i])
    return (im,)

anim = animation.FuncAnimation(fig, animate, init_func=init, frames=30, interval=1000./30, blit=True)