<img src="../PythonBootcamp/support_files/cropped-SummerWorkshop_Header.png">  

<h1 align="center">Introduction to the Allen Brain Observatory</h1> 
<h3 align="center">August 24, 2016</h3> 

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
<p>This notebook documents some classes and functions in the AllenSDK that help manipulate files and data structures in the Allen Brain Observatory. 
</div>


In [None]:
# please make sure your drive_path is set, so that the notebook can find the data files on the hard drive

# OS X
drive_path = '/Volumes/Brain2016'

# Windows (a good guess)
# drive_path = 'e:/'

# Linux (will vary; the following is possibly what Ubuntu will do)
# drive_path = '/media/Brain2016/'

In [None]:
# We need to import these modules to get started
import numpy as np
import pandas as pd
import os
import sys


import matplotlib.pyplot as plt
%matplotlib inline

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
<p>The main entry point is the `BrainObservatoryCache` class.  This class is responsible for downloading any requested data or metadata as needed and storing it in well known locations.  For this workshop, all of the data has been preloaded onto the hard drives you have received.

<p>We begin by importing the `BrainObservatoryCache` class and instantiating it.

<p>`manifest_path` is a path to the manifest file.  We will use the manifest file preloaded onto your Workshop hard drives.  Make sure that `drive_path` is set correctly for your platform.  (See the first cell in this notebook.)
</div>


In [None]:
from allensdk.core.brain_observatory_cache import BrainObservatoryCache

manifest_path = os.path.join(drive_path,'BrainObservatory','manifest.json')
boc = BrainObservatoryCache(manifest_file=manifest_path)

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 1.1:**  Get information about what's in the dataset from BrainObservatoryCache

<p>The following methods for BrainObservatoryCache retrieve the available depths, cre lines, areas, and stimuli.  Notice that these parameters outline the 'data cube'.
</div>

In [None]:
# Download a list of all targeted areas
targeted_structures = boc.get_all_targeted_structures()
print 'all targeted structures: ' + str(targeted_structures)

# Download a list of all imaging depths
depths = boc.get_all_imaging_depths()
print 'all imaging depths: ' + str(depths)

# Download a list of all cre driver lines 
cre_lines = boc.get_all_cre_lines()
print 'all cre lines: ' + str(cre_lines)

# Download a list of all stimuli
stims = boc.get_all_stimuli()
print 'all stimuli: ' + str(stims)


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 1.2:**  Use tab completion in Jupyter to see what other methods the BrainObservatoryCache has.
</div>

In [None]:
# Hit the 'tab' key with the cursor just after the '.'
boc.

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
<h2>Experiment containers</h2>
<p>The experiment container describes a set of 3 experiment sessions performed at the same location (targeted area and imaging depth) in the same mouse that targets the same set of cells. Each experiment container has a unique ID number.
</div>

In [None]:
expt_cont_list = boc.get_experiment_containers()

print "There are " + str(len(expt_cont_list)) + " experiment containers."

In [None]:
# example experiment_container_ids to use in this notebook
expt_list = [511510699, 511510664, 511510797, 511507650, 511510917, 
             511510675, 511510911, 511510860, 511510658, 511498500]

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 1.3:** Pick an experiment container.  For this session we're going to need to get an experiment, which you'll use for the remainder of the tutorial.  Execute the following cells to do this.
</div>

In [None]:
# pick a random experiment container
expt_index = np.random.randint(0,len(expt_list))
# get expt_container_id for that index
expt_container_id = expt_list[expt_index]

print "YOU GET AN EXPERIMENT CONTAINER!! EVERYONE GETS AN EXPERIMENT CONTAINER!!!"
print 'expt_container_id =',expt_container_id

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 1.3.1:** Find out the location and Cre line of this experiment container
</div>

In [None]:
print "Experiment container " + str(expt_container_id) + " is "
print boc.get_experiment_containers(ids=[expt_container_id])

<div style="background: #FFF0F0; border-radius: 3px; padding: 10px;">
**Poll** Report your experiment container's targeted structure here:
[Response](https://www.polleverywhere.com/multiple_choice_polls/xolRy5TQVGAhjdU)
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 1.4:** Get information about all of the experiment <strong>sessions</strong> in your experiment <strong>container</strong>.  This is accomplished with the `get_ophys_experiments` method.  
</div>

In [None]:
expt_session_info = boc.get_ophys_experiments(experiment_container_ids=[expt_container_id])
print(expt_session_info)

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">

<p>`get_experiment_containers` returns a list of dictionaries that contain information about experiment containers.

<p>`get_ophys_experiments` returns a list of dictionaries that contain information about experiment sessions.  Here we are using keyword arguments to return just those experiment sessions that belong to our experiment container.
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 1.5:**  Turn it into a DataFrame for easy access
</div>

In [None]:
expt_session_frame = pd.DataFrame(expt_session_info)
expt_session_frame

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 1.1:**  Find all experiment sessions from a given area, depth, cre line, or specific stimulus. How many of each are there? (Hint:  use the `help` function to see the other optional arguments for `get_ophys_experiments` or `get_experiment_containers`.)
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 1.2:**  Make a pandas table from all experiment sessions.  Perform Exercise 1.1 using this table.
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 1.3:**  Find the experiment id for Session A from your experiment container.  Save this as `session_id`.
</div>

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
<h2>Cell Specimens</h2>


<p>`get_cell_specimens` is a method of the BrainObservatoryCache that provides important pre-computed characteristics of all the cells in the data set.  
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 1.5:**  Make a pandas table from the information returned by `get_cell_specimens` for just the cells in your experiment container. How many cells are in this container?
</div>

In [None]:
cell_specimens_df = pd.DataFrame(boc.get_cell_specimens(experiment_container_ids=[expt_container_id]))
cell_specimens_df.head()

In [None]:
print len(cell_specimens_df)

<div style="background: #FFF0F0; border-radius: 3px; padding: 10px;">
**Poll** How many cells are in your experiment container? Answer here: [response](https://www.polleverywhere.com/free_text_polls/cXVrJnkXpPZA6ce)
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 1.4:**  Filter the table from the previous task to find all cells in your experiment container id that have `dsi_dg` (Direction Selectivity Index for Drifting Gratings)  &lt; 1.0.
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 1.5:**  Find the cell in your filtered dataframe that has the largest `dsi_dg` that is less than 1.0.  Save the cell_specimen_id of this cell to a variable called `cell_specimen_id`.
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 1.6:**  Find the preferred direction and temporal frequency for the cell you have identified in `cell_specimen_id`.  Save these to `ori` and `tf`.
</div>

<div style="background: #FFF0F0; border-radius: 3px; padding: 10px;">
**Poll** What is the preferred temporal frequency of your cell? Respond [here](https://www.polleverywhere.com/multiple_choice_polls/cn8I6mSMithCcKI)
</div>

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
<p>We will be using the cell you have recorded in `cell_specimen_id` for much of the remainder of this notebook.

<h2>The Data Object</h2>
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 2.1:**  Create a data_set object for this experiment session.

The data_set object contains methods and info for a single experiment session (one of the 3 in the experiment container)
</div>

In [None]:
data_set = boc.get_ophys_experiment_data(ophys_experiment_id = session_id)

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 2.2:** Use either `dir` or tab-completion to find out what methods the new `data_set` object has.
</div>

In [None]:
data_set.

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">

<p>Using the methods you find, perform the following exercises.
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 2.1:** Get the metadata for your data set. How old was the mouse in this experiment?  Was it male or female?  
</div>

<div style="background: #FFF0F0; border-radius: 3px; padding: 10px;">
**Poll** What was the sex of the mouse in your experiment? Respond [here](https://www.polleverywhere.com/multiple_choice_polls/ZuaLuwbeWqAjTfl)
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 2.3:** Get the max projection image for your data set.  
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 2.4:** Let's find the cell you recorded in `cell_specimen_id` in `data_set`.  `cell_specimen_id` is a unique cell identifier that is used across multiple sessions in which that cell appears.  For each individual session, each cell has an index specific to that session.  There are two methods of data_set that allow you to map back and forth between these two identifiers.  Find them and use one of them to save the session identifier for your cell to `cell_index`.
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 2.5:** Get the roi mask for your cell.  (Hint:  There are two methods that return roi masks.  In one of them masks are returned as lists of python objects.  What methods do they have?  What is the type of this object?)  What is the size and shape of the mask?
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 2.6:** Plot the mask overlayed on the max projection.  (Hint:  imshow has an optional parameter called `alpha`.)
</div>

<div style="background: #FFF0F0; border-radius: 3px; padding: 10px;">
**Poll** Are you having [fun?](https://www.polleverywhere.com/multiple_choice_polls/8SUrezEJfhoK7FY)
</div>

# Traces

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 2.7:**  What kinds of traces can you extract from the data object?  Retrieve the "corrected fluorescence" traces.   What is the shape of this object?  The methods will return a tuple of length two.  The first value is the set of time stamps for the acquisition frames; the second is an array of shape (number_of_cells,time_points).  How many cells are in your data set?  Plot the "corrected fluorescence" trace for the cell you saved in `cell_session_id`. 
</div>

# Stimuli

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 3.1:**  What stimuli were shown in this session? Use a method of the data_set object to find out.
</div>

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
<p>The stimulus table stores the timing information regarding stimulus conditions
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 3.2:** Use a method of the data_set object to get the stimulus_table for drifting gratings.  (Use help to find the necessary arguments for the method.)  What kind of object is this?  How many stimulus conditions are there?  How many orientations?  How many temporal frequencies?  How many trials of each condition were shown?  How long was each presentation?  (Hint:  use boolean indexing.)

<p><strong>Important hint</strong>: trial start and end times are in aquisition frames, which count each frame acquired by the two-photon microscope, not seconds.  This is the same index used for the fluorescence traces.
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 3.3:**> Plot the fluorescence trace for the cell in `cell_session_id` for a few trials using the start, end times of the trials.
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 3.4:** Create a plot that shows when the drifting gratings were displayed.  (Hint:  avxspan is an axis object method that will fill in the background between two x positions.  See the following example.)
</div>

In [None]:
fig,ax = plt.subplots(1)

t = np.linspace(0,2.0*np.pi,1000)
ax.plot(t,np.cos(t))
ax.axvspan(xmin=np.pi/4,xmax=7*np.pi/4,color='g',alpha=0.3)

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
<h2>Analysis</h2>


<p>The analysis objects summarize this trial data and provide convenient DataFrame objects.  
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 3.1:**  Import the `DriftingGratings` object and instatiate it with `data_set`.
</div>

In [None]:
from allensdk.brain_observatory.drifting_gratings import DriftingGratings

dg = DriftingGratings(data_set)

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">

<p>sweep_response is a DataFrame that contains the dF/F response of each cell during each stimulus trial. It shares its index with stim_table. Each cell contains a timeseries that extends from 1 second prior to the start of the trial to 1 second after the end of the trial. The sweep_response table is organized as cells (columns) for each sweep (rows)

<p>mean_sweep_response provides the mean dF/F for each trial.
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 3.5:** Get the sweep_response for this stimulus and data set.  What type of object is this?  What data does it contain?
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 3.6:** Get the mean_sweep_response for this stimulus and data set.  How does this object differ from sweep_response?
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 3.7:** Find the trials for a specific stimulus condition
(ex: temporal_frequency = 2 and orientation = 90).  Use the stimulus table and boolean indexing.
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 3.8:** Use the trials you've found and the sweep_response table to plot the response across trials.  (Extra credit for highlighting the interval over which the stimulus is 'on'.)
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 3.9:**  Compute and plot the mean response over trials for the preferred condition for your selected cell.
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 3.10:**  Repeat this process using `mean_sweep_response` in order to compute a single numerical value for the response to the preferred orientation and temporal frequency.
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 3.11:**  Generate a matrix of response values over all direction and temporal frequency conditions by repeating the previous calculation for each condition.  Plot a heat map of the mean response across all stimulus conditions.  Plot direction and temporal frequency tuning curves by averaging over each.
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 3.2:**  The easy way!  We did this as a pedagogic exercise so that you could learn about the sdk and the data.  Should you need it, this matrix has been computed already and is available in the `response` attribute for `DriftingGratings`.
</div>

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
<h2>Other Stimulus Types</h2>
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 3.2:**  There are analysis objects for the other stimulus types.  You saw above that Session A contains responses for drifting gratings, natural movies, and spontaneous activity.  Instantiate the Natural Movie object and see what methods and attributes are available.
</div>

In [None]:
from allensdk.brain_observatory.natural_movie import NaturalMovie 

nm1 = NaturalMovie(data_set)  #how to pick which movie, check sdk

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 3.12:**  There are also objects for StaticGratings, NaturalScenes, and LocallySparseNoise.  For each of these, use what you've learned to find an experiment with each of these stimulus types, instantiate the analysis object, and explore the stimulus tables and available attributes.
</div>

In [None]:
from allensdk.brain_observatory.static_gratings import StaticGratings
from allensdk.brain_observatory.natural_scenes import NaturalScenes
from allensdk.brain_observatory.locally_sparse_noise import LocallySparseNoise

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 3.13:**  Load the stimulus template for the LocallySparseNoise stimulus. Plot the first frame of the stimulus.
</div>

In [None]:
session_id_C = expt_session_frame[expt_session_frame.session_type=='three_session_C'].id.values[0]
data_set_C = boc.get_ophys_experiment_data(ophys_experiment_id = session_id_C)

In [None]:
lsn_template = data_set_C.get_stimulus_template('locally_sparse_noise')

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Exercise 3.14:**  Find all of the frames with a white square located at x=0,y=0. How many frames are there?
</div>

### Other Exercises or Homework

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Homework 1:**  Compute Receptive Fields for the ON and OFF responses using the Locally Sparse Noise stimulus.  (If you're having trouble, try testing your code on this cell_specimen_id:   )
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Homework 2:**  Pick an image from the Natural Scenes. Find all of the cells from which this is the preferred image. Determine the spatial frequency tuning of those cells. Does it differ from the population as a whole? Does it differ across areas, Cre lines, layers?
</div>

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
<h2>Project Ideas</h2>

<p>Here are some ideas to get you started in thinking about your projects.
</div>

<ol>
<li> How do cells' responses differ across regions, layers, and Cre lines?  How best can these differences be captured?
<li> What is the distribution of feature responses?  How does preferred orientation, say, vary across regions, layers, and Cre lines?
<li> Do the responses to one type of stimulus allow us to predict the responses to a different type?  Are grating responses consistent with natural image responses?
<li> Can you distinguish "simple" and "complex" cells in the dataset?  What is the right model or metric to use?
<li> Characterize the cross correlations (both "noise" and "signal" correlations) in the data set.  Can you model this variability?  
<li> Develop models of stimulus response that control for running speed or include temporal dynamics.
<li> How well can you identify the stimulus category given the activity of a set of neurons within an experiment, i.e. can you "decode" the stimulus?  What is the best way to do this?  What features are necessary?  Can you identify cells that carry "more" information about stimuli?
<li> What population metrics are useful for describing the data?  Can you model the population activity?
<li> What is the best way to visualize the activity of many cells in an experiment?  Is there a useful dimensional reduction that can help you?
</ol>