<img src="../resources/cropped-SummerWorkshop_Header.png">  

<h1 align="center">Brain Observatory - Visual Behavior </h1> 
<h2 align="center">Summer Workshop on the Dynamic Brain </h2> 
<h3 align="center">Monday, August 26, 2018</h3> 

<img src="../resources/visual_behavior_experiment.png" height="400" width="1200">  


This notebook will introduce you to the Visual Behavior Brain Observatory dataset. This dataset uses 2-photon calcium imaging (also called optical physiology or ophys) to measure neural activity in mice performing a visual change detection task. One aim of this dataset is to ask: how is sensory coding influenced by expectation, engagement, and experience?

The change detection task consists of a series of image presentations. Each image flash is 250ms followed by 500ms of gray screen. The task for the mouse is to lick in a 750ms response window following a change in image identity. On each trial, a change time is scheduled. On go trials, a change in image identity occurs. On catch trials, no image change occurs (aka 'sham change'), and we measure false alarm rates in the same 750ms response window. Correct responses are rewarded and licks outside the response window result in a timeout.

There are 8 natural scene images shown in each behavioral session. Mice learn the task with one set of 8 natural scenes which become highly familiar with experience. During the imaging phase of the experiment, mice perform the task with the familiar image set, as well as another set of 8 images that are experienced for the first time under the microscope. This allows us to ask how training history and visual experience infuence sensory responses. 

There are 2 types of sessions during the imaging portion of the experiment - active behavior and passive viewing. During the passive viewing sessions, the task is run in open loop mode with the lick spout retracted, after the mouse has been give its daily allocation of water. This allows us to ask how representations differ when the mouse is actively engaged in the task and motivated to earn water rewards compared to when it is sated and not receiving reward feedback.

During imaging sessions, 5% of non-change image flashes are randomly omitted from the otherwise regular sequence of stimulus presentations. This allows us to ask whether expectation signals are present in the visual cortex. 

The dataset consists of recordings from excitatory (Slc17a7-IRES2-Cre;CaMK2-tTA;Ai93(GCaMP6f)) and VIP inhibitory (VIP-IRES-Cre;Ai162(GCaMP6f)) neurons in V1. Excitatory cells were sampled at 2 depths: 175um (L2/3) and 375um (L5). VIP cells were sampled at 175um depth.

In this notebook, we will describe the core components of each experimental session and the tools for accessing and analyzing the data.

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">

<p>Let's get started

</div>

In [None]:
# you will need these libraries for computation & data manipulation
import os
import numpy as np
import pandas as pd

# matplotlib is a standard python visualization package
import matplotlib.pyplot as plt
%matplotlib inline

# seaborn is another library for statistical data visualization
# seaborn style & context settings make plots pretty & legible
import seaborn as sns
sns.set_context('notebook', font_scale=1.5, rc={'lines.markeredgewidth': 2})
sns.set_style('white')
sns.set_palette('deep');

# Import allensdk modules for loading and interacting with the data
from allensdk.brain_observatory.behavior.swdb import behavior_project_cache as bpc


<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
<p>The first thing we will do is use the `allensdk` to load a cache for the visual behavior dataset, which contains a manifest describing the dimensions of the dataset and methods for loading the data from particular sessions. You can inspect the manifest contained in the cache to identify experiments of interest and their metadata. 

<p>Make sure you have access to the `visual_behavior_cache.json` file, which tells the cache object where to find the data. </div>

In [None]:
# AWS paths

# Mac/Linux paths
# cache_json = {'manifest_path': '/allen/programs/braintv/workgroups/nc-ophys/visual_behavior/SWDB_2019/visual_behavior_data_manifest.csv',
#               'nwb_base_dir': '/allen/programs/braintv/workgroups/nc-ophys/visual_behavior/SWDB_2019/nwb_files',
#               'analysis_files_base_dir': '/allen/programs/braintv/workgroups/nc-ophys/visual_behavior/SWDB_2019/analysis_files',
#               'analysis_files_metadata_path': '/allen/programs/braintv/workgroups/nc-ophys/visual_behavior/SWDB_2019/analysis_files_metadata.json'
# }
# Windows paths
cache_json = {'manifest_path': r'\\allen\programs\braintv\workgroups\nc-ophys\visual_behavior\SWDB_2019\visual_behavior_data_manifest.csv',
              'nwb_base_dir': r'\\allen\programs\braintv\workgroups\nc-ophys\visual_behavior\SWDB_2019\nwb_files',
              'analysis_files_base_dir': r'\\allen\programs\braintv\workgroups\nc-ophys\visual_behavior\SWDB_2019\analysis_files',
              'analysis_files_metadata_path': r'\\allen\programs\braintv\workgroups\nc-ophys\visual_behavior\SWDB_2019\analysis_files_metadata.json'
             }

cache = bpc.BehaviorProjectCache(cache_json)

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 1.1:**  Get information about what's in the dataset 

<p>Read in 'visual_behavior_data_manifest.csv' using the cache object and explore the columns to see the available visual areas, cre lines, and session types. 

</div>

In [None]:
# get the manifest of all experiment sessions for this dataset
manifest = cache.manifest
manifest.head(10)

In [None]:
# what are the dimensions of this dataset? 
print('targeted structures:', manifest.targeted_structure.unique())
print('\ncre_lines:', manifest.full_genotype.unique())
print('\nstage_types:', manifest.stage_name.unique())

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 1.2:**  Everyone gets an experiment! 

<p>Get your experiment ID and assign it to a variable called `experiment_id`

<p>What is the `targeted_structure`, 'imaging_dpeth', `full_genotype`, and `stage_name` for your `experiment_id`? 

</div>

In [None]:
# get a random experiment
experiment_index = np.random.random_integers(low=0,high=len(manifest.ophys_experiment_id.values))
experiment_id = manifest.ophys_experiment_id.values[experiment_index]

In [None]:
# get the metadata for this experiment from the manifest


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 1.3:**  What is in an experiment container? 

<p>The experiment container describes a set of imaging sessions performed at the same location (targeted structure and imaging depth) in the same mouse that targets the same set of cells. All the sessions in an experiment container have a common `experiment_container_id`.

<p>Get a the `experiment_container_id` for your `experiment_id` and find out what other sessions were recorded at that same location. 

<p>Do all experiment containers have the same number of sessions associated with them? Hint: use pandas groupby
</div>

In [None]:
# get the container ID for this experiment


In [None]:
# what other sessions are in this container?


In [None]:
# Get number of sessions in each container


<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
<h2>The Behavior OPhys Session object</h2>
<p>The BehaviorOphysSession class in allensdk.brain_observatory.behavior.behavior_ophys_session provides an interface to all of the data for a single experimental session from the Visual Behavior pipeline, aligned to a common time clock.

<p>We package each session's data into a Neurodata Without Borders 2.0 (NWB) file. The BehaviorOphysSession will load data from the NWB file for a given session.
    
<p>You can load a BehaviorOphysSession object easily using the 'get_session' method of the cache object. 

<p>Use help to see what functions are contained in the session object. 


</div>

In [None]:
# get a session from the cache
session = cache.get_session(experiment_id)

In [None]:
help(session)

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 2.1:**  What is an experiment session? 

<p>Use tab completion to see what is in the dataset object for an experiment session

<p>What is in the `metadata` attribute? What is in the 'task_parameters' attribute?

</div>


In [None]:
# get session metadata


In [None]:
# get session task parameters


<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
<h2>Optical physiology data - max projection, roi masks, and fluorescence traces</h2>

<p>Let's use the session object to access neuron fluorescence timeseries, roi masks, and metadata. An ROI mask is used to define the boundary of each cell in the flourescence data. The timeseries extracted from each ROI is one cell's activity.

</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 2.1: max intensity projection and ROI masks** 
    
<p>Get the maximum intensity projection image using the `max_projection` attribute for your dataset and display it. 
    
<p>Get the 'segmentation_mask_image' and display it next to the max projection. 
</div>

In [None]:
# plot the max intensity projection and the segmentation mask


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 2.2: Get dF/F traces and ophys timestamps**

<p>Get the fluorescence traces and ophys timestamps. How are they formatted?

<p>`dff_traces` is a dataframe with 'cell_specimen_id' as the index and 'cell_roi_id' and 'dff' as columns. 
    
<p>'cell_roi_id' is the unique identifier for each cell within a session. 'cell_specimen_id' is the unified cell identifier after cells are matched across sessions within a container. Cells found in multiple sessions will have the same 'cell_specimen_id' in all the sessions in which they were found.  
    
<p>the 'dff' column contains the baseline normalized fluorescence traces, also called dF/F traces, for each cell in the session. 
    
<p>`timestamps_ophys` is an array of timestamps for each 2P imaging frame. 
    
<p>Check that the length of one of the dF/F traces is the same length as the ophys timestamps.

</div>

In [None]:
# get traces and timestamps


In [None]:
# get shape of traces and timestamps


In [None]:
# shape of one cell's trace


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 2.3: Plot the dF/F trace for a cell**

<p>Plot the dF/F trace for one cell by indexing into the `dff_traces` array. Use `timestamps_ophys` to plot the y_axis in seconds. 
    
<p>Try plotting the trace for a few different cells.

</div>

In [None]:
# plot the dF/F trace for one cell using ophys timestamps for x-axis values


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 2.4: Plot a heatmap of all cell traces in this session**

<p>Extract the dff_traces from the dataframe into an array. What is the shape?

<p>Use the matplotlib plotting function pcolormesh to plot the matrix as a heatmap. 

</div>

In [None]:
# turn dff_traces into an array of cells x timepoints
dff_traces_array = np.vstack(dff_traces.dff.values)
print('shape of dff_traces_array:',dff_traces_array.shape)

In [None]:
# plot a heatmap of all traces 
fig, ax = plt.subplots(figsize=(20,5))
cax = ax.pcolormesh(dff_traces_array, cmap='magma', vmin=0, vmax=np.percentile(dff_traces_array, 99))
ax.set_yticks(np.arange(0, len(dff_traces_array)), 10);
ax.set_ylabel('cells')
ax.set_xlabel('time (sec)')
ax.set_xticks(np.arange(0, len(timestamps_ophys), 600*31));
ax.set_xticklabels(np.arange(0, timestamps_ophys[-1], 600));
cb = plt.colorbar(cax, pad=0.015, label='dF/F')

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
<h2>Behavior timeseries and events - running, licks, and rewards </h2>
<p>As the mouse performs the behavioral task, it is free to run on a disk. The task is a go/no-go style task with licking as the behavioral response. When a mouse correctly licks the water spout, a reward is delivered. 

<p>Running, licks and rewards are measured at the stimulus frame display rate and share timestamps with the stimulus. </div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 3.1: Get running speed trace and timestamps** 

<p>Get the `running_speed` attribute of the dataset object. What does it contain? 

<p>Runnning speed shares timestamps with the visual stimulus. Compare the values of running timestamps from  `running_speed` with the values in the dataset attribute `stimulus_timestamps`. 
</div>

In [None]:
# get running speed


In [None]:
# what are the values of running speed timestamps?


In [None]:
# what are the values of stimulus timestamps?


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 3.2: Plot running speed**

<p>Plot the values for running speed with time in seconds on the x-axis. 
    
<p>Running speed is measured in cm/s. Label the axes appropriately.
        
</div>

In [None]:
# plot running speed with timestamps on x-axis


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 3.4: Rewards and licks**
    
<p>Get the 'rewards' attribute of the session object. How is it formatted? 

<p>Get the 'licks' attribute of the session object. How is it formatted? 

<p>What is the relationship between running, licking and rewards? 
</div>

In [None]:
# Get information about rewards


In [None]:
# Get information about licks


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 3.5: Plot licking, reward times, and running trace on the same figure**
    
<p>1) Plot `running_speed` as above, but set xlims to focus on a 30 second portion of the behavior session, from x=600 to x=630. 

<p>2) Plot `rewards` as points (not a line), at y = -10. Note that `rewards` is a dataframe, with timestamps as the index. Use the values of the index to get the times of all rewards to plot along the x-axis.

<p>Hint: You will need to create an array of len(session.rewards.index.values) filled with -10 to use as y-axis values to plot. np.repeat() is a convenient function for this.

<p>3) Plot `licking` times using plt.vlines() with ymin=-10 and ymax=-5. 

<p>4) Bonus: Create a legend to label licks, rewards, and running. 

<p>What is the relationship between running, licking and rewards? 
</div>

In [None]:
# plot running speed, rewards, and licks


<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
<h2>Visual stimuli </h2>
    
<p>The timing of visual stimui can be accessed through the 'stimulus_presentations' table. This includes the timing of omitted stimuli - in other words, the time where the image would have been presented if it were not omitted.  
    
<p>The images shown during the session are included in the 'stimulus_template'. 

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 4.1: Get the stimulus table**

<p>Get the `stimulus_presentations` attribute to identify the times of stimulus presentations. How many stimulus flashes were there? 

<p>What other data is included for each stimulus flash in this table? What could it be used for?
    
</div>

In [None]:
# get the stimulus presentations table


In [None]:
# how many stimulus presentations were there? 


In [None]:
# what are the keys of the stimulus presentations table?


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 4.2: Plot visual stimulus presentations with behavior events**

<p>1) Copy and paste your code from Task 3.5

<p>2) Plot stimulus presentations using the `start_time` and `stop_time` columns with plt.axvspan(). Set alpha=0.3 & facecolor='gray'.

<p>Hint: Loop through each row of the stimulus table using the pandas method 'iterrows' to plot all stimulus flashes
    
<p>3) Bonus: Plot stimulus presentations corresponding to image changes using the 'changes' column. Set facecolor='blue' to distinguish from non-change flashes. 

</div>

In [None]:
# plot running, rewards, licks and stimuli using axvspan to delineate periods where a stimulus was shown


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 4.3: Get visual stimulus templates**

<p>Get the `stimulus_templates` from the session object. What is the shape?

<p>The first dimension of `stimulus_templates` corresponds to the `image_index` in `stimulus_presentations`.
    
<p>Plot an image from 'stimulus_templates' using its 'image_index'. Show the name of the image in the title by finding the 'image_name' that corresponds to that 'image_index' in the 'stimulus_presentations' table.
    
</div>

In [None]:
# get the stimulus templates and print the shape


In [None]:
# plot a stimulus using its image index

# show the image name for that image index using the stimulus presentations table and show it in the title


<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
<h2>Behavior trials data</h2>
    
<p>The `trials` dataframe organizes behavior events (including licking and rewards), stimulus information (what stimulus was shown before and after the scheduled change time) and metadata (such as whether the trial was a 'go' trial or a 'catch' trial) for each behavioral trial. 

<p>This structure is convenient for data exploration and analysis.
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 5.1: Explore the trials table**

<p>1) Get the `trials` attribute of the `session` object. What are the columns of this dataframe? What are the rows?

<p>2) How many go trials were there? How many catch trials? What is the ratio of go to catch trials?

<p>3) What images were shown in this behavior session? Use the pandas 'unique' method to get the unique images from the trials table. 
</div>

In [None]:
# get the trials table 


In [None]:
# how many go trials were there? 


In [None]:
# how many catch trials were there?


In [None]:
# what images were shown? 


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 5.2: Get the hit and false alarm rates for this session**

<p>The hit rate is the fraction of go trials with a lick in the reward window
    
<p>The false alarm rate is the fraction of catch trials with a lick in the reward window

<p>1) Select all the 'go' trials by filtering the dataframe by `go`. Get the fraction of 'go' trials where 'hit' = True. 

<p>2) Repeat for 'catch' trials.

</div>

In [None]:
# compute the hit rate for go trials


In [None]:
# compute the false alarm rate for catch trials


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 5.3: Plot hit rate across images for go trials**

<p>1) Loop through the image names in `session.trials.change_image_name.unique()`

<p>2) Quantify the fraction of 'go' trials with a `hit` for each image

<p>3) Sort the hit rates using np.sort() and plot the sorted hit rate by image
    
<p>4) Get the sorted indices using np.argsort() and apply this ordering to the image names to plot on the x-axis
</div>

In [None]:
# get the hit rate for each image


In [None]:
# sort the hit rates in ascending order and sort the image labels in the same order


In [None]:
# plot hit rate by image with image names on the x-axis


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 5.4: Plot a lick raster**

<p>Provide the `trials` dataframe to the function below to plot a lick raster.

<p>Is the mouse performing the task consistently across the whole session?
</div>

In [None]:
def make_lick_raster(trials):
    trials = trials[trials.aborted==False]
    trials = trials.reset_index()
    fig,ax = plt.subplots(figsize=(5,10))
    for trial_index, trial_data in trials.iterrows(): 
        # get times relative to change time
        lick_times = [(t - trial_data.change_time) for t in trial_data.lick_times]
        reward_time = [(t - trial_data.change_time) for t in [trial_data.reward_time]]
        # plot reward times
        if len(reward_time) > 0:
            ax.plot(reward_time[0], trial_index + 0.5, '.', color='b', label='reward', markersize=6)
        # plot lick times
        ax.vlines(lick_times, trial_index, trial_index + 1, color='k', linewidth=1)
        # put a line at the change time
        ax.vlines(0, trial_index, trial_index + 1, color=[.5, .5, .5], linewidth=1)
    # gray bar for response window
    ax.axvspan(0.15, 0.75, facecolor='gray', alpha=.3, edgecolor='none')
    ax.grid(False)
    ax.set_ylim(0, len(trials))
    ax.set_xlim([-1, 4])
    ax.set_ylabel('trials')
    ax.set_xlabel('time (sec)')
    ax.set_title('lick raster')
    plt.gca().invert_yaxis()

In [None]:
# plot the lick raster for this session using the provided function
make_lick_raster(session.trials)

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
<h2>The Trial Response and Flash Response dataframes organize cell responses by behavior trials and stimulus flashes </h2>
    
<p> These two dataframes do the work of temporal alignment for you to create a convenient data structure for analysis. 
  
<p> The `trial_response_df` extracts cell responses for each behavioral trial in a [-4,8] second window around the change time.
    
<p> The `flash_response_df` extracts cell responses for each stimulus presentation in a [-0.5, 0.75] second window around each flash. 
    
<p> Both dataframes take the mean response for each cell in a 500ms window after the change time for trials, or after the stimulus onset time for stimulus presentations.
    
<p> These dataframes also include a p_value comparing the response for each cell on each trial to a shuffled distribution from the spontaneous activity epochs. 

</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 6.1:** Load and explore the Trial Response Dataframe. 

<p>1) Get the `trial_response_df` attribute of the session object. What are the columns? What are the rows? What is different than the `trials` table? 
    
<p>The `dff_trace` column contains a portion of each cell's dF/F trace from 4 seconds before the `change_time` to 8 seconds after the 'change_time' for each trial. There are also `dff_trace_timestamps` for the same window. 

<p> For each trial, the `mean_response` of each cell is computed for a 500ms window after the `change_time`.

<p>2) Assign `trial_response_df` to a variable named `tr` for convenient use in later exercises.
    
</div>

In [None]:
# get the trial response dataframe and assign it to 'tr'


In [None]:
# what is in the trial response dataframe?


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 6.2:** Plot the population average trace for go trials. 

<p>Select go trials from the 'trial_response_df', take the mean across all cells, all trials, and plot it
    
</div>

In [None]:
# plot the mean trace across all cells for go trials


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 6.3: Plot the population response across cells for one change trial**

<p>1) Select one 'trial_id' where 'go' = True and filter the trial_response_df to get the data for just that trial. How many rows are in this subset of data? Is it the same length as the number of unique cells?

<p>2) Get the 'mean_response' for all cells,  sort in order of response magnitude and plot it. 
    
</div>

In [None]:
# get data for all cells for a single go trial


In [None]:
# plot the mean response across cells for this trial, with cells on x-axis and mean dF/F on y-axis


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 6.4: Explore the flash response dataframe**

<p>What is in the `flash_response_df` attribute of the session object? What are the columns? What are the rows?  How is it different from the `stimulus_presentations` table?

<p>The`flash_response_df` contains the cell responses for individual stimulus presentations, aka flashes. It contains the `mean_response` of every cell in a 500ms window after every stimulus onset, for all stimulus presentations during the behavior session.  

</div>

In [None]:
# get the flash response dataframe and assign it to 'fr'


In [None]:
# whats in the flash response dataframe?


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 6.5: Plot the mean response to each image across all flashes**

<p>1) Select one image and plot the mean response across all cells using the 'dff_trace' column. Put the 'image_name' in the title. 
    
<p>2) Plot the mean response to all images on the same figure. 

</div>

In [None]:
# plot the mean trace across all cells for one image


In [None]:
# plot the mean trace across cells separately for each image


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<p>**Task 6.6: Plot one cell's response to each image across all flashes**
    
<p>Create the same plot for a single cell rather than the whole population. 
    
<p>What does it look like for different cells?
    
</div>

In [None]:
# plot the mean trace across images for one cell
