![Image](resources/banner.jpg)

<h1 align="center">Allen Brain Observatory Visual Coding Two-Photon </h1> 
<h2 align="center"> Day 1, Morning Session. SWDB 2024 </h2> 

<h3 align="center">Monday, August 19, 2023</h3> 

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
   
The Allen Brain Observatory Visual Coding Two-Photon dataset is a large-scale survey of physiological activity in mouse visual cortex in response to a variety of visual stimuli under passive viewing conditions.  The animals are head-fixed but free to run on a disc.  Single plane two-photon calcium imaging is performed in different areas and layers with transgenically targeted cell lines.  This notebook is a brief introduction to get you started with this data set and lead you to resources for you to explore further.

</div>

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
   
***What kind of questions can you answer with this dataset?***

This dataset contains recordings of activity in response to a variety of natural and artificial visual stimuli.  This makes it suitable for a variety of coding questions.

- How are stimuli and features from the external world encoded in neural responses?  
- How do the encoding properties differ across areas and layers?  In different cell lines?
- Can you build predictive models of response from stimuli?
- How are running activity and pupil size related to cortical activity?
- How can information about the stimuli and/or the animal's state be extracted from neural activity?  Can you decode stimuli?
- Do neurons coordinate their activity?  Do the act in ensembles?  
- Is there any spatial aspect to neural information?

These are just some of the questions that might be addressed from this type of data.  

***Why two-photon calcium imaging?***

- You get a relatively large number of cells across an area.
- 2D spatial arrangement within a layer.
- Chronic recording across days, and thus more measurements or stimuli.

***Why NOT two-photon calcium imaging?***

- Indirect measure of activity.  One must decide how to extract "activity" from the calcium signal, and what that means.
- Time scale of calcium is slow; you get relatively poor temporal resolution.
- For the indicator and resolution at which these recordings were made, single and low spike count activity is often not observed.

</div>

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
   
**Databook**

The databook is a resource for more in-depth information and examples for the Allen Brain Observatory Visual Coding Two-photon dataset.  You can find the pages for this data set here:  https://allenswdb.github.io/physiology/ophys/visual-coding/vc2p-background.html

![Image](resources/databook_vc2p.png)

</div>

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
   
***Remember the tools you have!***

- Use the databook as a reference; this notebook contains only a small portion of what is in the databook!
- Use the help function to find function arguments
- Use `dir` to see data and functions in an object
- Use tab complete in jupyter 

</div>

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
   
Using the Python objects we'll show you below, you can extract information about this dataset such as how many recordings from a given area or Cre line.

For each targeted area, layer, and Cre line, each mouse is recorded for three sessions (see more on this below).  There is a datafile for each session that includes (not exhaustive):

- Various flourescence traces from different stages of the processing pipeline.
- Running activity of the mouse
- Pupil size and eye tracking (for some sessions)
- Stimulus presentation timing and templates
- Max projection images and roi masks for each cell
- Extracted event traces from a deconvolution algorithm (in a separate file)

</div>

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import platform, os

%matplotlib inline

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
   
The following cell sets up a path variable so that this notebook will work on the cloud or using data accessed locally, e.g. from your hard drive.

</div>

In [None]:
# Set file location based on platform. 
platstring = platform.platform()
if ('Darwin' in platstring) or ('macOS' in platstring):
    # macOS 
    data_root = "/Volumes/Brain2024/"
elif 'Windows'  in platstring:
    # Windows (replace with the drive letter of USB drive)
    data_root = "E:/"
elif ('amzn' in platstring):
    # then on Code Ocean
    data_root = "/data/"
else:
    # then your own linux platform
    # EDIT location where you mounted hard drive
    data_root = "/media/$USERNAME/Brain2024/"

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
   
This dataset is accessed via the `allensdk` python package.  It requires instantiating a `BrainObservatoryCache` object that we usually call `boc`.  You'll access all of the data for this dataset using this object.

</div>

In [None]:
from allensdk.core.brain_observatory_cache import BrainObservatoryCache
manifest_file = os.path.join(data_root,'allen-brain-observatory/visual-coding-2p/manifest.json')
boc = BrainObservatoryCache(manifest_file=manifest_file)

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
   
***Plotting responses and stimulus epochs***

To give you an overview of how to access and use this data set, we are going to demonstrate accessing data for a session and plotting traces overlayed with stimulus epochs.

</div>

![Image](resources/vc2p.png)

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
   
You can get general information, such as the available areas or stimuli using queries that often start with `get_`.  Use introspection or see the databook for other possibilities.

</div>

In [None]:
boc.get_all_targeted_structures()

In [None]:
boc.get_all_stimuli()

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
   
The experiments are arranged in *containers*, which are a set of recording sessions that include a complete set of stimuli.  In this dataset, there are three sessions per container.  

![Image](resources/VC2p-sessions.png)

</div>

In [None]:
experiment_containers = boc.get_experiment_containers()

In [None]:
pd.DataFrame(experiment_containers)

In [None]:
sessions = boc.get_ophys_experiments(experiment_container_ids=[511510911])

In [None]:
pd.DataFrame(sessions)

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
   
Note:  `id` in the experiment containers table is the *container id*.  `id` in the sessions table is the *session id*.

</div>

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
   
![Image](resources/stim_container.png)

</div>

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
   
The `get_ophys_experiment_data` method will instantiate an object that contains the actual data for a single session.  If you do not have the data properly mounted (either on Code Ocean or via your hard drive) you will get a warning that the data is being downloaded here.

</div>

In [None]:
session_id = 508356957
session_data = boc.get_ophys_experiment_data(session_id)

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
   
The processed DF/F traces for that session can be returned via the following function.  This returns a tuple containing time stamps and a numpy array of shape (neurons, acquisition frames).  

</div>

In [None]:
t, dff = session_data.get_dff_traces()

In [None]:
n = 10

fig, ax = plt.subplots(figsize=(15,5))
ax.plot(t, dff[n])
ax.set_xlabel('time (s)')
ax.set_ylabel('DF/F (arbitrary units)')
ax.set_title('DF/F trace for cell index {}'.format(n))

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">

***Task 1***

Choose different cell indices and remake the plot above.  Can you find cells with intereseting responses?

</div>

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
   
In our analyses, we often use "extracted events", which are deconvolved fluorescence traces using an algorithm from Daniela Witten and Sean Jewell.  These are not in the session_data object but are accessed via a function from `boc`.

</div>

In [None]:
events = boc.get_ophys_experiment_events(ophys_experiment_id=session_id)

In [None]:
n = 10

fig, ax = plt.subplots(figsize=(15,5))
ax.plot(t, events[n])
ax.set_xlabel('time (s)')
ax.set_ylabel('DF/F (arbitrary units)')
ax.set_title('DF/F trace for cell index {}'.format(n))

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
   
Information about when each stimulus type is shown is contained in the `stimulus_epoch_table`.  `start` and `end` denote the *acquisition frame* on which that stimulus epoch began or ended.

</div>

In [None]:
stim_epoch = session_data.get_stimulus_epoch_table()

In [None]:
stim_epoch_table = pd.DataFrame(stim_epoch)
stim_epoch_table

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
   
We will use the function `ax.axvspan` to shade the temporal window during which a single stimulus epoch occured.  First let's grab the `start` and `end` frames for the epoch during which `natural_movie_one` was shown.

</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">

***Task 2***

Remake the plot of extracted events vs. time.  Use the function axvspan and the stimulus_epoch_table to shade the region during which natural_movie_one was shown.

</div>

In [None]:
start = stim_epoch_table[stim_epoch_table.stimulus=='natural_movie_one'].start.iloc[0]
end = stim_epoch_table[stim_epoch_table.stimulus=='natural_movie_one'].end.iloc[0]
start, end

In [None]:
n = 10

fig, ax = plt.subplots(figsize=(15,5))
ax.plot(t, events[n])
ax.axvspan(xmin=t[start], xmax=t[end], color='r', alpha=0.1)
ax.set_xlabel('time (s)')
ax.set_ylabel('DF/F (arbitrary units)')
ax.set_title('DF/F trace for cell index {}'.format(n))

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">

***Task 3***

A.  Remake the previous plot.  Add all of the stimulus epochs, with a unique color for each stimulus type.  (Hint:  define a list of colors beforehand.)

B.  Add all of the other traces in the experiment, using a vertical offset so they don't overlap.

</div>

In [None]:
colors = ['blue','orange','green','red']
n = 10

fig, ax = plt.subplots(figsize=(15,5))
ax.plot(t, events[n])

for c,stim_name in enumerate(stim_epoch.stimulus.unique()):
    stim = stim_epoch[stim_epoch.stimulus==stim_name]
    for j in range(len(stim)):
        ax.axvspan(xmin=t[stim.start.iloc[j]], xmax=t[stim.end.iloc[j]], color=colors[c], alpha=0.1)


ax.set_xlabel('time (s)')
ax.set_ylabel('DF/F (arbitrary units)')
ax.set_title('DF/F trace for cell index {}'.format(n))


In [None]:
fig, ax = plt.subplots(figsize=(14,8))

#here we plot the first 50 neurons in the session
for i in range(50):
    ax.plot(t, dff[i,:]+(i*2), color='gray')
    
#here we shade the plot when each stimulus is presented
colors = ['blue','orange','green','red']
for c,stim_name in enumerate(stim_epoch.stimulus.unique()):
    stim = stim_epoch[stim_epoch.stimulus==stim_name]
    for j in range(len(stim)):
        ax.axvspan(xmin=t[stim.start.iloc[j]], xmax=t[stim.end.iloc[j]], color=colors[c], alpha=0.1)
        
ax.set_xlabel("time (s)")
ax.set_ylabel("Extracted Events (arbitrary units)")

<div style="border-left: 3px solid #000; padding: 1px; padding-left: 10px; background: #F0FAFF; ">
   
***Explore further***

- Above we retrieved the targeted structures and the available stimuli using methods like 'boc.get_all_targeted_structures'. Using similar methods, what areas and depths are available?  What other dimensions of the data can be acquired this way?

- How can you retrieve a list of only those experiment containers from a particular area or Cre line?

- How would you retreive all sessions that included a particular stimulus, say natural scenes?

- What other traces are available in the session data?  What do they represent?

- How would you retreive the running speed of the animal?  Add a trace of the running speed to the plot above.

- What is image #48 in the Natural Scenes stimulus?

:::{admonition} Hint
:class: dropdown
Remember to check the [Databook](https://allenswdb.github.io/physiology/ophys/visual-coding/vc2p-background.html)!
:::

</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">

***Homework 1***

Make a plot similar to the above with shading for the times when an individual frame of natural scenes was presented.  

:::{admonition} Hint
:class: dropdown
The stimulus epoch table shows you when classes of stimuli are on the screen.  There is a similar data structure called the stimulus table that is specific to each data set.  
:::

</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">

***Homework 2***

Make a plot of the max projection image for a single session.  Compute the average activity across the sessions for each cell and shade the rois in the max projection image according to that activity.

</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">

***Homework 3***

Make a DataFrame of the number of session per Cre line and area.

:::{admonition} Hint
:class: dropdown
Remember to check the [Databook](https://allenswdb.github.io/physiology/ophys/visual-coding/vc2p-background.html)!
:::

</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">

***Homework 4***

List these Cre lines in order of how many neurons they have per session (on average):
    Rbp4-Cre_KL100
    Cux2-CreERT2
    Rorb-IRES2-Cre
    Vip-IRES-Cre
    
</div>

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">

***Homework 5***

As described above, the same set of neurons are targeted for each of the three sessions in a container, but not all cells appear in every session (see the graphic below).  Identify the cells that are common are across all three sessions in a container and remake the plot above with just those cells.  


:::{admonition} Hint
:class: dropdown
The indices for the dff traces are specific to each session.  To connect cells across sessions you will need to know what a `cell_specimen_id` is.  Remember to check the [Databook](https://allenswdb.github.io/physiology/ophys/visual-coding/vc2p-background.html)!
:::

</div>


![Image](resources/cell_specimens.png)