<img src="../../resources/cropped-SummerWorkshop_Header.png">  

<h1 align="center">Exercise 2.4 Much ado about nothing
<h2 align="center"> Differentiating periods of no-stimulus based behavioral context </h1> 
<h2 align="center">Summer Workshop on the Dynamic Brain</h2> 

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">

Astute observers will have noticed that, during the visual behavior task, there are some "omission" trials.

Where most trials include an image, these trials simply omit these expected image in favor of a blank screen. If you recall the results from the workshop, it was very easy to decode these omission stimuli from the responses in visual cortex. This makes a lot of sense; omitting stimuli provides a fundament different visual input to the system, so its no surprise that the visual system would have a very different response. 

Omission trials where not the only time the animal saw a blank screen during these recordings. Each recording includes an unstimulated period of "spontaneous activity," which is just another way of saying that the animal sat in the dark on the rig for a while during the recording. 

Importantly, this means that there were two periods in the recording where the animal saw nothing on its screen - in the first, the animal was expecting to see a stimulus that was withheld, in the other the animal had no such prior. This dichotomy gives us that opportunity to answer an exciting question about the visual system: 

<b> Does this behavioral context/expectation matter for how null-stimuli are represented by the visual system?</b>




<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
In this problem set, we will frame this question as a <b>decoding</b> problem. If there is a difference between these two behavioral epochs, then we should be able to reliably decode which epoch a given snippet of neural activity came from. Just in case you want to go wild with looking at precise timing in your decoding, we will be using the spiking data from the 'visual behavior' dataset.

Here, we will walk through:


<p> 1) Getting data from the omission trials and formatting it in decoder-friendly design matrix
    
<p> 2) Wrangling data from the "spontaneous activity" epoch into a decoder-friendly format. (Hint: we are going to grab snippets that look like trials, even though there is no trial structure here!)
    
<p> 3) Building a decoder for the two epochs, and evaluate its performance using cross validation
        
</div>

In [1]:
# Import some basic packages.
import os
from pathlib import Path
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Import the AllenSDK 
from allensdk.brain_observatory.behavior.behavior_project_cache.\
    behavior_neuropixels_project_cache \
    import VisualBehaviorNeuropixelsProjectCache

%matplotlib inline

In [2]:
# Find some data!
import platform
platstring = platform.platform()

if 'Darwin' in platstring:
    # macOS 
    data_root = "/Volumes/Brain2024/"
elif 'Windows'  in platstring:
    # Windows (replace with the drive letter of USB drive)
    data_root = "E:/"
elif ('amzn' in platstring):
    # then on CodeOcean
    data_root = "/data/"
else:
    # then your own linux platform
    # EDIT location where you mounted hard drive
    data_root = "/media/$USERNAME/Brain2024/"

First, we need to access the data.  This bit should look very similar to this afternoon's workshop

In [3]:
cache = VisualBehaviorNeuropixelsProjectCache.from_local_cache(cache_dir=data_root, use_static_cache=True)

Grab data from a session

In [4]:
session = cache.get_ecephys_session(
           ecephys_session_id=1065437523) # Feeling brave? Try a different number...

  return func(args[0], **pargs)


Get the stimulus presentations.

In [5]:
stimulus_presentations = session.stimulus_presentations
stimulus_presentations.head(-5)

Unnamed: 0_level_0,stimulus_block,image_name,duration,start_time,end_time,start_frame,end_frame,is_change,is_image_novel,omitted,...,rewarded,is_sham_change,temporal_frequency,orientation,position_y,stimulus_index,active,spatial_frequency,position_x,contrast
stimulus_presentations_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
0,0,im036_r,0.250188,28.131464,28.381652,60,75,False,False,False,...,False,False,,,,-99,True,,,
1,0,im036_r,0.250188,28.882028,29.132216,105,120,False,False,False,...,False,False,,,,-99,True,,,
2,0,im036_r,0.250232,29.632680,29.882912,150,165,False,False,False,...,False,False,,,,-99,True,,,
3,0,im036_r,0.250186,30.383329,30.633515,195,210,False,False,False,...,False,False,,,,-99,True,,,
4,0,im036_r,0.250229,31.133886,31.384115,240,255,False,False,False,...,False,False,,,,-99,True,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13381,5,im047_r,0.250210,8776.735046,8776.985256,522677,522692,False,False,False,...,False,False,,,,-99,False,,,
13382,5,im047_r,0.250207,8777.485673,8777.735881,522722,522737,False,False,False,...,False,False,,,,-99,False,,,
13383,5,im047_r,0.250208,8778.236296,8778.486503,522767,522782,False,False,False,...,False,False,,,,-99,False,,,
13384,5,im047_r,0.250208,8778.986918,8779.237126,522812,522827,False,False,False,...,False,False,,,,-99,False,,,


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">


<b>
1) Getting data from the omission trials and formatting it in decoder-friendly design matrix
    

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<b> (1a) Get the start and end times of omission stimuli </b>

Using the stimulus table, get the start and end time of the omission trials

Hint: they are stored as "omitted" in the image name

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<b> (1b) Filter spiketrains for an area</b>

VISP/V1 is always a favorite to start

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<b> (1b) Use these timestamps to build a matrix with the spike counts for each omission </b>

Your matrix will be be of size '# of omissions' x '# units'

In [8]:
# Function for getting spike counts in a specified window
def get_trial_spike_rates(spikes, startTimes, endTimes):
    rates = np.zeros(len(startTimes))
    for i,start in enumerate(startTimes):
        startInd = np.searchsorted(spikes, start)
        endInd = np.searchsorted(spikes, endTimes[i])
        rates[i] =np.count_nonzero(spikes[startInd:endInd])/(endTimes[i]-start)

    return rates


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<b> (2) Wrangling data from the "spontaneous activity" epoch into a decoder-friendly format. </b>


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<b> (2a) Find the start and stop times of spontainous activity </b>

They are stored in the stimulus_presentations table with `stimulus_name`=='spontaneous'. In this case, lets just use longest interval.


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<b> (2b) Build a trails matrix (similar to the one above) for spontaneous activity </b>

But wait! Spontaneous activity doesn't have a trial structure. 

For now, our solution will be to randomly grab population activity during time intervals equivalent to those of omission activity. 

Create a `spont_spike_rates` matrix equivalent to the one we built above, but with trails randomly extracted from the spontanious trial block.


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<b> (2c) Combine your rates into a decoder-friendly format </b>

Generate stack your rates into a single `X` matrix. Generate `y` matrix that contains a 0 value for omissions trials and 1 for spontaneous activity.

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<b>
<p> 3) Building a decoder for the two epochs, and evaluate its performance using cross validation


<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<b>
<p> 3a) Import what you need for a decoder and cross validation
</b>

You can use a Linear Support Vector Machine like we did in this mornings workshop if you like, but feel free to try out some of the other decoders in `sklearn`. Once you code is working, you might even think about comparing linear and non-linear decoders :)

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<b>
<p> 3b) Train your decoder and evaluate with cross validation.  
</b>

How did we do?



In [1]:
# Hint:
from sklearn.model_selection import StratifiedKFold

<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<b>
<p> 3c) What can we actually learn from this?
</b>

<p>
How well did your decoder do? 
<p>
Given that the visual stimulus was the same between conditions, is this what you expected? What does this tell you about information in V1? Do you see any caveats that need to be controlled for?

<p>
For starters, try comparing the average firing rate of each cell in each instance. You can use `np.mean(rate_matrix,axis=0)` for this, or you can be really fancy and fit linear regression models (here they turn out to be the same!).



<div style="background: #DFF0D8; border-radius: 3px; padding: 10px;">
<b>
<p> 4) Explore!!  
</b>

Now that you have a basic decoder framework for this question, feel free to have some fun selecting out questions. Who knows, you might end up with some fun project ideas.

Some starter ideas include (but are not limited to):

- What happens if you select different brain areas. Are they the same? Think about comparing e.g. a cortical sensory area (like VISp) to thalamic (LGd) and "Higher order" areas. What differences do you see?
- How important is the timescale of this difference - Do you results change in the later or earlier parts of the omissions trials?
- What if we had separated passive and active phases of the task? or times when the mouse disengages? Would this change some answers?
- What if you include/exclude particular cell types from your analysis?


