# Session 1 Exercises

## Part 1: Spike Counts

Imagine you've just finished recording from a V1 neuron. Across 60 experimental trials, you presented either a vertical or horizontal Gabor patch (30 each) and recorded the corresponding number of spikes (counts below). Now you want to ask some questions of your data.

In [None]:
## Spike counts: vertical Gabor patch.
spikes_v = [39, 36, 38, 32, 28, 33, 28, 29, 30, 31, 22, 37, 26, 22, 37, 
            34, 26, 30, 32, 34, 30, 26, 30, 26, 32, 30, 28, 21, 35, 41]

## Spike counts: horizontal Gabor patch.
spikes_h = [28, 19, 15, 19, 25, 27, 19, 19, 28, 18, 19, 24, 14, 24, 16, 
            11, 24, 16, 21, 22, 18, 24, 24, 20, 15, 26, 20, 17, 21, 26]

#### Exercise 1: Indexing

a) Return the spike count corresponding to the 17th presentation of the vertical stimulus.

b) Return the spike count corresponding to the fifth-from-last presentation of the horizontal stimulus.

c) Return the spike counts from every fourth presentation of the horizontal stimulus. 

d) Return spike count corresponding to the 3rd, 8th, and 10th-from-last presentaiton of the vertical stimulus. (Hint: use list comprehensions.)

e) Return the 2nd largest recorded spike count in response to the vertical stimulus. (Hint: use the `sort` function.)

#### Exercise 2: Built-in Functions

a) Compute and store in separate variables the max spike count from each stimulus condition. Then, using an `if/else` statement, write some code which prints out which condition had the larger max count.

b) Compute and store in separate variables the total spike counts from each stimulus condition. Then, using an `if/else` statement, write some code which prints out which condition had the smaller total count.

#### Exercise 3: Basic Scripting
a) Create a new copy of each list that now contains spikes counts greater than or equal to 25. How many counts are now in each list?

b) Using a for loop, create new list containing only the intersection of the two lists (i.e. containing only spike counts present in both lists.

c) Do the same as above, now only using list comprehensions.

#### Exercise 4: Custom Functions
a) Copy the **mean** function from today's lecture. Which condition shows the greater number of spikes on average? 

b) Write a function that computes the **median** of a list. Which condition shows the greater median number of spikes?

c) Write a function that computes the **standard deviation** of a list. Which condition shows the greater standard deviation in the number of spikes? As a reminder, the formula for the standard deviation is:

$$ s = \sqrt{\frac{1}{N-1} \sum_{i=1}^N (x_i - \bar{x})^2 }$$ 

## Part 2: Challenges

#### Exercise 1

Write a function that checks if the inputted argument is even, odd, a float, or not a number (NaN).

#### Exercise 2
Starting with the list below, construct a `while` loop that returns the [Fibonacci numbers](https://en.wikipedia.org/wiki/Fibonacci_number) and terminates only when the most recent number is greater than 5000.

#### Exercise 3
Define a function that checks if an inputted integer is prime. Test your function on the following numbers: 
>1411, 1147, 2327, 2683, 33233

Only one number above is not prime. 

As an added challenge, if the tested integer is not prime, have the function return a number the integer is divisible by.

## Part 3: Basic NumPy Operations
a) Generate an array of numbers 0-24. Reshape to a 5x5 matrix.

b) Extract the diagonal of this matrix.

c) Multiply the matrix by an identity matrix of the same shape. Confirm that it is identical to the original.

Hint: Use `np.all` command to confirm all equal. 

d) Return the indices of the matrix where the elements are greater than 15.

e) Using `np.where`, set all elements of the matrix greater than 15 to 1, else 0.


f) Set all elements of the matrix greater than 15 to 2, less than 5 to 1, else 0.

Hint: `np.where` can be passed as an input to `np.where`.

g) Define a demean function.

h) Apply the demean function across each row of the matrix.

## Part 4: Example Dataset

In this exercise, you will be using NumPy to manipulate and analyze simulated recording data. 

Imagine you have recorded the response of several neurons to some input stimuli. A week later, when you have to make a figure, you cannot remember exactly what you did. Luckily for you, you have the data and a description of each variable: 

| Variable  | Description |
|:---------:|-------------------------------------------------------------------|
| spikes  | Binary matrix, (n_trials, n_samples).<br>_1_ = spike was detected.<br>_0_ = no spike was detected. |
| times   | The start time of each sample. |
| neurons | An array denoting the neuron identity for a trial. |
| stimuli | An array denoting the stimulus used for a trial. |
| sfreq   | The sampling frequency of the recording. |

Work on the exercises below. By the end, you should be able to answer some questions about this dataset.

In [None]:
import numpy as np

## Load compressed file and extract data.
npz = np.load('spikes.npz')    
spikes = npz['spikes']               # Raw spike data.
times  = np.round(npz['times'],3)    # Recording time info.
neurons = npz['neurons']             # Neuron identifier
stimuli = npz['stimuli']             # Stimulus identifier
sfreq = float(npz['sfreq'])          # Sampling frequency

#### Exercise 1: Interrogating the Data

a) What is the shape of the raw spiking data? 

b) How many trials are there per neuron?

c) How many trials are there per stimulus?

d) How long is each trial of recording? (Hint: this dataset is stimulus-locked, meaning that 0 corresponds to the onset of the stimulus.)

e) How long is each sample of recording? That is, what is the time step?

#### Exercise 2: Averaging the Raw Data

To get started, you want to eyeball the data and see if there are any noticeable differences between the spiking patterns of the neurons.

a) For each unique neuron, compute the average number of spikes per time step and store the result in a new variable.

b) Using the starter code below, plot each of the raw traces. (Don't worry much about the plotting syntax. We'll talk more about visualization next week.) 

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

## Initialize canvas.
fig, ax = plt.subplots(1,1,figsize=(12,4))

## Insert the averaged timeseries below. 
ax.plot(times, _NEURON_A_AVG_HERE_, color='blue')
ax.plot(times, _NEURON_B_AVG_HERE_, color='orange')
ax.set(xlim=(times.min(), times.max()))

plt.tight_layout()

c) Based on the plot above, what can you infer? Are the data messy? Does it look like there is a difference in spike counts between the neurons?

#### Exercise 3: Binning the Spike Data

To get a more reliable estimate of spiking activity, neuroscientists often bin spike count data in order to measure the number of spikes per some time bin. Over the following steps, you will compute for each trial the spike count per **100ms time bin.**

a) How many 100ms bins will fit the length of each trial? How many samples will go into each bin?

b) Make a new variable, `counts`, that is an empty NumPy matrix (e.g. a matrix of zeros). The matrix should be shape (n_trials, n_bins). We will eventually store the binned spike counts in this variable.

c) Make a new variable, `bin_starts`, that contains the start time of each bin (i.e. -5.0, -4.9, -4.8, ..., 9.9). The final array should be shape, (n_bins).

(Hint: at least two of the array generating functions discussed earlier can do this.)

d) Now comes the hard part. Here we will write a for loop that will iteratively count the number of spikes in each bin for all trials.

To get you started, the following is some pseudo-code for how you might solve the problem. However, feel free to ignore if you have your own solution!

- **Top-level:** `for` loop iterating over the _start_ of each time bin and its corresponding index. (Hint: which `for` loop function passes both the member of a list and its index?)
    - _Step 1:_ Find the indices of the time points in `times` which belong to the current bin. (Hint: Google `np.logical_and`.)
    - _Step 2:_ Index into _spikes_ using the indices found in Step 1. Sum within each trial.
    - _Step 3:_ Store the resulting array from Step 2 in the corresponding column of `counts`.

#### Exercise 4: Averaging the Count Data

a) For each unique neuron, compute the average (mean) spike count per time bin and store the result in a new variable.

b) Using the starter code below, plot each of the rate traces. 

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

## Initialize canvas.
fig, ax = plt.subplots(1,1,figsize=(12,4))

## Insert the timeseries here.
ax.plot(bin_starts, _NEURON_A_COUNTS_HERE_, color='blue')
ax.plot(bin_starts, _NEURON_B_COUNTS_HERE_, color='orange')
ax.set(xlim=(times.min(), times.max()))

plt.tight_layout()

c) Based on the plot above, what can you infer? Are the data cleaner? Does this change your interpretation from earlier?

#### Exercise 5: Lazy Data Analysis

a) If we want to know whether a neuron is more sensitive to a stimulus, or just more active overall, a good test is to measure the spike counts during some baseline period.

For each neuron, compute the average spike count in the pre-stimulus period, [-5s - 0s]. Is one neuron more active than the other overall?

b) Find the bin at which spike counts peak for each neuron. (Hint: Google `np.argmax`.)

c) The recordings above actually used to two stimuli. Compute and plot the spike rate per neuron and stimulus. Copy the visualzation code above and plot. (A list of additional Matplotlib colors can be found [here](https://matplotlib.org/2.0.2/api/colors_api.html).)

d) Does it look like there's a difference between the stimuli?