# Week 2: numpy and matplotlib

## Developing an exercise

I will be running the first few exercises, but subsequent exercises will be run by one of you. If you haven't already, please sign up for a slot on our shared Google doc. If preferred, you may work in small groups of two to three people. You are expected to be in regular contact with the course staff as you develop the exercise so that we can guide you and ensure the exercise is of appropriate scope and level for the class. Be sure to contact us early. You are expected to have a concept ready at least two weeks before your presentation slot and a full draft of the exercise complete a full week before your presentation slot.

Although we try to ensure that the exercise is of appropriate length for the class, we may not finish all the problems before the end of class. Please finish the exercise outside of class if needed.

*Grading for registered students is based on your presentation, so you must sign up for a presentation if you're registered for the course.* 

Today's notebook provides a model for presenting an analysis method to the class. We briefly cover the essential motivation for the experiment and details of the data so that everyone understands how the analysis is useful. We then build up the analysis step-by-step using a sequence of code snippets and exercises.

## Notes/followup from week 1

### Pointer/reminder about Jupyter notebooks

Jupyter notebooks organize your code and text into a sequence of cells. Each cell can be run independently of each other. However, the order of execution matters. Take a look at the two cells below. Run the one named `Cell 1`. Why do we get an error?

Now, run `Cell 2` and then go back and run `Cell 1` again. Why does it work now?

What happens if we run `Cell 1` again? Why has the output changed?

In [None]:
# Cell 1
a = a * 2
print('a is now', a)

In [None]:
# Cell 2
a = 1

### Iteration and dictionaries

Dictionaries also provide fancy ways for looping:

In [None]:
d = {'dog': 'woof', 'cat': 'meow', 'cow': 'moo', 'ferret': 'dook'}
# iterate through keys
for k in d:
    print(k)

In [None]:
# iterate through values
for v in d.values():
    print(v)

In [None]:
# iterate through both
for k,v in d.items():
    print("The", k,"goes", v)

We'll have some practice with for loops and iterables below.

## Introduction: Getting started with numpy

Today we dip our toes in the waters of numerical processing.
`numpy` ("Numerical Python") is a core library used in pretty much any analysis written in Python. It implements a number of basic linear algebra routines, with the idea that mathmatical operations can be applied to time series and arrays of numbers, similarly to variables in standard programming. Conceptually, numpy borrows substantially from the numerical core of Matlab, which itself is built on the venerated LINPACK library.

`matplotlib` is a widely used plotting library, which also has its roots in Matlab. Many other plotting libraries exist, and many make prettier pictures. However, matplotlib can usually get the job done. 

To illustrate the use of basic numpy and matplotlib functions, we'll perform basic analysis of neural spike data recorded from auditory cortex. There are a few quirks around plotting auditory tuning curves that take advantage of optional features in matplotlib, e.g., as tweaking the details of the x axis. While you may not use these specific options all the time, they provide examples of how you can often find little features to make your plots prettier and easier to read.

### Auditory neurophysiology experiments

An auditory neurophysiology experiment consists of recording neural activity during the presentation of one or more sounds. A goal of these studies is to understand what aspects of the sound input, $x(t)$, evokes changes in neural activity, $y(t)$.

<img src="images/auditory_ephys_cartoon.png" width="400"/>

The function (or "encoding model") $h$ that describes the relationship between $x$ and $y$ can be very complex, but today things are simple. In the current dataset, single-neuron spiking activity is recorded from primary auditory cortex (A1) of a ferret during presentation of a sequence of pure tones of varying frequency. We'd like to know how spiking activity varies as a function of the tone frequency.

### Tuning curves

Here's an example frequency tuning curve, which shows the average spike rate evoked by tones of different frequencies for a single neuron in auditory cortex. The tuning curve provides a clear and simple characterization of the range of tone frequencies that drive a neural response, that is, a neuron's receptive field.

<img src="images/tuning_curve_example.png" width="400"/>

### Spike rasters

Here's a spike raster for the same neuron, which gives more information about the dynamics of spontaneous and sound-evoked activity. Ticks in each row show the time of spikes during a single tone presentation. This plot is complex in that data from mulitple different tones are stacked on top of each other. Note that responses to different tones can vary in time, sometimes occuring only transiently after tone onset and sometimes sustained over the duration of the tone.

<img src="images/raster_example.png"/>

### Todays learning objectives

* basics of numpy for vector/array computation
* simple plotting routines using matplotlib
* measure stimulus-evoked firing rate in single neuron data recorded from auditory cortex
* compute and plot a frequency tuning curve
* plot a spike raster, aligned to stimulus onset times


### Function list

There is a lot you can do with `numpy` and `matplotlib`. We'll introduce a bunch of basic functions from both libraries. It's a lot to digest, but many of these functions will keep showing up through the course.

Core Python functions
* `for`
* `range`
* `enumerate`
* `zip`

`numpy` functions
* `np.array`
* `np.shape` (or `arr.shape` property)
* `np.sum`, `np.mean`, `np.median`, `np.max`, `np.min`
* `np.zeros`, `np.ones`, `np.arange`
* `np.loadtxt`
* `np.round`
* `np.unique`
* `np.append`

`matplotlib` functions
* `plt.plot`
* `plt.xlabel`, `plt.ylabel`, `plt.title`
* `plt.axhline`, `plt.axvline`
* `plt.legend`
* `plt.rcParams`
* `plt.xlim`, `plt.ylim`
* `plt.xticks`, `plt.yticks`


## Introduction to numpy arrays and matrices

### Lists vs. arrays

We start by importing the numpy and matplotlib libraries, which is typically the first part of any python script. `np` and `plt` are conventional shorthands for `numpy` and `matplotlib`, which make subsequent code more compact. The line `%matplotlib inline` may not be necessary for your system, but is necessary to get plots to display in some Jupyter Notebook installations.

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

You should already be familiar with the python `list` data structure. And that the output of the last line of a Jupyter Notebook will be displayed when it's executed.

In [None]:
l = [1, 2, 3]
l

A numpy `array` look a lot like a list, though the declaration is a bit more complicated. Pass a list into `array` to turn it into an array.

In [None]:
arr = np.array([1, 2, 3])
arr

So what's the difference between a list and an array?  Say you wanted to add 1 to every entry in a list. It might be nice if you could do this, but if you try to execute it, you'll get an error.  Take a moment to see if you can make sense of this error message.

In [None]:
l+1

Instead, the + operation is used to concatenate lists:

In [None]:
l + [1]

Wiht an array, however, entry-wise addition is the native behavior.

In [None]:
arr+1

Notice also that * (multiplication) has very different effect on lists vs. arrays:

In [None]:
l*2

In [None]:
arr*2

There are many other differences between lists and arrays that will come clear below.

`numpy` commands like `ones`, `zeros` and `arange` are convenient for generating pre-populated arrays.

In [None]:
np.ones(10)  # all ones

In [None]:
np.zeros(5)  # all zeros

In [None]:
np.arange(6)   # array with values 0...5

### Basic array operations

Often you want to compute statistics on the numbers in an array. Let's start with a slightly bigger example.

In [None]:
a = np.array([1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144])

There are a bunch of numpy functions to do the basics.

In [None]:
np.sum(a)

In [None]:
np.mean(a)

In [None]:
np.median(a)

In [None]:
np.max(a)

In [None]:
np.min(a)

Note that many of these functions are replicated as methods on the array object:

In [None]:
a.mean()

But, unfortunately, not all of them:

In [None]:
a.median()

What methods are available for an array? Quite a few!

In [None]:
dir(a)

### 1-D versus 2-D arrays

The `shape` property is often useful.

In [None]:
arr.shape

Arrays are not limited to being 1-dimensional. In linear algebra, a 1-d array is called a vector. A 2-d array is a called a matrix. A dimensions is referred to as an `axis` in numpy.

In [None]:
arr2 = np.array([[1,2,3],[4,5,6]])
arr2.shape

In [None]:
# take mean only along axis 1 (i.e., 2nd dimension of the array)
arr2.mean(axis=1)

Important note: Defining big matrices by hand is cumbersome and often requires a confusing number of square brackets. These examples are illustrative, but not typical of how you create them in real data analysis settings, as you'll see below.

One confusing detail about numpy data structures is that an array of length N is not *exactly* the same thing as an Nx1 or 1xN matrix. For the today, it shouldn't be an issue, but it's something to be aware of.

In [None]:
arr1xN = np.array([[1,2,3]])
arrNx1 = np.array([[1],[2],[3]])
arr.shape, arr1xN.shape, arrNx1.shape

Pro tip: A very convenient feature of numpy is the behavior called broadcasting. Look what happens if I add 2x3 matrix to a vector of length 3. Dimensions of length 1 are automatically tiled (or "broadcast") to match dimensions in the other matrix.

In [None]:
print('arr1xN:\n', arr1xN)
print('arr2:\n', arr2)
print('arr1xN + arr2:\n', arr1xN + arr2)

Note that broadcasting fails when the dimensions with length >1 don't match:

In [None]:
arrNx1 + arr2

Bonus exercise. Can you create and add a 1x3 and a 4x1 array? How big is the result?

In [None]:
#Answer
a1 = np.array([[1,2,3]])
a2 = np.array([[1],[2],[3],[4]])
a12 = a1+a2
a12.shape

To generate a 2D array of zeros, pass it a tuple or list of dimensions

In [None]:
np.zeros((4,4))

### Slicing arrays

arrays can be sliced just like lists

In [None]:
print("First 2 values of a:", a[:2])

In [None]:
print("Last 3 values of a:", a[-3:])

But there's more! You can also pull out any subset of values indexed by a list:

In [None]:
i_want = [1, 2, 4, 10]
a[i_want]

In [None]:
i_want_repeats=[1,1,1,1,2,2,2,2]
a[i_want_repeats]

For n-dim arrays, you can provide an index for each dimension.

In [None]:
a12[[2,3],:2]

If you ignore the last dimension, it's like passing a `:`

In [None]:
a12[[2,3]]

### Exercise - Changing array dimensions

What is the size of `a12[2]`? How many dimensions does it have?

In [None]:
# Answer
print("a12[2] shape:", a12[2].shape)
print("number of dimensions after slicing out a single value from axis 0:", a12[2].ndim)

What if I index using a list containing a single value? Notice the shape dimensionality now:

In [None]:
a12[[2],:]

In [None]:
a12[[2],:].shape

### Boolean operations on arrays

It's easy to select (or "slice") rows, columns and submatrices from matrices, same way as you might from a list

In [None]:
arr2[:,0]

In [None]:
arr2[1,:]

In [None]:
arr2[:, 0:2]

But say you want to select entries from a vector based on whether they meet some criterion. Let's start with this array:

In [None]:
a = np.array([1,10,2,9,3,8,4,7,5,6])

We can easily find each entry with a value > 5:

In [None]:
a>5

Logical operations return an array of boolean values (False, True), which are equivalent to (0, 1). So we can count the number of True values easily:

In [None]:
np.sum(a>5)

And it's very easy to select these values:

In [None]:
a[a>5]

### Exercise - How many matches?

How many values of `a` are <4?  Can you perform this operation in a single line of code?

In [None]:
# Answer
(a<4).sum()

### Exercise - Operate on a subset

What's the mean value of entries >4?

In [None]:
# Answer
(a[a>4]).mean()

## Analyzing spike data in numpy

### Load data from csv files

Loading data into python (or any analysis software) can sometimes be the most confusing part of an analysis. Today, we'll try to make that as simple as possible. We'll load from an experiment where we recorded single-unit neural activity while presenting a set of tones.

In [None]:
spikes1 = np.loadtxt('data/CRD016c-40-1.csv', delimiter=',')
cell1_label = 'CRD016c-40-1'

`spikes1` is simply a list of times (in seconds) when spikes were recorded from a neuron. `cell1_label` is used to keep track of the identity of the neuron and the experiment in which it was recorded. 

In [None]:
spikes1

In [None]:
print("Total number of spikes:", len(spikes1))

In [None]:
print("First spike time:", spikes1.min(), "sec")
print("Last spike time:", spikes1.max(), "sec")

Sometimes numpy carries numbers out to way more significant digits than you want to read. The `round` function is useful for tidying things up.

### Exercise - Getting help on a function

We want to round a number to 3 significant digits. I've told you that `np.round` can do it, but how does it work? The `?` operation is a special way of displaying help for a function. Try asking for help, then see if you can used that information to round `x` to 3 significant digits:

In [None]:
np.round?

In [None]:
x = spikes1.min()
# Answer
np.round(x, 3)

### Load stimulus information

So we've spike data. Now we need to learn when the stimuli were presented. That information is stored in a second file. 

In [None]:
stim = np.loadtxt('data/stim.csv', delimiter=',')
stim = np.round(stim, 3)
stim.shape

`stim` has 251 rows and 3 columns. Each row provides information about a different stimulus event. The first column tells you the frequency of each sound that was presented.

In [None]:
stim[:,0]

The second and third columns indicate the times when the sound started and stopped.

In [None]:
for i in range(15):
    print(i, stim[i,0], "Hz", stim[i,1], "-", stim[i,2], "sec")

### Exercise - How long were the stimuli?

Compute the mean difference between each stop and start time.

In [None]:
# Answer
(stim[:,2]-stim[:,1]).mean()

One stimulus was 933 Hz. Let's find all those events.

In [None]:
f = 933
b = (stim[:,0]==f)
b

Notice that `b` is the same size as `stim[:,0]`, but `True` only when the value is 933.

### Exercise - How many matching stimulus events?

How many times was the 933 Hz stimulus presented?  Use the counting operation we used on boolean vectors above.

In [None]:
# Answer
f=933
np.sum(stim[:,0]==f)  # or b.sum()

You can get a quick view of when 933 Hz was played by looking at the matching submatrix of `stim`

In [None]:
stim[b,:]

Let's pull out those times in smaller vectors with logical names.

In [None]:
f = 933
b = (stim[:,0]==f)
start_times = stim[b,1]
stop_times = stim[b,2]

print("Times of", f, "Hz onset:", start_times, "sec")

### Computing sound-evoked spike rate
Now the analysis begins!  We want to count how many spikes the neuron produced when each sound was played. Let's start with the first presentation of the 933 Hz Tone.

In [None]:
s = start_times[0]
e = stop_times[0]
print("start:",s,"stop:",e)
spikes1[(spikes1>s) & (spikes1<=e)]

We can do the same thing for the second presentation of the 933 Hz tone.

In [None]:
s = start_times[1]
e = stop_times[1]
spikes1[(spikes1>s) & (spikes1<=e)]

For now, we don't care exactly when the spikes happened, but rather how many there were. Note that when we compute a `sum`, True is equal to 1 and False is equal to 0.

In [None]:
np.sum((spikes1>s) & (spikes1<=e))

Let's count how many spikes occured after each presentation of 933 Hz

In [None]:
spike_counts = np.zeros(len(start_times))
for i in range(len(start_times)):
    s = start_times[i]
    e = stop_times[i]
    spike_counts[i] = np.sum((spikes1>s) & (spikes1<=e))
    print("Trial", i, start_times[i] ,"-", stop_times[i], "sec:", spike_counts[i], "spikes")
    
mean_spikes = np.round(spike_counts.mean(),2)
print("Mean spike count:", mean_spikes)

It's often useful to compute spike rate (spikes per second) rather than simply counting them.

In [None]:
stim_duration = np.round(np.mean(stop_times-start_times),4)
spike_rate = np.round(mean_spikes/stim_duration,4)

print(mean_spikes, "spikes during", stim_duration, "sec translate into a firing rate of", spike_rate, "spikes/s")

We're doing a lot of for-looping and indexing. Remember from above that `enumerate` provides a more "pythonic" way of iterating though lists of data:

In [None]:
data = np.array([1,2,3,4,5])
for i,x in enumerate(data):
    print(i, x**2)

What if you have two lists, like start_times and stop_times?  There's the `zip` function that lets you iterate through two or more lists at once.

In [None]:
for s,e in zip(start_times,stop_times):
    print(f"stop-start: {e}-{s} = {np.round(e-s,3)}")

And if you want to get really crazy, you can nest `zip` inside `enumerate`.

In [None]:
for i, (s,e) in enumerate(zip(start_times,stop_times)):
    print(f"Trial {i}. stop-start: {e}-{s} = {np.round(e-s,3)}")

So here's a more "pythonic" way to calculate mean spike rate evoked by 933 Hz. Note that the code is organized to follow the conceptual steps of calculating the mean response on each trial, and then averaging across trials.

In [None]:
trial_count = len(start_times)
single_trial_rates = np.zeros(trial_count)
for i, (s,e) in enumerate(zip(start_times,stop_times)):
    # calculate mean spike rate on each trial
    single_trial_rates[i] = np.sum((spikes1>s) & (spikes1<=e)) / stim_duration

# average across trials
spike_rate = np.mean(single_trial_rates)
spike_rate

### Exercise - Find times for a different tone frequency
Now, say we want to find out when a different tone was played, say 2462 Hz. Redefine `start_times` and `stop_times` for those tones. (Hint: you can do this by very slightly modifying code from above)

In [None]:
# specify the new frequency in a variable
f = 2462
# Answer
b = (stim[:,0]==f)
start_times = stim[b,1]
stop_times = stim[b,2]

Then you can run the same code as above, but with the new values of `f`, `start_times` and `stop_times`

In [None]:
trial_count = len(start_times)
single_trial_rates = np.zeros(trial_count)
for i, (s,e) in enumerate(zip(start_times,stop_times)):
    # calculate mean spike rate on each trial
    single_trial_rates[i] = np.sum((spikes1>s) & (spikes1<=e)) / stim_duration

# average across trials
spike_rate = np.mean(single_trial_rates)
print(f, "Hz stim: evoked rate is",np.round(spike_rate,2), "spk/s")

To compute a tuning curve, we want to compute the evoked spike rate for each stimnlus. First we compute a list of all unique stimuli that were presented with the `unique` function

In [None]:
f_list = np.unique(stim[:,0])

Then "wrap" our spike rate computation in a for loop.

In [None]:
spike_rates = np.zeros(f_list.shape)
for j, f in enumerate(f_list):   # for loop to iterate through each unqiue
    b = (stim[:,0]==f)
    start_times = stim[b,1]
    stop_times = stim[b,2]

    trial_count = len(start_times)
    single_trial_rates = np.zeros(trial_count)
    for i, (s,e) in enumerate(zip(start_times,stop_times)):
        # calculate mean spike rate on each trial
        single_trial_rates[i] = np.sum((spikes1>s) & (spikes1<=e)) / stim_duration
    
    # average across trials
    spike_rates[j] = np.mean(single_trial_rates)
    print(f, "Hz stim: evoked rate is",np.round(spike_rates[j],2), "spk/s")

Note that we've conviently stored all the spike_rates in an array. That will be handy for our plots!

In [None]:
spike_rates

## Plotting tuning curves

Tuning curves are a lot easier to understand if we plot them. So let's see what we've got!

We've already imported the matplotlib library into `plt`. Now we used the `plot()` function to generate a tuning curve. `plot` has lots of optional parameters to make things pretty, but the simple starting point is `plt.plot(x, y)`, which plots vector `y` as a function of vector `x`. Note that `x` and `y` have to have the same length.

In [None]:
plt.plot(f_list, spike_rates)

### Labeling axes and other annotations

This graph may make sense, but it's ALWAYS a good idea to label things. The `xlabel` and `ylabel` functions are easy. Also, the `o-` option for `plot` also puts dots to indicate the actual data points. And one random note, the semicolon (`;`) on the last line prevents the extra output from displaying.

In [None]:
plt.plot(f_list, spike_rates, marker='o');
plt.xlabel('Frequency (Hz)')
plt.ylabel('Mean evoked spikes/sec')
plt.title(cell1_label + ' frequency tuning curve');

### Adding a second line to a plot

One other thing that helps interpretting tuning curves is to know the spontaneous firing rate of the neuron. We can measure that from the average spike rate during the 0.1 sec preceding every stimulus (since we happen to know that no tone was ever played during that time window).

In [None]:
all_start_times = stim[:,1]
all_start_times[:10]

We define `pre_start_times` as 0.1 sec before each tone onset time, and then use the same strategy we used above to measure evoked rate. Now, however, we can average over all stimulus events to get a single spontaneous rate.

In [None]:
pre_duration = 0.1
pre_start_times = all_start_times - pre_duration
total_trials = len(pre_start_times)
spont_single_rates = np.zeros(total_trials)
for i,(s,e) in enumerate(zip(pre_start_times,all_start_times)):
    spont_single_rates[i] = np.sum((spikes1>s) & (spikes1<=e))/pre_duration
spont_rate = np.mean(spont_single_rates)
np.round(spont_rate,3)

We can use `axhline` to plot the spont rate as a horizontal line. Also, we assoicate a `label` with each line, which in turn is used to generate a `legend`.

In [None]:
plt.plot(f_list, spike_rates, 'o-', label='evoked rate')
plt.axhline(y=spont_rate, linestyle='--', label='spont rate')
plt.xlabel('Sound frequency (Hz)')
plt.ylabel('Mean evoked spike rate')
plt.title(f'{cell1_label} frequency tuning curve')
plt.legend();

### Analyzing tuning for a second neuron

Let's load data from a second single unit, which was recorded simulatenously with the first one. This unit was recorded from electrode number 50, while the previous one (CRD016c-40-1) was recorded from electrode 40.

In [None]:
spikes2 = np.loadtxt('data/CRD016c-50-1.csv', delimiter=',')
cell2_label = 'CRD016c-50-1'

Since this cell was recorded simultaneously, we can use the same `stim` matrix to guide analysis.

In [None]:
f=933
b = (stim[:,0]==f)
start_times = stim[b,1]
stop_times = stim[b,2]
trial_count=len(start_times)
single_trial_rates = np.zeros(trial_count)
for i, (s,e) in enumerate(zip(start_times,stop_times)):
    # calculate mean spike rate on each trial
    single_trial_rates[i] = np.sum((spikes2>s) & (spikes2<=e)) / stim_duration

# average across trials
spike_rate2 = np.mean(single_trial_rates)
print(f"{cell2_label}, {f} Hz stimulus: evoked spike rate is {np.round(spike_rate2,2)} spk/s")

In [None]:
f_list = np.unique(stim[:,0])

# NOTE: intialize spike_rates2 here
spike_rates2 = np.zeros(f_list.shape)

for j,f in enumerate(f_list):
    b = (stim[:,0]==f)
    start_times = stim[b,1]
    stop_times = stim[b,2]
    
    trial_count = len(start_times)
    single_trial_rates = np.zeros(trial_count)
    for i, (s,e) in enumerate(zip(start_times,stop_times)):
        # calculate mean spike rate on each trial
        single_trial_rates[i] = np.sum((spikes2>s) & (spikes2<=e)) / stim_duration
    
    # average across trials
    spike_rates2[j] = np.mean(single_trial_rates)
    print(f"{cell2_label} {f} Hz stimulus: evoked spike rate is {np.round(spike_rates2[j],2)} spk/s")

Let's compute a tuning curve for neuron 2. We can copy and paste code from above, but let's save the result to a new vector, `spike_rates2`, so that we don't overwrite the tuning curve for the first neuron.

### Exercise - Compute spontaneous rate for a second neuron

Same logic as for computing `spike_rates2`. Compute `spont_rate2` by copying code from above and modifying it to work for neuron 2 without overwriting results for neuron 1.

In [None]:
# We don't need to initialize these variables that describe the stimuli, 
# but here they are for reference
all_start_times = stim[:,1]
pre_duration = 0.1
pre_start_times = all_start_times - pre_duration
total_trials = len(pre_start_times)

# Answer
spont_single_rates = np.zeros(total_trials)
for i,(s,e) in enumerate(zip(pre_start_times,all_start_times)):
    spont_single_rates[i] = np.sum((spikes2>s) & (spikes2<=e))/pre_duration
spont_rate2 = np.mean(spont_single_rates)

np.round(spont_rate2,3)

Now we copy and paste plot a tuning curver for neuron 2

In [None]:
plt.plot(f_list, spike_rates2, 'o-', label='evoked rate')
plt.axhline(y=spont_rate2, linestyle='--', label='spont rate')
plt.xlabel('Sound frequency (Hz)')
plt.ylabel('Mean evoked spike rate')
plt.title(f'{cell2_label} frequency tuning curve')
plt.legend();

### Exercise - Overlay the tuning curves

We can see that this neuron has a similar best frequency near 1000 Hz, but a different shape. To help distinguish the cells, it would be nice to plot the two tuning curves on top of each other. Call the plot command twice to plot the two curves in the same graph. Don't worry about spont rate for the moment. (Bonus: can you use a different marker style to label points for neuron 2?)

In [None]:
# Answer
plt.plot(f_list, spike_rates, 'o-', label=cell1_label)
plt.plot(f_list, spike_rates2, 's-', label=cell2_label)
plt.legend()
plt.xlabel('Frequency (Hz)')
plt.ylabel('Mean spike rate');

Notice that `matplotlib` automatically plots the second curve in a different color, which is helpful for reading the graph.

For completeness, we want to add spontaneous rate for each neuron. That's easy, right?

In [None]:
plt.plot(f_list, spike_rates, 'o-', label=cell1_label)
plt.plot(f_list, spike_rates2, 's-', label=cell2_label)
plt.legend()
plt.xlabel('Frequency (Hz)')
plt.ylabel('Mean spike rate');

plt.axhline(spont_rate, linestyle='--')
plt.axhline(spont_rate2, linestyle='--')

### Exercise - Change plot color

This is basically working, but notice that the `axhline` plots are not using appropriate colors. We can manually control plot colors with the `color` option. Replot the two tuning curves from above, but color the `hlines` to match the evoked activity. Colors can be specified many different ways. The most comprehensive is to specify the RGB (red-green-blue) HEX code (https://matplotlib.org/stable/tutorials/colors/colors.html). To help, we'll get a list of the automatic color cycle from `matplotlib`. You'll want spont rate for neuron 1 plotted in `color_set[0]` and spont rate for neuron 2 in `color_set[1]`. The `plt.rcParams` dictionary gives you access to many, many parameters that control the look and feel of matplotlib plots. Here, we're just using it to find out the default plot colors (which can be changed if you want!).

In [None]:
color_set = plt.rcParams['axes.prop_cycle'].by_key()['color']
color_set

In [None]:
# Code copied from above
plt.plot(f_list, spike_rates, 'o-', label=cell1_label)
plt.plot(f_list, spike_rates2, 's-', label=cell2_label)
plt.legend()
plt.xlabel('Frequency (Hz)')
plt.ylabel('Mean spike rate');

# Now plot the horizontal lines in matching colors.

# Answer
plt.axhline(spont_rate, linestyle='--', color=color_set[0])
plt.axhline(spont_rate2, linestyle='--', color=color_set[1])

## Generating raster plots

Tuning curves are great, but they hide a lot of details about the dynamics of neural activity. Raster plots provide a richer characterization of the response.

### Extract spike times

So far, we have been simply counting spikes that fell between stimulus onset and offset. Now, we'll like to plot when those spikes occured on each trial. Remember how to get spike times?

In [None]:
f = 933
b = (stim[:,0]==f)
start_times = stim[b,1]
stop_times = stim[b,2]

s = start_times[0]
e = stop_times[0]

spike_times = spikes1[(spikes1>s) & (spikes1<e)]
spike_times

For a raster, we want to know when spikes occurred on each trial, relative to the time of stimulus onset. So we need to subtract the stimulus onset time. Notice that all the spike times fall within the range 0-0.3 sec, since the stimulus was 0.3 sec long.

In [None]:
spike_times = spikes1[(spikes1>s) & (spikes1<e)] - s
spike_times

### Plotting a raster

How do we generate a useful plot?  This definitely doesn't make a ton of sense:

In [None]:
plt.plot(spike_times);

Instead, for a raster, we want to plot a row of dots, with time indicated by their position on the x-axis. Position on the y axis is a constant, indicating that they all occurred during the same trials. The `ones` function is quite handy here.

To plot dots without connecting lines, we provide some extra parameters to `plot`. 

In [None]:
y = np.ones(spike_times.shape)
plt.plot(spike_times, y, linestyle='', marker='.', color='black')
plt.xlabel('Time from stimulus onset (s)')

This is such a common way to plot data, that there's a shortcut in `plot`, where you can simply pass the parameter `'k.'` after `x` and `y`.

In [None]:
plt.plot(spike_times, y, 'k.')
plt.xlabel('Time from stimulus onset (s)')

Kind of making sense? You can see things better if you plot the raster for multiple trials. 

In [None]:
spike_counts = np.zeros(len(start_times))
plt.figure()
for i in range(len(start_times)):
    s = start_times[i]
    e = stop_times[i]
    
    spike_times = spikes1[(spikes1>s) & (spikes1<e)] - start_times[i]
    plt.plot(spike_times,np.ones(spike_times.shape)*i,'k.')
plt.xlabel('Time from stimulus onset (s)')
plt.ylabel('Stimulus repetition');

And it's helpful to include some silent periods pre- and post-stimulus.

In [None]:
spike_counts = np.zeros(len(start_times))
plt.figure()
for i in range(len(start_times)):
    s = start_times[i]-0.1
    e = stop_times[i]+0.1
    
    spike_times = spikes1[(spikes1>s) & (spikes1<e)] - start_times[i]
    plt.plot(spike_times,np.ones(spike_times.shape)*i,'k.')
plt.xlabel('Time from stimulus onset (s)');

Wait a sec... Not much changed from the previous plot. Why is that? Turns out this neuron has a very low spontaneous rate, and there were no spikes at all before stimulus onset. Use the `xlim` function to make sure the x axis display properly.

In [None]:
spike_counts = np.zeros(len(start_times))
plt.figure()
for i in range(len(start_times)):
    s = start_times[i]-0.1
    e = stop_times[i]+0.1
    
    spike_times = spikes1[(spikes1>s) & (spikes1<e)] - start_times[i]
    plt.plot(spike_times,np.ones(spike_times.shape)*i,'k.')
    
plt.xlabel('Time from stimulus onset (s)')
plt.xlim([-0.1, 0.4]);

### Exercise - Overlay vertical line

It's helpful to mark when the sound turns on and off. Use `axvline` function (very similar to `axhline`) to mark the stimulus onset (0) and offset times (0.3). To make the lines stand out, use green dashed lines (`colors='lightgreen', linestyles='--'`). If you need help, try `plt.axvline?` Notice that this gives you A LOT of information. If you carefully read a few of these help displays, however, it gets easier to parse out the information you need.

In [None]:
plt.axvline?

In [None]:
spike_counts = np.zeros(len(start_times))
plt.figure()
for i in range(len(start_times)):
    s = start_times[i]-0.1
    e = stop_times[i]+0.1
    
    spike_times = spikes1[(spikes1>s) & (spikes1<e)] - start_times[i]
    plt.plot(spike_times,np.ones(spike_times.shape)*i,'k.')
    
plt.xlabel('Time from stimulus onset (s)')
plt.xlim([-0.1, 0.4])

# Now add code to plot the vertical lines

# Answer
plt.axvline(0, color='lightgreen', linestyle='--')
plt.axvline(0.3, color='lightgreen', linestyle='--');

### Stacking rasters for multiple stimuli

Now let's get fancy. We want a single raster plot to show responses to all the different stimuli. Let's loop through all the different stimulus frequencies, similar to the tuning curve calculation above, and stack the rasters on top of each other

In [None]:
plt.figure()
row = 0
for j,f in enumerate(f_list):  # outer loop -- stimuli
    b = (stim[:,0]==f)
    start_times = stim[b,1]
    stop_times = stim[b,2]
    for i in range(len(start_times)):  # innter loop -- repetitions
        s = start_times[i]-0.1
        e = stop_times[i]+0.1

        spike_times = spikes1[(spikes1>s) & (spikes1<e)] - start_times[i]
        row += 1
        plt.plot(spike_times,np.ones(spike_times.shape)*row, 'k.')
        
plt.xlabel('Time from stimulus onset (s)')
plt.xlim([-0.1, 0.4]);
plt.axvline(0, color='lightgreen', linestyle='--')
plt.axvline(0.3, color='lightgreen', linestyle='--');

Make sense? This code can be hard to work with in the long run, though, because extraction of the spike times is mixed up with the plotting. It would be more modular if we could break the code into separate parts. First, figure out where all the black dots should be plotted, then in a separate step, plot all the points at once.

We want to collect information about spiking across all the stimulus events into one big matrix with one row per spike. Each row has two values, indicating the trial number and the time of that spike within the trial. There can of course be more than one spike per trial:

$$ R = \begin{bmatrix}
1 & t_1\\
1 & t_2\\
... & \\
251 & t_N
\end{bmatrix} $$

A useful fucntion for building a big matrix like this is `append`.

### Exercise - Appending matrices

Say I have two matrices, `a` is 4 x 2 and `b` is 3 x 2. I want to append `b` to `a` along their columns (`axis=0`) so that the result, `c` is 7 x 2. Figure out how to use `np.append` to do this.

In [None]:
np.append?

In [None]:
a = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
b = np.array([[9, 10], [11, 12], [13, 14]])

# Answer
c = np.append(a, b, axis=0)
c

Say we want to append two 3x1 matrices to get a 3x2 matrix. Can you fix the error? 

In [None]:
a = np.zeros(3)
b = np.array([[1],[2],[3]])

c = np.append(a,b,axis=1)
c

In [None]:
# Answer
a = np.zeros((3,1))
b = np.array([[1],[2],[3]])

c = np.append(a,b,axis=1)
c

Now we put it all together to generate the big `data` matrix of spikes.

In [None]:
# pre-define the matrix with appropriate number of columns
row = 0
data = np.zeros((0,2))
first_trial_row=np.zeros(f_list.shape)

for j,f in enumerate(f_list):  # go through each unique stimulus
    # we're also keeping track of each row when a new stimulus occurs (see below)
    first_trial_row[j] = row
    b = (stim[:,0]==f)
    start_times = stim[b,1]
    stop_times = stim[b,2]
    for i in range(len(start_times)):  # go through each presentation of that stimulus
        s = start_times[i]-0.1
        e = stop_times[i]+0.1

        row += 1
        spike_times = spikes1[(spikes1>s) & (spikes1<e)] - start_times[i]
        this_data = np.zeros((len(spike_times),2))
        this_data[:,0] = row
        this_data[:,1] = spike_times
        data = np.append(data, this_data, axis=0)
data.shape

Now plotting is straightforward

In [None]:
plt.plot(data[:,1], data[:,0], 'k.')
plt.axvline(0, color='lightgreen', linestyle='--')
plt.axvline(0.3, color='lightgreen', linestyle='--');
plt.xlabel('Time from sound onset (s)')
plt.title(cell1_label + ' raster');

One more thing. We want the axis labels to mean something. We're going to use the `yticks` command to hack the way the y axis is labeled, using the `first_trial_row` vector to label the first row where a particular stimulus was presented.

In [None]:
first_trial_row

`yticks` let's you specify where to add a ylabel and what value to display

In [None]:
plt.plot(data[:,1], data[:,0], 'k.')
plt.axvline(0, color='lightgreen', linestyle='--')
plt.axvline(0.3, color='lightgreen', linestyle='--');
plt.yticks(first_trial_row, f_list.astype(int))
plt.ylabel('Sound frequency (Hz)')
plt.xlabel('Time from sound onset (s)')
plt.title(cell1_label + ' raster');

### Exercise - Plot raster for a new cell

Generate a raster for cell #2. You should be able to accomplish this largely by copying and modifying code from above.

In [None]:
#Answer
# pre-define the matrix with appropriate number of columns
row = 0
data = np.zeros((0,2))
first_trial_row=np.zeros(f_list.shape)

for j,f in enumerate(f_list):
    # we're also keeping track of each row when a new stimulus occurs
    first_trial_row[j] = row
    b = (stim[:,0]==f)
    start_times = stim[b,1]
    stop_times = stim[b,2]
    for i in range(len(start_times)):
        s = start_times[i]-0.1
        e = stop_times[i]+0.1

        row += 1
        spike_times = spikes2[(spikes2>s) & (spikes2<e)] - start_times[i]
        this_data = np.zeros((len(spike_times),2))
        this_data[:,0] = row
        this_data[:,1] = spike_times
        data = np.append(data, this_data, axis=0)

plt.plot(data[:,1], data[:,0], 'k.')
plt.axvline(0, color='lightgreen', linestyle='--')
plt.axvline(0.3, color='lightgreen', linestyle='--');
plt.xlabel('Time from sound onset (s)')
plt.yticks(first_trial_row, f_list.astype(int))
plt.ylabel('Sound frequency (Hz)')
plt.title(cell2_label + ' raster');

### Bonus. Tweaking the x axis for log-spaced tones.

Remember this tuning curve?

In [None]:
f_list = np.unique(stim[:,0])
spike_rates = np.zeros_like(f_list)
for j,f in enumerate(f_list):
    b = (stim[:,0]==f)
    start_times = stim[b,1]
    stop_times = stim[b,2]
    
    trial_count = len(start_times)
    single_trial_rates = np.zeros(trial_count)
    for i, (s,e) in enumerate(zip(start_times,stop_times)):
        # calculate mean spike rate on each trial
        single_trial_rates[i] = np.sum((spikes1>s) & (spikes1<=e)) / stim_duration

    # mean across trials
    spike_rates[j] = np.mean(single_trial_rates)

all_start_times = stim[:,1]
pre_duration = 0.1
pre_start_times = all_start_times - pre_duration

spont_single_rates = np.zeros(total_trials)
for i,(s,e) in enumerate(zip(pre_start_times,all_start_times)):
    spont_single_rates[i] = np.sum((spikes1>s) & (spikes1<=e))/pre_duration
spont_rate = np.mean(spont_single_rates)

plt.plot(f_list, spike_rates, 'o-', label='evoked rate')
plt.axhline(y=spont_rate, linestyle='--', label='spont rate')
plt.xlabel('Sound frequency (Hz)')
plt.ylabel('Mean evoked spike rate')
plt.title(cell1_label + ' frequency tuning curve')
plt.legend();

It's pretty clear now that tones near 1000 Hz evoke strong responses, compared to lower or higher frequencies. But there are a few things we can do to make this easier to read.

First, the auditory system works in a logorithmic scale. That is, the difference between 500  and 1000 Hz is perceived as similar to the difference between 1000 and 2000 Hz. The experiment played tones with logorithmic spacing to match the perceptual space, and you can see that the dots are more closely spaced on the x-axis at lower sound frequencies. There are a few ways to plot with even spacing. One option is to use the `semilogx` function.

In [None]:
plt.semilogx(f_list, spike_rates, 'o-')
plt.xlabel('Frequency (Hz)')
plt.ylabel('Mean evoked spikes/sec')
plt.title(cell1_label + ' frequency tuning curve');

Notice that the labels on the x axis are weirdly spaced and hard to read. An alternative is to manually overwrite the x axis labels. If we only pass a single vector to `plot`, the x value will be inferred to be 1 ... `len(spike_rates)`. Then we can manually relabel the x axis with `xticks`.

In [None]:
plt.plot(spike_rates, 'o-')
xticks = np.arange(0,len(f_list),2)
plt.xticks(xticks,f_list[xticks].astype(int))
plt.xlabel('Frequency (Hz)')
plt.ylabel('Mean evoked spikes/sec')
plt.title(cell1_label + ' frequency tuning curve');