# Basics of spike train analysis

We will get you started on analysing spike trains.

1\. Loading and plotting spike trains

2\. Calculating firing rates

3\. Interspike interval

Let's first import the packages we are going to use, and set up some plotting parameters.

In [None]:
%matplotlib inline
%config InlineBackend.rc={'figure.figsize': (12, 6), 'font.size': 14 }
import matplotlib.pyplot as plt
import numpy as np

## 1. Loading and plotting spike trains

Action potentials are very short. We can discard their duration and assign each action potential a time of occurence: the spike time.

The set of spike times of a single neuron is called a spike train.

We will begin by loading the spike train of an example neuron. We will then try to visualize it.

### Loading and plotting a single spike train

Let's first take a look at the data file. Open `data_spike_trains` and then `example_spikes.txt` via the left panel.

Numpy offers a very easy way to load such a file into the memory.

In [None]:
example_spike_times = np.loadtxt("data_spike_trains/example_spikes.txt")
print(example_spike_times)

Note, how the comment at the start of the file was ignored.

**Exercise:**

Print the shape of `example_spike_times`. What does the result mean? Print the time of the 10th spike in milliseconds.

In [None]:
# Start your code here
# FIXME
print(example_spike_times.shape)
print(example_spike_times[9] * 1000)

**Expected output:**
```
(5391,)
1634.5
```

Let's start visualizing our data with matplotlib. For creating raster plots, matplotlib offers the function `eventplot`.

In [None]:
plt.eventplot(example_spike_times);

The semicolon only suppresses the output of the cell. It's not required.

There's too many lines in a small space. We can focus the view to a specific part using `xlim`.

In [None]:
plt.eventplot(example_spike_times)
plt.xlim((0, 10));

We can use more Matplotlib functions to make our plot prettier and more informative.

In [None]:
plt.eventplot(example_spike_times, colors="black", linewidths=0.5)
plt.xlim((0, 10))
plt.xlabel("Time [s]")
plt.title("Raster plot");

### Intermezzo: Lists and arrays

If we want to work with multiple neurons, we need data structures that can hold all that data, e.g. all spike trains. We could create one variable for each spike train, but that will get very tedious if we have many neurons. Instead, we want one variable for all spike trains.

Let's first remind ourselves of the pros and cons of lists and arrays:

|Lists|Arrays|
|--|--|
|can contain multiple datatypes|can only contain one datatype|
|indexing by slicing|indexing by slicing and boolean arrays|
|only few methods/functions|fast and powerful computations using numpy|
| |numpy functions work on/return arrays anyway|
|only 1-dimensional|multiple dimensions possible|

2-dimensional arrays are a great choice for e.g. parameters (mean firing rate etc.)! The first dimension (rows) can denote the cell, the second dimension (columns) can denote its parameter.

To optimize memory allocation we often want to initialize an empty array before we begin our analysis. Let's say we have 20 cells and we want to calculate 3 different parameters for each cell.

In [None]:
parameter_storage = np.empty((20, 3))

During our analysis, we can store the results in this 2D array.

For spike trains, however, 2D arrays are not suited. This is because 2D arrays must be rectangular, i.e. each row has the same number of columns. Not all cells have the same number of spikes in their spike train, though.

Instead, we could use arrays of arrays, lists of lists, or lists of arrays. All are fine, but we recommend lists of arrays.

This means: All spike train data is stored in a list. The elements of the list are numpy arrays, one for each cell. Each array contains the spike train of that cell. 

### Loading and plotting multiple spike trains

If we want to compare neurons, we need to load multiple spike trains. We can do this by calling `loadtxt` for each file separately, and then putting everything in a list.

In [None]:
spikes_1 = np.loadtxt("data_spike_trains/18_SP_C101.txt")
spikes_2 = np.loadtxt("data_spike_trains/18_SP_C203.txt")
spikes_3 = np.loadtxt("data_spike_trains/18_SP_C603.txt")

spikes_list = [spikes_1, spikes_2, spikes_3]
print(spikes_list)

This will get very tedious if we want to load dozens of spike trains. Repeating code often is inefficient coding.

But we know a better solution: for-loops!

In [None]:
path_list = ["data_spike_trains/18_SP_C101.txt", "data_spike_trains/18_SP_C203.txt", "data_spike_trains/18_SP_C603.txt"]

spikes_list = []
for i in range(len(path_list)):
    spikes_list = spikes_list + [np.loadtxt(path_list[i])]

print(spikes_list)

**Exercise:**

Write a function called `load_spike_trains_to_list`. As an argument, this function should take a list of filepaths called `list_of_paths`. It should return a list containing each neuron's spike times called `list_of_spikes`.

_Followup exercise:_ You can also loop over a list directly, by doing `for element in list:`. Use that to make your code even nicer.

_Followup exercise 2:_ Can you do the same with a list comprehension?

In [None]:
# Start your code here
# FIXME
def load_spike_trains_to_list(list_of_paths):
    
    list_of_spikes = []
    for i in range(len(list_of_paths)):
        list_of_spikes = list_of_spikes + [np.loadtxt(list_of_paths[i])]
    
    return list_of_spikes


def load_spike_trains_to_list(list_of_paths):
    
    list_of_spikes = []
    for path in list_of_paths:
        list_of_spikes = list_of_spikes + [np.loadtxt(path)]
    
    return list_of_spikes


def load_spike_trains_to_list(list_of_paths):
    
    list_of_spikes = [np.loadtxt(path) for path in list_of_paths]
    
    return list_of_spikes


# End your code here
# The next line is to test your function
print(load_spike_trains_to_list(path_list)[1])

**Expected output:**
```
[  0.5766   2.8239   4.5523 ... 481.387  482.4371 482.4677]
```

How does one get a list of all the paths efficiently?

We can get the paths of all files we need using `Path` from the package `pathlib`.

In [None]:
from pathlib import Path
all_paths = sorted(Path("data_spike_trains").glob("18_SP_C*"))
print(all_paths)

Posix is the filesystem standard for non-windows operating systems. On windows, you'd get a list of `WindowsPath`. Luckily, `loadtxt` also works with `PosixPath` and `WindowsPath`.

In [None]:
all_spike_trains = load_spike_trains_to_list(all_paths)

Now we have all our spike trains loaded in memory, in the list `all_spike_trains`. We can now use `eventplot` to plot all the spike trains in a raster plot.

In [None]:
plt.eventplot(all_spike_trains, colors="black", linelengths=0.8, linewidths=0.5)

plt.xlim((0,32))
plt.title("Raster plot")
plt.xlabel("Time (s)")
plt.ylabel("Neuron ID");

**Exercise:**

Let's focus on some neurons that have similar spike trains. Specifically, plot the first 32 seconds of the spike trains of the 1st, 12th, 13th, 14th, and 19th neurons in `all_spike_trains`, but not the rest.

In [None]:
# Start your code here
# FIXME
ids = [0, 11, 12, 13, 18]
some_spike_trains = [all_spike_trains[i] for i in ids]

plt.eventplot(some_spike_trains, colors="black", linelengths=0.8, linewidth=0.5);

plt.xlim((0,32))
plt.title("Raster plot")
plt.xlabel("Time (s)")
plt.ylabel("Neuron ID");

## 2 - Calculating firing rates

The generation of spikes is inherently noisy. Therefore, it is often more informative to look at the spike rate, i.e. an average over time.

In this chapter, we will focus on the first 8 seconds of recording of the five neurons in the plot above. Let's first extract that data.

In [None]:
ids = [0, 11, 12, 13, 18]
short_spike_trains = []
for i in ids:
    current_spike_train = all_spike_trains[i]
    current_spike_train = current_spike_train[current_spike_train < 8]
    short_spike_trains = short_spike_trains + [current_spike_train]

Numpy provides the function `histogram` to calculate histograms. We'll start by analyzing the first of the five neurons.

In [None]:
spikes_per_bin, bin_edges = np.histogram(short_spike_trains[0], bins=50, range=(0, 8))

You can specificy the number of bins the histogram should have via the parameter `bins` and the extent of the histogram with `range`. The function will return the number of occurences (spikes) in each bin and the edges of the bins.

We would rather like to know the centers of the bins, but they are easy to calculate. Note that there is one more bin edge than bins.

In [None]:
bin_size = bin_edges[1] - bin_edges[0]
bin_centers = bin_edges[:-1] + bin_size/2

Now we can plot the histogram as a curve.

In [None]:
plt.plot(bin_centers, spikes_per_bin, lw=5)

plt.title("Spike count histogram")
plt.xlabel("Time [s]")
plt.ylabel("Number of spikes");

The number of spikes on the y-axis depends, of course, on the size of the bins (try it by changing the parameter `bins` in `np.histogram`). The spike rate / firing rate would be more convenient.

In [None]:
firing_rate = spikes_per_bin / bin_size

plt.plot(bin_centers, firing_rate, lw=5)

plt.title("Firing rate histogram")
plt.xlabel("Time [s]")
plt.ylabel("Firing rate [Hz]");

Compared to the eventplot, we can now see the dynamics of the spike train much better.

In the next three exercises, your goal will be to create a plot that shows the firing rates of all five neurons, so that we can compare them better.

**Exercise:**

Write a function called `firing_rate_histogram` that calculates the firing rate histogram from a spike train. As an argument, this function should take a spike train called `spike_train`. It should return two variables: First, the firing rate in each bin of the histogram, called `hist_firing_rate`. Second, the centers of the corresponding bins, called `hist_bin_centers`. The histogram should have 50 bins in the range from 0 to 8.

_Hint:_ You can return multiple results from a function by writing `return result_1, result_2`.

In [None]:
# Start your code here
# FIXME
def firing_rate_histogram(spike_train):
    spikes_per_bin, bin_edges = np.histogram(spike_train, bins=50, range=(0, 8))
    bin_size = bin_edges[1] - bin_edges[0]
    hist_bin_centers = bin_edges[:-1] + bin_size/2
    hist_firing_rate = spikes_per_bin / bin_size
    return hist_firing_rate, hist_bin_centers

# End your code here
# The next lines are to test your function
hist, bins = firing_rate_histogram(short_spike_trains[1])
print(hist[30], bins[30])

**Expected output:**

```93.75 4.88```

**Exercise:**

Write another function called `multi_frate_histograms` that calculates a histogram for each of multiple spike trains. As an argument, this function should take a list of spike trains called `list_of_spike_trains`. It should return two variables: First, a list called `multi_hist_firing_rate` that contains for each given spike train the corresponding firing rates in a histogram. Second, the bin centers of these histograms, called `hist_bin_centers`.

_Hint:_ Make use of a for-loop and the function you wrote above. Remember that the bin centers of all the histograms are the same, so you don't have to create a list for them. A single 1D array will suffice.

In [None]:
# Start your code here
# FIXME
def multi_frate_histograms(list_of_spike_trains):
    multi_hist_firing_rate = []
    for spike_train in list_of_spike_trains:
        hist_firing_rate, hist_bin_centers = firing_rate_histogram(spike_train)
        multi_hist_firing_rate = multi_hist_firing_rate + [hist_firing_rate]
    return multi_hist_firing_rate, hist_bin_centers

# End your code here
# The next lines are to test your function
hist, bins = multi_frate_histograms(short_spike_trains)
print(hist[2][30], bins[30])

**Expected output:**

```56.25 4.88```

**Exercise:**

Finally, create one plot that shows the histograms of the five neurons from the end of the last chapter (use `short_spike_trains`).

_Hint:_ Use your function from above. You can plot multiple graphs in one plot by simple calling `plt.plot` multiple times.

In [None]:
# Start your code here
# FIXME
frate_histogram_list, bin_centers = multi_frate_histograms(short_spike_trains)

for histogram in frate_histogram_list:
    plt.plot(bin_centers, histogram, lw=5);

plt.title("Firing rate histogram")
plt.xlabel("Time [s]")
plt.ylabel("Firing rate [Hz]");

## 3 - Interspike interval

Spike generation does not only depend on the stimulus, there are also internal statistics.

The interspike interval (ISI) is defined as the time between two consecutive spikes. The ISI histogram is a histogram of all these intervals of a spike train.

**Exercise:**

Calculate the ISI histogram for the spike train `example_spike_times` that we loaded in the beginning. Choose a reasonable number of bins and focus on ISIs below 30ms. Then plot the histogram.

_Hint:_ numpy provides a function `diff` that will come in handy. Open the documentation (`shift + tab`) or google it to find out how it works.

In [None]:
# Start your code here
# FIXME
isi = np.diff(example_spike_times)
isi_hist, bin_edges = np.histogram(isi, bins=50, range=(0, 0.03))
bin_size = bin_edges[1] - bin_edges[0]
bin_centers = bin_edges[:-1] + bin_size/2

plt.plot(bin_centers, isi_hist)
plt.title("Interspike interval histogram")
plt.xlabel("Interspike interval [s]")
plt.ylabel("Number of occurences");

What does this mean? Why are there so few intervals below ~3ms?

So far, when using 'np.histogram', we have specified the number of bins of the histogram. This is the easiest way of using it, but we might also want to directly control the size of the bins. 

**Exercise:**

Create the same plot as above, but now explicitly set the size of the bins to 0.1 ms. You can do so by creating the bin edges manually and passing them as the parameter `bins` to `np.histogram`. Take a look at the documentation!

_Hint:_ To create the bin edges, you can use `np.arange` with the `step` parameter.

In [None]:
# Start your code here
# FIXME
bin_size = 0.0001
bin_edges = np.arange(0, 0.03 + bin_size, bin_size)
bin_centers = bin_edges[:-1] + bin_size/2
isi_hist, bin_edges = np.histogram(isi, bins=bin_edges)

plt.plot(bin_centers, isi_hist)
plt.title("Interspike interval histogram")
plt.xlabel("Interspike interval [s]")
plt.ylabel("Number of occurences");

**Exercise:**

Try plotting a 'real' histogram using matplotlib's `hist` function. Whether you specify the number of bins or the bin edges is up to you.

In [None]:
# Start your code here
# FIXME
plt.hist(isi, bins=50, range=(0, 0.03))
plt.title("Interspike interval histogram")
plt.xlabel("Interspike interval [s]")
plt.ylabel("Number of occurences");