### Bootcamp 2019 Day 5, Python *continued*

Mike Morais (4$^{th}$ year PNI, Pillow Lab)

---

Utilizing StackExchange, adapting starter code, and managing long multi-step analyses

# Neural decoding of location using hippocampal place cells

<img src="placecell.png" width=750>

Nancy Kanwisher's intro to place cells: ```https://youtu.be/km4203tZXnY?t=227```

<img src="encodedecode.png" width=700>

Simulating experiments can be a helpful way to ensure complicated analyses are working properly. If you're analyzing real data and the results look strange, is it because your code is wrong, or because your hypotheses are wrong? If you simulate data consistent with your hypotheses, then the only "free variable" is your analysis $-$ your results will be correct if and only if your code works as intended. This sort of piloting of your analyses would be a valuable asset in your future PhD work.

If you get truly stuck, I've imported functions from ```placecell_solutions.py``` that solve any of the steps of the analysis. I'll tell you which function you need, but it's your job to figure out how to use it! _Bonus points if you can code anything more cleverly than I did!_

$\longrightarrow$ _This problemset was based on Paul Miller's supplementary exercises from_ An Introductory Course in Computational Neuroscience.

- <font color='#d5573b'>Import the typical set of packages. Also import the starter code I've provided in ```decoding_utils```. If you get _truly_ stuck, I've also provided ```decoding_solutions``` which solves each subproblem.</font>
- <font color='#d5573b'>Use magic commands to autosave regularly and autoreload at least the ```decoding_utils``` package.</font>
- <font color='#d5573b'>Set the random number generator seed to $1$, so that we all get the same answers.</font>

## 1 $-$ Visualizing place fields of sample hippocampal "neurons"

We will assume our simulated hippocampal "neurons" have place fields shaped like spherical Gaussians. This is a 2D version of the "bell curve" in our figure above. Our neurons will be defined by three parameters -- a center at which the firing rate of the neuron will be highest, a maximum firing rate; and a width with which that response will trail off away from that center.

Let's generate place fields for four neurons predefined below by these three parameters:

- <font color='#d5573b'>Define arrays of $x$ and $y$ coordinates from $0$ to $100$ (including $100$).</font>
- <font color='#d5573b'>Use the function ```field_2D``` in ```decoding_utils``` to generate place fields on these $x$ and $y$ coordinates for the four neurons with parameters defined below. _Hint: Check the inputs to_ ```field_2D```. _You'll need to make 2-D grids of $x$-$y$ coordinates._</font>
- <font color='#d5573b'>Plot all four place fields. Fix all four figures to have the same limits, so that we can see the difference between different rates.</font>

In [None]:
## Define the place fields of four cells (DETERMINISTIC)
N = 4
centers = np.array([[30, 30], [40, 70], [60, 10], [70, 65]]) ## (x,y)
widths = np.array([[15, 15, 20, 20]]).T                      ## Standard deviation of spherical Gaussians
rmax = np.array([[10, 15, 20, 15]]).T                        ## Maximum rate at center of place-field


## 2 $-$ ENCODING $-$ Simulating "animal" trajectory through the environment

We will assume our simulated animals walks through the 2D environment as a random walk. A random walk describes random motion in which the next step is independent of the one before it. Specifically, we want to simulate a random walk on a grid, which has nine possible moves _at each time increment_:

$$\nwarrow\quad\uparrow\quad\nearrow$$
$$\leftarrow\quad\circ\quad\rightarrow$$
$$\swarrow\quad\downarrow\quad\searrow$$

On a bounded grid, we want to have boundary conditions that prevent a move off the grid, e.g. on the rightmost edge of a grid, we'd only have these moves:

$$\nwarrow\quad\uparrow$$
$$\leftarrow\quad\circ$$
$$\swarrow\quad\downarrow$$

We moreover want to simulate neural responses along this random walk. Hippocampal place fields tell us what the (average) neural response will be at any given location. After we simulate the movement of the animal, we can measure what the average neural response should be of our hippocampal neurons. This is called the _neural code_ of that population. At each time step, each neuron may or may not spike, we only know what the spike rate should be _on average_, but fortunately this is enough to simulate the responses.

Given that this is a random walk with noisy neural responses, our solution should involve a lot of random numbers.

- <font color='#d5573b'>Simulate a bounded random walk on the $x$-$y$ coordinate grid we've defined above, for 1800 sec, with a time increment of 0.02 seconds. This will include...</font>
> <font color='#d5573b'>Simulating "left, stay, right" and "up, stay, down" independently, each with probabilities $[0.25, 0.50, 0.25]$. What are the probabilities of moving in each of the eight directions (+ staying)?</font>
>
> <font color='#d5573b'>Including a boundary condition that negates any movements outside the grid. _Challenge: This is a bad way to implement the condition; fix it._</font>

- <font color='#d5573b'>At each location in the random walk, simulate random spiking activity for each neuron based on its place field simulated above. _Hint: At location $(30, 30)$ in the grid, neuron 1 is firing on average 10 spikes/sec. So, on a given time step of 0.02 sec, what fraction of the time will it spike?_ For those familiar, this is one way to simulate a Poisson process.</font>

- <font color='#d5573b'>Plot the random walk and the spike trains (the latter is often called a raster plot). _Because we have 90000 time points, these plots will look terrible. How can we make them look better / more readable?_</font>


## 3 $-$ ANALYSIS $-$ Computing empirical spike fields and optimizing parametric functional fits

Now begins the data analysis stage, where experimental neuroscientists might actually start _in vivo_. Suppose the random walk and spike trains were the locations and neural responses of a real animal, and that we _didn't_ know the true place fields. We want to estimate them from these data. Where do we start?

Every time a spike occurs, we can know _where_ the animal was when it occurred. Intuitively, a general location that ellicits more spikes is more likely to be near the center of the place field. So if we want to estimate the place field for each neuron, we can start by computing for each a _spike field_, which counts spikes at each location in the grid and reports the proportion of the time a neuron spikes at each location.

The resulting spike field is very chunky and noisy, but we assume that place fields should be generally smooth. To finish, we'll fit a smooth function to these spike fields and use these in our downstream analyses.



- <font color='#d5573b'>Fit spike fields to each of the neurons based on the spiking data and the positions in the grid. The spike field should report the proportion of time a neuron spiked when the animal was in each location, which depends on...</font>
> <font color='#d5573b'>Computing the number of time increments spent in each location on the grid. _Hint: scan through the positions at each time increment, and add 1 to a grid of zeros for each position._</font>
>
> <font color='#d5573b'>Computing the number of spikes in each location on the grid. _Hint: scan through the positions only at the time increments of spikes, and add 1 to another grid of zeros._ </font>

- <font color='#d5573b'>Use the function ```fit_2DGaussian``` in ```decoding_utils``` to fit a smooth profile on top of each neuron's empirical spike field. Don't worry so much about how this is done.</font>


## 4 $-$ ANALYSIS $-$ Estimating place fields

The spike fields we've estimated are actually a (posterior) probability of spiking at each location -- not a place field. Hopefully it's clear, though, that these two things should be related. A neuron that spikes with consistently higher probability should have a higher firing rate, so conversion from spike field to place field is reasonably straightforward. **Don't worry about the underlying details too much, the methods themselves are pretty quick.**

We can estimate the placefields using a (re)normalization in the following way:
$$\textsf{PlaceField}(\text{location}) = \frac{\textsf{SpikeField}(\text{location})\Pr[\text{spike}]}{ Z\cdot\Delta t\cdot \Pr[\text{location}]}$$
where $Z$ is the sum total of all probabilities in the spike field.

- <font color='#d5573b'>Compute the average probability of a spike for each neuron and compute the probability of being in any location, assuming that the probability of being in any location is equal (uniform).</font>

- <font color='#d5573b'>Use the normalization above to estimate the placefields from the spikefields.</font>
- <font color='#d5573b'>Plot the true placefield, the empirical spikefield, and the estimated placefield for each of the four neurons (in one figure).</font>


## 5 $-$ DECODING $-$  Decoding position from estimated place fields

Our analysis of place cell encoding used positions and spikes to estimate the place fields. With those place field estimates, we can start our analysis of decoding to estimate the position. _Why bother re-estimating position when we have it already?_ If our model is correct, and our assumptions are valid, our decoding should be a faithful reconstruction of the data. As a result, "decoding" analyses are ubiquitous throughout neuroscience.

In this case of hippocampal neurons, strong decoding means our place cells do in fact represent meaningful location information. To decode, we'll leverage an iterative procedure called Bayesian updating. Importantly, this means that we're making location estimates _with uncertainty_, so we're actually estimating probabilities that the animal is in any location. Suppose we have estimates for some time $t$, and want to estimate time $t+1$. There are two sources of updated information at play here:

1. What is the probability that we'll be at location $(x^*,y^*)$ at time $t+1$? It's the probability that we were at location $(x^*-1,y^*)$ previously and moved right, plus the probability that we were at location $(x^*+1,y^*)$ previously and moved left, and so on. Using our assumption of a random walk, we know that we can update each location's probability using the nine possible movements that arrive at that location, and repeat this procedure for each location.

2. Suppose on time $t+1$, neurons $1$ and $2$ spiked, but neurons $3$ and $4$ didn't. Then the movement is likely to be "towards" the place fields of the neurons that spiked and "away from" the place fields of the neurons that spiked. This can be done by multiplying our estimate by the spike field (or the "no-spike" field) of each spiking (or not-spiking) neuron.

At every time step, we can _update_ our estimate of the animal's location for the next time step, and repeat.

- <font color='#d5573b'>Implement the Bayesian updating procedure described above. You can use the function ```bayesianupdate``` in ```decoding_utils``` to do the update operations in steps $1$ and $2$.</font>

- <font color='#d5573b'>With each map of estimated location, compute the maximum-probability location as a "point estimate" of the animal's location.</font>
- <font color='#d5573b'>Perform a few "scores" to measure how well your decoder estimates the true locations over time. Try to correlate the arrays of estimated location and true location (```np.corrcoef```) and/or try to compute the squared error $($estimated location $-$ true location$)^2$.</font>


## 6 $-$ Plotting

- <font color='#d5573b'>Plot the each of the $x$ and $y$ positions over time in a plot, and overlay the estimated positions.</font>

## 7 $-$ Analysis with larger "neural" populations (Challenge)

Generalize the whole pipeline for random populations of different numbers of neurons. Can you find any features of your simulated populations that predict better or worse decoding, e.g. number of neurons, separation of placefield centers, high or low placefield widths? 

This could be a great opportunity to abstract your previous code into functions, and make sure its as flexible as necessary.