### **Biological Signals Analysis 2023**
### **Week 7 Exercise**
### **Poisson Processes Continued**

### **Review:**

##### Convolutions

We previously met the convolution operator:

$
    (f*g)(t) \equiv \int_0^t f(t-\tau)g(\tau)d\tau
$

and its discrete brother with their commutative property:

$
    (f*g)[n] \equiv \sum_{m=-\infty}^{\infty} f[m]g[n-m] =  \sum_{m=-\infty}^\infty f[n-m]g[m]
$

The important points being:
-  Convolution is the standard way to multiply functions, since multiplying them element-wise doesn't make sense most of the time.
- The operation itself requires pretty basic calculus.

### **Poisson Process** 

#### Introduction


The reason we mentioned convolution in the first place was that we were looking for a formal way to average our spike sequence:

$
\rho (t) = \sum_{i=1}^{n} \delta (t-t_i)
$

and we saw that convolving a spike train with a running window produces $r(t)$, the time-dependent spike average. Looking back at the above equation, it's pretty clear that it also models an LTI system, since it's a simple composition of delta functions at different times. This fact will justify much of the math that we'll do, but we'll see that in a moment.

Our goal is to characterize and analyze spike trains. Saying that implies that we're not analyzing sub-threshold activity, and that we're treating all spikes coming from a single neuron as equals. These are very valid assumptions, but we should set them on our table. We're also relying on spikes and spike timings, but unfortunately we know that they're generally unreliable - repeated identical trials will not create identical responses in individual neurons.

We'd like to have a model for the **probability** that a specific spike, or spike train, will appear in response to a given stimulus. While not allowing us to predict *every spike*, this model can definitely be a huge step forward in our understanding of neural enconding and decoding. Assuming we have the probability density function (PDF) of spike times $p(t_1,\ldots,t_n)$, if we multiply it by a time window $\Delta t$ we can receive the complete probability for a specific sequence of $n$ spikes $P(t_1,\ldots,t_n)$ in which spike $i$ happened between times $t_i$ and $t_i + \Delta t$:

$
P(t_1,\ldots,t_n) = p(t_1,\ldots,t_n) (\Delta t)^n
$

The assumption that we know $p(t_1,\ldots,t_n)$ for all times is, unfortunately,
completely invalid. There can be many different spike trains, with different
PDFs, making any approximation of this function lacking. If we instead rely on a
statistical model to describe \textbf{all} possible spike trains, we can use
this model's assumption to better our guesswork. We'll define the firing rate
$r(t)$ as the chance for a neuron to fire in a short interval around $t$, but
it's still insufficient to characterize the full scope of the problem, as it
might also depend on previous spikes, meaning that we must have the information
of previous spikes in order to determine the next one. Or in mathematical terms:

\begin{equation}
    P(t) = P(t_n|t_1, \ldots, t_{n-1})
\end{equation}

This is a very hard nut to crack - this probability function can be very hard to simulate. We can simply it by saying that the probability to spike at time $t$ is dependent only on the previous spike:

\begin{equation}
    P(t_n|t_{n-1})
\end{equation}

This is called a renewal process, and it's a bit easier to handle, as we'll see later. However, if the spike probability is independent on previous spikes, then $r(t)$ is indeed sufficient to characterize our issues and we have ourselves a Poisson Process. This is also called the "independent spike hypothesis".

In summary, the chance to see a spike at some time $t$ is $P(t)$ and is only dependent on $r(t)$. the probability to receive a specific spike train in which spikes happened at $t_1, t_2, ..., t_n$ is $P(t_1, t_2, ..., t_n)$. The probability distribution function of $P$ is $p$, which means that the chance to see a \emph{specific} spike train is $p(t_1, ..., t_n)(\Delta t)^n$. Our model of this spike train is given by the expression $\rho(t) = \sum_{i=1}^{n}{\delta (t - t_i)}$ and we'll convolve this spike train with some window function to formulate our $r(t)$.

#### Homogeneous Poisson Process

To make our lives even easier, we'll currently assume that $r(t)$ is actually $r$, time-independent. This is called a \emph{Homogeneous Poisson Process}. What  is the probability to receive some specific spike train $P(t_1, \ldots, t_n)$ under the assumptions of a homogeneous Poisson process?

First we'll ask ourselves what can this process be dependent on? We'll assume we
have a ``wider'', more general function $P_T(n)$, which models the probability
that any sequence of $n$ spikes occurs within a trial of duration $T$. What does
it depend on? First we divide the total measurement time $T$ into $M$ bins of
length $\Delta t = \nicefrac{T}{M}$. Assuming $\Delta t$ is small enough so that
we never have two spikes within a single bin, we can think of three different
factors that affect this quantity.

\begin{itemize}
  \item Number of ways of placing $n$ spikes into $M$ bins. Combinatorics tells us that this equals $\binom{M}{n}$.
  \item The probability of $n$ spikes occurring in $M$ bins. The chance to
  receive one spike in a time bin is $r\Delta t$, and the chance to receive $n$
  of these is $(r \Delta t)^{n}$.
  \item The probability of not having a spike in the remaining time bins. We're
  left with $M - n$ bins, and from here we can derive that probability to be
  $(1 - r \Delta t)^{M-n}$.
\end{itemize}

Multiplying them all together results in the following expression:

\begin{equation}
P_T(n) = \lim_{\Delta t \rightarrow 0} \binom{M}{n} (r \Delta t)^n (1 - r
\Delta t)^{M - n}
\end{equation}

and we're taking the limit $\Delta t \rightarrow 0$ to make this calculation as
exact as possible by making our assumptions valid. To continue we'll assume that
$M$ is large (since $\Delta t$ is small) which makes $M - n \approx M$ and also
can simplify a part of the binomial coefficient, finally resulting in:

\begin{equation}
\begin{aligned}
\frac{M!}{(M-n)!} \approx M^n  = \left( \frac{T}{\Delta t} \right)^n \ & ; \
\lim_{\Delta t \rightarrow 0} (1 - r \Delta t)^{M-n} = e^{-rT} \\
 \Downarrow & \\
P_T(n) = & \frac{(rT)^n}{n!}e^{-rT}
\label{eq:ProbT}
\end{aligned}
\end{equation}

\begin{figure}
 \begin{centering}
\includegraphics[scale=0.6]{ImagesExtraMaterial/poisson_distrib_for_spike_train.png}
\par\end{centering} \caption{\textbf{A)} Probability to generate $n$ spikes in a
time period $T$ for different numbers of spikes $n$. \textbf{B)} Probability to
find $n$ spikes when $rT=10$. The plotted line is a Gaussian distribution with mean equal
to 10. From Dayan \& Abbot.}
  \label{fig:poi_proc}
\end{figure}

Which is exactly the Poisson distribution. Figure \ref{fig:poi_proc} shows
typical values for $P_T(n)$. Once we have this function written down, we return
to the original spike train $P(t_1,\ldots,t_n)$ we actually measured. We won't
prove it, but the full relationship between these two distributions is actually
given by:

\begin{equation}
P(t_1,\ldots,t_n) = n! P_T(n) \left( \frac{\Delta t}{T} \right)^n
\label{eq:spike_train}
\end{equation}

\subsection{Inhomogeneous Poisson Process}

The inhomogeneous case, in which $r = r(t)$, implies that every sequence of
spikes has its own probability, even when it has exactly $n$ spikes. In
Abbott and in Anan's presentation you can see how to derive that the probability density for an $n$-spike train is

\begin{equation}
p(t_1,\ldots,t_n)=\exp\left( -\int_{0}^{T}r(t)dt \right)
\prod_{i=1}^{n}r(t_i)
\end{equation}
where $t_1,\ldots,t_n$ are ordered. This equation still assumes that each spike is independent of the previous one.

\section{Properties of signals and spike trains}

In this part we'll explore some statistical properties of the $P_T(n)$ PDF, namely its variance and mean. We've already seen in exercise 1 that the variance of Poisson process is equal to the coefficient of the distribution:

\begin{equation}
\sigma_n^2 = \left<n^2\right> - \left<n\right>^2 = rT
\end{equation}
where the $\left<\right>$ sign implies calculating the expected value. The mean
of such a process is again $\left<n\right>=rT$, making the variance and mean
equal.

\subsection{Fano factor and coefficient of variation}

The ratio of these two quantities (variance and mean) is called \textit{Fano
factor}, and can be used to describe a real, measured spike train when we wish to compare it to a true (idealized) homogeneous Poisson process, in which it's equal to one. Figure \ref{fig:fanofac} shows the distribution that can help us calculate the Fano factor.

\begin{equation}
\text{Fano factor} \equiv F = \frac{\sigma_n^2}{\left< n \right>}
\end{equation}

If the underlying neural process that generated the spikes was Poissonic-like, we expect our FF to be equal to one.

\begin{figure}
\begin{centering}
\includegraphics[scale=1]{ImagesExtraMaterial/fano_factor.png}
\par\end{centering} \caption{Aggregation of spike trains from multiple neurons are usually position around a value of 1 for the Fano factor. From Izhar Bar-Gad lectures.}
\label{fig:fanofac}
\end{figure}

A second important variable that describes the distribution in equation \ref{eq:ProbT} is called the \textit{coefficient of variation}. To calculate it we first need the mean of the distribution. The mean interspike interval is

\begin{equation}
\begin{aligned}
\left<\tau\right> & = \int_0^\infty \tau p(\tau) d\tau \\
& = \int_0^\infty \tau r e^{-r\tau} d\tau \\
& = r \int_0^\infty \tau e^{-r\tau} d \tau \text{\ \ \ (Gamma function)}\\
& = r \left[ \frac{1}{r^2} \right] = \frac{1}{r}
\end{aligned}
\end{equation}
and the variance of the interspike intervals is

\begin{equation}
\sigma_\tau^2 = \int_0^\infty \tau^2 r e^{-r\tau} d\tau - \left< \tau \right>^2
= \frac{1}{r^2}
\end{equation}

The ratio of the standard deviation and the expected value is the \emph{coefficient of variation,} and is again equal to one for a true homogeneous Poisson distribution. It's useful since we always have to think of the standard deviation of the data in the context of its mean (= making it a dimensionless
quantity). When this coefficient is bigger than one, The neurons are usually
bursty. When it's lower, we're usually dealing with a regular neurons,
deterministic in character (i.e. integrates and fires).

\begin{equation}
C_V = \frac{\sigma_\tau}{\left< \tau \right>}
\end{equation}

For processes in which the next spike depends on the previous one, the Fano
factor approaches $C_V^2$ over long time intervals. An example for the
coefficient of variation is given in figure \ref{fig:Cv}.

\begin{figure}
\begin{centering}
\includegraphics[scale=1]{ImagesExtraMaterial/coefficient_of_var.png}
\par\end{centering} \caption{Different probability density functions lead to
different coefficients of variation. From Izhar Bar-Gad lectures.}
\label{fig:Cv}
\end{figure}

\end{document}

#### Modeling Interspike Intervals Using Exponential Decay

Suppose a spike occurs at time $t_i$. In the context of a homogeneous Poisson process, we consider the probability of generating the next spike within a small interval $\Delta t$, after a time $\tau$ has elapsed since the last spike. This scenario can be dissected into two probabilities:

1. **The Probability of No Spike Occurring for a Time $\tau$**: This is essentially the probability that the neuron remains 'quiet' for a duration of $\tau$, before potentially firing the next spike. Given the memoryless property of the Poisson process, the probability of not firing for any given small interval is independent of history. For a homogeneous Poisson process with rate $r$, the probability of no spike in a small interval $\Delta t$ is approximately $1 - r\Delta t$.

2. **The Probability of Generating a Spike in the Following Small Interval $\Delta t$**: Immediately after the interval $\tau$, the chance of firing a spike in the next infinitesimally small time slice $\Delta t$ is $r\Delta t$, assuming $r$ is the rate of the process (i.e., average number of spikes per unit time).

The probability that no spike occurs for a time $\tau$ and then a spike occurs in the next interval $\Delta t$ can be modeled as an exponential decay process. This is because, as time progresses without a spike, the probability of continuing to not have a spike decreases exponentially, reflecting the memoryless property of the process. 

**Calculating the Average Interspike Interval**

The average interspike interval (ISI) can be calculated by leveraging the exponential distribution that arises from the Poisson process's properties. The exponential distribution is given by:

$ P(T > t) = e^{-rt} $

where $T$ is the time until the next event (spike), $t$ is a specific time interval, and $r$ is the rate of the process. The mean of the exponential distribution, which represents the average time until the next spike, is given by:

$ \mu = \frac{1}{r} $

Thus, the average interspike interval in a homogeneous Poisson process is inversely related to the rate of the process. This allows us to model and understand the timing of spikes in neuronal spike trains, providing insights into the neural encoding of information and the underlying neuronal dynamics.