## Abstract

The reactivation of neuronal activity patterns outside the context in which they originally occurred is thought to play an important role in mediating memory, but the biophysical mechanisms underlying reactivation are not well understood. Especially mysterious is the short-term replay of sequential activity patterns occurring in the recent past, since spike-timing-dependent plasticity, the biophysical mechanism normally invoked to bias model networks towards sequence production, is not thought to have a substantial effect on such short time-scales. Here we propose a model in which short-term memory for sequences is maintained in persistent activity triggered by the original sequence activation, and which directly increases the effective excitability of the neural ensembles involved in the sequence, thus leading to an increased probability of replay of sequences involving those ensembles. We show how such a phenomenon can be implemented in a simple dynamical model, and we show that the number of sequences that a randomly connected recurrent network can replay grows polynomially in the number of ensembles in the network (with degree equal to sequence length). In a simplified probabilistic model we then show how the decodability of past stimulus sequences from future neural replay sequences (in the absence of the stimulus) increases as the network connectivity becomes reflective of the stimulus transitions. Finally, we discuss the computational principals underlying our model in terms of attractors and the generation of spatiotemporal patterns from spatially defined information. Our model provides a low-complexity, biologically plausible alternative to other sequence reactivation models and makes the prediction that reactivation of arbitrary sequences in biological neural networks will be much less common than reactivation of sequences already preferentially embedded in the network.

## Introduction

The replay of neuronal activity patterns outside of the behavioral context in which they originally occurred is thought to play an important role in the retention and recall of memories (Gelbard-Sagiv 2008, Dupret 2010, Carr 2011). However, while neuronal pattern replay has been observed across a diverse set of brain areas, including mammalian hippocampus (Skaggs 1996, Nadasdy 1999, Louie 2001, Davison 2009), ventral striatum (Wimmer 2016), ventral tegmental area (Valdes 2015), prefrontal cortex (Euston 2007), and visual cortex (Ji 2007, Han 2008, Eagleman 2012), as well as in RA in songbirds (Dave 2000), the biophysical mechanisms underlying this phenomenon are poorly understood. This is because in order for replay to occur a neural activity pattern must be temporarily "tagged" upon its initial activation such that its probability of activating again at a later time is preferentially increased, but how such tagging might occur is not clear. This appears especially problematic in the replay of spatiotemporal patterns or sequences, in which not only specific sets of neurons, but directed associations between them, must be remembered and reproduced. Worthy of even further consideration is the observed replay of sequential activity patterns nearly immediately after their original activation (Han 2008, Davison 2009, Eagleman 2012), since canonical sequence-reinforcing plasticity mechanisms, such as spike-timing-dependent plasticity (STDP) (Bi 2001), are generally quite weak (Markram 1997, Bi 2001) and would not be expected to significantly alter neural network dynamics over short time courses. Indeed, network models using STDP as their primary learning rule typically require a large number of stimulus presentations (i.e., at least dozens) before the network can generate stereotyped sequences (Fiete 2010, Klampfl 2013, Huang 2015). (Some effort has been made to identify precise regimes under which STDP may have faster effects [Yger 2015], but in this case STDP did not specifically bias the generation of sequences.)

The alternative to mediation of short-term sequential replay by synaptic plasticity is mediation by temporary changes in network state. That is, one might imagine a portion of the network collapsing onto an *attractor* that maintains a representation from which the original sequence can later be reconstructed. Indeed, such attractor networks are perhaps the most common model for short-term memory in neural systems (Barak 2014, Chaudhuri 2016). While attractors can in general have relatively arbitrary time-varying structure yet which still contains information about previous activity patterns (Maass 2002), for the purpose of parsimony we focus on a specific subset of attractor models that have the following property: activation of an ensemble of neurons moves a portion of the network to an attractor state such that the excitability of the activated ensemble is increased while the network remains in that attractor state. We later show that neural ensembles with this property can easily be built using a standard "rate-based" model network (Wilson 1972) with loosely tuned connectivity, and that the attractors that emerge from sets of these ensembles are of a compositional nature, which greatly increases the set of possible stable states that can be temporarily maintained.

In addition to parsimony this choice of model class is motivated by a growing body of experimental work suggesting that neurons in many brain regions can indeed show short-term, activation-triggered increases in excitability. Specifically, it has been shown during a variety of working memory tasks that certain neurons exhibit an increased response to the second presentation of a stimulus relative to the first (reviewed in Tartaglia 2015). For example, many stimulus-selective neurons in primate inferotemporal cortex (IT) responded more strongly to a stimulus when the stimulus matched a target that had been shown a few seconds earlier, relative to when a different target had been shown (Miller 1994), with similar results later obtained in MT (Liu 2011) and V4 (Hayden 2013). A more recent experiment in which functional magnetic resonance imaging (fMRI) recordings were made on human volunteers that were shown sequences of faces found that many face-responsive voxels, which are a proxy for large populations of neurons (Huettel 2004), demonstrated a consistent enhancement of their responses to repeated faces (de Gardelle 2012). 

Here we explore the capacity for an attractor network whose connectivity produces activation-triggered lingering hyperexcitability to demonstrate preferential replay of recent, stimulus-driven activity sequences. We first demonstrate sequential replay in a rate-based network model with a simple tree-like connectivity structure. Next we perform a graph theoretical analysis of the capacity for randomly connected recurrent networks to exhibit short-term sequential replay and show that for a fixed sequence length $L$, on average $\sim O(N^L)$ sequences can be replayed in a network of $N$ ensembles. Using a simplified dynamical model we then show that the decodability of past stimulus sequences from future neural replay sequences increases when the internal network connectivity is reflective of the transition probabilities among stimulus elements. Finally, we discuss how our results suggest the utility and biological feasibility of large sets of stable, compositionally constructed attractor states and how the spatial information maintained in these attractors combined with the connectivity structure of the neural network can yield a rich space of replayable sequential activity patterns.

## Results

### Activation-triggered hyperexcitability in a network of leaky integrate-and-fire neurons yields stimulus-specific sequence replay

We first provide a proof-of-principle demonstration of how a network of leaky integrate-and-fire (LIF) neurons can yield sequence replay. The network consists of three neuron types: primary neurons, memory neurons, and inhibitory neurons, with sequence activation corresponding to sequential spiking activity in primary neurons. All neurons have identical membrane time constants, resting potentials, threshold potentials, reset potentials, and refractory periods. All input to a neuron, either external or from other neurons in the network, arises through synaptic activation, which leads to transient conductance changes and deflections of the neuron's membrane potential. Primary and memory neurons send only excitatory (glutamatergic) projections to other neurons, with the inhibitory neuron sending only inhibitory (GABAergic) projections. The network consists of 9 primary neurons $(P1 - P12)$, 9 memory neurons $(M1 - M12)$, and 1 inhibitory neuron $(I1)$.

The ability of the network to replay sequences arises as a consequence of its synaptic connectivity (**Figure 1A**). Each primary neuron projects to and receives a projection from the inhibitory neuron, which also sends an inhibitory connection to itself. Each primary neuron also sends an excitatory projection to and receives an excitatory projection from one corresponding memory neuron, and each memory neuron sends an excitatory projection to itself. The memory neuron self-connection is strong enough to endow the neuron with bistability between a non-spiking downstate and a persistently spiking upstate. The connections among the primary neurons in each ensemble are arranged in a tree-like structure with a branch point at neuron $P3$.

**Figure 1B** shows an example of sequence replay in the LIF network just described. A sequence of primary neurons $[P1, P2, P3, P4, P5, P6]$ is first activated by sequentially driving each neuron with an external excitatory input for X seconds, superimposed upon a global background of noisy inhibitory inputs to the primary neurons. After the initial sequence, a global background of noisy excitatory inputs is added to the inhibitory background. Subsequently, a "trigger" excitatory input is applied to the first primary neuron in the sequence, causing it to spike and the rest of the neurons in the sequence to follow. At $t = ...$ a short pulse of GABAergic inputs is applied to the memory neurons, reseting the network to its original state. After reseting, a different sequence $[P1, P2, P3, P7, P8, P9]$ is activated by an external stimulus and subsequently replayed in response to the same excitatory trigger to primary neuron $P1$. Sequence replay, however, can also arise without a trigger, as exemplified by the spontaneous sequence activation at time $t = ...$. Importantly, each of the two sequences begins with the same subsequence $[P1, P2, P3]$, so the identity of the replayed sequence is determined not just by its first elements, but indeed by activation of the latter elements in the recent past. Thus, the network can maintain information about a stimulus-elicited activation sequence in a transient attractor state, i.e., persistent activation of a subset of memory neurons (**Figure 1C**), such that the specific sequence being maintained is replayed upon activation of the first neuron in the sequence. Notably, the timescale of the sequence being replayed is determined by the intrinsic network dynamics, not by the stimulus. Since in this case the stimulus timescale is slower than the network timescale, the replay is compressed, as is seen frequently in the brain (Davidson 2009).

**Figure 1D** shows the voltage traces of primary neuron $P4$, memory neuron $M4$ and the inhibitory neuron $I1$ during the initial stimulus-driven activation of the sequence $[P1, P2, P3, P4, P5, P6]$: $M4$ is moved to its upstate by the multi-spike activity of $P4$. **Figure 1E** shows the voltage traces of primary neurons $P3$, $P4$, and $P7$, memory neurons $M4$ and $M7$, and the inhibitory neuron $I1$ during reactivation of the original stimulus-driven sequence $[P1, P2, P3, P4, P5, P6]$. Since $M4$ but not $M7$ has been moved to its upstate by the original activation sequence, P4 receives increased persistent excitatory input. Thus, even though both $P4$ and $P7$ receive the same excitatory input from $P3$ following $P3$'s spike, only $P4$ is able to cross its threshold due to the additional excitatory input it receives from $M4$.

In [None]:
fig_1(...)

We next show how the behavior of the network can be modulated by global "control" inputs. For instance, an inhibitory input applied homogeneously to all memory units prevents them from switching to their upstates following initial sequence activation, thus preventing subsequent sequence replay (**Figure 2A**). The network can also be moved to a spontaneously active state by applying global noisy excitatory background input, in which case both allowable sequences activate spontaneously (**Figure 2B**).

In [None]:
fig_2(...)

### A simplified network model

Activation-triggered hyperexcitability and the generation and replay of sequences can be implemented in many different networks as long as there is lateral inhibition among the primary units, bistability in the memory units, and a bias for sequences corresponding to paths through the subnetwork of primary units. The specific details of the model, however, can greatly vary. For instance, the timescale of replay could be determined primarily by spike-frequency adaptation (ref for SFA), as opposed to inhibitory feedback delays, which would have the effect of dilating the sequence replay timescale. In order to focus on the fundamental computational features that are invariant to the specific details of the neural implementation we thus consider a substitute model with simplified dynamics but which shares the key properties of the LIF network: strong inhibitory feedback and activation-triggered hyperexcitability. Specifically, we consider a discrete-time network of interconnected ensembles with binary activations.

In the simplified model we simultaneously implement binary activation and inhibitory feedback by imposing a winner-take-all (WTA) rule on the network, such that exactly one ensemble is active at each time step. WTA dynamics are not only implicit in sequence generation (since a sequence is generally defined as an ordered set of individual elements) and computationally powerful (Maass 2002) but are observed in neocortex under the control of attention (Lee 1999) and can arise through biologically plausible network mechanisms involving lateral inhibition (Coultrip 1992). To simulate noise in the network, however, we generalize the WTA rule to a probabilistic one: ensembles with stronger inputs are more likely to "win", but will not do so necessarily. Specifically, an ensemble's chance of winning at a given time step is increased if (1) the ensemble is receiving a positive stimulus, (2) it is downstream of the previously active ensemble with a strong connection weight, or (3) it is in a hyperexcitable state. Finally, we implement activation-triggered hyperexcitability by moving an ensemble to its hyperexcitable state following its activation and allowing it to remain there for an extended number of timesteps, after which its excitability returns to baseline. A summary of the model dynamics is depicted in **Figure 3A**, and the model is described in more detail in the *Methods* section.

### General computational properties of hyperexcitability-mediated sequence replay

To begin, we consider a network of ensembles organized in a "feed-forward" arrangement (the network is not strictly feed-forward because of the implicit recurrent inhibition) (**Figure 3B**). In a later section we will examine sequence replay in random recurrent networks. As in **Figure 1B**, but in discrete time, we can stimulate a sequence of ensembles, which moves them into their hyperexcitable state, and then stimulate just the first ensemble, which causes the rest of the ensembles in the previously stimulated sequence to follow (**Figure 3C**). If no trigger is provided, we observe the spontaneous replay of subsequences of the original sequence, beginning at a randomly chosen hyperexcitable ensemble and continuing until the end of the sequence (**Figure 3D**). Thus, the simplified model captures hyperexcitability-mediated sequence replay, the key property of the LIF network.

Parameter dependence in **Figure 3E**?

In [None]:
fig_3(...)

#### The capacity for reactivating multiple sequences depends on sequence overlap

**Figure 4A** shows an example of multiple sequences being reactivated in the simplified network model. From time steps 0 to 5 the activation sequence $[00, 11, 22, 33, 44]$ is driven by a strong stimulus sequentially applied to those ensembles. Then, from time steps 5 to 10 the activation sequence $[40, 51, 62, 73, 84]$ is driven by a new stimulus sequence. Subsequently, strong trigger stimuli are alternately applied to ensembles $00$ and $30$ every six time steps, which causes their respective sequences to reactivate with high probability, since there is no overlap in the ensembles or the connections that compose the two sequences. However, if the second driven sequence is instead $[40, 31, 22, 13, 04]$, which overlaps with the first sequence via ensemble $22$, the trigger stimuli applied to either $00$ or $40$ sometimes elicit the correct original sequences and sometimes elicit a combination of the two (**Figure 4B**). For example, the trigger stimulus to $00$ at time step $X$ causes activation of the sequence $[00, 11, 22, 13, 04]$. Thus, overlapping sequences lead to interference at the time of sequence replay.

#### Previous sequence activation influences sequential pattern completion in the presence of weak or distributed stimuli

One of the advantages of implementing a working memory task such as sequence replay in a neural network model is that it allows memory-based modulation of the computations performed by the network. One important computation that is intrinsically performed by our network is sequential pattern completion, that is, the mapping of an uncertain stimulus to a specific sequential activation pattern. Importantly, the way in which this computation is performed is dependent on the hyperexcitability states of the ensembles in the network. We demonstrate this feature of our model by considering the response of the network to a distributed stimulus sequence (time steps 5 - 10), each element of which consists of a strong stimulus distributed over two ensembles (10 and 20, 11 and 21, 12 and 22, etc.), after the network was previously driven (time steps 0 - 5) by one of two "focused" stimulus sequences (each of which stimulates only one ensemble per time step). As shown in **Figure 4C**, when the distributed stimulus is preceded by the focused stimulus and resulting activation sequence $[00, 11, 12, 13, 04]$, the network response to the distributed stimulus recruits ensembles $11$, $12$, and $13$. When the focused stimulus yields the activation sequence $[30, 21, 22, 23, 34]$, however, the network response to the distributed stimulus recruits ensembles $21$, $22$, and $23$. When viewed under the lens of pattern completion, the pattern completed by the network in response to the distributed stimulus is thus biased towards previous activation sequences, which is in alignment with the recent result that human perceptual decisions about a noisy visual stimulus were biased toward the stimulus that had been reported on the previous trial (St. John-Saaltink 2016).

#### Overriding of weakly probable sequences by accidental replay of highly probable sequences

### Hyperexcitability-mediated sequence replay in random recurrent networks

Since sequence replay depends on the connectivity of the network of ensembles, it is clear that not all sequential activation patterns can be replayed. Notably, in our results up until now we have only considered activation sequences that align with paths through networks with specially constructed connectivities. A reasonable question to ask is what capacity a network with random recurrent connectivity among its ensembles has for sequence replay.

To answer this question, we first define a sequence of ensembles within a network to be (uniquely) replayable if (1) the sequence traces out a path through the network and (2) no ensemble projects to more than one other ensemble in the sequence (except for the last ensemble, which can have unlimited projections). Intuitively, this means that starting at the initial ensemble there is only one network path through all the other ensembles in the sequence. We can then define the capacity $C_L(W)$ of a network with connectivity matrix $W$ to replay ensemble activation sequences of length $L$ as the number of replayable sequences of length $L$ that exist in the network. **Figure 5A** shows an example of a replayable and a nonreplayable sequence in a toy network. Notably, both an unconnected and a fully connected network have a replay capacity of zero, the unconnected network because there are no paths and the fully connected network because no set of ensembles specifies a unique sequence.

#### Random recurrent networks of hyperexcitable ensembles can replay a large number of sequences

While determining the sequence replay capacity $C_L(W)$ of a network with an arbitrary connectivity matrix $W$ is in general quite challenging, due to the combinatorial nature of the path counting problem, the analysis is greatly simplified if we instead calculate the expected replay capacity of a random network. If ensemble labels are randomly assigned during the generation of the network, then the expected replay capacity (see Appendix A for derivation) for simple sequences is given by:

$$E\left[C_L(W)\right] = \cfrac{N!}{(N-L)!} p\left(\textrm{path exists and is replayable}\right)$$

#### Stimulus decoding from replay improves when stimulus transitions are reflected in network connectivity

Simple decoding...

Replay off, but same connectivity -- improved sequence decoding

Simplified STDP allows replay to embed new stim sequences into connectivity

## Discussion

#### comparison to other mechanisms

* STDP (Szatmary and Izhikevich, VelizCuba)
* gating neurons, Conde-Sousa
* phase-based/slow ADP (Lisman 1995, Koene 2003)
* echo-state network trained to output delayed inputs (Jaeger 2014)

#### notes about this mechanism

* mechanism is not necessarily network implementation that we used, since predicts memory unit will have maxed out firing rates - other work (Crowe 2010) suggests that memory may indeed be mediated by dynamical sequences of neurons that have the same tuning; not hard to come up with simple model in which such a sequence could also increase excitability, but worth investigating how to do this without two separate populations (since growing body of evidence suggests that sensory memory can be decoded with same decoder as sensory perception)
* we discounted STDP earlier, but there are many short-term plasticity mechanisms -- these generally are not dependent on pre- and post-synaptic activity, though, and they often also have the effect of increasing excitability in groups of connected cells, so may have similar effective functional role (+ cholinergic-modulated after depolarization (Lisman 1995))
* repetition *suppression* also seen in data (tendency for neurons to respond more weakly to second stimulus presentation, generally thought to be because of adaptation) - thought to be indicative of predictive coding (since it's seen in tandem with enhancement, thought that prediction and prediction errors are both represented in the brain)
* one can also consider networks whose activation depends not just on last active ensemble, but on last two or three active ensembles, by introducing delay units or timescales
* reverse replay through specially connected networks and refractoriness (reverse replay could not be achieved even by strong short-term STDP)
* other replay model: gamma-locked phase buffer, order indicated by different levels of hyperexcitability
* assumption of stabilized receptive fields

#### computational impliciations

* one of the fundamental questions in theory of memory is how to bridge the single-neuron/synapse dynamic timescales (~ 100s of ms at max) with the timescales of synaptic plasticity (minutes to hours); this provides a way of doing through persistent activity that has high memory capacity
* this is really an extremely simple way to have a lot of stable attractor states by way of compositionality (one of the original motivations for liquid state machines was the "unreasonable" number of attractors required for high memory capacity)
    * we've mentioned that the maxed out firing rates are not necessarily accurate, though; a fascinating open question is how compositional attractor states can be achieved for more Hopfield-like attractors (Rishi's paper?)
* also directly changes sequence probabilities, since no extra decoder is needed (as is normally needed to translate the state of an LSM, e.g., back into something meaningful)
* also allows investigation of interaction in neural networks between spatial representations and spatiotemporal sequences
* supports general framework for reconstruction of spatiotemporal pattern from static spatial representation and network connectivity
* relationship to replay of experienced paths vs novel trajectories through visited areas

#### relationship to other computations

* predictive coding over short timescales
* these networks, especially the random ones, cannot reproduce arbitrary sequences; however, if you allow multiple neurons to be tuned to one sequence element, can arrange architecture to get serial recall for arbitrary sequences, much like Botvinick 2007