# Context, Recurrence, and Sleep
Tamas's topic is sleep-dependent memory consolidation. His approach starts with the hypothesis that a variety of phenomena tied to sleep-dependent memory consolidation can be explained for in terms of replay, and in turn that replay within a memory system can be modeled as recurrence within instance-based architecture.

Previous models like McClelland's REMERGE already include recurrence-similarity mechanisms for modeling memory effects. And recurrence within instance-based architectures aren't a novel idea. But in his work I've seen so far, he's at least qualitatitively accounting for various effects across different tasks using MINERVA w/ recurrence. He's also reported having a wealth of EEG and behavioral data (maybe FMRI, too) he hopes to relate with his models. He's also using a bayesian technique to fit and evaluate the model that I think could be helpful to learn about, too.

But what do we have to add to this project? Beyond our experience developing and evaluating instance-based models and accounting for sequences of recall events, I think there's an interesting body of research exploring the possibility that temporal features have celebrity status when it comes to sleep-dependent consolidation. Howard and Kahana's 2005 TCM paper includes proposal of a retrieved context account of transitive inference - one of the tasks central to this literature, and was noted by Kumaran & McClelland in their REMERGE paper as offering a qualitatively distinct account of performance than the recurrence-based account they prefer. Lerner & Gluck (2019) reports an extensive meta-analysis of research on sleep-dependent memory consolidation and find that sleep has an effect explicit detection of hidden temporal rules much more reliably than it does on detection of "stationary" rules. 

Since it composes the premises of retrieved context theory into an instance-based architecture, InstanceCMR is maybe uniquely suited for integrating TCM's account of relation extraction with recurrence-based accounts of how sleep influences memory. A composite account of sleep-dependent memory consolidation might be able to explain variance across experimental results in the literature more effectively than a more focused model. But I dunno!

## Tasks
In general, the focus in this research is on how sleep affects people's ability to extract hidden regularities from recently encoded stimuli. Across tasks though, there's a common thread:

> In these studies, participants are asked to perform a simple task, which can be easily accomplished by following a given set of instructions; however, unknown to participants, the stimuli in the task embed some hidden regularities that, if discovered (either implicitly or explicitly), can lead to a marked improvement in performance.

Across many influential studies using these tasks, sleep -- particularly short-wave sleep -- was found to "facilitate such incidental discovery more than simple time passed in wake". 

Tamas focuses on three tasks: transitive inference, paired associate learning, and serial reaction time. Let's review these tasks and the main sleep effects, then circle back to paradigms across the broader literature examined in Lerner & Gluck's (2019) review.

## Transitive Inference: Sleep Facilitates Inference of Indirect Associations

Transitive inference is an ability to derive a relation "Mary is taller than Kate" from the premises "Mary is taller than Ann" and "Ann is taller than Kate". 

One highly cited study examining this capacity (Ellenbogen et al, 2007) uses this approach:

> On each trial, subjects were presented with a pair of abstract images and asked to choose between them, after which they received feedback for their choice. Through trial and error, subjects needed to discover which image in each pair should be preferred over the other. The images were chosen from 6 stimuli with a hidden rule governing the preferences
hierarchy: A > B > C > D > E > F. Only adjacent pairs were presented during training (e.g., AB, BC, CD), in random order. At test, subjects needed to once again choose, without further feedback, the preferred stimuli from the learned pairs, but also from unlearned "inference" pairs (e.g., BD, CE, BE) for which the correct answer follows the same hierarchy rule (e.g., B > E). Results showed that sleep facilitates performance for the inference pairs in an implicit way (i.e., more correct answers than in the wake condition), but does not benefit explicit recognition of the hidden rule (feedback condition). 

Another studied had subjects associate between sets of images A and B, and between images in sets B and C, and then tested for indirect associations (A and C). More SWS during a nap correlated with stronger indirect associations.

Tamas modeled this by measuring cosine similarities between relevant echo representations.

![image.png](attachment:cc8fe445-0e3c-4e6d-8e5e-5c783a320e8e.png)

### Paired Associate: Sleep Protects Against Interference Between Memory Traces

Participants learn 60 paired associated (A-B) in two phases. First, study only. Second, anticipation-plus-study where a computer presents the A words and the participant gets feedback on their answer. After sleep (or just a delay), some participants do another study phase where they learn 20 A-C pairs, inducing retroactive interference for the A-B pairs. 
Then after a short ten minute delay, participants perform cued recall of associated B and associated C words for each A-cue.

Interference is less harmful for recall of A-B and A-C pairs after sleep!

![image.png](attachment:9de26210-40c1-4abe-935c-5620790916b3.png)

## Serial Reaction Time Task

Pariticipants learn two (12-item) sequences of button presses (A and B). For each trial, a visual cue appeared with a tone in one of four locations, corresponding to keys of the same configuration, and pressed the key as quickly as possible while minimizing errors. Cousins et al replayed tones associated with one learned sequence during slow-wave sleep for one group, but not another which instead did the replay while awake. After waking, participants with the sleep-based reactivation, demonstrated greater explicit knowledge (p  0.005) and more improved procedural skill (p  0.04) forthe cued sequence relative to the uncued sequence. 

To model reaction time, they took an idea from tje iterative resonance model. Exponent for activation is set to 1 at time 1. Then exponent is increased with each timestep. Exaggerates differences in similarity of probe to each trace. Count how many times it takes for model to retrieve trace that exceeds decision threshold of the model. 

This could prove an interesting way to model reaction times in free recall!

![image.png](attachment:1b6e7e11-be32-4d0c-a23c-995221d6c0cd.png)

### Number reduction task
Not a focus of Andrei's modeling but apparently the most cited paradigm used to suggest that sleep inspires insight, according to Lerner % McGluck. 

> Subjects perform computations on a series of digit pairs in succession. For each pair (comprised of the digits 1, 4, and 9), they need to produce a third digit based on a simple pre-taught rule. In each trial, eight digits are presented, and subjects are required to go over them serially by first applying the rule to the first two digits; then applying the rule to their response together with the third digit; then to their new response and the fourth digit, and so on. Subjects thus produce a total of seven digits one after the other throughout each trial by continually employing the rule, with the final digit considered the ultimate answer for that trial. Subjects are told, however, that if they happen to realize what the last digit will be before having gone through all seven computations, they can respond with that answer immediately and end the trial early. Indeed, unrevealed to the subjects, there is a hidden rule that governs the required responses and which, if detected, allows the subjects to predict the last digit prematurely: The inputs are organized such that, for a given trial, the last three required responses always mirror the preceding three
responses (e.g., 4, 9, 4, 1, 1, 4, 9). If subjects recognize this regularity, they can predict the final answer for the trial as soon as they compute the second response and thus considerably reduce their RT for that trial. Studying the effects of sleep on performance, it isregularly found that sleep dramatically increases the probability of subjects explicitly discovering the hidden rule (evident by both a large decrease in RTs and in stating the relation between the 2nd and 7th response in a follow-up questionnaire [9,31]), with some studies linking the effect specifically to SWS [12,31]. Implicit effects of sleep (i.e., gradual reduction of RTs to each of the three predictable responses before the insight occurs), in contrast, are rarely found.

### Other tasks

- **Artificial grammar learning**. subjects are exposed to sentences made of gibberish words or syllables. A hidden grammatical rule governs these sentences and restricts the order of words such that not all possible combinations are allowed (e.g., 3- word sentences in which the first word always determines the identity of the third word). Learning effect generally found implicitly but not explicitly. 

- **Statistical learning**. Weather Prediction Task. subjects are presented with abstract images and asked to learn, by trial and error, whether they predict Sun or Rain. Various combinations of 1, 2 or 3 images (out of possible 4) are displayed on different trials, with a complex and probabilistic relation linking each combination to the correct answer. Subjects can improve performance above chance even if not fully realizing the complex rule, by developing simple strategies that take under consideration only some of the images. 

- **Information-integration**. subjects are exposed to a set stimuli differing on two dimensions, both of which could be visual (e.g., a grating pattern differing on orientation and frequency), or one auditory and one visual (e.g., location of an image and an accompanying tone). The stimuli are differentiated to two groups based on a linear decision bound in the 2D stimuli space, such that information from both dimensions need be taken under consideration simultaneously for optimal performance. One study [30] found sleep enhances categorization performance. A second study [51] found sleep did not enhance performance immediately, but enhanced the effects of retraining on the same rule following sleep. An explicit test of rule knowledge (using a generation task) showed no sleep effects. A third study [29] found no facilitatory effects of sleep at all.

- **Generalization of categorical learning**. subjects learn to classify a group of exemplars to two or more categories based on instructions or through trial and error; and are subsequently tested on their knowledge of the categories when required to classify new exemplars, or the never-seen category prototypes, without feedback. stimuli vary greatly, and Results were highly polarized. 

## Patterns Across Literature
- Overall, findings tended to be replicated across studies that used the same task. 
- No simple effects of experimental design.
- Sleep facilitates explicit detection of temporal, not stationary regularities (MRT, SRTT -- not transitive inference?)
- A facilitatory effect of sleep on implicit detection of a hidden regularity was common to all paradigms

The common theme for both the NRT and SRTT is the use of a hidden regularity with a temporal (or sequential) nature: event x on time t predicts event y happening a few seconds later. The only other task to show a sleep-related effect on
explicit detection of a hidden rule, the surveillance task, employed a similar type of regularity.

(Knocks the idea that temporal contextual relations might be more explicit, as TCM has already been applied to model transitive inference.)

## Proposed Accounts

### Temporal Scaffolding Hypothesis
when regularities have a temporal nature that depends on information occurring over several seconds or more, the typical timescale of Hebbian mechanisms (approximately 50e200 ms; [73]) may not be sufficient to create the necessary associations in real time. But One critical feature of memory replay in the hippocampus during SWS is that it does not occur in the same rate of the original experience; in fact, it is time-compressed, by a factor of up to 20 of the original speed. 

It is, however, important to note that since the evidence for timecompressed memory replay almost exclusively relies on rodent research, the temporal scaffolding model remains speculative until further corroboration from human studies.

### Active System Consolidation
Hippocampal memories are reactivated during SWS in coordination with cortical activity (memory replay). suggests that memory reactivation supports the transformation of hippocampally-dependent episodic memories into cortically-dependent semantic ones [18]. Through this process, regularities embedded within the encoded memories are slowly extracted, avoiding catastrophic interference, and then distributed within existing knowledge structures for long-term memory storage [18,19].

### Signal-Boosting
Based on data suggesting that SWS leads to a net reduction in synaptic strength within the hippocampus and cortex, sleep may act to maintain stable levels of synaptic strength (known as synaptic homeostasis) by reducing and even eliminating excessive connectivity created during wake. Such homeostasis has the potential to improve signal to noise ratio and maintain the important common aspects of memories while reducing the salience of their less relevant idiosyncratic features, thus creating generalized representations of the individual experiences.

> This article proposes a mechanism by which the reactivation of newly learned memories during sleep could actively underpin both schema formation and the addition of new knowledge to existing schemata. Under this model, the overlapping replay of related memories selectively strengthens shared elements. Repeated reactivation of memories in different combinations progressively builds schematic representations of the relationships between stimuli. We argue that this selective strengthening forms the basis of cognitive abstraction, and explain how it facilitates insight and false memory formation.

## Computational Mechanisms?

### Recurrent Similarity Computation
Proposes recurrent similarity computation where two-way cexcitatory onnections between feature and conjunctive layers of a neural network enable recurrent reactivation that facilitates efficient discovery of higher-order relationships. Originally to account for generalization.

Cited limitations: no learning algorithm proposed for how conjunctive layer is formed. Not clear how to model tasks where a priori conjunctive nodes arne't obvious.

![image.png](attachment:2deed4cf-98b5-4003-b1c8-f0a3d760ea15.png)

Tamas et al show the same mechanisms work fine when implemented in MINERVA, where the echo retrieved by a probe is iteratively used as a probe to "sharpen" the echo, progressively retrieving a smaller and smaller subset of memories in way that improves retrieval. How? Protects against interference? Do we store echoes as traces or what?

![image.png](attachment:d49e6f05-b881-4ed7-9a1b-bf9b01d2744e.png)

Pulls replay and "sharpening" together into one neat mechanism!

### TCM - Items are tagged w/ contextual codes/associations
TCM argues that contextual states, rather thanitem–item associations, act as the primary cues for recall of items. As such, items are retrieved as a function of their similarity to thecurrent state of context, which in turn is influenced by both theitems themselves and a general tendency to drift over time. **Common features due to temporal co-occurrence enables generalization.**

Don't think Howard & Kahana evaluate this idea very much, at least not in their 2005 paper, but it's definitely very reasonable given our work on CMR and other work I've seen on recognition. 

But then what does this have to do with sleep? I suspect composing two qualitatively distinct mechanisms within one model could prove more robust than models encoding just one, and maybe obtain flexibility to account for discrepancies between task outcomes reviewed in Lerner & Gluck's meta-analysis (we could suppose one mechanism might prove more influential in some tasks than others). 

