# SYDE 556/750 --- Assignment 5

## Due Date: Dec 4, 2023


**Student ID: 20823934**

*Note:* Please include your numerical student ID only, do *not* include your name.

*Note:* Unlike assignments 1-4, for this assignment the full instructions (including some hints) are in this file.  The cells you need to fill out are marked with a "writing hand" symbol. Of course, you can add new cells in between the instructions, but please leave the instructions intact to facilitate marking.

- This assignment is worth 30 marks (30% of the final grade). The number of marks for each question is indicated in brackets to the left of each question.

- Clearly label any plot you produce, including the axes. Provide a legend if there are multiple lines in the same plot.

- You won’t be judged on the quality of your code.

- All questions use the nengo default of Leaky Integrate-and-Fire neurons with the default parameter settings (`tau_rc=0.02` and `tau_ref=0.002`).

- Make sure to execute the Jupyter command “Restart Kernel and Run All Cells” before submitting your solutions. You will lose marks if your code fails to run or produces results that differ significantly from what you’ve submitted.

- Rename the completed notebook to `syde556_assignment_05_<STUDENT ID>.ipynb` and submit it via email to the TA (Ben Masters <bmasters@uwaterloo.ca>). The deadline is at 23:59 EST on Dec 4, 2023.

- There is a late penalty of one mark per day late. Please contact celiasmith@uwaterloo.ca if there are extenuating circumstances.

- **For this assignment, you must use [Nengo](https://www.nengo.ai/getting-started/).** Feel free to look through the examples folder and/or the tutorials on the Nengo website before doing this assignment.



In [None]:
# Import numpy and matplotlib
import numpy as np
import matplotlib.pyplot as plt

import nengo

# Fix the numpy random seed for reproducible results
np.random.seed(18945)

# Some formating options
%config InlineBackend.figure_formats = ['svg']

# 1. Building an Accumulate-to-Threshold Decision Making Model

One standard account for how brains make simple decision-making tasks is that they gradually accumulate evidence for or against something, and when that evidence hits some threshold, a decision is made.  This sort of model is used to account for the fact that people take longer to make decisions when the evidence is weak.

If you want more background on this, https://www.jneurosci.org/content/34/42/13870 gives a decent overview, but this diagram shows a high-level overview:

![](https://www.jneurosci.org/content/jneuro/34/42/13870/F1.large.jpg)

We're going to make a model of this process. It will make its choice based on a single input value, which gives some evidence as to which choice should be made.  It will indicate a choice by outputting either a 1 or a -1.  If that input evidence is positive, it will be more likely to make the first choice (outputting a 1), and if the input evidence is negative it will be more likely to make the second choice (outputting a -1).

*TIP: The Nengo GUI built-in tutorials 10 through 18 may be useful to give you an overview of different recurrent systems and different ways of modifying ```Ensembles```.*



**a) Accumulation. [2 marks]** Start by building a recurrent system that can add up evidence over time (the accumulator or integrator).  This is a neural ```Ensemble``` that holds a single dimension, and uses a small number of neurons (50).  Provide it with one input ```Node``` that has a constant value of ```[0.1]``` and connect that input into the ```Ensemble``` with a ```Connection```.  Now make a ```Connection``` from the ```Ensemble``` back to itself that computes the identity function.  Since this ```Connection``` is accumulating evidence over time, we want it to be fairly stable, so set ```synapse=0.1``` on this ```Connection``` (leave the other `Connection` at its default value).  This means that the neurotransmitter being used will spread out over 100ms, rather than the default 5ms.

If you run the above system with the constant positive input of 0.1 as noted above, the value stored in the accumulator should gradually increase until it hits 1 (this should take about 1 second of simulated time).  If you change the input to be -0.1, it should gradually decrease until it hits -1.

Make a single plot that shows the behaviour of the model for four different inputs: 0.2, 0.1, -0.1, and -0.2.  For each input, run the model for 2 seconds (`sim.run(2)`) and plot the value stored in the accumulator `Ensemble`.  Use a `Probe` synapse of 0.01 to get the stored value.

In [None]:
def q1a(const_inputs=[0.2, 0.1, -0.1, -0.2], run_time=2, seed=0):
    np.random.seed(seed)
    
    for const_input in const_inputs:
        model = nengo.Network()
        with model:
            a = nengo.Ensemble(n_neurons=50, dimensions=1)  # Single dimension and small number of neurons
            input = nengo.Node(const_input)
            
            nengo.Connection(pre=input, post=a)             # Input connection
            nengo.Connection(pre=a, post=a, synapse=0.1)    # Recurrent connection
                    
            a_probe = nengo.Probe(a, synapse=0.01)          # Probe the value of the ensemble

        # Run the simulation
        with nengo.Simulator(model) as sim:
            sim.run(run_time)

        # Plot the results
        plt.plot(sim.trange(), sim.data[a_probe], label=f"Input = {const_input}")
        
    plt.title("Accumulator for Various Constant Inputs")
    plt.xlabel("Time (s)")
    plt.ylabel("Value")
    plt.legend()
    plt.show()
    
q1a()

**b) Accumulator Discussion. [1 mark]** What is the mathematical computation being performed here (i.e. what is the relationship between the input and the output)?  Why does the value stop increasing (or decreasing) when it hits +1 (or -1)?

The ensemble acts as an integrator, so the system can be described by $\frac{\mathrm{d}\vec x(t)}{\mathrm{d}t} = u$. For an input of 0.1, the output reaches a value of 1 at 1 second because the recurrent connection has a synapse of 0.1. Since $\frac{\tau_{recurrent}}{\tau_{input}} = \frac{0.1}{0.2} = 0.5$, the input of 0.2 reaches a value of 1 at only 0.5 seconds. The value stops increasing or decreasing when it hits a magnitude of 1 because the neurons have a radius of 1 and saturate beyond that range. Increasing the neuron radius allows the output to saturate at a higher value and also reach the target value faster.

**c) Adding random noise to the neurons. [1 mark]** Next, we can add randomness to the neurons.  In standard (non-neural) accumulator models, there is a "random-walk" component that randomly varies the value being accumulated.  We can model this by adding random noise into the ```Ensemble```, which means adding random current to each of the neurons.  The command for this is:

```python
acc.noise = nengo.processes.WhiteSignal(period=10, high=100, rms=1)
```

(where ```acc``` is whatever name you gave your accumulator ```Ensemble```.)

The strength of this noise is set by the ```rms=1``` parameter.  Generate the same plot as in part (a) but with the noise `rms=1`.  Also generate the same plot for `rms=3`, `rms=5`, and `rms=10`.  What happens to the resulting output?

In [None]:
def q1c(const_inputs=[0.2, 0.1, -0.1, -0.2], rms=1, run_time=2, seed=0):
    np.random.seed(seed)
    
    for const_input in const_inputs:
        model = nengo.Network()
        with model:
            a = nengo.Ensemble(n_neurons=50, dimensions=1)  # Single dimension and small number of neurons
            a.noise = nengo.processes.WhiteSignal(period=10, high=100, rms=rms)   # Add noise
            input = nengo.Node(const_input)
            
            nengo.Connection(pre=input, post=a)             # Input connection
            nengo.Connection(pre=a, post=a, synapse=0.1)    # Recurrent connection
                    
            a_probe = nengo.Probe(a, synapse=0.01)          # Probe the value of the ensemble

        # Run the simulation
        with nengo.Simulator(model) as sim:
            sim.run(run_time)

        # Plot the results
        plt.plot(sim.trange(), sim.data[a_probe], label=f"Input = {const_input}")
        
    plt.title(f"Accumulator for RMS = {rms}")
    plt.xlabel("Time (s)")
    plt.ylabel("Value")
    plt.legend()
    plt.show()
    
q1c(rms=1)
q1c(rms=3)
q1c(rms=5)
q1c(rms=10)

The following discussion focuses on $input = 0.1$, but a similar logic can be applied to the other input values.

When `rms = 1`, the signal is still able to reach a value of 1 near `time = 1s` and saturate thereafter. As the rms increases, the amplitude of the high-frequency noise also increases and the signal is no longer able to saturate within 2 seconds. At `rms = 10`, it's difficult to tell what the integrator output is even supposed to represent. Increasing the `rms` lowers the quality of the output because the noisy neurons distort the recurrent conection and lead to inaccurate integrations. Generally, increasing `rms` causes the output to not saturate and also vary wildly for different random seeds.

**e) Adding decision-making. [2 marks]** To complete the basic model, we want to determine when this accumulator passes some threshold.  If the value becomes large enough, we should make one choice (+1), and if it becomes small enough we should make the other choice (-1).  To achieve this, make a new output ```Ensemble``` that is also one-dimensional and has 50 neurons.  Form a ```Connection``` from the accumulator to this new ```Ensemble``` that computes the following function:

```python
def choice(x):
    if x[0] > 0.9:
        return 1
    elif x[0] < -0.9:
        return -1
    else: 
        return 0
```

This new output should now stay at zero until the accumulator value gets large enough, and then quickly move to +1 or -1.

Build this model and plot the output of both the accumulator `Ensemble` and the decision-making `Ensemble`.  Use a noise `rms=3` and for both `Probe`s use a synapse of 0.01.  Do this for all four input values (0.2, 0.1, -0.1, and -0.2).

How well does the system perform?  Does it make decisions faster when there is stronger evidence?  What differences are there (if any) between the computation we are asking the system to perform and the actual result?

*TIP: try running the model a few times to see the variability in the output*

In [None]:
def q1e(const_inputs=[0.2, 0.1, -0.1, -0.2], run_time=2, seed=0):
    np.random.seed(seed)
    
    def choice(x):
        if x[0] > 0.9:
            return 1
        elif x[0] < -0.9:
            return -1
        else: 
            return 0
    
    fig, axs = plt.subplots(1, 2, figsize=(12, 5), sharey=True)
    
    for const_input in const_inputs:
        model = nengo.Network()
        with model:
            input = nengo.Node(const_input)
            a = nengo.Ensemble(n_neurons=50, dimensions=1)
            a.noise = nengo.processes.WhiteSignal(period=10, high=100, rms=3)   # Use rms=3
            b = nengo.Ensemble(n_neurons=50, dimensions=1)  # New output Ensemble that is also one-dimensional with 50 neurons
            
            nengo.Connection(pre=input, post=a)
            nengo.Connection(pre=a, post=a, synapse=0.1)
            nengo.Connection (pre=a, post=b, synapse=0.1, function=choice)  # Connect the accumulator to the new Ensemble
                    
            a_probe = nengo.Probe(a, synapse=0.01)  # For both probes use a synapse of 0.01
            b_probe = nengo.Probe(b, synapse=0.01)

        # Run the simulation
        with nengo.Simulator(model) as sim:
            sim.run(run_time)

        # Plot the results
        axs[0].plot(sim.trange(), sim.data[a_probe], label=f"Input = {const_input}")
        axs[1].plot(sim.trange(), sim.data[b_probe], label=f"Input = {const_input}")
        
    print(f"Seed = {seed}")
    axs[0].set_title("Accumulator")
    axs[0].set_xlabel("Time (s)")
    axs[0].set_ylabel("Value")
    axs[0].legend()
    
    axs[1].set_title("Decision Maker")
    axs[1].set_xlabel("Time (s)")
    axs[1].legend()
    plt.show()
    
q1e(seed=0)
q1e(seed=4)

The system performs well in that it's able to make the decision for `+1` or `-1` around the time that the accumulator reaches the threshold value of `+0.9` and `-0.9`. However, the exact time the decision is made and the decision value vary greatly depending on the random seed. For `seed = 0`, the late thresholds have higher values than the early thresholds. However for `seed = 4` the accumulator for `input = 0.1` hovers around 0.9, and the noise causes the value to be even lower sometimes. As a result, the decision output hovers around 0.5 as the accumulator can't consistently stay above `0.9`.

The system does make decisions faster when there is stronger evidence. As seen in the first graph (`seed = 0`), the first set of decisions settles at around 0.75s, while the second set settles at around 1.5s. This shows that evidence twice as strong elicits a response twice as fast. 

Aside from the noisy output, there is 1 major difference between what we're asking the system to perform and its actual output. The `choice()` function is supposed to output a value of exactly `+1` or `-1` in response to the accumulator value, but the actual output in the graph above is closer to `1.1, 0.8, -0.9`, and `-1`. Although it's easy for the reader to know what the value is supposed to be, such errors can propagate down the computation if additional decisions depend on these outputs. This highlights the importance of having highly distinguishable outputs when noise is involved.

**f) Combining Ensembles. [2 marks]** An alternative implementation would be to combine the two separate 1-dimensional `Ensembles` into one 2-dimensional `Ensemble`.  The Connections are made similarly as in the original model, but they need to target the particular dimensions involved using the ```ens[0]``` and ```ens[1]``` syntax.  Try building the model this way and plot the results.  Do this for a single `Ensemble` with 100 neurons (the same number as the total number of neurons in the original model) and with 500 neurons.  Also, be sure to increase the `radius` as would be appropriate in order to produce values like what we had in the original model, where the accumulator might be storing a 1 and the output might be a 1.

How does combining Ensembles in this way change the performance of the system?  

When the Ensembles are combined together in this way, what are we changing about the biological claims about the model?  In particular, how might we determine whether the real biologicial system has these as separate `Ensembles` or combined together?

In [None]:
def q1f(const_inputs=[0.2, 0.1, -0.1, -0.2], n_neurons=100, run_time=2, seed=0):
    np.random.seed(seed)
    
    def choice(x):
        if x[0] > 0.9:
            return 1
        elif x[0] < -0.9:
            return -1
        else: 
            return 0
    
    fig, axs = plt.subplots(1, 2, figsize=(12, 5), sharey=True)
    
    for const_input in const_inputs:
        model = nengo.Network()
        with model:
            input = nengo.Node(const_input)
            a = nengo.Ensemble(n_neurons=n_neurons, dimensions=2, radius=2)
            a.noise = nengo.processes.WhiteSignal(period=10, high=100, rms=3)
            
            # Change all a to a[0] and all b to a[1]
            nengo.Connection(pre=input, post=a[0])
            nengo.Connection(pre=a[0], post=a[0], synapse=0.1)                  # Accumulator
            nengo.Connection(pre=a[0], post=a[1], synapse=0.1, function=choice) # Choice
                    
            a_probe_acc = nengo.Probe(a[0], synapse=0.01)
            a_probe_dec = nengo.Probe(a[1], synapse=0.01)

        # Run the simulation
        with nengo.Simulator(model) as sim:
            sim.run(run_time)

        # Plot the results
        axs[0].plot(sim.trange(), sim.data[a_probe_acc], label=f"Input = {const_input}")
        axs[1].plot(sim.trange(), sim.data[a_probe_dec], label=f"Input = {const_input}")
        
    print(f"Number of neurons = {n_neurons}")
    axs[0].set_title("Accumulator")
    axs[0].set_xlabel("Time (s)")
    axs[0].set_ylabel("Value")
    axs[0].legend()
    
    axs[1].set_title("Decision Maker")
    axs[1].set_xlabel("Time (s)")
    axs[1].legend()
    plt.show()
    
q1f(n_neurons=100)
q1f(n_neurons=500)

Compared to before, the accumulator still grows at a similar speed but saturates at 2 instead of 1 since the radius is 2. When the input has magnitude `0.1`, the accumulator still reaches a magnitude of 1 around `0.5s`. When the input has magnitude `0.2`, the accumulator still reaches a magnitude of 1 near `1.5s`. 

However, the decision output is a lot worse than before. Before, the decision-maker remains at 0 until the target value is reached, then quickly goes up to 1 and stays at 1. Now, the decision-maker begins increasing/decreasing even when the target hasn't been reached, and also takes more time to get to the target value. This means it has less temporal accuracy due to dimensional cross-talk and representation overlap. Fortunately, the decision output remains within the range of +- 1 as intended. These observations are similar for different random seeds.

When the ensemble is implemented this way, we're claiming that decisions made by biological neurons is a graded response influenced by the strength of the input at each time step, instead of a binary output based on a set threshold. We can determine whether biological systems have separate or combined ensembles by seeing whether the decision begins changing as soon as the accumulator has some value, or only after a certain threshold is reached. In the brain, having separate ensembles for decision-making indicates localized, specialized processing (Q1e), while using the same ensemble to both process inputs and make decisions indicates distributed processing (Q1f).

**g) Improving Representation [2 marks].** Returning to the original implementation from section (e) (with 2 separate Ensembles), we can improve the performance by adjusting the tuning curves of the second `Ensemble`.  Do this by setting `intercepts = nengo.dists.Uniform(0.4, 0.9)`.  This randomly chooses the x-intercepts of the neurons uniformly between 0.4 and 0.9, rather than the default of -1 to 1.  Generate the same plot as in part (e).

How does this affect the performance of the model?  (Try running the model a few times to see the variability in performance). 

Why does the output stay at exactly zero up until the decision is made (rather than being randomly jittering around zero, as in the previous models)?  

Why couldn't we use this approach in the case from part (f) where the `Ensembles` are combined?

In [None]:
from nengo.utils.ensemble import tuning_curves

def q1g(const_inputs=[0.2, 0.1, -0.1, -0.2], run_time=2, seed=0):
    np.random.seed(seed)
    
    def choice(x):
        if x[0] > 0.9:
            return 1
        elif x[0] < -0.9:
            return -1
        else: 
            return 0
    
    fig, axs = plt.subplots(1, 2, figsize=(12, 5), sharey=True)
    
    for const_input in const_inputs:
        model = nengo.Network()
        with model:
            input = nengo.Node(const_input)
            a = nengo.Ensemble(n_neurons=50, dimensions=1)
            a.noise = nengo.processes.WhiteSignal(period=10, high=100, rms=3)
            b = nengo.Ensemble(n_neurons=50, 
                               dimensions=1,
                               intercepts=nengo.dists.Uniform(0.4, 0.9))
            
            nengo.Connection(pre=input, post=a)
            nengo.Connection(pre=a, post=a, synapse=0.1)
            nengo.Connection (pre=a, post=b, synapse=0.1, function=choice)
                    
            a_probe = nengo.Probe(a, synapse=0.01)
            b_probe = nengo.Probe(b, synapse=0.01)

        # Run the simulation
        with nengo.Simulator(model) as sim:
            sim.run(run_time)
            tuning_b, activities = tuning_curves(b, sim)

        # Plot the results
        axs[0].plot(sim.trange(), sim.data[a_probe], label=f"Input = {const_input}")
        axs[1].plot(sim.trange(), sim.data[b_probe], label=f"Input = {const_input}")
        
    print(f"Seed = {seed}")
    axs[0].set_title("Accumulator")
    axs[0].set_xlabel("Time (s)")
    axs[0].set_ylabel("Value")
    axs[0].legend()
    
    axs[1].set_title("Decision Maker")
    axs[1].set_xlabel("Time (s)")
    axs[1].legend()
    plt.show()
    
    plt.figure()
    plt.plot(tuning_b, activities)
    plt.title("Tuning Curves")
    plt.xlabel("Input x")
    plt.ylabel("Firing rate (Hz)")
    plt.show()
    
q1g(seed=0)
q1g(seed=4)

This change affects the decisions in that the output is exactly 0 with no noise until the threshold is reached, at which point it jumps up to the final value with even more noise than before. The decisions are qualitatively similar to before because the accumulator remains the same, so the time at which the decision is made and the accumulator saturation values are the same. These relationships are seen in the graphs above.

The output stays at exactly 0 until the decision is made because the x-intercepts of the tuning curves are between 0.4 and 0.9 instead of -1 to 1 as before. The tuning curve graphs show that neurons don't even begin to activate until the input is above 0.4 or below -0.4. This means that for values between -0.4 and 0.4 the decision ensemble is physically incapable of producing an output. Before, the output jitters around 0 because the decoder needed to represent 0 using active, noisy neurons.

We can't use this approach with the combined ensemble because the two dimensions are related. Limiting the range of the decision dimension limits the range of the accumulator dimension as well, causing improper integration and misrepresentation of or inability to represent values.

# 2. Temporal Representation

In class, we discussed the Legendre Memory Unit (LMU), a method for storing input information over time.  This allows us to make connections where the function being computed is a function of the input over some window in time, rather having to be a function of the current input.

In this question, we will use this to build a model that can distinguish a 1Hz sine wave from a 2Hz sine wave.  Notice that it is impossible to perform this task without having information over time; if I just give you a single number at any given point in time, you can't tell whether it's from a 1Hz sine wave or a 2Hz sine wave.  So we need some method to store the previous input information, and that's what the LMU does.

**a) Representing Information over Time. [2 marks]** The core of the LMU is to compute the differential equation ${dx \over dt} = Ax + Bu$ where $A$ and $B$ are carefully chosen using the following math:

```python
A = np.zeros((q, q))
B = np.zeros((q, 1))
for i in range(q):
    B[i] = (-1.)**i * (2*i+1)
    for j in range(q):
        A[i,j] = (2*i+1)*(-1 if i<j else (-1.)**(i-j+1)) 
A = A / theta
B = B / theta        
```

Implement this in Nengo.  Use `theta=0.5` and `q=6`.  The model should consist of a single `Ensemble` that is `q`-dimensional. Use 1000 neurons in this `Ensemble`.  Use `synapse=0.1` on both the recurrent `Connection` and on the input `Connection`.

For the input, give a 1Hz sine wave for the first 2 seconds, and a 2Hz sine wave for the second 2 seconds.  This can be done with:

```python
stim = nengo.Node(lambda t: np.sin(2*np.pi*t) if t<2 else np.sin(2*np.pi*t*2))
```

Run the simulation for 4 seconds.  Plot `x` over the 4 seconds using a `Probe` with `synapse=0.01`.  `x` should be 6-dimensional, and there should be a noticable change between its value before `t=2` and after `t=2`.

In [None]:
def q2a(theta=0.5, q=6, n_neurons=1000, run_time=4, seed=0, plot=True):
    np.random.seed(seed)
    
    A = np.zeros((q, q))
    B = np.zeros((q, 1))
    for i in range(q):
        B[i] = (-1.)**i * (2*i+1)
        for j in range(q):
            A[i,j] = (2*i+1)*(-1 if i<j else (-1.)**(i-j+1)) 
    A = A / theta
    B = B / theta
    
    tau = 0.1
    A_prime = tau * A + np.eye(q)
    B_prime = tau * B
    
    if plot:
        plt.figure(figsize=(12, 5))
        
    model = nengo.Network()
    with model:
        input = nengo.Node(lambda t: np.sin(2*np.pi*t) if t<2 else np.sin(2*np.pi*t*2))
        lmu = nengo.Ensemble(n_neurons=n_neurons, dimensions=q)    # q-dimensional ensemble with 1000 neurons
        
        nengo.Connection(pre=input, post=lmu, transform=B_prime, synapse=0.1) # Input connection
        nengo.Connection(pre=lmu, post=lmu, transform=A_prime, synapse=0.1)   # Recurrent connection
    
        lmu_probe = nengo.Probe(lmu, synapse=0.01)

    # Run the simulation
    with nengo.Simulator(model) as sim:
        sim.run(run_time)
    
    target_1st_half = np.ones((len(lmu_data_2a)//2, 1))         # 1 from t=0 to t=2
    target_2nd_half = np.ones((len(lmu_data_2a)//2, 1)) * -1    # -1 from t=2 to t=4
    target = np.vstack((target_1st_half, target_2nd_half))

    if plot:
        for i in range(q):
            plt.plot(sim.trange(), sim.data[lmu_probe][:, i], label=f"q = {i}")
            
        plt.title("LMU for Sine Waves")
        plt.xlabel("Time (s)")
        plt.ylabel("Value")
        plt.legend()
        plt.show()
    
    return A_prime, B_prime, sim.data[lmu_probe], target
    
_ = q2a()

**b) Computing the function. [2 marks]** We now want to compute our desired function, which is "output a 1 if we have a 1Hz sine wave and a 0 if we have a 2Hz sine wave".  To do this, we need to make a `Connection` from the LMU `Ensemble` out to a new `Ensemble` that will be our category.  Have it be 1-dimensional with 50 neurons.

Normally in Nengo, when we define a `Connection` we specify a Python function that we want to approximate.  Nengo will then choose a bunch of random `x` values, call the function to determine what the output should be for each one, and use that to solve for the decoders.  However, in this case, we already have that set of `x` values!  That's exactly the data you plotted in part (a).  For the `x` values from t=0 to t=2.0 we want an output of 1.  For the `x` values from t=2.0 to t=4.0, we want an output of -1.  So, to specify these target values, we make a matrix of size `(4000,1)` (4000 for the 4000 time steps that you have `x` values for, and 1 for the output being 1-dimensional).  Set the first 2000 values to 1 and the second 2000 values to -1.

Now that you have your `x` values and the corresponding `target` values, you can tell Nengo to use them when you make the `Connection` like this:

```python
nengo.Connection(a, b, eval_points=x_values, function=target)
```

That will tell Nengo just to use the values you're giving it, rather than randomly sampling `x` and calling a function to get the target values.

Build this model and plot the resulting category (with a `Probe` with `synapse=0.01`).  The output should be near 1 for the first 2 seconds, and near -1 for the second 2 seconds.  (Important note: it will not be perfect at this task!)

In [None]:
def q2b(theta=0.5, q=6, n_neurons=1000, run_time=4, seed=0, plot=True):
    np.random.seed(seed)
    
    A_prime_2a, B_prime_2a, lmu_data_2a, target_q2 = q2a(q=q, n_neurons=n_neurons, plot=False)
    
    model = nengo.Network()
    with model:
        input = nengo.Node(lambda t: np.sin(2*np.pi*t) if t<2 else np.sin(2*np.pi*t*2))
        lmu = nengo.Ensemble(n_neurons=n_neurons, dimensions=q)
        cat = nengo.Ensemble(n_neurons=50, dimensions=1)
        
        nengo.Connection(pre=input, post=lmu, transform=B_prime_2a, synapse=0.1)
        nengo.Connection(pre=lmu, post=lmu, transform=A_prime_2a, synapse=0.1)
        
    with model:
        nengo.Connection(lmu, cat, eval_points=lmu_data_2a, function=target_q2)
        cat_probe = nengo.Probe(cat, synapse=0.01)

    with nengo.Simulator(model) as sim:
        sim.run(run_time)
        
    if plot:
        plt.figure(figsize=(12, 5))
        plt.plot(sim.trange(), sim.data[cat_probe])
        plt.title("Category Output for Sine Wave Input")
        plt.xlabel("Time (s)")
        plt.ylabel("Value")
        plt.show()
    
    rmse = np.sqrt(np.mean((sim.data[cat_probe] - target_q2)**2))
    
    return rmse, sim.trange(), sim.data[cat_probe]
    
_, _, _ = q2b()

**c) Adjusting the input. [2 marks]** Repeat part b) but with an input that is a 2Hz sine wave for the first 2 seconds, and a 1Hz sine wave for the second 2 seconds (i.e. the opposite order as in part (b)). How well does this perform?  Describe the similarities and differences.  One particular difference you should notice is that the model may make the wrong classification for the first 0.25 seconds.  Why is this happening?  What could you change to fix this?

In [None]:
def q2c(theta=0.5, q=6, n_neurons=1000, run_time=4, seed=0, plot=True):
    np.random.seed(seed)
    
    A_prime_2a, B_prime_2a, lmu_data_2a, target_q2 = q2a(q=q, n_neurons=n_neurons, plot=False)
    
    model = nengo.Network()
    with model:
        input = nengo.Node(lambda t: np.sin(2*np.pi*t*2) if t<2 else np.sin(2*np.pi*t))
        lmu = nengo.Ensemble(n_neurons=n_neurons, dimensions=q)
        cat = nengo.Ensemble(n_neurons=50, dimensions=1)
        
        nengo.Connection(pre=input, post=lmu, transform=B_prime_2a, synapse=0.1)
        nengo.Connection(pre=lmu, post=lmu, transform=A_prime_2a, synapse=0.1)
        
    with model:
        nengo.Connection(lmu, cat, eval_points=lmu_data_2a, function=target_q2)
        cat_probe = nengo.Probe(cat, synapse=0.01)

    with nengo.Simulator(model) as sim:
        sim.run(run_time)
        
    if plot:
        plt.figure(figsize=(12, 5))
        plt.plot(sim.trange(), sim.data[cat_probe])
        plt.title("Category Output for Sine Wave Input")
        plt.xlabel("Time (s)")
        plt.ylabel("Value")
        plt.show()
    
    rmse = np.sqrt(np.mean((sim.data[cat_probe] - target_q2)**2))
    
    return rmse, sim.trange(), sim.data[cat_probe]
    
_ = q2c()

Similarities between the two graphs:
1. Both graphs output 1 when the input is a 1 Hz signal and output -1 when the input is 2 Hz. Therefore, the classifier performs fairly well.
2. Both graphs dip down to 0 one second into the 1 Hz prediction. For the first graph, this happens at `t=1s` while for the second it happens at `t=3s`. This dip does not occur for the 2 Hz prediction.
3. Aside from the dip, the amount of noise throughout the graphs are fairly consistent
4. Both outputs start from 0 then jump to the final value

Differences between the two graphs:
1. Graph 1 immediately jumps from 0 to 1 at `t=0`, which is the expected output. Graph 2 jumps from 0 to 1 then down to -1. 
2. Graph 1 has a smooth transition from 1 to -1 at `t=2`, while graph 2 has a big spike at `t=2` then finally transitions to 1 around `t=2.25`. 

In other words, graph 2 makes the wrong prediction for the first 0.25 seconds after a transition. This happens because a 1 Hz sine wave is 1 at `t=0.25s` while a 2 Hz sine wave is 0 at `t=0.25s`. At `t=0`, the classifier can't distinguish between these two waves, but the difference is significant enough at `t=0.25` to make the correct classification. The wrong predictions can be eliminated if fresh `x_values` and `targets` were calculated from the flipped inputs.

**d) Adjusting the number of neurons. [2 marks]** Repeat part b) but adjust the number of neurons in the `Ensemble` computing the differential equation.  Try 50, 100, 200, 500, 1000, 2000, and 5000.  How does the model behaviour change?  Why does this happen?  In addition to looking at the actual results for each run, also plot the RMSE in the classification as you adjust the number of neurons.  

In [None]:
def q2d():
    n_neuronss = [50, 100, 200, 500, 1000, 2000, 5000]
    results = []
    for n_neurons in n_neuronss:
        rmse, t_range, output = q2b(n_neurons=n_neurons, plot=False) # Repeat part b) with different number of neurons
        results.append((rmse, t_range, output))
        
    # Plot the results
    plt.figure(figsize=(12, 5))
    for i, (rmse, t_range, output) in enumerate(results):
        plt.plot(t_range, output, label=f"n_neurons = {n_neuronss[i]}")
    plt.title("Category Output for Sine Wave Input")
    plt.xlabel("Time (s)")
    plt.ylabel("Value")
    plt.legend()
    plt.show()
    
    # Plot the RMSE vs number of neurons
    rmses = [rmse for rmse, _, _ in results]
    plt.plot(n_neuronss, rmses)
    plt.title("RMSE vs Number of Neurons")
    plt.xlabel("Number of Neurons")
    plt.ylabel("RMSE")
    plt.show()
    
q2d()

In [None]:
_ = q2a(n_neurons=50)
_ = q2a(n_neurons=5000)

As the number of neurons increases from 50 to 5000, the output becomes increasingly accurate as demonstrated by the consistently decreasing RMSE. The amount of output noise decreases significantly (aside from the dip at `t=1`) and a clear separation can be made between the two classes. This happens because increasing the number of neurons allows more neurons to be allocated to each dimensions, reducing the noise of representation as seen in the two LMU graphs.

**e) Adjusting the q value. [2 marks]** Repeat part b) (returning to 1000 neurons) but adjust the value of `q`.  Try 1, 2, 4, 8, 16, 32, and 64. How does the model behaviour change?  Why does this happen? In addition to looking at the actual results for each run, also plot the RMSE in the classification as you adjust the number of q values.  

In [None]:
def q2e():
    qs = [1, 2, 4, 8, 16, 32, 64]
    results = []
    for q in qs:
        rmse, t_range, output = q2b(q=q, plot=False) # Repeat part b) with different number of neurons
        results.append((rmse, t_range, output))
        
    # Plot the results
    plt.figure(figsize=(12, 5))
    for i, (rmse, t_range, output) in enumerate(results):
        plt.plot(t_range, output, label=f"q = {qs[i]}")
    plt.title("Category Output for Sine Wave Input")
    plt.xlabel("Time (s)")
    plt.ylabel("Value")
    plt.legend()
    plt.show()
    
    # Plot the RMSE vs number of neurons
    rmses = [rmse for rmse, _, _ in results]
    plt.plot(qs, rmses)
    plt.title("RMSE vs Number of Neurons")
    plt.xlabel("Number of Neurons")
    plt.ylabel("RMSE")
    plt.show()
    
q2e()

In [None]:
_ = q2a(q=4)
_ = q2a(q=64)

At `q=1`, there aren't enough dimensions to represent the input signal, so most of the output is near 0 and the RMSE is relatively high. At `q=4`, the model is able to accurately classify the input signal with the least amount of error. However, as the number of dimensions increases above `q=4`, the RMSE increases because the limited number of neurons are shared across too many dimensions and subsequent LMUs are too noisy. As a result, adding dimensions above 4 only contributes to increasing the error. As seen in the graph for `q=64`, the amplitude of the noise is so high that useful dimensions are blocked out, so it is only capable of predicting -1.

# 3. Online Learning

Normally when building models with the Neural Engineering Framework, we compute the connection weights at the beginning and then leave them fixed while running the model.  But, we can also apply online learning rules to adjust the connection weights over time.  This has the effect of changing the function being computed.  One general learning rule is the PES rule, where you provide an extra input that indicates whether the output value should be increased or decreased.  This is generally called an error signal.

**a) Basic online learning. [2 marks]** Build a network that will learn the identity function.  You will need three `Ensembles`, one for the input, one for the output, and one for the error. Each one is 1-dimensional and uses 200 neurons.  For the input, use Nengo to randomly generate a 2Hz band-limited white noise signal as follows:

```python
stim = nengo.Node(nengo.processes.WhiteSignal(period=100, high=2, rms=0.3))
```

When making the learning connection, initialize it to compute the zero function and to use the PES learning rule as follows:
```python
def initialization(x):
    return 0
c = nengo.Connection(pre, post, function=initialization, learning_rule_type=nengo.PES(learning_rate=1e-4))
```

The error `Ensemble` should compute the difference between the output value and the desired output value.  For this initial question, we want the output value to be the same as the input value (i.e. we are learning the identity function).  Then connect the error `Ensemble` to the learning rule as follows:

```python
nengo.Connection(error, c.learning_rule)
```

(Note: for this question, leave the `synapse` values on the `Connections` at their default values)

Run the model for 10 seconds and plot the input value and the resulting output value (using a `Probe` with `synapse=0.01`).  The output should match the input fairly well after the first few seconds.

In [None]:
def q3a(run_time=10, learning_rate=1e-4, seed=0, plot=True):
    np.random.seed(seed)
    
    model = nengo.Network()
    with model:
        # Input, output, and error ensembles that are 1-dimensional with 200 neurons
        input_ens = nengo.Ensemble(n_neurons=200, dimensions=1)
        output_ens = nengo.Ensemble(n_neurons=200, dimensions=1)
        error_ens = nengo.Ensemble(n_neurons=200, dimensions=1)
        
        stim = nengo.Node(nengo.processes.WhiteSignal(period=100, high=2, rms=0.3))
        nengo.Connection(stim, input_ens)

        def initialization(x):
            return 0
        c = nengo.Connection(input_ens, output_ens, function=initialization, learning_rule_type=nengo.PES(learning_rate=learning_rate))
        
        nengo.Connection(output_ens, error_ens)
        nengo.Connection(input_ens, error_ens, transform=-1)
        nengo.Connection(error_ens, c.learning_rule)
        
        input_probe = nengo.Probe(input_ens, synapse=0.01)
        output_probe = nengo.Probe(output_ens, synapse=0.01)
    
    with nengo.Simulator(model) as sim:
        sim.run(run_time)

    if plot:
        plt.figure(figsize=(12, 5))
        plt.plot(sim.trange(), sim.data[input_probe], label="Input")
        plt.plot(sim.trange(), sim.data[output_probe], label="Output")
        plt.title("PES Learning for 10 Seconds")
        plt.xlabel("Time (s)")
        plt.ylabel("Value")
        plt.legend()
        plt.show()
        
    return sim.trange(), sim.data[input_probe], sim.data[output_probe]
    
_ = q3a()

**b) Error calculation. [1 mark]**  What would happen if you reversed the sign of the error calculation (i.e. if you did `target - output` rather than `output - target`?  Why does that happen?

Originally, `error = output - target` represents a negative feedback loop where minimizing error means making the output more similar to the input. However, when the calculation is reversed the error becomes a positive feedback loop and differences between the input and output are amplified. As a result, the output quickly grows to 1.5 or decreases to -1.5 depending on the random seed. The output stops growing beyond 1.5 due to neuron saturation.

**c) Computing metrics. [1 mark]**  Break your data up into 2-second chunks and compute the Root-Mean-Squared-Error between the target value (the stimulus itself) and the output from the model for each chunk.  Since the simulation is 10 seconds long, you should have 5 RMSE measures (one for the first 2 seconds, one for the second 2 seconds, one for the third 2 seconds, and so on).  Repeat the simulation 10 times and plot the average for each of these values.  The result should show that the model gets better over time, but does not reach 0 error.  

In [None]:
def q3c(num_iterations=10, learning_rate=1e-4, run_time=10):
    divisions = run_time // 2
    rmse_values = np.zeros(divisions)
    
    for i in range(num_iterations):
        trange, input, output = q3a(run_time=run_time, learning_rate=learning_rate, seed=i, plot=False)
        
        for d in range(divisions):
            start = d * 2
            end = start + 2
            start_idx = int(start / run_time * len(trange))
            end_idx = int(end / run_time * len(trange))
            
            stim_chunk = input[start_idx:end_idx]
            output_chunk = output[start_idx:end_idx]
            
            rmse = np.sqrt(np.mean((stim_chunk - output_chunk) ** 2))
            rmse_values[d] += rmse
            
    rmse_values /= num_iterations
    time_labels = np.arange(2, run_time + 1, 2)
        
    plt.figure(figsize=(12, 5))
    plt.plot(time_labels, rmse_values)
    plt.title(f"Average RMSE over {num_iterations} Iterations")
    plt.xlabel("Time (s)")
    plt.ylabel("RMSE")
    plt.show()

q3c()

**d) Increasing learning time. [2 marks]**  Repeat part (c), but run the model for 100 seconds instead of 10 seconds.  How do the results change?

In [None]:
q3c(run_time=100)

In part c), the error decreased consistently and the relationship seemed to be monotonic. It was possible to believe that the error would keep getting closer to 0. However in d) when the simulation was run over 100 seconds, the error stops decreasing around `t=15` and remains around 0.025 for the remainder of the simulation. This shows that the error gets close to 0 but cannot be completely eliminated.

**e) Learning rates. [2 marks]**  Repeat part (d), but decrease the learning rate to `1e-5`.  How do the results change?  How do they compare to part (c)?

In [None]:
q3c(run_time=100, learning_rate=1e-5)

These results differ from part (c) and (d) because learning takes 10 times as long with 1/10th of the learning rate. In (c), the error is around 0.06 at `t=10`, whereas it takes around 80 seconds for (e) to reach the same RMSE. The fast initial convergence in (d) is also missing in (e). Lastly, the magnitude of fluctuations in (e) during the learning process is much higher than (d). In conclusion, a learning rate of `1e-5` is too low for this ensemble to learn the representation effectively.

**f) Improving performance. [1 mark]**  If you wanted to make the learned result even more accurate, how would you do this?  What would you change about the model and learning process?

There are several improvements that can be made to the learning process:
1. Increasing the number of neurons. As seen before, more neurons means less error and more accurate representations.
2. Tuning the learning rate. If a learning rate of `1e-4` is much better than `1e-5`, maybe some other learning rate will perform even better.
3. Tuning the ensemble parameters. Maybe a different radius, x-intercept, or firing rates helps the ensemble represent the desired signal.
4. Adjusting the synaptic filter time constants. It's possible that a different time constant might lead to better signal processing.

**g) Learning other functions. [1 mark]** Repeat part (a), but have the system learn a function where the input is a scalar $x$, but the output is the vector $[x^2, -x]$.  This will involve changing the dimensionality of some of the `Ensembles` and adding a `function=` to be computed on the `Connection` from the `stim` to the `error`.

In [None]:
def q3f(run_time=10, learning_rate=1e-4, seed=0):
    np.random.seed(seed)
    
    model = nengo.Network()
    with model:
        input_ens = nengo.Ensemble(n_neurons=200, dimensions=1)
        output_ens = nengo.Ensemble(n_neurons=200, dimensions=2)    # 2-dimensional output
        error_ens = nengo.Ensemble(n_neurons=200, dimensions=2)     # 2-dimensional error
        
        stim = nengo.Node(nengo.processes.WhiteSignal(period=100, high=2, rms=0.3))
        nengo.Connection(stim, input_ens)

        # Initialize the 2-dimensional output to 0
        def initialization(x):
            return np.zeros(2)

        # Compute the 2-dimensional target
        def target_function(x):
            return [x[0]**2, -x[0]]

        c = nengo.Connection(input_ens, output_ens, function=initialization, learning_rule_type=nengo.PES(learning_rate=learning_rate))
        
        nengo.Connection(output_ens, error_ens)
        nengo.Connection(stim, error_ens, function=target_function, transform=-1)
        nengo.Connection(error_ens, c.learning_rule)
        
        input_probe = nengo.Probe(input_ens, synapse=0.01)
        output_probe = nengo.Probe(output_ens, synapse=0.01)
    
    with nengo.Simulator(model) as sim:
        sim.run(run_time)

    plt.figure(figsize=(12, 5))
    plt.plot(sim.trange(), sim.data[input_probe], label="Input")
    plt.plot(sim.trange(), sim.data[output_probe][:, 0], label="Output (x^2)")
    plt.plot(sim.trange(), sim.data[output_probe][:, 1], label="Output (-x)")
    plt.title("PES Learning of [x^2, -x]")
    plt.xlabel("Time (s)")
    plt.ylabel("Value")
    plt.legend()
    plt.show()

# Run the modified function
q3f()