# EE 304 - Neuromorphics: Brains in Silicon


##  Neuromorphic Architectures

####  The story thus far

We have implemented two of a neuron's elements:

- An exponentially decaying **synapse**
    - a log-domain lowpass filter
- A quadratic integrate-and-fire **soma**
    - a log-domain lowpass filter plus a current mirror

In this lecture, we will start looking at architectures for interconnecting these elements in a neuromorphic chip to implement spiking neural networks. 

#### Dedicating versus sharing hardware elements

When we implement networks of spiking neurons in silicon, we may choose to:

- <b> Dedicate </b> each hardware element to a single neuronal element
    - the neural elements are: 
        - Axon
        - Synapse
        - Dendrite
        - Soma
- <b> Share </b> it among several neural elements
    - the resulting architectures are: 
        - Fully Dedicated
        - Shared Axon
        - Shared Synapse
        - Shared Dendrite 

#### Resulting architectures have different scaling properties

For instance, in the Shared-Axon Architecture:

- Instead of carrying a silicon neuron's spikes on a metal wire dedicated to it
- We transmit its spike train on a bus
- Along with spike trains of other silicon neurons
- This cuts the number of wires from $N$ to $\log_2(N)$

### The Four Neuronal Elements 

<img src="files/lecture10/TwoNeuronNetwork.png" width="720">

In all, a neuron has four elements:

- <b>Axon</b>: Communicates its spikes to other neurons
- <b>Synapse</b>: Coverts a spike to a graded potential
- <b>Dendrite</b>: Summates these graded potentials 
- <b>Soma</b>: Converts these summated graded potentials to a spike train

### The Four Architectures 

As we map a spiking neural network's neuronal elements onto our silicon chip's hardware elements, we must choose whether to dedicate or share them.

We may make choices, progressively, for each of the four types of neuronal elements.

This leads to four distinct architectures:

- <b>Fully-Dedicated</b>: Each hardware element is dedicated to a single neuronal element
- <b>Shared-Axon</b>: $\log_2(N)$ metal wires are shared by $N$ axons 
- <b>Shared-Synapse</b>: A single lowpass-filter is shared by all of a neuron's $N$ synapses
- <b>Shared-Dendrite</b>: A single resistive-mesh is shared by all $N$ neurons' dendrites

As we move down the line, sharing more and more types of elements, the hardware savings compound:

<img src="files/lecture10/ArchitecturesSynRAMScaling.png" width="720">
$A$ is the number of neurons a dendritic arbor spans

- Synapse-circuit count drops from $N^2$ to $N/A$
- RAM words needed to store weights drops from $N^2$ to $N^2/A$


 ### Fully-Dedicated Architecture

<img src="files/lecture10/FullyDedicated.png" width="360">

Requires $N^2$ synapse circuits to fully connect $N$ neurons:

- No hardware elements are shared
    - Correspondence between harware elements and neural elements is one-to-one 

### Shared-Axon Architecture

<img src="files/lecture10/SharedAxon.png" width="360">

$\log_2(N)$ wires are shared by $N$ axons:

- Communicates spikes as addresses 
- Each neuron is assigned a unique addess
- This representation is called an **address-event**  


### Shared-Synapse Architecture

<img src="files/lecture10/SharedSynapse.png" width="440">

Uses only $N$ synapse circuits to fully connect $N$ neurons:

- A single circuit models each neuron’s $N$ synapses
- All spikes destined for these synapses are routed to it
- Requires a external RAM to store weights

% Do convolution in time

### Superposable Synapse Circuit for Shared Synapse

<img src="files/lecture10/SuperposableSynapse.svg" width="840">

One synapse circuit replaces three:
- It receives their three input spike trains, combined into a single spike-train
- It **linearly superposes** the three outputs they would have produced 
- It assumes that all three synapses' had the **same weight**
    - All spikes are weighted equally

This works because the circuit's input-output relationship is described by

$$ 
    v(t) = \int_{-\infty}^{t} \!\! h(t-\tau)\sum_i \delta(\tau-t_i) dt 
         = \sum_i h(t-t_i) 
         \;\; {\rm with} \;\; 
    h(t) = {1 \over \tau}u(t)e^{-t/\tau} 
$$
- A convolution with a kernel that **decays exponentially** with time

### Shared-Dendrite

<img src="files/lecture10/SharedDendrite.png" width="480">

Ratio of shared-synapse circuits to neurons drops below one-to-one:

- A resistive mesh models exponential decay along dendrites
- A single, shared, resistive mesh models all $N$ neurons’ dendritic trees
- This mesh is implemented with transistors
- Neurons receive input from neighboring shared-synapse circuits

The RAM’s size and bandwidth are cut as well

### Diffusor Circuit: Transistor-Based Resistive Mesh

<img src="files/lecture10/diffusor.svg" width="960">

This transistor-based implementation is perfectly linear:
- In the subthreshold current-domain 
- Over three to four decades (see plot)

In the continuum limit, its current-in-current-out relationship is given by 
$$ I^(x) = \int_{-\infty}^{+\infty} \!\! k(x-u)I^*(u)du \;\; {\rm with} \;\; k(x) = \frac{1}{2L}e^{-x/L}  $$
- A convolution with a kernel that **decays exponentially** with distance
- The space-constant $L \approx \exp(\kappa(V_c - V_r)/2U_T)$

We can define $L$ as the dendritic **arbor's radius**
- The current received decays by are factor of $e$ 
- Neurons within this distance receive significant input 

### Fully-Shared

<img src="files/lecture10/FullyShared.png" width="540">

A single, shared arithmetic unit models all elements of the neural network:

- Each time a spike occurs, it updates the membrane voltages of that neuron’s targets
- It retrieves the old value as well as the synaptic weights from RAM
- If the new value exceeds threshold, it issues a spike
    - Adds it to the address-event queue

### In Summary

<img src="files/lecture10/ArchitecturesSynRAMScaling.png" width="720">
$A$ is the number of neurons a dendritic arbor spans

