# EE 304 - Neuromorphics: Brains in Silicon


##  Silicon Circuit Heterogeneity

<img src="files/lecture7/complete-lpf.png" width="520">
The story thus far:

- We started with a **dimensionless model** of the leaky membrane
    - a first-order linear low-pass filter
- We represented the **dimensionless variables as currents**
    - these current levels (fA to nA) are chosen to keep the gate voltage subthreshold 
    - and are normalized by a "unit current" ($I_{\rm n}$)
- We used a capacitor (C) to represent the membrane's capactitance 
- We derived an expression for the current charging or discharging the capacitor, $I_C$
    - in terms of the input ($I_u$) and output ($I_v$) currents
- We found that **$I_C$ is proportional to $I_u/I_v - 1$**
    - The proportionalilty constant ($I_{\rm lk}$) is given by $C U_T/\kappa\tau$
    - where $\tau$ is the desired time-constant 
- To divide and multiply, we exploited the subthreshold MOS transistor's exponential current-voltage relation:
    - **Multiply**: Apply sum of two gate voltages to third transistor's gate
    - **Divide**: Apply difference of two gate voltages to third transistor's gate

### How similar is one silicon circuit to another?

<img src="files/lecture7/TransistorDopants50nm.png" width="800">

We will now start to look at the fabrication process--it is far from perfect! 

- Fabricated copies of the circuit we designed are very different from each other
- For the same input current, their output currents can differ by a factor of 10!
- This heterogeneity arises from two sources:
    - **Dopant fluctuations**: Some transistors receive more or less dopant atoms than others
    - **Threshold mismatch**: These fluctuations result in a threshold-voltage distribution
    - **Subthreshold operation**: The exponential current-voltage relationship amplifies threshold-voltage variation
- Fabricated copies become **more similar** if we use
    - **More silicon area**: Larger transistor channels average out dopant fluctuations  
    - **More electrical power**: Current-voltage dependence is less steep above threshold

### How similar is one transistor to another?

<img src="files/lecture7/ThresholdDistribution90nm.png" width="500">

A lot of work has been done characterizing transistor mismatch and understanding how it arises 

- Threshold voltages are **normally distributed**
    - Arises because the number of dopant atoms in the channel is normally distributed 
    - Since these dopants are charged, their fluctuations result in charge fluctuations
        - $\Delta Q = q\Delta N$, where $\Delta N$ is the fluctuation in counts and $q$ ius the electronic charge
        - $\Delta V_{\rm thr} = \Delta Q/C_{\rm ox}$, where $C_{\rm ox}$ is the gate-to-channel capacitance 
- This is what we would expect in if 
    - Each dopant atom chooses it's location independently
    - *And* the mean dopant-atom count is large 
        - It would be *Poisson* rather than *Normal* if the mean is small (few dozen)

### How does transistor mismatch vary with channel area?

<img src="files/lecture7/ThresholdSigma-v-Area90nm.png" width="500">

- From the previous slide, the threshold-voltage variation is
    - $\Delta V_{\rm thr} = q\Delta N/C_{\rm ox}$
- Hence, its variance can be expressed in terms of the variance in dopant-atom counts $\sigma_{\rm N}^2$:  
    - $\sigma_{\rm V}^2 = q^2\sigma_{\rm N}^2/C_{\rm ox}^2$
- Independence implies that 
    - $\sigma_{\rm N}^2 = \mu_{\rm N} \propto W \times L$ 
- And we also have
    - $C_{\rm ox} \propto W \times L$
- Consequently
    - $\sigma_{\rm V}^2 \propto 1/(W \times L)$
- The **standard deviation is inversely proportional to the square-root of the area**
    - To reduce the standard deviation by a factor of two, the area must increase by a factor of four!
- The trend breaks down for transistor channels with very large area 
    - Probably because the devices, arranged in dense arrays, are further apart
- It is also expected to break down for transistors with very small areas
    - Perimeter-to-area ratio increases
    - Edge-efects become more significant

### How do threshold-voltage variations translate to current variations?

<img src="files/lecture7/NormalV-to-LognormalI.png" width="800">

Normally distributed threshold variations translate to lognormally distributed current variations

- Pass the threshold voltage distribution through an exponential
- You will end up turning a normal distribution into a lognormal one
    - That is, the log of the currents is normally distributed
- The expressions are given relate:
    - the lognormal distribution's mean and standard deviation ($\mu_{\rm I}$ and $\sigma_{\rm I}$) 
    - to the normal distribution's mean and standard deviation ($\mu_{\rm V}$ and $\sigma_{\rm V}$)
- Note that:
    - $\mu_{\rm I}$ depends on $\mu_{\rm V}$ as well as $\sigma_{\rm V}$
    - $\sigma_{\rm I}$ depends on $\sigma_{\rm V}$ as well as $\mu_{\rm V}$
    - $\sigma_{\rm I}/\mu_{\rm I}$ depends only on $\sigma_{\rm V}$ 
- That is, current mismatch expressed as a fraciton of the mean current depends only on voltage mismatch

### Including threshold mismatch in the transistor model

<img src="files/lecture7/id-vs-vg+mismatch.png" width="1200">

We extend our expression for the transistor's subthreshold drain current to include mismatch as follows

 - We draw an offset voltage, $\Delta V_{\rm G}$, from a normal distribution with mean zero and standard deviation $\sigma_{\rm V_{\rm th}}$:
 
 $p(\Delta V_{\rm G}) = \frac{1}{\sigma_{\rm V_{\rm th}}\sqrt{2\pi}}\exp\left(-\frac{\Delta V_{\rm G}^2}{2\sigma_{\rm V_{\rm th}}^2}\right)$

- We add this to the gate voltage, $V_{\rm G}$, and obtain the current with mismatch

 $I_{\rm D}(V_{\rm G}, V_{\rm S}) = I_0 e^{\{\kappa (V_{\rm G} + \Delta V_{\rm G}) - V_{\rm S}\}/U_T}$

- This expression can be written in terms of a gain factor $\Lambda$

 $I_{\rm D}(V_{\rm G}, V_{\rm S}) = \Lambda I_0 e^{(\kappa V_{\rm G} - V_{\rm S})/U_T}$ where $\Lambda = e^{\kappa\Delta V_{\rm G}/U_T}$

- That is to say, this transistor passes $\Lambda$ times more current than one without mismatch
- This gain $\Lambda$ is lognormally distributed

### The distribution of transistor current gain ($\Lambda$)

<img src="files/lecture7/LambdaDist.png" width="1000">

We have $\Lambda = e^{\kappa\Delta V_{\rm G}/U_T}$ and we know that $\Delta V_{\rm G}$ is normally distributed. It follows that:

- $\Lambda$ is drawn from the lognormal distribution:

 $p(\Lambda)d\Lambda = \frac{1}{\sigma_{\rm V_{\rm th}}\sqrt{2\pi}}\exp\left(-\frac{(\ln(\Lambda)U_T/\kappa)^2}{2\sigma_{\rm V_{\rm th}}^2}\right)d\Delta V_{\rm G}$

    - We solved $\Lambda = e^{\kappa\Delta V_{\rm G}/U_T}$ for $\Delta V_{\rm G}$ in terms of $\Lambda$
    - And replaced $\Delta V_{\rm G}$ in the normal distribution with the resulting expression
    - Multiplying by $d\Lambda$ and $d\Delta V_{\rm G}$ makes probabilities of drawing from corresponding ranges equal
    
- Since $d\Lambda/d\Delta V_{\rm G}=(\kappa/U_T)\exp(\kappa\Delta V_{\rm G}/U_T)=(\kappa/U_T)\Lambda$, we have

 $p(\Lambda) = \frac{1}{\Lambda(\kappa\sigma_{\rm V_{\rm th}}/U_T)\sqrt{2\pi}}\exp\left(-\frac{(\ln(\Lambda)U_T/\kappa)^2}{2\sigma_{\rm V_{\rm th}}^2}\right)$

- We can rewrite this more simply by expressing $\sigma_{\rm V_{\rm th}}$ in units of $U_T/\kappa$

- Naming this dimensionless standard deviation $\sigma_{U_T} = \kappa\sigma_{\rm V_{\rm th}}/U_T$, we have

 $p(\Lambda) = \frac{1}{\Lambda\sigma_{\rm U_T}\sqrt{2\pi}}\exp\left(-\frac{\ln(\Lambda)^2}{2\sigma_{U_T}^2}\right)$
 
- Or, equivalently, the underlying normal distribution has standard deviation $\sigma_{\rm U_T}$

### Why include transistor mismatch in dimensionless models?

<img src="files/lecture7/mismatch-response-curve.png" width="1200">

It will enable us to code up dimensionless neuron models in Nengo whose heterogeneity precisely mirrors the heterogeneity in our silicon implementation of these models.

What's our motivation for doing this?

- With this precise correspondence in place, we can figure out how transistor sizes in our silicon circuit impact the performance of our NEF-configured networks of silicon spiking neurons 

- We expect there to be an optimum level of mismatch: 
    - **Too little mismatch**: Bad because the neurons' responses to input current are too similar (left, $\sigma_{U_T}=0.05$)
    - **Too much mismatch**: Bad because a lot of the neurons either never fire or fire all the time (right, $\sigma_{U_T}=0.40$)
    - **An intermediate level**: Most of the neurons start firing between -1 and +1 (middle, $\sigma_{U_T}=0.22$) 
    
- And, therefore, we expect there to be an optimum transistor size
    - Since mismatch is inversely related to transistor area

### Including transistor mismatch in leaky membrane model I

<img src="files/lecture7/complete-lpf.png" width="520">

- In synthesizing the silicon circuit design, we left out mismatch. That is, we assumed that: 

    $I_{\rm D} = I_0 e^{(\kappa V_{\rm G}-V_{\rm S})/U_T}$
              
- Observe, however, that the model with mismatch can written as:

    $I_{\rm D}/\Lambda = I_0 e^{(\kappa V_{\rm G}-V_{\rm S})/U_T}$
 
- Therefore, our derivation still holds if:
    - The $i^{\rm th}$ transistor's current is divided by its current gain 
    - That is, $I_i$ is replaced with $I_i/\Lambda_i$--the corresponding current in a zero-mismatch transistor
    - Tranistors are numbered as shown in the figure

- In particular, our previous result

    $(I_u/I_v)I_{\rm lk} = I_0 e^{\kappa(V_u + V_{\rm lk} -V_v)/U_T}$
 
- becomes
    \begin{equation*}
    \frac{(I_u/\Lambda_2)(I_{\rm lk}/\Lambda_1)}{(I_v/\Lambda_4)} = I_0 e^{\kappa(V_u + V_{\rm lk} -V_v)/U_T}
    \end{equation*}
- And we know that 
    
    $I_0 e^{\kappa(V_u + V_{\rm lk} -V_v)/U_T} = I_3/\Lambda_3$
    
- Hence, we have
\begin{equation*}
    \frac{I_3}{\Lambda_3} = \frac{(I_u/\Lambda_2)(I_{\rm lk}/\Lambda_1)}{(I_v/\Lambda_4)} \iff\
    I_3 = \frac{\Lambda_3\Lambda_4}{\Lambda_1\Lambda_2}\frac{I_u I_{\rm lk}}{I_v}
\end{equation*}

### Including transistor mismatch in leaky membrane model II

<img src="files/lecture7/complete-lpf-I3.png" width="520">

Equating the rate of change of the capacitor's charge to the net current supplied, we have 

\begin{eqnarray*}
C\frac{dV_v}{dI_v}\frac{dI_v}{dt} & = & \frac{\Lambda_3\Lambda_4}{\Lambda_1\Lambda_2}\frac{I_u I_{\rm lk}}{I_v} - I_{\rm lk}\\
\iff \frac{C U_T}{\kappa I_v}\frac{dI_v}{dt} & = & \frac{\Lambda_3\Lambda_4}{\Lambda_1\Lambda_2}\frac{I_u I_{\rm lk}}{I_v} - I_{\rm lk}\\
\iff \frac{C U_T}{\kappa I_{\rm lk}}\frac{dI_v}{dt} & = & \frac{\Lambda_3\Lambda_4}{\Lambda_1\Lambda_2}I_u - I_v
\end{eqnarray*}

Converting to a dimensionless model by normalizing with $I_{\rm n}$ and setting $\tau = C U_T/\kappa I_{\rm lk}$ yields

\begin{eqnarray*}
\tau\frac{dv}{dt} & = & \frac{\Lambda_3\Lambda_4}{\Lambda_1\Lambda_2}u - v
\end{eqnarray*}

Note that we assumed that $I_{\rm lk}$ has no mismatch, which is not actually the case. 
