A system of five linear equations has the following expression in matrix notation:
$$
\begin{bmatrix} 
y_1 \\ y_2 \\ y_3 \\ y_4 \\ y_5 
\end{bmatrix} = 

\begin{bmatrix} 
    w_{1,1} & w_{1,2} & w_{1,3} & w_{1,4} & w_{1,5} \\ 
    w_{2,1} & w_{2,2} & w_{2,3} & w_{2,4} & w_{2,5} \\ 
    w_{3,1} & w_{3,2} & w_{3,3} & w_{3,4} & w_{3,5} \\ 
    w_{4,1} & w_{4,2} & w_{4,3} & w_{4,4} & w_{4,5} \\ 
    w_{5,1} & w_{5,2} & w_{5,3} & w_{5,4} & w_{5,5} \\ 
\end{bmatrix} 

\begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \end{bmatrix} +

\begin{bmatrix} b_1 \\ b_2 \\ b_3 \\ b_4 \\ b_5 \end{bmatrix}
$$\

We can simplify the above expression to:
$$
\mathbf{y} = \mathbf{W}\mathbf{x} + \mathbf{b}
$$
with 
* $\mathbf{x}$ is a column vector representing our inputs, whose size is (5,1).
* $\mathbf{y}$ is a column vector representing our outputs, whose size is (5,1).
* $\mathbf{W}$ is a matrix representing our coefficients, whose size is (5,5).
* $\mathbf{b}$ is a column vector representing our biases, whose size is (5,1).


In [4]:
import torch
import torch.nn as nn
import numpy as np

class LinearLayer(nn.Module):
    def __init__(self, input_size, output_size):
        super(LinearLayer, self).__init__()
        self.weights = np.random.randn(input_size, output_size)
        self.bias = np.zeros((input_size))

    def forward(self, x):
        return torch.mm(x, self.weights) + self.bias
    
layer = LinearLayer(5, 5)

## 3. Conversion of a Deep Neural Network to a Memristive Deep Neural Network

Within MemTorch, `memtorch.mn.Module.patch_model` can be used to convert DNNs to a MDNNs. Prior to conversion, a memristive device model must be defined and characterized in part (prior to the introduction of other non-ideal device characteristics).

In the cell below:
* A reference (base) memristor model from `memtorch.bh.memristor` is defined.
* Optional reference memristor keyword arguments are set.
* A `memtorch.bh.memristor.Memristor` object is instantiated
* The hysteresis loop of the instantiated memristor object is generated/plotted.
* The bipolar switching behaviour of the instantiated memristor object is generated/plotted.

In [None]:
reference_memristor = memtorch.bh.memristor.VTEAM
reference_memristor_params = {'time_series_resolution': 1e-10}
memristor = reference_memristor(**reference_memristor_params)
memristor.plot_hysteresis_loop()
memristor.plot_bipolar_switching_behaviour()

In the cell below, the trained DNN from Section 2 is converted to an equivalent MDNN, where all convolutional layers are replaced with memristive-equivalent layers. While only *Conv2d* layers are converted for demonstration purposes, we note that MemTorch currently supports conversion of *Conv1d*, *Conv2d*, *Conv3d*, and *Linear* layers. Specifically:
* `memtorch.bh.map.Parameter.naive_map` is used to convert the weights within all `torch.nn.Conv2d` layers to equivalent conductance values, to be programmed to the two memristive devices used to represent each weight (positive and negative, respectively).
* `tile_shape` is set to (128, 128), so that modular crossbar tiles of size 128x128 are used to represent weights.
* `ADC_resolution` is set to 8 to set the bit width of all emulated Analogue to Digital Converters (ADC).
* `ADC_overflow` is used to set the initial overflow rate of each ADC.
* `quant_method` is used to set the quantization method used (linear, by default).
* `transistor` is set to `True`, so a 1T1R arrangement is simulated.
* `programming_routine` is set to `None` to skip device-level simulation of the programming routine.



We note if `transistor` is `False` `programming_routine` must not be `None`. In which case, device-level simulation is performed for each device using `memtorch.bh.crossbar.gen_programming_signal` and `memtorch.bh.memristor.Memristor.simulate`, which use finite differences to model internal device dynamics. As `scheme` is not defined, a double-column parameter representation scheme is adopted. Finally, `max_input_voltage` is 0.3, meaning inputs to each layer are encoded between -0.3V and +0.3V.

In [None]:
import copy
from memtorch.mn.Module import patch_model
from memtorch.map.Input import naive_scale
from memtorch.map.Parameter import naive_map


model = Net().to(device)
model.load_state_dict(torch.load('trained_model.pt'), strict=False)
patched_model = patch_model(copy.deepcopy(model),
                          memristor_model=reference_memristor,
                          memristor_model_params=reference_memristor_params,
                          module_parameters_to_patch=[torch.nn.Conv2d],
                          mapping_routine=naive_map,
                          transistor=True,
                          programming_routine=None,
                          tile_shape=(128, 128),
                          max_input_voltage=0.3,
                          scaling_routine=naive_scale,
                          ADC_resolution=8,
                          ADC_overflow_rate=0.,
                          quant_method='linear')

In the cell below, all patched `torch.nn.Conv2d` layers are tuned using linear regression. A randomly generated tensor of size (8, `self.in_channels`, 32, 32) is propagated through each memristive layer and each legacy layer (accessible using `layer.forward_legacy`). `sklearn.linear_model.LinearRegression` is used to determine the coefficient and intercept between the linear relationship of each set of outputs, which is used to define the `transform_output` lamdba function, that maps the output of each layer to their equivalent representations.

In [None]:
patched_model.tune_()

Finally, in the cell below, the converted and tuned MDNN is benchmarked using the MNIST test data set.
*Note: This cell may take a considerable amount of time to run.*

In [None]:
print(test(patched_model, test_loader))

## 4. Modeling Non-Ideal Device Characteristics


Non-ideal device characteristics can either be encapsulated within device specific memristive models, or introduced to base (generic) models after conversion, using `memtorch.bh.nonideality.NonIdeality.apply_nonidealities`. Currently, the following non-ideal device characteristics are supported:
* `memtorch.bh.nonideality.DeviceFaults`
* `memtorch.bh.nonideality.Endurance` and `memtorch.bh.nonideality.Retention`
* `memtorch.bh.nonideality.FiniteConductanceStates`
* `memtorch.bh.nonideality.NonLinear`

Stochastic parameters, used to model process variances, can be defined using `memtorch.bh.StochaticParameter`. The introduction of each type of non ideal device characteristic is demonstrated below.


### 4.1 Modeling Device Faults

Memristive devices are susceptible to failure, by either failing to eletroform at a pristine state, or becoming stuck at high or low resistance states. MemTorch incorporates a specific function for accounting for device failure, `memtorch.bh.nonideality.DeviceFaults`.

In the cell below:
* The original patched model is copied using `copy.deepcopy`.
* `lrs_proportion` is set to 0.25, so that 25% of devices are assumed to fail to a low resistance state.
* `hrs_proportion` is set to 0.10, so that 15% of devices are assumed to fail to a high resistance state.

It is assumed that the total proportion of devices set to a high resistance state is equal to the proportion of devices that fail to eletroform at pristine states plus the proportion of devices stuck at a high resistance state.



In [None]:
from memtorch.bh.nonideality.NonIdeality import apply_nonidealities

patched_model_ = apply_nonidealities(copy.deepcopy(patched_model),
                                  non_idealities=[memtorch.bh.nonideality.NonIdeality.DeviceFaults],
                                  lrs_proportion=0.25,
                                  hrs_proportion=0.10,
                                  electroform_proportion=0)

In [None]:
print(test(patched_model_, test_loader))

### 4.2 Modeling Device Endurance and Retention

Memristive devices possess non-ideal endurance and retention properties, which should be accounted for. MemTorch incorporates specific functions for accounting for device endurance and retention characteristics, `memtorch.bh.nonideality.Endurance`, and `memtorch.bh.nonideality.Retention`, respectively.

All endurance and retention models are defined in `memtorch.bh.nonideality.endurance_retention_models`.

In the cell below:
* The original patched model is copied using `copy.deepcopy`.
* `x`, the number of SET-RESET cycles is set to be equal to 10,000.
* Endurance characteristics are accounted for using `memtorch.bh.nonideality.NonIdeality.Endurance` and `memtorch.bh.nonideality.endurance_retention_models.model_endurance_retention`.
* `operation_mode` within `endurance_model_kwargs` is set to `sudden`, so that sudden failure is modeled, and various other model arguments are set.


In [None]:
from memtorch.bh.nonideality.NonIdeality import apply_nonidealities

patched_model_ = apply_nonidealities(copy.deepcopy(patched_model),
                                  non_idealities=[memtorch.bh.nonideality.NonIdeality.Endurance],
                                  x=1e4,
                                  endurance_model=memtorch.bh.nonideality.endurance_retention_models.model_endurance_retention,
                                  endurance_model_kwargs={
                                        "operation_mode": memtorch.bh.nonideality.endurance_retention_models.OperationMode.sudden,
                                        "p_lrs": [1, 0, 0, 0],
                                        "stable_resistance_lrs": 100,
                                        "p_hrs": [1, 0, 0, 0],
                                        "stable_resistance_hrs": 1000,
                                        "cell_size": 10,
                                        "temperature": 350,
                                  })

In [None]:
print(test(patched_model_, test_loader))

In the cell below:
* The original patched model is copied using `copy.deepcopy`.
* `time`, the retention time, is set to be equal to 1,000s.
* Retention characteristics are accounted for using `memtorch.bh.nonideality.NonIdeality.Retention` and `memtorch.bh.nonideality.endurance_retention_models.model_conductance_drift`.
* `initial_time` within `retention_model_kwargs`, the initial time, is set to be equal to 1s.
* `drift_coefficient` within `retention_model_kwargs` is set to be equal to 0.1.

In [None]:
from memtorch.bh.nonideality.NonIdeality import apply_nonidealities

patched_model_ = apply_nonidealities(copy.deepcopy(patched_model),
                                  non_idealities=[memtorch.bh.nonideality.NonIdeality.Retention],
                                  time=1e3,
                                  retention_model=memtorch.bh.nonideality.endurance_retention_models.model_conductance_drift,
                                  retention_model_kwargs={
                                        "initial_time": 1,
                                        "drift_coefficient": 0.1,
                                  })

In [None]:
print(test(patched_model_, test_loader))

### 4.3 Modeling a Finite Number of Conductance States

Realistic memristive devices are non-ideal and have a finite number of stable discrete electrically switchable conductance states, bounded by a low conductance semiconducting state, and a high-conductance metallic state. MemTorch incorporates a specific function for accounting for devices with a finite number of conductance states, `memtorch.bh.nonideality.FiniteConductanceStates`.

In the cell below:
* The original patched model is copied using `copy.deepcopy`.
* A finite number of conductance states are accounted for using `memtorch.bh.nonideality.NonIdeality.FiniteConductanceStates`.
* `conductance_states` is set to be equal to 5, to model 5 evenly-distributed conductance states.

In [None]:
from memtorch.bh.nonideality.NonIdeality import apply_nonidealities

patched_model_ = apply_nonidealities(copy.deepcopy(patched_model),
                                  non_idealities=[memtorch.bh.nonideality.NonIdeality.FiniteConductanceStates],
                                  conductance_states=5)

In [None]:
print(test(patched_model_, test_loader))

### 4.4 Modeling Non-Linear Device Characteristics

Non-ideal memristive devices have non-linear I/V device characteristics, especially at high voltages, which are difficult to accurately and efficiently model. The `memtorch.bh.nonideality.NonLinear.apply_non_linear` function can be used to efficiently model non-linear device I/V characteristics during inference for devices with an infinite number of discrete conductance states, and for devices with a finite number of conductance states.

For cases where devices are not simulated using their internal dynamics, it is assumed that the change in conductance during read cycles is negligible.

Within MemTorch, `memtorch.bh.nonideality.NonLinear.apply_non_linear` uses two methods to effectively model non-linear device I/V characteristics:

1. During inference, each device is simulated for timesteps of duration `device.time_series_resolution` using `device.simulate`.
2. Post weight mapping and programming, the I/V characteristics of each device are determined using a single reset voltage sweep.

In the cell below:
* The original patched model is copied using `copy.deepcopy`.
* Non-linear device characteristics are accounted for using `memtorch.bh.nonideality.NonLinear`.
* `simulate` is set to be equal to `True`, so during inference each device is simulated.




In [None]:
from memtorch.bh.nonideality.NonIdeality import apply_nonidealities

patched_model_ = apply_nonidealities(copy.deepcopy(patched_model),
                                  non_idealities=[memtorch.bh.nonideality.NonIdeality.NonLinear],
                                  simulate=True)

In [None]:
print(test(patched_model_, test_loader))

In the cell below:
* The original patched model is copied using `copy.deepcopy`.
* Non-linear device characteristics are accounted for using `memtorch.bh.nonideality.NonLinear`.
* `simulate` is not set, so the I/V characteristics of each device are determined using a single reset voltage sweep.
* `sweep_duration` is set to be equal to 2s.
* `sweep_voltage_signal_amplitude` is set to be equal to 1V.
* `sweep_voltage_signal_frequency` is set to be equal to 0.5Hz.


In [None]:
from memtorch.bh.nonideality.NonIdeality import apply_nonidealities

patched_model_ = apply_nonidealities(copy.deepcopy(patched_model),
                                  non_idealities=[memtorch.bh.nonideality.NonIdeality.NonLinear],
                                  sweep_duration=2,
                                  sweep_voltage_signal_amplitude=1,
                                  sweep_voltage_signal_frequency=0.5)

In [None]:
print(test(patched_model_, test_loader))

### 4.5 Modeling Stochastic Parameters

MemTorch supports the usage of stochastic parameters for higher flexibility to simply account for process variances using `memtorch.bh.StochasticParameter.StochasticParameter`. Stochastic parameters can be used when defining device characteristics.

In the cell below:
* A memristor object is characterised using stochastic parameters defining low and high resistance states.
* The memristor object is instantiated, and the hysteresis loop and bipolar switching behaviour of the instantiated memristor object is generated/plotted.

Each time the memristor object is instantiated, stochastic parameters will be resampled.


In [None]:
import memtorch

reference_memristor = memtorch.bh.memristor.VTEAM
reference_memristor_params = {'time_series_resolution': 1e-10,
                              'r_off': memtorch.bh.StochasticParameter(loc=1000, scale=200, min=2),
                              'r_on': memtorch.bh.StochasticParameter(loc=5000, scale=sigma, min=1)}

memristor = reference_memristor(**reference_memristor_params)
memristor.plot_hysteresis_loop()
memristor.plot_bipolar_switching_behaviour()

## Final Remarks
A complete API is avaliable [here](https://memtorch.readthedocs.io/). To learn how to use MemTorch, and to reproduce results of ‘_MemTorch: An Open-source Simulation Framework for Memristive Deep Learning Systems_’, we provide numerous tutorials in the form of Jupyter notebooks [here](https://memtorch.readthedocs.io/en/latest/tutorials.html).

Current issues, feature requests and improvements are welcome, and are tracked using: https://github.com/coreylammie/MemTorch/projects/1. These should be reported [here](https://github.com/coreylammie/MemTorch/issues).