# Eve: Make Deep Learning More Interesting

Before starting, please make sure **Eve** is in your python path.

You can install **Eve** via PyPi by ```pip install eve-ml```, 

then check it out by ```python -c "import eve; print(eve.__version__)"```.

Now, let's learn more about **Eve**.

## Eve is a native extension of PyTorch

The core module of Eve is **eve.cores.Eve**, which succeed **torch.nn.Module**.

You can build a deep learning network with **Eve** just like what you used to do with PyTorch.

First, we import the necessary packages.

In [1]:
import eve
import torch
import eve.cores
from torchvision.datasets import MNIST
from torchvision import transforms
from tqdm import tqdm

Let's build a toy model using **Eve** to solve MNIST classification task.

In [2]:
# If you want to use new features of Eve, you must ensure the network succeed Eve, not nn.Module
class ToyModel(eve.cores.Eve):
    def __init__(self):
        super().__init__()
        
        # You can define any nn.Module in Eve, Eve can handle them well.
        self.conv = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels=1, out_channels=3, kernel_size=3, stride=2, padding=1),
            torch.nn.BatchNorm2d(3),
            torch.nn.ReLU(),
        )
        
        self.linear = torch.nn.Linear(14 * 14 * 3, 10)
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        conv = self.conv(x)
        
        conv = torch.flatten(conv, 1)
        linear = self.linear(conv)
        return torch.log_softmax(linear, dim=-1)

Load dataset and define dataloader. **data_root** is what the MNIST dataset will be saved.

In [3]:
data_root = "/media/densechen/data/dataset"

# define dataset
train_dataset = MNIST(root=data_root, train=True, download=True, transform=transforms.ToTensor())
test_dataset = MNIST(root=data_root, train=False, download=True, transform=transforms.ToTensor())

# define dataloader
train_dataloader = torch.utils.data.DataLoader(train_dataset, num_workers=2, batch_size=128, drop_last=False)
test_dataloader = torch.utils.data.DataLoader(test_dataset, num_workers=2, batch_size=128, drop_last=False)

Define the network and optimizer, and move them to cuda if possible.

In [4]:
# define network
toy_model = ToyModel()

use_cuda = torch.cuda.is_available()

if use_cuda:
    toy_model.cuda()
    print("Use cuda")
else:
    print("Use cpu")
    
# define optimzier
# !WARN: in any case, you should call model.torch_parameters() to gather the parameters needed for optimizer.
optimizer = torch.optim.Adam(toy_model.torch_parameters(), lr=1e-3)

Use cuda


Train toy_model in dataset for 10 epochs.

In [5]:
for i in range(10):
    toy_model.train()
    for data, target in train_dataloader:
        if use_cuda:
            data, target = data.cuda(), target.cuda()
        optimizer.zero_grad()
        loss = torch.nn.functional.cross_entropy(toy_model(data), target)
        loss.backward()
        optimizer.step()

    correct = 0.0
    toy_model.eval()
    for data, target in test_dataloader:
        if use_cuda:
            data, target = data.cuda(), target.cuda()
            
        predict = toy_model(data)
        correct += (predict.max(dim=1)[1] == target).float().sum()
    print(f"Acc on Epoch {i+1}/10 is {100 * correct/len(test_dataset):.2f}%")

Acc on Epoch 1/10 is 91.53%
Acc on Epoch 2/10 is 93.00%
Acc on Epoch 3/10 is 93.71%
Acc on Epoch 4/10 is 94.20%
Acc on Epoch 5/10 is 94.94%
Acc on Epoch 6/10 is 95.20%
Acc on Epoch 7/10 is 95.58%
Acc on Epoch 8/10 is 95.81%
Acc on Epoch 9/10 is 96.05%
Acc on Epoch 10/10 is 96.27%


## Quantization Neural Network

**Eve** supports many quantization methods, which is developed recently.

Now, let's design a STE quantization model to solve the MNIST classification task.

In [6]:
# If you want to use new features of Eve, you must ensure the network succeed Eve, not nn.Module
class SteToyModel(eve.cores.Eve):
    def __init__(self, max_bits=4):
        super().__init__()
        
        # You can define any nn.Module in Eve, Eve can handle them well.
        self.conv = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels=1, out_channels=3, kernel_size=3, stride=2, padding=1),
            torch.nn.BatchNorm2d(3),
            torch.nn.ReLU(),
        )
        
        state = eve.cores.State(self.conv)
        self.ste = eve.cores.SteQuan(state=state, max_bits=max_bits)
        
        self.linear = torch.nn.Linear(14 * 14 * 3, 10)
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        conv = self.conv(x)
        
        ste = self.ste(conv)
        
        ste = torch.flatten(ste, 1)
        linear = self.linear(ste)
        return torch.log_softmax(linear, dim=-1)

Define an STE toy model and optimizer, and move them to cuda if possible.

In [7]:
# define network
ste_toy_model = SteToyModel(max_bits=4)

if use_cuda:
    ste_toy_model.cuda()
    print("Use cuda")
else:
    print("Use cpu")
    
# define optimzier
optimizer = torch.optim.Adam(ste_toy_model.torch_parameters(), lr=1e-3)

Use cuda


Train ste_toy_model in dataset for 10 epochs.

In [8]:
for i in range(10):
    ste_toy_model.train()
    for data, target in train_dataloader:
        if use_cuda:
            data, target = data.cuda(), target.cuda()
        optimizer.zero_grad()
        loss = torch.nn.functional.cross_entropy(ste_toy_model(data), target)
        loss.backward()
        optimizer.step()

    correct = 0.0
    ste_toy_model.eval()
    for data, target in test_dataloader:
        if use_cuda:
            data, target = data.cuda(), target.cuda()
            
        predict = ste_toy_model(data)
        correct += (predict.max(dim=1)[1] == target).float().sum()
    print(f"Acc on Epoch {i+1}/10 is {100 * correct/len(test_dataset):.2f}%")

Acc on Epoch 1/10 is 89.51%
Acc on Epoch 2/10 is 91.11%
Acc on Epoch 3/10 is 91.38%
Acc on Epoch 4/10 is 91.71%
Acc on Epoch 5/10 is 91.67%
Acc on Epoch 6/10 is 91.59%
Acc on Epoch 7/10 is 91.68%
Acc on Epoch 8/10 is 91.61%
Acc on Epoch 9/10 is 91.55%
Acc on Epoch 10/10 is 91.33%


Here, we use a max_bits = 4 to do the quantization operation, and we find that by quantization, the 
performance of model can be further improved. This is also a special property of Quantization.

We think that the quantization operation is benifit for improving generalization ability of model compared 
with full-precision network.

## Spiking Neural Network

**Eve** designs some special attributes to support hidden states rather than **torch.nn.Buffer**.

The hidden states of **Eve** will reset while calling **Eve.reset()** and will not be fetched by state_dict().

In [9]:
class SnnToyModel(eve.cores.Eve):
    def __init__(self, timesteps=5):
        """Different with traditional neural network, spiking neural network should repeat many times for a specifed 
        input image."""
        super().__init__()
        
        # recored the max timesteps
        self.timesteps = timesteps
        
        # For spiking neural network, you should specified a encoder for it to transfer the rate encoding to
        # spiking trains. Here, we use the possion encoder just likes many spiking neural networks do.
        self.encoder = eve.cores.PoissonEncoder(timesteps=timesteps)
        
        # You can define any nn.Module in Eve, Eve can handle them well.
        self.conv = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels=1, out_channels=3, kernel_size=3, stride=2, padding=1),
            torch.nn.BatchNorm2d(3),
            # torch.nn.ReLU(), # move the ReLU to Node already
        )
        
        state = eve.cores.State(self.conv) # get some necessary argments to define a quan layer
        
        # add a IfNode
        self.ifnode = eve.cores.IfNode(state=state, time_independent=False)
        
        # add a ste layer to convert the voltage into spiking signals.
        self.ste = eve.cores.SteQuan(state=state, max_bits=1)
        self.linear = torch.nn.Linear(14 * 14 * 3, 10)
        
        self.spike() # turn on spiking mode
        
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        # NOTE: spiking neural network contains hidden states of voltage we should reset it every time.
        self.reset()
        
        res = []
        for i in range(self.timesteps):
            conv = self.conv(x)
            
            ifnode = self.ifnode(conv)
            
            # insert ste layer here
            ste = self.ste(ifnode)
            
            ste = torch.flatten(ste, 1)
            linear = self.linear(ste)
            res.append(linear)
            
        res = torch.stack(res, dim=0).mean(dim=0)

        return torch.nn.functional.log_softmax(res, dim=-1)

Define SNN toy model and optimizer, move them to cuda if possible.

In [10]:
max_timesteps = 3
snn_toy_model = SnnToyModel(max_timesteps)

if use_cuda:
    snn_toy_model.cuda()
    print("use cuda")
else:
    print("use cpu")
    
optimizer = torch.optim.Adam(snn_toy_model.torch_parameters(), lr=1e-3)

use cuda


Train snn_toy_model in dataset for 10 epochs

In [11]:
for i in range(10):
    snn_toy_model.train()
    for data, target in train_dataloader:
        if use_cuda:
            data, target = data.cuda(), target.cuda()
        optimizer.zero_grad()
        loss = torch.nn.functional.cross_entropy(snn_toy_model(data), target)
        loss.backward()
        optimizer.step()

    correct = 0.0
    snn_toy_model.eval()
    for data, target in test_dataloader:
        if use_cuda:
            data, target = data.cuda(), target.cuda()
            
        predict = snn_toy_model(data)
        correct += (predict.max(dim=1)[1] == target).float().sum()
    print(f"Acc on Epoch {i+1}/10 is {100 * correct/len(test_dataset):.2f}%")

Acc on Epoch 1/10 is 89.97%
Acc on Epoch 2/10 is 90.89%
Acc on Epoch 3/10 is 91.43%
Acc on Epoch 4/10 is 91.69%
Acc on Epoch 5/10 is 91.95%
Acc on Epoch 6/10 is 92.21%
Acc on Epoch 7/10 is 92.49%
Acc on Epoch 8/10 is 92.65%
Acc on Epoch 9/10 is 92.76%
Acc on Epoch 10/10 is 92.59%


Currently, the performance of spiking neural network is still have a large space to be improved. 
At this example, you can increase the max_timesteps to increase the final accuracy.


You can fetch all the hidden states of **Eve** via:

In [12]:
for k, hid in snn_toy_model.named_hidden_states():
    print(k, torch.typename(hid))

ifnode.voltage_hid torch.cuda.FloatTensor


## Quan

The following cells, we use the trainer provied by **Eve** to study on the different quantization methods' performance. 

## SteQuan

SteQuan is the most widely used quantization function with fixed alpha parameters.

SteQuan is more stable and can be trained with a larger learning rate.

In [13]:
import eve
import eve.app
from gym.envs import make, spec, registry

mnist_trainer = make("mnist-v0",
               eve_net_kwargs={
                   "node": "IfNode",
                   "node_kwargs": {
                       "voltage_threshold": 0.5,
                       "time_independent": True,
                       "requires_upgrade": True,
                   },
                   "quan": "SteQuan",
                   "quan_kwargs": {
                       "max_bits": 8,
                       "requires_upgrade": True,
                   },
                   "encoder": "RateEncoder",
                   "encoder_kwargs": {
                       "timesteps": 1,
                   }
               },
               max_bits=8,
               root_dir="/media/densechen/data/code/eve-mli/examples/logs",
               data_root="/media/densechen/data/dataset",
               pretrained=None,
               device="auto")
print("SteQuan")
for i in range(10):
    mnist_trainer.train_one_epoch()
    acc = mnist_trainer.test_one_epoch()["acc"]
    print(f"Acc on Epoch {i+1}/10 is {100 * acc:.2f}%")

Using cuda device
load pretrained None failed.
'NoneType' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.
bit_width reset to 8.
set baseline acc as 0.0
SteQuan
Acc on Epoch 1/10 is 89.87%
Acc on Epoch 2/10 is 90.68%
Acc on Epoch 3/10 is 91.19%
Acc on Epoch 4/10 is 91.60%
Acc on Epoch 5/10 is 91.69%
Acc on Epoch 6/10 is 91.49%
Acc on Epoch 7/10 is 91.79%
Acc on Epoch 8/10 is 91.88%
Acc on Epoch 9/10 is 91.74%
Acc on Epoch 10/10 is 92.05%


## LsqQuan

LsqQuan has a trainable alpha parameter, which is supervised with global error.
It is unstable during training and prunes to make a invalid alpha value, which is not a positive one.
Using a smaller learning rate is vital.

In [14]:
import eve
import eve.app
from gym.envs import make, spec, registry

mnist_trainer = make("mnist-v0",
               eve_net_kwargs={
                   "node": "IfNode",
                   "node_kwargs": {
                       "voltage_threshold": 0.5,
                       "time_independent": True,
                       "requires_upgrade": True,
                   },
                   "quan": "LsqQuan",
                   "quan_kwargs": {
                       "max_bits": 8,
                       "requires_upgrade": True,
                   },
                   "encoder": "RateEncoder",
                   "encoder_kwargs": {
                       "timesteps": 1,
                   }
               },
               max_bits=8,
               root_dir="/media/densechen/data/code/eve-mli/examples/logs",
               data_root="/media/densechen/data/dataset",
               pretrained=None,
               device="auto")
print("LsqQuan")
for i in range(10):
    mnist_trainer.train_one_epoch()
    acc = mnist_trainer.test_one_epoch()["acc"]
    print(f"Acc on Epoch {i+1}/10 is {100 * acc:.2f}%")

Using cuda device
load pretrained None failed.
'NoneType' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.
bit_width reset to 8.
set baseline acc as 0.0
LsqQuan
Acc on Epoch 1/10 is 91.54%
Acc on Epoch 2/10 is 92.33%
Acc on Epoch 3/10 is 93.20%
Acc on Epoch 4/10 is 93.78%
Acc on Epoch 5/10 is 93.99%
Acc on Epoch 6/10 is 90.55%
Acc on Epoch 7/10 is 81.58%
Acc on Epoch 8/10 is 16.22%
Acc on Epoch 9/10 is 11.36%
Acc on Epoch 10/10 is 11.36%


## LlsqQuan

LlsqQuan has a trainable alpha parameter, which is supervised with local error. It is more stable than LsqQuan.

In [15]:
import eve
import eve.app
from gym.envs import make, spec, registry

mnist_trainer = make("mnist-v0",
               eve_net_kwargs={
                   "node": "IfNode",
                   "node_kwargs": {
                       "voltage_threshold": 0.5,
                       "time_independent": True,
                       "requires_upgrade": True,
                   },
                   "quan": "LlsqQuan",
                   "quan_kwargs": {
                       "max_bits": 8,
                       "requires_upgrade": True,
                   },
                   "encoder": "RateEncoder",
                   "encoder_kwargs": {
                       "timesteps": 1,
                   }
               },
               max_bits=8,
               root_dir="/media/densechen/data/code/eve-mli/examples/logs",
               data_root="/media/densechen/data/dataset",
               pretrained=None,
               device="auto")
print("LlsqQuan")
for i in range(10):
    mnist_trainer.train_one_epoch()
    acc = mnist_trainer.test_one_epoch()["acc"]
    print(f"Acc on Epoch {i+1}/10 is {100 * acc:.2f}%")

Using cuda device
load pretrained None failed.
'NoneType' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.
bit_width reset to 8.
set baseline acc as 0.0
LlsqQuan
Acc on Epoch 1/10 is 93.25%
Acc on Epoch 2/10 is 94.65%
Acc on Epoch 3/10 is 95.56%
Acc on Epoch 4/10 is 95.93%
Acc on Epoch 5/10 is 96.39%
Acc on Epoch 6/10 is 96.28%
Acc on Epoch 7/10 is 96.56%
Acc on Epoch 8/10 is 96.68%
Acc on Epoch 9/10 is 96.73%
Acc on Epoch 10/10 is 96.67%


## Hybrid Spiking and Quantization

In [16]:
import eve
import eve.app
from gym.envs import make, spec, registry

mnist_trainer = make("mnist-v0",
               eve_net_kwargs={
                   "node": "IfNode",
                   "node_kwargs": {
                       "voltage_threshold": 0.5,
                       "time_independent": True,
                       "requires_upgrade": True,
                   },
                   "quan": "SteQuan",
                   "quan_kwargs": {
                       "requires_upgrade": True,
                   },
                   "encoder": "RateEncoder",
                   "encoder_kwargs": {
                       "timesteps": 1,
                   }
               },
               max_bits=1,
               root_dir="/media/densechen/data/code/eve-mli/examples/logs",
               data_root="/media/densechen/data/dataset",
               pretrained=None,
               device="auto")
print("QuanSpiking")
mnist_trainer.eve_net.spike()
for i in range(10):
    mnist_trainer.train_one_epoch()
    acc = mnist_trainer.test_one_epoch()["acc"]
    print(f"Acc on Epoch {i+1}/10 is {100 * acc:.2f}%")

Using cuda device
load pretrained None failed.
'NoneType' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.
bit_width reset to 1.
set baseline acc as 0.0
QuanSpiking
Acc on Epoch 1/10 is 89.67%
Acc on Epoch 2/10 is 91.16%
Acc on Epoch 3/10 is 91.38%
Acc on Epoch 4/10 is 91.48%
Acc on Epoch 5/10 is 91.55%
Acc on Epoch 6/10 is 91.46%
Acc on Epoch 7/10 is 91.03%
Acc on Epoch 8/10 is 91.40%
Acc on Epoch 9/10 is 91.67%
Acc on Epoch 10/10 is 91.75%
