# Eve: Make Deep Learning More Interesting

Before starting, please make sure **Eve** is in your python path.

You can install **Eve** via PyPi by ```pip install eve-ml```, 

then check it out by ```python -c "import eve; print(eve.__version__)"```.

Now, let's learn more about **Eve**.

## Eve is a native extension of PyTorch

The core module of Eve is **eve.cores.Eve**, which succeed **torch.nn.Module**.

You can build a deep learning network with **Eve** just like what you used to do with PyTorch.

First, we import the necessary packages.

In [1]:
import eve
import torch
import eve.cores
from torchvision.datasets import MNIST
from torchvision import transforms
from tqdm import tqdm

Let's build a toy model using **Eve** to solve MNIST classification task.

In [2]:
# If you want to use new features of Eve, you must ensure the network succeed Eve, not nn.Module
class ToyModel(eve.cores.Eve):
    def __init__(self):
        super().__init__()
        
        # You can define any nn.Module in Eve, Eve can handle them well.
        self.conv = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels=1, out_channels=3, kernel_size=3, stride=2, padding=1),
            torch.nn.BatchNorm2d(3),
            torch.nn.ReLU(),
        )
        
        self.linear = torch.nn.Linear(14 * 14 * 3, 10)
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        conv = self.conv(x)
        
        conv = torch.flatten(conv, 1)
        linear = self.linear(conv)
        return torch.log_softmax(linear, dim=-1)

Load dataset and define dataloader. **data_root** is what the MNIST dataset will be saved.

In [3]:
data_root = "/media/densechen/data/dataset"

# define dataset
train_dataset = MNIST(root=data_root, train=True, download=True, transform=transforms.ToTensor())
test_dataset = MNIST(root=data_root, train=False, download=True, transform=transforms.ToTensor())

# define dataloader
train_dataloader = torch.utils.data.DataLoader(train_dataset, num_workers=2, batch_size=128, drop_last=False)
test_dataloader = torch.utils.data.DataLoader(test_dataset, num_workers=2, batch_size=128, drop_last=False)

Define the network and optimizer, and move them to cuda if possible.

In [4]:
# define network
toy_model = ToyModel()

use_cuda = torch.cuda.is_available()

if use_cuda:
    toy_model.cuda()
    print("Use cuda")
else:
    print("Use cpu")
    
# define optimzier
# !WARN: in any case, you should call model.torch_parameters() to gather the parameters needed for optimizer.
optimizer = torch.optim.Adam(toy_model.torch_parameters(), lr=1e-3)

Use cuda


Train toy_model in dataset for 10 epochs.

In [5]:
for i in range(10):
    toy_model.train()
    for data, target in train_dataloader:
        if use_cuda:
            data, target = data.cuda(), target.cuda()
        optimizer.zero_grad()
        loss = torch.nn.functional.cross_entropy(toy_model(data), target)
        loss.backward()
        optimizer.step()

    correct = 0.0
    toy_model.eval()
    for data, target in test_dataloader:
        if use_cuda:
            data, target = data.cuda(), target.cuda()
            
        predict = toy_model(data)
        correct += (predict.max(dim=1)[1] == target).float().sum()
    print(f"Acc on Epoch {i+1}/10 is {100 * correct/len(test_dataset):.2f}%")

Acc on Epoch 1/10 is 91.21%
Acc on Epoch 2/10 is 92.32%
Acc on Epoch 3/10 is 93.25%
Acc on Epoch 4/10 is 94.00%
Acc on Epoch 5/10 is 94.62%
Acc on Epoch 6/10 is 94.99%
Acc on Epoch 7/10 is 95.29%
Acc on Epoch 8/10 is 95.52%
Acc on Epoch 9/10 is 95.86%
Acc on Epoch 10/10 is 95.77%


## Quantization Neural Network

**Eve** supports many quantization methods, which is developed recently.

Now, let's design a STE quantization model to solve the MNIST classification task.

In [6]:
# If you want to use new features of Eve, you must ensure the network succeed Eve, not nn.Module
class SteToyModel(eve.cores.Eve):
    def __init__(self, max_bit_width=4):
        super().__init__()
        
        # You can define any nn.Module in Eve, Eve can handle them well.
        self.conv = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels=1, out_channels=3, kernel_size=3, stride=2, padding=1),
            torch.nn.BatchNorm2d(3),
            torch.nn.ReLU(),
        )
        
        state = eve.cores.State(self.conv)
        self.ste = eve.cores.SteQuan(state=state, max_bit_width=max_bit_width)
        
        self.linear = torch.nn.Linear(14 * 14 * 3, 10)
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        conv = self.conv(x)
        
        ste = self.ste(conv)
        
        ste = torch.flatten(ste, 1)
        linear = self.linear(ste)
        return torch.log_softmax(linear, dim=-1)

Define an STE toy model and optimizer, and move them to cuda if possible.

In [7]:
# define network
ste_toy_model = SteToyModel(max_bit_width=4)

if use_cuda:
    ste_toy_model.cuda()
    print("Use cuda")
else:
    print("Use cpu")
    
# define optimzier
optimizer = torch.optim.Adam(ste_toy_model.torch_parameters(), lr=1e-3)

Use cuda


Train ste_toy_model in dataset for 10 epochs.

In [8]:
for i in range(10):
    ste_toy_model.train()
    for data, target in train_dataloader:
        if use_cuda:
            data, target = data.cuda(), target.cuda()
        optimizer.zero_grad()
        loss = torch.nn.functional.cross_entropy(ste_toy_model(data), target)
        loss.backward()
        optimizer.step()

    correct = 0.0
    ste_toy_model.eval()
    for data, target in test_dataloader:
        if use_cuda:
            data, target = data.cuda(), target.cuda()
            
        predict = ste_toy_model(data)
        correct += (predict.max(dim=1)[1] == target).float().sum()
    print(f"Acc on Epoch {i+1}/10 is {100 * correct/len(test_dataset):.2f}%")

Acc on Epoch 1/10 is 89.87%
Acc on Epoch 2/10 is 91.02%
Acc on Epoch 3/10 is 91.63%
Acc on Epoch 4/10 is 91.84%
Acc on Epoch 5/10 is 91.83%
Acc on Epoch 6/10 is 91.72%
Acc on Epoch 7/10 is 91.79%
Acc on Epoch 8/10 is 91.79%
Acc on Epoch 9/10 is 91.93%
Acc on Epoch 10/10 is 91.95%


Here, we use a max_bit_width = 4 to do the quantization operation, and we find that by quantization, the 
performance of model can be further improved. This is also a special property of Quantization.

We think that the quantization operation is benifit for improving generalization ability of model compared 
with full-precision network.

## Spiking Neural Network

**Eve** designs some special attributes to support hidden states rather than **torch.nn.Buffer**.

The hidden states of **Eve** will reset while calling **Eve.reset()** and will not be fetched by state_dict().

In [9]:
class SnnToyModel(eve.cores.Eve):
    def __init__(self, max_timesteps=5):
        """Different with traditional neural network, spiking neural network should repeat many times for a specifed 
        input image."""
        super().__init__()
        
        # recored the max timesteps
        self.max_timesteps = max_timesteps
        
        # For spiking neural network, you should specified a encoder for it to transfer the rate encoding to
        # spiking trains. Here, we use the possion encoder just likes many spiking neural networks do.
        self.encoder = eve.cores.PoissonEncoder(max_timesteps=max_timesteps)
        
        # You can define any nn.Module in Eve, Eve can handle them well.
        self.conv = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels=1, out_channels=3, kernel_size=3, stride=2, padding=1),
            torch.nn.BatchNorm2d(3),
            # torch.nn.ReLU(), # move the ReLU to Node already
        )
        
        state = eve.cores.State(self.conv) # get some necessary argments to define a quan layer
        
        # add a IfNode
        self.ifnode = eve.cores.IfNode(state=state, time_independent=False)
        
        # add a ste layer to convert the voltage into spiking signals.
        self.ste = eve.cores.SteQuan(state=state, max_bit_width=1)
        self.linear = torch.nn.Linear(14 * 14 * 3, 10)
        
        self.spike() # turn on spiking mode
        
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        # NOTE: spiking neural network contains hidden states of voltage we should reset it every time.
        self.reset()
        
        res = []
        for i in range(self.max_timesteps):
            conv = self.conv(x)
            
            ifnode = self.ifnode(conv)
            
            # insert ste layer here
            ste = self.ste(ifnode)
            
            ste = torch.flatten(ste, 1)
            linear = self.linear(ste)
            res.append(linear)
            
        res = torch.stack(res, dim=0).mean(dim=0)

        return torch.nn.functional.log_softmax(res, dim=-1)

Define SNN toy model and optimizer, move them to cuda if possible.

In [10]:
max_timesteps = 3
snn_toy_model = SnnToyModel(max_timesteps)

if use_cuda:
    snn_toy_model.cuda()
    print("use cuda")
else:
    print("use cpu")
    
optimizer = torch.optim.Adam(snn_toy_model.torch_parameters(), lr=1e-3)

use cuda


Train snn_toy_model in dataset for 10 epochs

In [11]:
for i in range(10):
    snn_toy_model.train()
    for data, target in train_dataloader:
        if use_cuda:
            data, target = data.cuda(), target.cuda()
        optimizer.zero_grad()
        loss = torch.nn.functional.cross_entropy(snn_toy_model(data), target)
        loss.backward()
        optimizer.step()

    correct = 0.0
    snn_toy_model.eval()
    for data, target in test_dataloader:
        if use_cuda:
            data, target = data.cuda(), target.cuda()
            
        predict = snn_toy_model(data)
        correct += (predict.max(dim=1)[1] == target).float().sum()
    print(f"Acc on Epoch {i+1}/10 is {100 * correct/len(test_dataset):.2f}%")

Acc on Epoch 1/10 is 89.33%
Acc on Epoch 2/10 is 90.84%
Acc on Epoch 3/10 is 91.42%
Acc on Epoch 4/10 is 91.94%
Acc on Epoch 5/10 is 92.21%
Acc on Epoch 6/10 is 92.22%
Acc on Epoch 7/10 is 92.29%
Acc on Epoch 8/10 is 92.25%
Acc on Epoch 9/10 is 92.42%
Acc on Epoch 10/10 is 92.55%


Currently, the performance of spiking neural network is still have a large space to be improved. 
At this example, you can increase the max_timesteps to increase the final accuracy.


You can fetch all the hidden states of **Eve** via:

In [12]:
for k, hid in snn_toy_model.named_hidden_states():
    print(k, torch.typename(hid))

ifnode.voltage_hid torch.cuda.FloatTensor


## Quan

The following cells, we use the trainer provied by **Eve** to study on the different quantization methods' performance. 

In [13]:
from eve.app import make

mnist_trainer = make("mnist", 
                     checkpoint_path="", 
                     max_timesteps=1, 
                     data_kwargs={"root": "/media/densechen/data/dataset"}, 
                     kwargs={
                        "device": "cuda:0",
                        "root_dir": "/media/densechen/data/code/eve-mli/examples/logs"
                        },
                     net_arch_kwargs={
                        "quan": "SteQuan",  
                        "quan_kwargs":{
                            "max_bit_width": 4,
                        },
                     },
                     optimizer_kwargs={
                         "lr": 1e-3,
                     },
                    )
print("SteQuan")
for i in range(10):
    mnist_trainer.train_one_epoch()
    acc = mnist_trainer.test_one_epoch()
    print(f"Acc on Epoch {i+1}/10 is {100 * acc:.2f}%")

("making new trainer: mnist ({'checkpoint_path': '', 'max_timesteps': 1, "
 "'data_kwargs': {'root': '/media/densechen/data/dataset'}, 'kwargs': "
 "{'device': 'cuda:0', 'root_dir': "
 "'/media/densechen/data/code/eve-mli/examples/logs'}, 'net_arch_kwargs': "
 "{'quan': 'SteQuan', 'quan_kwargs': {'max_bit_width': 4}}, "
 "'optimizer_kwargs': {'lr': 0.001}})")
falied to load checkpoint , raise [Errno 2] No such file or directory: ''
original accuracy: 0.10848496835443038
create an upgrader automatically
SteQuan
Acc on Epoch 1/10 is 89.60%
Acc on Epoch 2/10 is 91.46%
Acc on Epoch 3/10 is 92.09%
Acc on Epoch 4/10 is 92.50%
Acc on Epoch 5/10 is 92.37%
Acc on Epoch 6/10 is 92.38%
Acc on Epoch 7/10 is 92.43%
Acc on Epoch 8/10 is 92.53%
Acc on Epoch 9/10 is 92.49%
Acc on Epoch 10/10 is 92.36%


## SteQuan

SteQuan is the most widely used quantization function with fixed alpha parameters.

SteQuan is more stable and can be trained with a larger learning rate.

In [14]:
from eve.app import make

mnist_trainer = make("mnist", 
                     checkpoint_path="", 
                     max_timesteps=1, 
                     data_kwargs={"root": "/media/densechen/data/dataset"}, 
                     kwargs={
                        "device": "cuda:0",
                        "root_dir": "/media/densechen/data/code/eve-mli/examples/logs"
                        },
                     net_arch_kwargs={
                        "quan": "LsqQuan",  
                        "quan_kwargs":{
                            "max_bit_width": 4,
                        },
                     },
                     optimizer_kwargs={
                         "lr": 1e-3,
                     },
                    )
print("LsqQuan")
for i in range(10):
    mnist_trainer.train_one_epoch()
    acc = mnist_trainer.test_one_epoch()
    print(f"Acc on Epoch {i+1}/10 is {100 * acc:.2f}%")

("making new trainer: mnist ({'checkpoint_path': '', 'max_timesteps': 1, "
 "'data_kwargs': {'root': '/media/densechen/data/dataset'}, 'kwargs': "
 "{'device': 'cuda:0', 'root_dir': "
 "'/media/densechen/data/code/eve-mli/examples/logs'}, 'net_arch_kwargs': "
 "{'quan': 'LsqQuan', 'quan_kwargs': {'max_bit_width': 4}}, "
 "'optimizer_kwargs': {'lr': 0.001}})")
falied to load checkpoint , raise [Errno 2] No such file or directory: ''
original accuracy: 0.10571598101265822
create an upgrader automatically
LsqQuan
Acc on Epoch 1/10 is 91.21%
Acc on Epoch 2/10 is 92.41%
Acc on Epoch 3/10 is 93.11%
Acc on Epoch 4/10 is 90.01%
Acc on Epoch 5/10 is 11.36%


ValueError: alpha must be positive

## LsqQuan

LsqQuan has a trainable alpha parameter, which is supervised with global error.
It is unstable during training and prunes to make a invalid alpha value, which is not a positive one.
Using a smaller learning rate is vital.

In [15]:
from eve.app import make
mnist_trainer = make("mnist", 
                     checkpoint_path="", 
                     max_timesteps=1, 
                     data_kwargs={"root": "/media/densechen/data/dataset"}, 
                     kwargs={
                        "device": "cuda:0",
                        "root_dir": "/media/densechen/data/code/eve-mli/examples/logs"
                        },
                     net_arch_kwargs={
                        "quan": "LlsqQuan", 
                        "quan_kwargs":{
                            "max_bit_width": 4,
                        },
                     },
                     optimizer_kwargs={
                         "lr": 1e-3,
                     },
                    )
print("LlsqQuan")
for i in range(10):
    mnist_trainer.train_one_epoch()
    acc = mnist_trainer.test_one_epoch()
    print(f"Acc on Epoch {i+1}/10 is {100 * acc:.2f}%")

("making new trainer: mnist ({'checkpoint_path': '', 'max_timesteps': 1, "
 "'data_kwargs': {'root': '/media/densechen/data/dataset'}, 'kwargs': "
 "{'device': 'cuda:0', 'root_dir': "
 "'/media/densechen/data/code/eve-mli/examples/logs'}, 'net_arch_kwargs': "
 "{'quan': 'LlsqQuan', 'quan_kwargs': {'max_bit_width': 4}}, "
 "'optimizer_kwargs': {'lr': 0.001}})")
falied to load checkpoint , raise [Errno 2] No such file or directory: ''
original accuracy: 0.08662974683544304
create an upgrader automatically
LlsqQuan
Acc on Epoch 1/10 is 92.34%
Acc on Epoch 2/10 is 94.48%
Acc on Epoch 3/10 is 95.33%
Acc on Epoch 4/10 is 95.40%
Acc on Epoch 5/10 is 95.06%
Acc on Epoch 6/10 is 93.93%
Acc on Epoch 7/10 is 93.44%
Acc on Epoch 8/10 is 93.08%
Acc on Epoch 9/10 is 92.94%
Acc on Epoch 10/10 is 92.70%


## LlsqQuan

LlsqQuan has a trainable alpha parameter, which is supervised with local error. It is more stable than LsqQuan.

## Hybrid Spiking and Quantization

In [16]:
from eve.app import make
mnist_trainer = make("mnist", 
                     checkpoint_path="", 
                     max_timesteps=5, 
                     data_kwargs={"root": "/media/densechen/data/dataset"}, 
                     kwargs={
                        "device": "cuda:0",
                        "root_dir": "/media/densechen/data/code/eve-mli/examples/logs"
                        },
                     net_arch_kwargs={
                        "quan": "SteQuan", 
                        "quan_kwargs":{
                            "max_bit_width": 4,
                        },
                     },
                     optimizer_kwargs={
                         "lr": 1e-3,
                     },
                    )
print("QuanSpiking")
mnist_trainer.eve_module.spike()
for i in range(10):
    mnist_trainer.train_one_epoch()
    acc = mnist_trainer.test_one_epoch()
    print(f"Acc on Epoch {i+1}/10 is {100 * acc:.2f}%")

("making new trainer: mnist ({'checkpoint_path': '', 'max_timesteps': 5, "
 "'data_kwargs': {'root': '/media/densechen/data/dataset'}, 'kwargs': "
 "{'device': 'cuda:0', 'root_dir': "
 "'/media/densechen/data/code/eve-mli/examples/logs'}, 'net_arch_kwargs': "
 "{'quan': 'SteQuan', 'quan_kwargs': {'max_bit_width': 4}}, "
 "'optimizer_kwargs': {'lr': 0.001}})")
falied to load checkpoint , raise [Errno 2] No such file or directory: ''
original accuracy: 0.08910205696202532
create an upgrader automatically
QuanSpiking
Acc on Epoch 1/10 is 89.33%
Acc on Epoch 2/10 is 90.93%
Acc on Epoch 3/10 is 92.13%
Acc on Epoch 4/10 is 92.41%
Acc on Epoch 5/10 is 92.66%
Acc on Epoch 6/10 is 92.90%
Acc on Epoch 7/10 is 92.80%
Acc on Epoch 8/10 is 92.81%
Acc on Epoch 9/10 is 93.02%
Acc on Epoch 10/10 is 93.02%
