Accuracy drops during extended training. #8

Sahel13 · 2021-10-15T11:51:06Z

Hi,

I've built the following quaternion CNN using the methods provided.

  class QLeNet_300_100(nn.Module):
      def __init__(self):
          super().__init__()
          self.fc1 = layers.QLinear(196, 75)
          self.fc2 = layers.QLinear(75, 25)
          self.fc3 = layers.QLinear(25, 10)
          self.abs = layers.QuaternionToReal(10)
  
      def forward(self, x):
          x = torch.flatten(x, 1)
          x = F.relu(self.fc1(x))
          x = F.relu(self.fc2(x))
          x = self.abs(self.fc3(x))
          return x

When training the model for an extended duration on the MNIST dataset, the accuracy suddenly drops to nearly 10%, which is what we would expect from an untrained model, and doesn't improve any further. An image of the accuracy values as training progresses is attached.

The same issue also persists when using the methods in Parcollet's original repo. I would appreciate some insight into why this might be happening. If you need additional info, I can provide the code to recreate this issue.

Thanks,
Sahel

The text was updated successfully, but these errors were encountered:

giorgiozannini · 2021-10-16T11:53:12Z

Thanks a lot for the interest in our work!
Yes if you may kindly provide the full code we will look into it ASAP!

Sahel13 · 2021-10-16T12:18:05Z

Hi,

Thanks for the quick reply. I'm attaching the file.

Sahel13 · 2021-10-16T12:19:42Z

I closed the issue by mistake. Here's the code. You might need to change the 'data_directory' variable.

import os
import torch
import torch.nn as nn
import torchvision
from htorch import layers
import torchvision.transforms as transforms
import torch.nn.functional as F


"""
Parameters.
"""
batch_size = 128
learning_rate = 0.1
num_epochs = 40
data_directory = os.path.join('data', 'mnist')

use_gpu = True
device = torch.device("cuda:0" if use_gpu else "cpu")

"""
Get the data.
"""
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.1307], std=[0.3081])
])

trainset = torchvision.datasets.MNIST(
    root=data_directory,
    train=True,
    download=True,
    transform=transform
)
testset = torchvision.datasets.MNIST(
    root=data_directory,
    train=False,
    download=True,
    transform=transform
)

trainloader = torch.utils.data.DataLoader(
    trainset,
    batch_size=batch_size,
    shuffle=True,
    num_workers=2
)
testloader = torch.utils.data.DataLoader(
    testset,
    batch_size=batch_size,
    shuffle=True,
    num_workers=2
)


"""
Define the model.
"""
class LeNet_300_100(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = layers.QLinear(196, 75)
        self.fc2 = layers.QLinear(75, 25)
        self.fc3 = layers.QLinear(25, 10)
        self.abs = layers.QuaternionToReal(10)

    def forward(self, x):
        x = torch.flatten(x, 1)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.abs(self.fc3(x))
        return x


model = LeNet_300_100()
model.to(device)


"""
Train and test.
"""
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate, momentum=0.9)


def test_model(model, testloader, device):
    correct = 0
    total = 0

    with torch.no_grad():
        for data in testloader:
            images, labels = data[0].to(device), data[1].to(device)
            outputs = model(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

    accuracy = 100 * correct / total
    return accuracy


for epoch in range(num_epochs):
    if epoch == 0:
        accuracy = test_model(model, testloader, device)
        print("ep  {:03d}  loss    {:.3f}  acc  {:.3f}%".format(epoch,
               0, accuracy))

    epoch_loss = 0.0
    for i, data in enumerate(trainloader, 0):

        images, labels = data[0].to(device), data[1].to(device)

        # Zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        epoch_loss += loss.item()

    # Test accuracy at the end of each epoch
    accuracy = test_model(model, testloader, device)

    print("ep  {:03d}  loss  {:.3f}  acc  {:.3f}%".format(
        epoch + 1, epoch_loss / len(trainloader), accuracy))

print("\nTraining complete.\n")

giorgiozannini · 2021-10-16T13:59:29Z

This looks like a learning rate problem! I tried running your code with a learning rate of 0.05 and had no problems.

Sahel13 · 2021-10-16T17:22:09Z

Oh okay. I was using the same parameters for a comparable real-valued network and that worked fine, so maybe that's why I may have missed this. Thank you.

On a side note, have you used your methods to try and construct a quaternion model that performs better than a real-valued counterpart at a classification task, say like the one in Gaudet's paper?

giorgiozannini · 2021-10-17T19:06:03Z

We did! Actually that's what we are working on right now, as soon as we find a configuration with a noticeable improvement over real NN's we will update the repo.

Sahel13 closed this as completed Oct 16, 2021

Sahel13 reopened this Oct 16, 2021

Sahel13 mentioned this issue Oct 17, 2021

QBatchNorm2d RuntimeError 'shape invalid' #9

Closed

giorgiozannini closed this as completed Oct 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accuracy drops during extended training. #8

Accuracy drops during extended training. #8

Sahel13 commented Oct 15, 2021 •

edited

Loading

giorgiozannini commented Oct 16, 2021

Sahel13 commented Oct 16, 2021

Sahel13 commented Oct 16, 2021 •

edited

Loading

giorgiozannini commented Oct 16, 2021

Sahel13 commented Oct 16, 2021

giorgiozannini commented Oct 17, 2021

Accuracy drops during extended training. #8

Accuracy drops during extended training. #8

Comments

Sahel13 commented Oct 15, 2021 • edited Loading

giorgiozannini commented Oct 16, 2021

Sahel13 commented Oct 16, 2021

Sahel13 commented Oct 16, 2021 • edited Loading

giorgiozannini commented Oct 16, 2021

Sahel13 commented Oct 16, 2021

giorgiozannini commented Oct 17, 2021

Sahel13 commented Oct 15, 2021 •

edited

Loading

Sahel13 commented Oct 16, 2021 •

edited

Loading