# Convolutional Neural Network Image Classifier

2D convolutions produce outputs by applying filters, i. e. kernels, to 2D matrices. The outputs are 2D matrices as well. For detailed explanation visit: https://towardsdatascience.com/intuitively-understanding-convolutions-for-deep-learning-1f6f42faee1

In PyTorch the Conv2d layer takes in the following parameters: `
    nn.Conv2d(in_channels,out_channels, kernel_size, stride, padding)`

In [23]:
import torch
import torch.nn as nn
import torch.optim as optim
import torch.utils.data
import torch.nn.functional as F
import torchvision
from torchvision import transforms
from PIL import Image

The `num_classes=2` parameter in `__init__` indicates how many classes the classifier outputs; as this is in fact the original AlexNet architecture the specification requiring classes to be stated thusly is met.

`nn.Sequential()` allows creation of chains of layers, which enables breaking up of the model into more logical arrangements. Here two such chains are used: `features()` and `classifier()`.

Pooling layers such as `MaxPool2d` are used to reduce the resolution of the network from previous input layer, which yields fewer parameters in lower layers. This compression speeds up computation and also prevents overfitting. For detailed explanation visit: https://analyticsindiamag.com/max-pooling-in-convolutional-neural-network-and-its-features/

Alternative to `MaxPool` and `AvgPool` are `AdaptiveMaxPool` and `AdaptiveAvgPool` which work independently of the incoming tensor's dimensions and are for that reason recommended. Architectures using them can work with different input dimensions, which is handy when working with disparate datasets.

`Dropout` layer is a simple way to prevent or reduce overfitting. It works by randomly selecting a certain user defined percent of the nodes which will not be updated during the training cycle. As a result, these nodes will be prevented from overfitting the data and the randomness helps increase this generalization further. Finally, `Dropout` layers are set in such manner that they are active only during training and not validation.

In [24]:
class CNNNet(nn.Module):
                                                                    
    def __init__(self, num_classes=2):
        super(CNNNet, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),  # input layer
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(64, 192, kernel_size=5, padding=2),           # layer
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(192, 384, kernel_size=3, padding=1),          # layer
            nn.ReLU(),
            nn.Conv2d(384, 256, kernel_size=3, padding=1),          # layer
            nn.ReLU(),
            nn.Conv2d(256, 256, kernel_size=3, padding=1),          # layer
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=3, stride=2)
        )
        self.avgpool = nn.AdaptiveAvgPool2d((6, 6))
        self.classifier = nn.Sequential(
            nn.Dropout(),                                           # layer
            nn.Linear(256 * 6 * 6, 4096),
            nn.ReLU(),
            nn.Dropout(),                                           # layer
            nn.Linear(4096, 4096),
            nn.ReLU(),
            nn.Linear(4096, num_classes)                            # output layer
        )
  
    def forward(self, x):
        x = self.features(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.classifier(x)
        return x

In [25]:
cnnnet = CNNNet()

In [26]:
def train(model, optimizer, loss_fn, train_loader, val_loader, epochs=20, device="cpu"):

    for epoch in range(epochs):
        training_loss = 0.0
        valid_loss = 0.0
        model.train()
        for batch in train_loader:
            optimizer.zero_grad()
            inputs, targets = batch
            inputs = inputs.to(device)
            targets = targets.to(device)
            output = model(inputs)
            loss = loss_fn(output, targets)
            loss.backward()
            optimizer.step()
            training_loss += loss.data.item() * inputs.size(0)
        training_loss /= len(train_loader.dataset)

        model.eval()
        num_correct = 0
        num_examples = 0
        for batch in val_loader:
            inputs, targets = batch
            inputs = inputs.to(device)
            output = model(inputs)
            targets = targets.to(device)
            loss = loss_fn(output, targets)
            valid_loss += loss.data.item() * inputs.size(0)
            correct = torch.eq(torch.max(F.softmax(output), dim=1)[1], targets).view(-1)
            num_correct += torch.sum(correct).item()
            num_examples += correct.shape[0]
        valid_loss /= len(val_loader.dataset)

        print('Epoch: {}, TrainingLoss: {:.2f}, Validation Loss: {:.2f}, accuracy = {:.2f}'
          .format(epoch, training_loss, valid_loss, num_correct / num_examples))

In [27]:
def check_image(path):
    try:
        im = Image.open(path)
        return True
    except:
        return False

img_transforms = transforms.Compose([
    transforms.Resize((64, 64)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std =[0.229, 0.224, 0.225]) 
    ])

train_data_path = "/Users/nikolavetnic/Desktop/Datasets/img_catfish/train/"
train_data = torchvision.datasets.ImageFolder(
    root=train_data_path,
    transform=img_transforms,
    is_valid_file=check_image)

val_data_path = "/Users/nikolavetnic/Desktop/Datasets/img_catfish/val/"
val_data = torchvision.datasets.ImageFolder(
    root=val_data_path,
    transform=img_transforms,
    is_valid_file=check_image)

test_data_path = "/Users/nikolavetnic/Desktop/Datasets/img_catfish/test/"
test_data = torchvision.datasets.ImageFolder(
    root=test_data_path,
    transform=img_transforms,
    is_valid_file=check_image)

batch_size = 64

train_data_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size,shuffle=True)
val_data_loader = torch.utils.data.DataLoader(val_data, batch_size=batch_size, shuffle=True)
test_data_loader = torch.utils.data.DataLoader(test_data, batch_size=batch_size, shuffle=True)

if torch.cuda.is_available():
    device = torch.device("cuda")
else:
    device = torch.device("cpu")

In [28]:
cnnnet.to(device)
optimizer = optim.Adam(cnnnet.parameters(), lr=0.001)

In [29]:
train(cnnnet, optimizer, torch.nn.CrossEntropyLoss(), train_data_loader, val_data_loader, epochs=10, device=device)



Epoch: 0, TrainingLoss: 0.88, Validation Loss: 0.71, accuracy = 0.21
Epoch: 1, TrainingLoss: 0.68, Validation Loss: 0.87, accuracy = 0.55
Epoch: 2, TrainingLoss: 0.59, Validation Loss: 0.59, accuracy = 0.61
Epoch: 3, TrainingLoss: 0.53, Validation Loss: 0.45, accuracy = 0.73
Epoch: 4, TrainingLoss: 0.48, Validation Loss: 0.61, accuracy = 0.69
Epoch: 5, TrainingLoss: 0.44, Validation Loss: 0.40, accuracy = 0.83
Epoch: 6, TrainingLoss: 0.43, Validation Loss: 0.36, accuracy = 0.87
Epoch: 7, TrainingLoss: 0.49, Validation Loss: 0.42, accuracy = 0.82
Epoch: 8, TrainingLoss: 0.50, Validation Loss: 0.48, accuracy = 0.71
Epoch: 9, TrainingLoss: 0.42, Validation Loss: 0.76, accuracy = 0.61


I had problems with making predictions code from the SimpleNet example, namely the final three lines. The problem was in wrong shape of the tensor `img` passed to the model, which is weird because it was of the same batch the model used to train on. This was resolved by performing `.unsqueeze(0)` on the tensor, which is a solution I found here: https://discuss.pytorch.org/t/runtimeerror-expected-4-dimensional-input-for-4-dimensional-weight-6-3-5-5-but-got-3-dimensional-input-of-size-3-256-256-instead/37189

In [35]:
labels = ['cat', 'fish']

img = Image.open("/Users/nikolavetnic/Desktop/Datasets/img_catfish/test/fish/246232021_a806f30fbc.jpg")
img = img_transforms(img).to(device)

predicted_class = labels[torch.argmax(cnnnet(img.unsqueeze(0)))]
print(predicted_class)

fish
