Model Training v_0.1
Faretra

In [1]:
#Load Libraries
from io import FileIO
import os
import numpy as np
import torch
import glob
import torch.nn as nn
from torchvision.transforms import transforms
from torch.utils.data import DataLoader
from torch.optim import Adam
from torch.autograd import Variable
import torchvision
import pathlib
from pathlib import Path
import pickle



Check for CUDA (gpu augmentation), if the model has to use the CPU resources of the machine 
then the training and prediction time will be greatly increased

In [2]:
device= torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

cpu


For testing purposes we're gonna use Intel Img Classification dataset, but we're gonna add an horizontal flip at random to 
every img presented to the model in order to increase the variance without the need to download more files. 
Vertical flip would be useless here, as natural features such as trees are never vertically flipped. 
It may prove useful for the actual clothing dataset, as clothes come in different form, shape 
and sometimes orientation in space.
We're also going to trasform the pixel rgb value in a tensor array and normalize it in order to use the resulting tensor to make prediction (computer vision)

In [3]:
transformer= transforms.Compose([
    transforms.Resize((150,150)),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(), #0-255 to 0-1, numpy to tensors
    transforms.Normalize ([0.5,0.5,0.5], #0-1 to [-1,1] , formula(x-mean)/std
                         [0.5,0.5,0.5])
])


Dataloader


It is common best practice, in order to avoid memory overload, to not upload all the training img dataset at once, but divide it into serveral different batches. It's useful to adapt batch size relative to the CPU or GPU performaces in order to avoid memory overload errors.
In the Dataloader() we're also adding a shuffle to the batch in order for the model not to be biased


In [4]:
train_path=r"C:/Users/LT_J/OneDrive/Desktop/ML/InnovatesApp/Archivio_Img/seg_train"
test_path=r"C:/Users/LT_J/OneDrive/Desktop/ML/InnovatesApp/Archivio_Img/seg_test"
#C:\Users\LT_J\OneDrive\Desktop\ML\InnovatesApp\Archivio_Img\seg_test
#C:/Users/LT_J/OneDrive/Desktop/ML/InnovatesApp/Archivio_Img/seg_train
train_loader=DataLoader(
    torchvision.datasets.ImageFolder(train_path, transform=transformer),
    batch_size=256, shuffle =True
)
test_loader=DataLoader(
    torchvision.datasets.ImageFolder(test_path, transform=transformer),
    batch_size=256, shuffle =True
)

Now we're going to extract all the different categories from the folders

In [5]:
train_root=pathlib.Path(str(train_path))
test_root=pathlib.Path(str(test_path))
train_classes= os.listdir(train_root)
test_classes= os.listdir(test_root)

print("Training classes are")
print(train_classes)
print('')
print("Test classes are")
print(test_classes)


Training classes are
['buildings', 'forest', 'glacier', 'mountain', 'sea', 'street']

Test classes are
['buildings', 'forest', 'glacier', 'mountain', 'sea', 'street']


Now we'll write our CNN Network class

In [6]:
class ConvNet(nn.Module):
    def __init__(self,num_classes=6):
        super(ConvNet,self).__init__()
        
        #Output size after convolution filter
        #((w-f+2P)/s) +1  w=Width,Heigth, f=kernelSize , P=padding , s=stride
        #Input shape= (256,3,150,150)--->(batchSize,colorChannels,Xsize,Ysize)
        self.conv1=nn.Conv2d(in_channels=3,out_channels=12,kernel_size=3,stride=1,padding=1)
        #Output shape (256,12,150,150)-->increased channel number
        self.bn1=nn.BatchNorm2d(num_features=12)  #n_features=channels
        #Output shape (256,12,150,150)
        self.relu1=nn.ReLU()
        #Output shape (256,12,150,150)
        
        self.pool = nn.MaxPool2d(kernel_size=2)
        #Reduce the image size by factor 2
        #Output shape (256,12,75,75)
        
        self.conv2=nn.Conv2d(in_channels=12,out_channels=20,kernel_size=3,stride=1,padding=1)
        #Output shape (256,20,75,75)
        self.relu2=nn.ReLU()
        #Output shape (256,20,75,75)
        
        self.conv3=nn.Conv2d(in_channels=20,out_channels=32,kernel_size=3,stride=1,padding=1)
        #Output shape (256,32,75,75)-->increased channel number
        self.bn3=nn.BatchNorm2d(num_features=32)  #n_features=channels
        #Output shape (256,32,75,75)
        self.relu3=nn.ReLU()
        #Output shape (256,32,75,75)

        
        #The syntax we saw on the previous lines is how we can add multiple convolution layers to our model.
        #By adding more steps that gradually increase the number of channels and the accuracy fo the model
        
        self.fc=nn.Linear(in_features=32*75*75,out_features=num_classes)
        
        #Feed forward Function
        

    def forward(self,input):
        output=self.conv1(input)
        output=self.bn1(output)
        output=self.relu1(output)
            
        output=self.pool(output)
            
        output=self.conv2(output)
        output=self.relu2(output)
            
        output=self.conv3(output)
        output=self.bn3(output)
        output=self.relu3(output)
            
            #Output is going to be in matrix form(256,32,75,75), 
            #to feed the output we're going to reshape it first
            
        output=output.view(-1,32*75*75)
            
        output=self.fc(output)
            
        return output
        
        

In [7]:
ConvNet(num_classes=6).to(device)
model=ConvNet(num_classes=6)

In [8]:
#Optimizer and loss function
optimizer=Adam(model.parameters(),lr=0.001,weight_decay=0.0001)
loss_function=nn.CrossEntropyLoss()

In [9]:
num_epochs=10

In [10]:
#Calculating the size of training and testing images
train_count=len(glob.glob(train_path+'/**/*.jpg'))
test_count=len(glob.glob(test_path+'/**/*.jpg'))

In [11]:
print(train_count,test_count)

14034 3000


In [12]:
#Model training and saving best model for each epoch. We're gonna save 
#the best model for each epoch

best_accuracy=0.0

for epoch in range(num_epochs):
    #Evaluation and training on training dataset
    model.train()
    train_accuracy=0.0
    train_loss=0.0
    
    for i,(images,labels) in enumerate(train_loader):
        
        if torch.cuda.is_available():
            images=Variable(images.cuda())
            labels=Variable(labels.cuda())
        
        optimizer.zero_grad()
        
        outputs=model(images)
        loss=loss_function(outputs, labels)
        loss.backward()
        optimizer.step()
        
        train_loss+= loss.cpu().data*images.size(0)
        _,prediction=torch.max(outputs.data,1)
        
        train_accuracy+=int(torch.sum(prediction==labels.data))
        
    train_accuracy=train_accuracy/train_count
    train_loss=train_loss/train_count
    
    #Evaluation on testing dataset
    model.eval()
    
    test_accuracy=0.0
    
    for i,(images,labels) in enumerate(test_loader):
        
        if torch.cuda.is_available():
            images=Variable(images.cuda())
            labels=Variable(labels.cuda())
        
        outputs=model(images)
        _,prediction=torch.max(outputs.data,1)
        test_accuracy+=int(torch.sum(prediction==labels.data))
        
    test_accuracy=test_accuracy/test_count
    
    print('Epoch: '+ str(epoch)+ 'Train Loss '+str(int(train_loss))+' Train Accuracy: '+str(train_accuracy)+' Test Accuracy '+ str(test_accuracy))
    
    #Save best model
    if test_accuracy>best_accuracy:
        torch.save(model.state_dict(), 'best_checkpoint.model')
        best_accuracy=test_accuracy

Epoch: 0Train Loss 10 Train Accuracy: 0.5106883283454468 Test Accuracy 0.6066666666666667
Epoch: 1Train Loss 1 Train Accuracy: 0.7034345161750035 Test Accuracy 0.6166666666666667
Epoch: 2Train Loss 1 Train Accuracy: 0.778395325637737 Test Accuracy 0.7166666666666667
Epoch: 3Train Loss 0 Train Accuracy: 0.8321932449764857 Test Accuracy 0.6413333333333333


We will save our trained model to the disk using the pickle library. Pickle is used to serializing and de-serializing a Python object structure. In which python object is converted into the byte stream. dump() method dumps the object into the file specified in the arguments.

In our case, we want to save our model so that it can be used by the server. So we will save our object regressor to the file named model.pkl.

In [14]:
pickle.dump(model, open('model.pkl','wb'))
pickle.dump(model,open('model.pkl2','wb'))