<a href="https://colab.research.google.com/github/berthine/AMMI_git/blob/master/Copy_of_Copy_of_Cat_dog_Transfer_Learning_PyTorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# A practical example to learn Transfer learning with PyTorch

#### Tranfer learning is a way to solve a computer vision problem with potentially a small dataset and without too much computer power and specialized hardware. 


With transfer learning, the learning process can be **faster**, **more accurate** and need **less training data**, in fact, 

1.   ***the size of the dataset***
2.   ***the similarity with the original dataset (the one in which the network was initially trained) are the two keys to consider before applying transfer learning.***

There are four scenarios:

1. Small dataset and similar to the original: train only the (last) fully connected layer
2. Small dataset and different to the original: train only the fully connected layers
3. Large dataset and similar to the original: freeze the earlier layers (simple features) and train the rest of the layers
4. Large dataset and different to the original: train the model from scratch and reuse the network architecture (using the trained weights as a started point).

Note : In a network, the earlier layers capture the simplest features of the images (edges, lines…) whereas the deep layers capture more complex features in a combination of the earlier layers (for example eyes or mouth in a face recognition problem). In order to fine-tune a model, we need to retrain the final layers because the earlier layers have knowledge useful for us.

# **Gaol**

1.   We are going to learn how transfer learning can help us to solve a problem without spending too much time training a model
2.   Taking advantage of pretrained architectures.


 

We will start with importing the necessary packages.

Task and pretrained model


---


 Dog and cat classifiction using resnet34 pretrained model.


 Needed packages





In [0]:
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
import torchvision
from torchvision import datasets, models, transforms
import matplotlib.pylab as plt

# Loading Cat and Dog Dataset

In [0]:
#https://drive.google.com/file/d/1qXKSt6jAgh20f-2PkRT-Z61PW-3E4S_z/

# Load helper file
link = "https://drive.google.com/file/d/1qXKSt6jAgh20f-2PkRT-Z61PW-3E4S_z/"

_, id_t = link.split('d/')

id = id_t.split('/')[0]

print ("Loading file ...")

print (id) # Verify that you have everything after '='

# Install the PyDrive wrapper & import libraries.
# This only needs to be done once per notebook.
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# Authenticate and create the PyDrive client.
# This only needs to be done once per notebook.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)


file_id = id
downloaded = drive.CreateFile({'id':file_id})
downloaded.FetchMetadata(fetch_all=True)
downloaded.GetContentFile(downloaded.metadata['title'])
print ("Completed")

In [0]:
!ls data

In [0]:
!pwd #to see where you are

In [0]:
!unzip data.zip

In [0]:
transforms = transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], 
                             [0.229, 0.224, 0.225])
    ])

train_set = datasets.ImageFolder("data/train",transforms)
val_set   = datasets.ImageFolder("data/val",transforms)
  
train_loader = torch.utils.data.DataLoader(train_set, batch_size=4,
                                       shuffle=True, num_workers=4)
val_loader = torch.utils.data.DataLoader(val_set, batch_size=4,  
                                       shuffle=True, num_workers=4)
classes = train_set.classes
device = torch.device("cuda:0" if torch.cuda.is_available()
                               else "cpu")


In [0]:
!ls data

# Model Building

Now, we define the  network we’ll be training. The **resnet34** model was originally trained for a dataset that had 1000 class labels, but our dataset only has two class labels! We'll replace the final layer with a new, untrained layer that has only two outputs ( dog and cat).

In [0]:
model = models.resnet34(pretrained=True)

In [0]:
model

In [0]:
classes = ['cat', 'dog']
mean , std = torch.tensor([0.485, 0.456, 0.406]),torch.tensor([0.229, 0.224, 0.225])

def denormalize(image):
  image = transforms.Normalize(-mean/std,1/std)(image) #denormalize
  image = image.permute(1,2,0) #Changing from 3x224x224 to 224x224x3
  image = torch.clamp(image,0,1)
  return image
# helper function to un-normalize and display an image
def imshow(img):
    img = denormalize(img) 
    plt.imshow(img)

In [0]:
#imshow(train_loader[0])

## Feature Extraction & Image embedding

In [0]:
for param in model.parameters():
    param.requires_grad = False

In [0]:
#how to extract features in the model
num_ftrs = model.fc.in_features #the number of input features of the last layer of thefully connected 
model.fc = nn.Linear(num_ftrs, 2) # return the last layer of the fc(the last layer of the resnet34 model)


In [0]:

num_ftrs, model.fc

In [0]:
 for i, data in enumerate(train_loader, 0):
        inputs, labels = data
        break
# # plot the images in the batch, along with the corresponding labels
# fig = plt.figure(figsize=(20, 8))
# # display 8 images
# for idx in np.arange(4):
#     ax = fig.add_subplot(2, 4/2, idx+1, xticks=[], yticks=[])
#     imshow(inputs[idx])
#     ax.set_title("{} ".format(classes[labels[idx]]))

print(inputs.shape)
embeds = model(inputs)  #print the mini_batch_sise, nb_channel, W and hight

In [0]:
print('shape of embedding for one batch size', embeds.shape)
print("-------------------------------------------------")
print(embeds) #the representation of the image

### we are doing binary classification we will alter the final layer to have two neurons.

In [0]:
num_ftrs = model.fc.in_features
model.fc = nn.Linear(num_ftrs, 2)
model = model.to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

# Model Training

In [0]:
for epoch in range(25):
    running_loss = 0.0
    for i, data in enumerate(train_loader, 0):
        inputs, labels = data
        inputs = inputs.to(device)
        labels = labels.to(device)
        
        optimizer.zero_grad()
        
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
    print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, i * len(data), len(train_loader.dataset),
                100. * i / len(train_loader), running_loss))
print('Finished Training')

# Model Validation

In [0]:
class_correct = list(0. for i in range(2))
class_total = list(0. for i in range(2))
with torch.no_grad():
    for i, data in enumerate(val_loader, 0):
            inputs, labels = data
            inputs = inputs.to(device)
            labels = labels.to(device)
            outputs = model(inputs)
            _, predicted = torch.max(outputs, 1)
            c = (predicted == labels).squeeze()
            for i in range(4):
                label = labels[i]
                class_correct[label] += c[i].item()
                class_total[label] += 1
for i in range(2):
    print('Accuracy of %5s : %2d %%' % (
        classes[i], 100 * class_correct[i] / class_total[i]))

# Test with our own image

In [0]:
from PIL import Image
from torchvision.transforms import ToTensor
model.eval()
#img_name1 = "index.jpeg" # change this to the name of your image file.
img_name2 = "Im2.jpg"
def predict_image(image_path, model):
    image = Image.open(image_path)
    print(type(image))
    image_tensor = ToTensor()(image)
    #image_tensor = transforms(image)
    image_tensor = image_tensor.unsqueeze(0)
    image_tensor = image_tensor.to(device)
    output = model(image_tensor)
    index = output.argmax().item()
    if index == 0:
        return "It's a Cat"
    elif index == 1:
        return "It's a Dog"
    else:
        return "None of them"
predict_image(img_name2,model)

In [0]:
! ls /content