<img src="https://i.ibb.co/qjt4Ymb/2022-09-19-004719.png" alt="2022-09-19-004719" border="0">

# AOI simple Pipeline (Part 2)

# Exercise: Full solution
* Single CNN model
* ImageDataSet
* ImageDataLoader
* Submit results

AIdea AOI Project
https://aidea-web.tw/topic/285ef3be-44eb-43dd-85cc-f0388bf85ea4

## Step 1: Load the dataset from google drive
If the following command does not work, please download it, put it on your Google drive, and set up sharing

Download from:
https://drive.google.com/file/d/1tovCO2gsjesjJ8OsfHgahyt-buY34dk0/view?usp=sharing

https://drive.google.com/file/d/1SZXCzhR_cr11tiBq52ASOc0urFxYYhs3/view?usp=sharing

In [None]:
%%bash
gdown https://drive.google.com/uc?id=1_fSiJdT7X_BT_IOf23yn9x5AvvaXSFb_
unzip aoi-dataset.zip
rm aoi-dataset.zip

## Step 2: Import python libraries

In [None]:
import os
import glob
import torch
from torch import nn
from torch.utils.data import Dataset, DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

In [None]:
import torch
print (torch.cuda.is_available())

True


In [None]:
device_name=torch.cuda.get_device_name(0)
print(f"Using GPU {device_name}")

Using GPU Tesla T4


In [None]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

## Step 3: Choose one of CNN models and load from the saved file

### EfficientNet B0 to B7

__Model-EfficientNet__

https://pytorch.org/hub/nvidia_deeplearningexamples_efficientnet/

|  Base model | resolution  | Base model | resolution  |
|---|---|---|---|
| EfficientNetB0  | 224  | EfficientNetB4  | 380  |
| EfficientNetB1  | 240  | EfficientNetB5  | 456  |
| EfficientNetB2  | 260  | EfficientNetB6  | 528  |
| EfficientNetB3  | 300  | EfficientNetB7  | 600  |


In [None]:
import torchvision.models as models
num_classes=6
filepath = "AOI-B1.pth"
model=models.efficientnet_b1(num_classes=num_classes)
model.load_state_dict(torch.load(filepath))
model.cuda()

## Step 4: Load the test set

In [None]:
import pandas as pd
df_test = pd.read_csv("test.csv")
print(df_test.shape)

(10142, 2)


In [None]:
df_test.head()

Unnamed: 0,ID,Label
0,test_00000.png,
1,test_00001.png,
2,test_00002.png,
3,test_00003.png,
4,test_00004.png,


In [None]:
test_files  = df_test.iloc[:,0].values
test_labels = df_test.iloc[:,1].values
print(test_labels[:10])

[nan nan nan nan nan nan nan nan nan nan]


## Step 5: Set up a test_dataloader with test_dataset

In [None]:
from torchvision import transforms
pretrained_size = 240
pretrained_means = [0.485, 0.456, 0.406]
pretrained_stds= [0.229, 0.224, 0.225]
test_transform = transforms.Compose([
    transforms.Resize(pretrained_size),
    transforms.ToTensor(),
    transforms.Normalize(mean = pretrained_means, std = pretrained_stds)
])
batches =48

In [None]:
from PIL import Image
class CustomDataset(torch.utils.data.Dataset):
    def __init__(self, csv_path, images_folder, transform = None):
        self.df = pd.read_csv(csv_path)
        self.images_folder = images_folder
        self.transform = transform

    def __len__(self):
        return len(self.df)
    def __getitem__(self, index):
        filename = self.df.iloc[index]['ID']
        label = self.df.iloc[index]['Label']
        image = Image.open(os.path.join(self.images_folder, filename)).convert('RGB')
        if self.transform is not None:
            image = self.transform(image)
        return image, label

In [None]:
imgdir= "test_images"
csvfile = "test.csv"


In [None]:
test_dataset = CustomDataset(csvfile, imgdir, test_transform)
test_dataloader = DataLoader(test_dataset,batch_size=batches, shuffle=False)
print(f"Total images={len(test_dataset)}")

Total images=10142


In [None]:
total_batch=len(test_dataset)//batches + 1
print(total_batch)

212


## Step 6: Check test results

In [None]:
test_predictions = np.zeros(len(test_labels))

In [None]:
model.eval()
# again no gradients needed
with torch.no_grad():
    total_batch = len(test_dataset)//batches
    for i, (batch_images, batch_labels) in enumerate(test_dataloader):
      images = batch_images.cuda()
      labels = batch_labels.cuda()
      outputs = model(images)
      _, predictions = torch.max(outputs, 1)
      test_predictions[i*batches:(i+1)*batches] = predictions.cpu()
      if (i+1) % 10 == 0:
          print(f'lter [{i+1}/{total_batch}]')

lter [10/211]
lter [20/211]
lter [30/211]
lter [40/211]
lter [50/211]
lter [60/211]
lter [70/211]
lter [80/211]
lter [90/211]
lter [100/211]
lter [110/211]
lter [120/211]
lter [130/211]
lter [140/211]
lter [150/211]
lter [160/211]
lter [170/211]
lter [180/211]
lter [190/211]
lter [200/211]
lter [210/211]


In [None]:
test_predictions=test_predictions.astype(int)
test_predictions[:10]

array([1, 2, 5, 0, 2, 5, 5, 5, 0, 2])

## Step 7: Output test results

In [None]:
df_out = pd.DataFrame(df_test)
df_out.shape

(10142, 2)

In [None]:
df_out['Label'] = test_predictions
df_out.to_csv("pt-aoi_B28_1.csv", index=False)