<a href="https://colab.research.google.com/github/ST3ALT4/edgeai_robotics/blob/main/image_processing/Copy_of_Deeplearning_with_Images_Part_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this notebook we will implement deep convolutional neural networks for image processing tasks. If you have never implemented a neural network, this notebook will help you start with the basics and slowly move towards a stage where you can write your own networks and train them for a specific task. If you are already comfortable with the implemention of a deeplearning convolutional neural network, this will be a recap of the basics before we move to Part 2.
Lets get started:

We will begin with reading images and getting them ready for use. You have already used the 2 popular image libraries **cv2** and **PIL**. We will discuss a couple of points about them. At this point we will also talk about the deeplearning frameworks and their model formats.
The most common frameworks are **Tensorflow , Keras , Pytorch**. Although these are widely used today, they are not the oldest ones. **Torch** and **Caffe** are among 2 frameworks used in the early days. The Matlab adaptation of caffe was called **Matcaffe** while the python adaptation was **pycaffe**. The Torch we are talking about is not the Pytorch we know today. It was a different package where the programming language was **".lua"**. Later it was adapted in python as **Pytorch**. Lets now start reading images and discussing formats:-

In [None]:
#Reading with cv2
import cv2

img = cv2.imread('/content/Image_1.jpg')
print(img.shape)

# We will notice that cv2 reads the image in a numpy array with HWC format. This is the format commonly used for tensorflow and keras frameworks. Also, the image is 0-255 range.

(240, 320, 3)


In [None]:
#Reading with Pillow
from PIL import Image
import torchvision.transforms as transforms


t=transforms.ToTensor()
im = Image.open('/content/Image_1.jpg')

im = t(im)
print(im.shape)

# We will notice that PIL reads the image and returns a PIL object. This is later converted to a torch tensor with CHW format. This is the format commonly used for pytorch. Also, the image is 0-1 range.

torch.Size([3, 240, 320])


If you are wondering whether we always use cv2 with tensorflow and PIL with pytorch, the answer is "not really". We are ofcourse free to use what we want as long as we end up with the correct format. For example, while using pytorch, I may read images with cv2 but I need to convert the numpy array into a torch tensor and transpose the matrix to follow the CHW format and am good to go. Take a look at the examples below if theres any confusion.

In [None]:
import torch
img = cv2.imread('/content/Image_1.jpg')
img = img.transpose((2,0,1))
print(img.shape)
img = torch.from_numpy(img)
print(img.shape)
img=img/255 #to bring it in 0-1 range
#

(3, 240, 320)
torch.Size([3, 240, 320])


Now that we discussed reading single images, lets talk about reading the entire dataset using a dataloader. We will write a class for it.

In [None]:
#dataloader for an image processing (regression) task

import os
import numpy as np
from PIL import Image
from torchvision import transforms
from torch.utils.data import Dataset
import torch.utils.data as data
from random import randrange
import cv2
import torch
from torch.utils.data import DataLoader
import torchvision

to_tensor = transforms.Compose([
    transforms.ToTensor()
])


class DLdata:


    def __init__(self,train=True):

        data_dir = '/content/data/'


        self.output = data_dir+'out'
        self.input = data_dir+'inp'

        self.image_names = os.listdir(self.input)

    def __len__(self):
        return len(self.image_names)

    def __getitem__(self, idx):

        img_name = self.image_names[idx]
        out_img_name='gt_'+img_name
        input_image = Image.open(os.path.join(self.input, img_name)).convert('L') #noisy1.jpg
        output_image = Image.open(os.path.join(self.output, out_img_name)).convert('L') #gt_noisy1.jpg




        x, y = input_image.size

        matrix=260
        x1 = randrange(0, x - matrix)
        y1 = randrange(0, y - matrix)
        input_image=input_image.crop((x1, y1, x1 + matrix, y1 + matrix))
        output_image = output_image.crop((x1, y1, x1 + matrix, y1 + matrix))

        input_image = to_tensor(input_image)
        output_image = to_tensor(output_image)

        return input_image, output_image



kwargs = {'num_workers': 4,'pin_memory': True} if torch.cuda.is_available() else {}
print('loading train')
training_set = DLdata(train=True,)
train_loader = DataLoader(training_set, batch_size=3, shuffle=True, **kwargs)

loading train


FileNotFoundError: [Errno 2] No such file or directory: '/content/data/inp'

In [None]:
import kagglehub

# Download latest version
path = kagglehub.dataset_download("nourabdoun/fruits-quality-fresh-vs-rotten")

print("Path to dataset files:", path)

Path to dataset files: /kaggle/input/fruits-quality-fresh-vs-rotten


In [None]:
#dataloader for a classification task
#!unzip  /content/archive.zip

from torch.utils.data.sampler import SubsetRandomSampler
import numpy as np
import os
import numpy as np
from PIL import Image
from torchvision import transforms
from torch.utils.data import Dataset
import torch.utils.data as data
from random import randrange
import cv2
import torch
from torch.utils.data import DataLoader
import torchvision
#
TRANSFORM_IMG = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(256),
    transforms.ToTensor(),
    # transforms.Normalize(mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225] )
    ])

train_data = torchvision.datasets.ImageFolder(root="/kaggle/input/fruits-quality-fresh-vs-rotten/Quality Dataset/train", transform=TRANSFORM_IMG)
valid_data = torchvision.datasets.ImageFolder(root="/kaggle/input/fruits-quality-fresh-vs-rotten/Quality Dataset/valid", transform=TRANSFORM_IMG)

num_train = len(train_data)
indices = list(range(num_train))
split = int(np.floor(.25 * num_train))

print(num_train)
np.random.shuffle(indices)

train_idx, valid_idx = indices[split:], indices[:split]
train_sampler = SubsetRandomSampler(train_idx)
valid_sampler = SubsetRandomSampler(valid_idx)

train_loader = data.DataLoader(train_data, batch_size=5, num_workers=4)

test_loader  = data.DataLoader(valid_data, batch_size=1,num_workers=4)

# train_loader = data.DataLoader(train_data, batch_size=16, sampler=train_sampler, num_workers=4)

# test_loader  = data.DataLoader(valid_data, batch_size=1, sampler=valid_sampler,num_workers=4)


287


In [None]:
# Model for classification (from scratch)
from torch import nn
import torch.nn.functional as F
import torch.optim as optim

class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__() #some shit getting inheret can do some shit later like model(data) works as model.forward(data)
        # Convolutional layers
        self.conv1 = nn.Conv2d(3, 32, kernel_size=3)
        self.pool1 = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3)
        self.pool2 = nn.MaxPool2d(2, 2)
        self.conv3 = nn.Conv2d(64,128, kernel_size=3)
        self.pool3 = nn.MaxPool2d(2, 2)
        self.conv4 = nn.Conv2d(128,256, kernel_size=3)
        self.pool4 = nn.MaxPool2d(2, 2)
        self.conv5 = nn.Conv2d(256,512, kernel_size=3)
        self.pool5 = nn.MaxPool2d(2, 2)

        # Fully connected layers
        self.fc1 = nn.Linear(18432, 256)
        self.fc2 = nn.Linear(256, 5)  # Final output layer

    def forward(self, x):
        # Convolutional layers

        x = F.relu(self.conv1(x))
        print(x.shape)
        x = self.pool1(x)
        x = F.relu(self.conv2(x))
        print(x.shape)
        x = self.pool2(x)
        x = F.relu(self.conv3(x))
        print(x.shape)
        x = self.pool3(x)
        print(x.shape)
        x = F.relu(self.conv4(x))
        x = self.pool4(x)
        x = F.relu(self.conv5(x))
        x = self.pool5(x)

        # Flattening the layer for the fully connected layer
        x = x.view(x.size(0), -1)
        #print(x.shape)

        # Fully connected layers
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        #x= F.log_softmax(x)

        return x
device = torch.device('vulkan' if torch.is_vulkan_available() else 'cuda' if torch.cuda.is_available() else'cpu')
print(device)
model = CNN().to(device)
print(model)


cuda
CNN(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1))
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1))
  (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv3): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1))
  (pool3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv4): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1))
  (pool4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv5): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1))
  (pool5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (fc1): Linear(in_features=18432, out_features=256, bias=True)
  (fc2): Linear(in_features=256, out_features=5, bias=True)
)


In [None]:
#Using a pretrained model for feature extraction...use when not to be trained from scratch

import torch
import torchvision.models as models

model = models.vgg16(pretrained=True)
model.classifier[6] = nn.Linear(in_features=4096, out_features=5)
for param in model.parameters():
    param.requires_grad = False

for param in model.classifier.parameters():
    param.requires_grad = True

model = model.to(device)
print(model)

VGG(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU(inplace=True)
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): ReLU(inplace=True)
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): ReLU(inplace=True)
    (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): ReLU(inplace=True)
    (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (13): ReLU(inplace=True)
    (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (15): ReLU(inplace=True)
    (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1

In [None]:
#Training


optimizer = optim.Adam(model.parameters(), lr=0.001)

def train(model, device, train_loader, optimizer, epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):   #16,3,256,256
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model.forward(data)
        loss = nn.CrossEntropyLoss()(output, target)
        loss.backward()
        optimizer.step()
        if batch_idx % 10 == 0:
            print(f'Train Epoch: {epoch} [{batch_idx * len(data)}/{num_train-split} ({100. * batch_idx / (num_train-split):.0f}%)]\tLoss: {loss.item():.6f}')

def evaluate(model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for batch_idx, (data, target) in enumerate(test_loader):
            data, target = data.to(device), target.to(device)
            output = model(data)

            pred = output.argmax(dim=1)
            # print(pred)
            correct += pred.eq(target).sum()
            #print(correct.item())
            test_loss += nn.CrossEntropyLoss()(output, target).item()



    test_loss /= len(test_loader)
    print(f'\nTest set: Average loss: {test_loss:.4f}, Accuracy: {correct.item()}/{split} ({100. * correct.item() / split:.0f}%)\n')


for epoch in range(5):
    train(model, device, train_loader, optimizer, epoch)
    evaluate(model, device, test_loader)
torch.save(model.state_dict(), 'model_weights.pth')


Test set: Average loss: 604.3192, Accuracy: 24/71 (34%)


Test set: Average loss: 135.6931, Accuracy: 24/71 (34%)


Test set: Average loss: 187.0259, Accuracy: 24/71 (34%)


Test set: Average loss: 239.3784, Accuracy: 24/71 (34%)


Test set: Average loss: 103.2980, Accuracy: 24/71 (34%)

