### **Part 2**
### **Uploading Original Images:**
***You are requested to ignore any warnings.***

**_Caution:_ Upload the file *data1.zip* in the *Dataset* folder present in the submission zip file before running any of the cells below.**

The original images are also available inside the *data1 Original* as well as *data1* folders in the *Dataset* folder attached alongwith the submission file.

It contains 100 samples of all the 10 Bengali numerals each written by different people, and collected by us.

In [1]:
#Importing the required libraries
import torch
import torchvision
import torchvision.transforms as transforms

from PIL import Image
import numpy as np
from numpy import savetxt

The cell below helps us in unzipping the *data1.zip* file. \
Proceed only after the cell below shows output *Done*.

In [None]:
from zipfile import ZipFile
file_name = 'data1.zip'

with ZipFile(file_name, 'r') as zip:
  zip.extractall()
  print('Done')

Done


The function below returns the list of files present in the input directory.

In [None]:
import os
def ListofFiles(Dir):
    Files = []
    for root, dir_name, file_name in os.walk(Dir): 
#root store address till directory, dir_name stores directory name # file_name stores file name
        for name in file_name:
            fullName = os.path.join(root, name)
            Files.append(fullName)
    return Files

###**Converting into MNIST format:**

After running the following cell, all the extracted images are converted into grayscale images and resized to 28px-by-28px, to make them **similar to the MNIST format**.

In [None]:
FileList = ListofFiles('/content/data1')

for file in FileList:
    img = Image.open(file).convert('L').resize((28,28))   # grayscaling and resizing to 28px-by-28px images
    img.save(file)

The modified images are zipped into *data2.zip* file. Since our original dataset contains 1000 images, this modified dataset also contains 1000 images. \
These images are provided inside the *data2 Resized* folder in the *Dataset* folder attached alongwith the submission file.

In [None]:
!zip -r /content/data2.zip /content/data1

  adding: content/data1/ (stored 0%)
  adding: content/data1/data1/ (stored 0%)
  adding: content/data1/data1/sample34/ (stored 0%)
  adding: content/data1/data1/sample34/7.jpg (stored 0%)
  adding: content/data1/data1/sample34/5.jpg (stored 0%)
  adding: content/data1/data1/sample34/8.jpg (stored 0%)
  adding: content/data1/data1/sample34/1.jpg (stored 0%)
  adding: content/data1/data1/sample34/3.jpg (stored 0%)
  adding: content/data1/data1/sample34/6.jpg (stored 0%)
  adding: content/data1/data1/sample34/9.jpg (stored 0%)
  adding: content/data1/data1/sample34/2.jpg (stored 0%)
  adding: content/data1/data1/sample34/0.jpg (stored 0%)
  adding: content/data1/data1/sample34/4.jpg (stored 0%)
  adding: content/data1/data1/sample42/ (stored 0%)
  adding: content/data1/data1/sample42/7.jpg (stored 0%)
  adding: content/data1/data1/sample42/5.jpg (stored 0%)
  adding: content/data1/data1/sample42/8.jpg (stored 0%)
  adding: content/data1/data1/sample42/1.jpg (stored 0%)
  adding: content/

###**Data Augmentation:**

As instructed in the problem statement, standard augmenting transforms are performed using available Pytorch transforms. We are driven by our design choice to decide which transforms would be detrimental to our dataset and hence didn't use them.

For each of the modified images, 9 data augmentations were performed. So now our whole dataset consists of 10000 images.

The performed 9 augmentations are explained in the comments in the cell below.

We finally save the images into an Excel file *numbers.csv* to make the training easier later.



In [None]:
FileList = ListofFiles('/content/data1')

pixels=[]
for file in FileList:

    pos = file.find('.')
    label = int(file[pos-1])

    # Identical image (no transformation being performed)    
    img0 = Image.open(file)
    file0 = file[:pos] + file[pos] + file[pos+1:]
    img0.save(file0)
    pixels.append([label] + list(img0.getdata()))
    
    # Random Rotation of the image (keeping center invariant), from -10 degrees to +10 degrees.
    transform1 = transforms.RandomAffine(degrees=10)
    img1 = transform1(img0)
    file1 = file[:pos] + '1' + file[pos] + file[pos+1:]
    img1.save(file1)
    pixels.append([label] + list(img1.getdata()))

    # Random Rotation upto the range of 5 degrees, and a Random Horizontal Translation upto the range of 0.1*width
    transform2 = transforms.RandomAffine(degrees=5,translate=(0.1,0))
    img2 = transform2(img0)
    file2 = file[:pos] + '2' + file[pos] + file[pos+1:]
    img2.save(file2)
    pixels.append([label] + list(img2.getdata()))

    # Random Rotation upto the range of 5 degrees, and a Random Vertical Translation upto the range of 0.1*height
    transform3 = transforms.RandomAffine(degrees=5,translate=(0,0.1))
    img3 = transform3(img0)
    file3 = file[:pos] + '3' + file[pos] + file[pos+1:]
    img3.save(file3)
    pixels.append([label] + list(img3.getdata()))

    # Random Horizontal as well as Vertical Translations upto the range of 0.1*width, and 0.1*height respectively
    transform4 = transforms.RandomAffine(degrees=0,translate=(0.1,0.1))
    img4 = transform4(img0)
    file4 = file[:pos] + '4' + file[pos] + file[pos+1:]
    img4.save(file4)
    pixels.append([label] + list(img4.getdata()))

    # No translations and rotations, but a shear parallel to both X-axis and Y-axis
    transform5 = transforms.RandomAffine(degrees=0,translate=(0,0),shear=[-5,5,-5,5])
    img5 = transform5(img0)
    file5 = file[:pos] + '5' + file[pos] + file[pos+1:]
    img5.save(file5)
    pixels.append([label] + list(img5.getdata()))

    # Gaussian Blurring the image
    transform6 = transforms.GaussianBlur(3,sigma=(0.1,1.0))
    img6 = transform6(img0)
    file6 = file[:pos] + '6' + file[pos] + file[pos+1:]
    img6.save(file6)
    pixels.append([label] + list(img6.getdata()))

    # Composing transform2 and transform3
    transform7 = transforms.Compose([transform2,transform3])
    img7 = transform7(img0)
    file7 = file[:pos] + '7' + file[pos] + file[pos+1:]
    img7.save(file7)
    pixels.append([label] + list(img7.getdata()))

    # Increasing the Sharpness of the image by 1.5, with a Probability of 0.8
    transform8 = transforms.RandomAdjustSharpness(1.5,p=0.8)
    img8 = transform8(img0)
    file8 = file[:pos] + '8' + file[pos] + file[pos+1:]
    img8.save(file8)
    pixels.append([label] + list(img8.getdata()))

    # Perspective Transformation with a Distortion Scale of 0.1 and a Probability of 0.5
    transform9 = transforms.RandomPerspective(distortion_scale=0.1,p=0.5)
    img9 = transform9(img0)
    file9 = file[:pos] + '9' + file[pos] + file[pos+1:]
    img9.save(file9)
    pixels.append([label] + list(img9.getdata()))

# Saving the images into an Excel file "numbers.csv" to make the training easier later
pixels_arr=np.asarray(pixels)
print(pixels_arr.shape)     # printing the shape of the Excel file (collection of 10000 images) treating it as a Numpy array
savetxt('numbers.csv', pixels_arr, delimiter=',',fmt='%d')

(10000, 785)


The augmented images, alongwith the original images are zipped into *data3.zip* file. Since our original dataset contains 1000 images, this augmented dataset contains 10000 images. \
All these images are provided inside the *data3 Augmented* folder in the *Dataset* folder attached alongwith the submission file.

In [None]:
!zip -r /content/data3.zip /content/data1

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  adding: content/data1/data1/sample8/6.jpg (stored 0%)
  adding: content/data1/data1/sample8/02.jpg (stored 0%)
  adding: content/data1/data1/sample8/46.jpg (stored 0%)
  adding: content/data1/data1/sample8/01.jpg (stored 0%)
  adding: content/data1/data1/sample8/78.jpg (stored 0%)
  adding: content/data1/data1/sample8/13.jpg (stored 0%)
  adding: content/data1/data1/sample8/92.jpg (stored 0%)
  adding: content/data1/data1/sample8/75.jpg (stored 0%)
  adding: content/data1/data1/sample8/04.jpg (stored 0%)
  adding: content/data1/data1/sample8/9.jpg (stored 0%)
  adding: content/data1/data1/sample8/85.jpg (stored 0%)
  adding: content/data1/data1/sample8/86.jpg (stored 0%)
  adding: content/data1/data1/sample8/51.jpg (stored 0%)
  adding: content/data1/data1/sample8/79.jpg (stored 0%)
  adding: content/data1/data1/sample8/49.jpg (stored 0%)
  adding: content/data1/data1/sample8/16.jpg (stored 0%)
  adding: content/data1/d

In [None]:
!rm -rf data1         # Deleting the extracted "data1" folder