# Model Development 
This notebook was used to develop the model.py, image_collator.py, get_batch.py files.

It also showcases the data processing pipline and model fucntionality.

## Load Image Data

In [1]:
import os
import torch, torchvision
import torchvision.transforms as transforms
from PIL import Image


image_dir = r'C:\Users\hunte\OneDrive\Documents\Coding Projects\Signature-Similarity-Checker\data\handwritten-signatures\sample_Signature\sample_Signature\forged'

# Get all file names in the directory
image_files = [f for f in os.listdir(image_dir) if f.endswith('.PNG')]

# Load images into a list using PIL
images = [Image.open(os.path.join(image_dir, img_file)) for img_file in image_files]

#shwo first three images
images[:3]

[<PIL.PngImagePlugin.PngImageFile image mode=L size=1354x530>,
 <PIL.PngImagePlugin.PngImageFile image mode=L size=1829x730>,
 <PIL.PngImagePlugin.PngImageFile image mode=L size=2110x834>]

## Collate Image Data

In [2]:
from image_collator import ImageCollator

collator = ImageCollator()

tensor_stack = collator.collate(images, num_poolings=3, print_shapes=True, resize_size=(50, 150))



Shapes of images post converting to tensor: 

torch.Size([1, 530, 1354]) torch.Size([1, 730, 1829]) torch.Size([1, 834, 2110]) torch.Size([1, 855, 1756]) torch.Size([1, 559, 1213]) 
torch.Size([1, 736, 933]) torch.Size([1, 423, 1304]) torch.Size([1, 599, 1675]) torch.Size([1, 826, 2073]) torch.Size([1, 653, 1545]) 
torch.Size([1, 623, 1078]) torch.Size([1, 573, 1203]) torch.Size([1, 470, 1345]) torch.Size([1, 698, 1928]) torch.Size([1, 611, 1864]) 
torch.Size([1, 619, 1435]) torch.Size([1, 398, 1777]) torch.Size([1, 647, 1081]) torch.Size([1, 505, 1527]) torch.Size([1, 653, 1816]) 
torch.Size([1, 730, 2178]) torch.Size([1, 800, 1648]) torch.Size([1, 542, 1419]) torch.Size([1, 684, 1019]) torch.Size([1, 579, 1533]) 
torch.Size([1, 864, 1635]) torch.Size([1, 800, 1952]) torch.Size([1, 766, 1760]) torch.Size([1, 476, 1045]) torch.Size([1, 740, 1183]) 


Shapes of images post max pooling: 

torch.Size([1, 66, 169]) torch.Size([1, 91, 228]) torch.Size([1, 104, 263]) torch.Size([1, 106, 21



## Build Batches of Image Pairs

In [3]:
from get_batch import Build_Batch

batch_builder = Build_Batch()

batch = batch_builder.build_batch(tensor_stack)

The intepretation of the below tensor size is that their are [(870 pairs of images), (2 images per pair), (1 singleton dimmension), (height of 50), (width of 150)]

In [4]:
batch.shape

torch.Size([870, 2, 1, 50, 150])

# Forward Pass

In [6]:
from model import BiEncoder

model = BiEncoder(threshold=0.5)

model


BiEncoder(
  (conv_layer_1): Conv2d(1, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv_layer_2): Conv2d(8, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv_layer_3): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv_layer_4): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv_layer_5): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv_layer_6): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (linear_layer): Linear(in_features=3456, out_features=128, bias=True)
)

Run Forward Pass

In [7]:
preds = model(batch)

print(f'preds shape: {preds.shape}')
print(f'first 5 preds: {preds[:5]}')

preds shape: torch.Size([870])
first 5 preds: tensor([1., 1., 1., 1., 1.])


The model outputs a vector of binary preds on the similarity of the image pairs. 

It is not trained yet, so all the preds are ones.