Paper reference: https://arxiv.org/pdf/1707.02131.pdf

- Focus on *offline* signature-verification which is *writer-independent*.
- Challenging as compared to other one-shot tasks because writing styles could greatly differ.

In [1]:
!pip install unrar

Collecting unrar
  Downloading unrar-0.4-py3-none-any.whl (25 kB)
Installing collected packages: unrar
Successfully installed unrar-0.4


In [2]:
!wget http://www.cedar.buffalo.edu/NIJ/data/signatures.rar

--2021-09-13 17:43:25--  http://www.cedar.buffalo.edu/NIJ/data/signatures.rar
Resolving www.cedar.buffalo.edu (www.cedar.buffalo.edu)... 128.205.33.100
Connecting to www.cedar.buffalo.edu (www.cedar.buffalo.edu)|128.205.33.100|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://cedar.buffalo.edu/NIJ/data/signatures.rar [following]
--2021-09-13 17:43:26--  https://cedar.buffalo.edu/NIJ/data/signatures.rar
Resolving cedar.buffalo.edu (cedar.buffalo.edu)... 128.205.33.100
Connecting to cedar.buffalo.edu (cedar.buffalo.edu)|128.205.33.100|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 253587033 (242M) [text/plain]
Saving to: ‘signatures.rar’


2021-09-13 17:43:35 (28.2 MB/s) - ‘signatures.rar’ saved [253587033/253587033]



In [3]:
!unrar x signatures.rar signatures/


UNRAR 5.50 freeware      Copyright (c) 1993-2017 Alexander Roshal


Extracting from signatures.rar

Creating    signatures                                                OK
Creating    signatures/signatures                                     OK
Creating    signatures/signatures/full_forg                           OK
Extracting  signatures/signatures/full_forg/forgeries_10_1.png             0%  OK 
Extracting  signatures/signatures/full_forg/forgeries_10_10.png            0%  OK 
Extracting  signatures/signatures/full_forg/forgeries_10_11.png            0%  OK 
Extracting  signatures/signatures/full_forg/forgeries_10_12.png            0%  OK 
Extracting  signatures/signatures/full_forg/forgeries_10_13.png            0%  OK 
Extracting  signatures/signatures/full_forg/forgeries_10_14.png            0%  OK 
Extracting  signatures/signatures/full_forg/forgeries_10_15.png            0%  OK 
Extracting  signatures/signatures/fu

In [4]:
import cv2
import os
import glob
import numpy as np
import matplotlib.pyplot as plt

In [5]:
base_dir = "signatures/signatures"

In [6]:
len(os.listdir(f"{base_dir}/full_org")), len(os.listdir(f"{base_dir}/full_forg"))

(1321, 1321)

In [7]:
os.mkdir("processed_signatures/")
os.mkdir("processed_signatures/full_org")
os.mkdir("processed_signatures/full_forg")

In [8]:
from tqdm.auto import tqdm
from joblib import Parallel, delayed

def _get_img_paths():
  for sigclass in os.listdir(base_dir):
    class_path = os.path.join(base_dir, sigclass)
    if class_path != "signatures/signatures/Readme.txt":
      for image in os.listdir(class_path):
        if image.endswith(".png"):
          yield os.path.join(class_path, image)

def preprocess_signature(img_path):
  image = cv2.imread(img_path)
  if image is None:
    print(f"The image {img_path} cannot be read. Skipping")
    return
  # Bi-linear interpolation.
  image = cv2.resize(image, (155, 220), interpolation=cv2.INTER_LINEAR)
  # Invert image.
  image = cv2.bitwise_not(image)
  # Save image.
  filename = os.path.join("processed_signatures", img_path.split("/")[2], "proc_" + img_path.split("/")[-1])
  cv2.imwrite(filename, image)

# TODO: Use joblib for faster saving?
# _ = Parallel(n_jobs=-1)(delayed(preprocess_signature)(img_path) for img_path in tqdm(_get_img_paths(), total=1321*2))
for img_path in _get_img_paths():
  preprocess_signature(img_path)

In [9]:
!ls processed_signatures/full_org | wc -l

1320


In [10]:
!ls processed_signatures/full_forg | wc -l

1320


In [None]:
!zip -r processed_signatures.zip processed_signatures/