## Project Setup

If using miniconda run this Command Prompt command: ```conda create --name project_env python=3.12```

Before using this program, install everything in the `requirements.txt` file with this Command Prompt command:

```pip install requirements.txt```

Download the training dataset here: https://iris.di.ubi.pt/ubipr.html (Use the 'original' version.)

Unzip the dataset into a folder somewhere, and set the DATASET_PATH field in `paths.env` to that directory.

## Import Statements and Parameters

**Run code in this section before any other section.**


In [None]:
print("Importing...")
from lib import *
print("Imported required modules.")

IMG_SIZE = (105, 105)                       # All images are scaled to this size (width, height).
IMG_WITH_CHANNELS_SIZE = (105, 105, 3)      # The full input shape of the images, including color channels (width, height, channels).
BATCH_SIZE = 32                             # The number of image-pair samples processed in a "batch" between each backpropogation pass.
EPOCHS = 4                                  # How many epochs through the entire training dataset.
STEPS_PER_EPOCH = 50                        # Number of batches the training loop runs per epoch.

# The number of samples in each epoch is about `STEPS_PER_EPOCH * BATCH_SIZE`.

# reads filepath configuration from `paths.env` and stores it in instance variables
env = EnvLoader("paths.env")
print(str(env))

Load data from the filesystem if it exists:

In [None]:
# Load gallery
gallery_dict = load_gallery_images(env.GALLERY_IMAGE_PATH)

# Define the structure of the CNN. Will eventually contain trained values.
base_cnn = create_base_cnn(IMG_WITH_CHANNELS_SIZE)

# Load gallery embeddings if they already exist
gallery_embeddings = load_gallery_embeddings(env.GALLERY_EMBEDDING_PATH)

try:
    siamese_model = load_siamese_model(env.MODEL_SAVE_PATH)

    # Copy weights from the trained Siamese model
    # In the Siamese model, the base CNN is the 3rd layer (index 2)
    for base_layer, siam_layer in zip(base_cnn.layers, siamese_model.layers[2].layers):
        base_layer.set_weights(siam_layer.get_weights())

except RuntimeError as e:
    print("Model was not loaded", e)

## Siamese Model Training

**Only run code in this section if new training data has been added, otherwise skip.**

Dataset Image Preparation: generates tensorflow dataset for training and saves it to a cache.

In [None]:
# Check if processed images already exist
if os.path.exists(env.PROCESSED_IMAGE_PATH):
    # Load already processed images (took 12s)
    print("Loading preprocessed images from cache...")
    images_dict = np.load(env.PROCESSED_IMAGE_PATH, allow_pickle=True).item()
else:
    # Process images for the first time (took 10m 57s)
    print("Processing raw images...")
    images_dict = load_images_by_filename(env.RAW_IMAGE_PATH, IMG_SIZE)
    
    print(f"Saving preprocessed images to {env.PROCESSED_IMAGE_PATH}...")
    np.save(env.PROCESSED_IMAGE_PATH, images_dict)

train_ds = make_tf_dataset(images_dict, BATCH_SIZE, IMG_SIZE)

Sanity check to make sure almost every person has the correct image count 

In [None]:
def get_image_counts():
    images_per_person = [len(imgs) for _person, imgs in images_dict.items()]
    bins = np.bincount(images_per_person)
    indices = np.arange(0, bins.shape[0])
    return np.transpose(np.array((indices, bins)))

get_image_counts()

Training: runs the siamese model's training algorithm. Can take a while.

In [None]:
from keras.callbacks import EarlyStopping
from keras.optimizers import Adam

early_stop = EarlyStopping(
    monitor='loss',
    patience=5,            # stop after 5 epochs with no improvement
    restore_best_weights=True
)

siamese_model, base_cnn = create_siamese_model(input_shape=IMG_WITH_CHANNELS_SIZE)
siamese_model.compile(loss=contrastive_loss,
              optimizer=Adam(learning_rate=1e-4),
              metrics=[siamese_accuracy])
siamese_model.summary()

siamese_model.fit(
    train_ds,
    steps_per_epoch=STEPS_PER_EPOCH,
    epochs=EPOCHS,
    callbacks=[early_stop]
)

for base_layer, siam_layer in zip(base_cnn.layers, siamese_model.layers[2].layers):
    base_layer.set_weights(siam_layer.get_weights())

In [None]:
# Call this after training:
save_siamese_model(siamese_model, str(env.MODEL_SAVE_PATH))

## Process Gallery Images

**Only run code in this section if the gallery has been changed.**

The gallery contains a subfolder for each person containing images of their eyes.

Each person's identifier will be the same as the name of their subfolder.

In [None]:
crop_gallery_images(env.GALLERY_IMAGE_PATH, IMG_SIZE)

gallery_embeddings = compute_gallery_embeddings(
    base_cnn,
    gallery_dict,
    IMG_WITH_CHANNELS_SIZE,
    env.GALLERY_EMBEDDING_PATH
)

## Compare Eyes

**Run this code to perform the identification algorithm.**

Uses `identify_eye()` to compare `query_image` with images in the gallery and outputs the image with the highest similarity.

In [None]:
identity, confidence = identify_eye(
    env.QUERY_IMAGE_PATH,
    "Query Image",
    base_cnn,
    gallery_embeddings,
    IMG_SIZE,
    margin=1.0,
    threshold=70.0
)

Uses `identify_eye()` to identify all images in the TestImages folder.

In [None]:
for img_name in os.listdir(env.TEST_IMAGE_PATH):
    if not img_name.lower().endswith((".jpg", ".jpeg", ".png")):
        continue
    
    query_image = os.path.join(env.TEST_IMAGE_PATH, img_name)

    identity, confidence = identify_eye(
        query_image,
        img_name,
        base_cnn,
        gallery_embeddings,
        img_size=IMG_SIZE,
        margin=1.0,
        threshold=70.0
    )
