# Face Recognition Model Training## Using UWA HSFD Hyperspectral DatabaseThis notebook demonstrates building a face recognition model using the UWA HSFD hyperspectral database. We preprocess hyperspectral data to grayscale for simplicity and train a CNN to output face embeddings.

In [None]:
import numpy as npimport matplotlib.pyplot as pltimport cv2from tensorflow.keras.models import Modelfrom tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Flatten, Dense, Dropout, BatchNormalization, GlobalAveragePooling2Dfrom tensorflow.keras.optimizers import Adamfrom tensorflow.keras.applications import MobileNetV2from sklearn.model_selection import train_test_splitimport os

## 1. Data PreprocessingThe UWA HSFD database contains hyperspectral images. We'll convert them to grayscale for simplicity.**Note**: The UWA HSFD database is located at the specified path and contains PNG files.Dataset structure expected:```UWA_HSFD/  ├── subject_01/  │   ├── image_001.png  │   └── ...  ├── subject_02/  └── ...```

In [None]:
# Set the data pathdata_path = r"C:\Users\Anvitha\Face based Person Authentication\UWA HSFD V1.1 (1)\UWA HSFD V1.1\HyperSpec_Face_Session1"print("Data path set to:", data_path)

In [None]:
def load_and_preprocess_data(data_dir, target_size=(224, 224)):    """    Load and preprocess UWA HSFD data from PNG files.        Args:        data_dir: Path to data directory        target_size: Target image size            Returns:        X: Image data array (RGB)        y: Labels array        labels: Label mapping    """    images = []    labels = []    label_map = {}    current_label = 0        if os.path.exists(data_dir):        for subject_dir in sorted(os.listdir(data_dir)):            subject_path = os.path.join(data_dir, subject_dir)            if os.path.isdir(subject_path):                label_map[subject_dir] = current_label                                for image_file in os.listdir(subject_path):                    if image_file.endswith('.png'):                        image_path = os.path.join(subject_path, image_file)                                                # Load PNG image                        img = cv2.imread(image_path)                        if img is not None:                            # Convert grayscale to RGB if necessary                            if len(img.shape) == 2:                                img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)                                                        # Resize to target size                            img = cv2.resize(img, target_size)                                                        # Normalize to 0-1                            img = img.astype(np.float32) / 255.0                                                        images.append(img)                            labels.append(current_label)                                        current_label += 1        return np.array(images), np.array(labels), label_map

## 2. Load Data

In [None]:
# Load the dataX, y, label_map = load_and_preprocess_data(data_path)print(f"Loaded {len(X)} images from {len(label_map)} subjects")print(f"Image shape: {X.shape[1:]}")print(f"Label map: {label_map}")

## 3. Build CNN Model for Face EmbeddingsWe'll create a CNN model that outputs face embeddings. We'll use transfer learning with MobileNetV2 as the base.

In [None]:
def build_face_embedding_model(input_shape=(224, 224, 3), embedding_dim=128):    """    Build a CNN model for face embedding extraction.        Args:        input_shape: Input image shape        embedding_dim: Dimension of output embedding            Returns:        Keras Model    """    # Use MobileNetV2 as base    base_model = MobileNetV2(        input_shape=input_shape,        include_top=False,        weights='imagenet'    )        # Freeze base model layers (for transfer learning)    base_model.trainable = False        # Build model    inputs = Input(shape=input_shape)    x = base_model(inputs, training=False)    x = GlobalAveragePooling2D()(x)    x = Dense(256, activation='relu')(x)    x = Dropout(0.5)(x)    x = Dense(embedding_dim, activation=None)(x)  # No activation for embeddings        # L2 normalization of embeddings    from tensorflow.keras.layers import Lambda    import tensorflow as tf    embeddings = Lambda(lambda x: tf.nn.l2_normalize(x, axis=1))(x)        model = Model(inputs=inputs, outputs=embeddings)        return model

In [None]:
# Create the modelmodel = build_face_embedding_model(input_shape=(224, 224, 3), embedding_dim=128)model.summary()

## 4. Triplet Loss FunctionFor face recognition, we use triplet loss to train the network to produce similar embeddings for the same person and different embeddings for different people.

In [None]:
import tensorflow as tfdef triplet_loss(y_true, y_pred, alpha=0.2):    """    Triplet loss function.        Args:        y_true: Not used (required by Keras)        y_pred: Predictions containing [anchor, positive, negative] embeddings        alpha: Margin parameter            Returns:        Loss value    """    anchor = y_pred[:, 0:128]    positive = y_pred[:, 128:256]    negative = y_pred[:, 256:384]        # Calculate distances    pos_dist = tf.reduce_sum(tf.square(anchor - positive), axis=1)    neg_dist = tf.reduce_sum(tf.square(anchor - negative), axis=1)        # Triplet loss    loss = tf.maximum(pos_dist - neg_dist + alpha, 0.0)        return tf.reduce_mean(loss)

## 5. Training (Demonstration)This section demonstrates the training process. In practice, you would:1. Load the actual UWA HSFD dataset2. Preprocess hyperspectral images to grayscale/RGB3. Create triplet batches (anchor, positive, negative)4. Train the model using triplet loss

In [None]:
# Split data for training/validationX_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)# For demonstration, compile and train (would need triplet data generator)# model.compile(optimizer=Adam(learning_rate=0.001), loss=triplet_loss)# Training would look like:# history = model.fit(#     triplet_generator,  # Generator yielding (anchor, positive, negative) triplets#     epochs=50,#     validation_data=val_triplet_generator# )print("Data loaded and ready for training.")print(f"Training samples: {len(X_train)}")print(f"Validation samples: {len(X_val)}")print("For actual training, implement triplet data generator.")

## 6. Using Pre-trained ModelFor the face authentication system, we use a pre-trained MobileNetV2 model which is already integrated in the `face_utils.py` module. This provides a practical solution that works with RGB webcam inputs without requiring extensive training on hyperspectral data.

In [None]:
# Load the embedding extractor used in the systemimport syssys.path.append('..')from src.face_utils import EmbeddingExtractorextractor = EmbeddingExtractor()print("Embedding extractor ready!")print(f"Input shape: {extractor.input_shape}")print(f"Model architecture:")extractor.model.summary()

## 7. Testing Embedding ExtractionLet's test the embedding extraction with a sample image.

In [None]:
# Create a test image (random for demonstration)test_image = np.random.randint(0, 255, (224, 224, 3), dtype=np.uint8)# Extract embeddingembedding = extractor.extract_embedding(test_image)normalized_embedding = extractor.normalize_embedding(embedding)print(f"Embedding shape: {embedding.shape}")print(f"Embedding dimension: {len(embedding)}")print(f"Embedding L2 norm (before normalization): {np.linalg.norm(embedding):.4f}")print(f"Embedding L2 norm (after normalization): {np.linalg.norm(normalized_embedding):.4f}")

## 8. Similarity ComparisonDemonstrate how embeddings are compared using cosine similarity.

In [None]:
from src.auth_utils import cosine_similarity# Create two test embeddingsembedding1 = np.random.randn(1280)embedding2 = np.random.randn(1280)embedding3 = embedding1 + np.random.randn(1280) * 0.1  # Similar to embedding1# Calculate similaritiessim_different = cosine_similarity(embedding1, embedding2)sim_similar = cosine_similarity(embedding1, embedding3)sim_same = cosine_similarity(embedding1, embedding1)print(f"Similarity (different faces): {sim_different:.4f}")print(f"Similarity (similar faces): {sim_similar:.4f}")print(f"Similarity (same face): {sim_same:.4f}")print("
Typical threshold for authentication: 0.6")

## SummaryThis notebook demonstrates:1. How to load and preprocess UWA HSFD PNG images (handling grayscale)2. Building a CNN for face embedding extraction3. Triplet loss for training face recognition models4. Using pre-trained models for practical applications5. Embedding extraction and similarity comparisonThe actual face authentication system uses MobileNetV2 pre-trained on ImageNet, which provides excellent results for RGB webcam inputs without requiring extensive training on hyperspectral data.