### GoogleNet model transfer learning:

GoogleNet, particularly its InceptionV3 model, represents a significant advancement in the field of deep learning and computer vision. Here's a detailed overview:

GoogleNet Overview
Background: GoogleNet is a deep neural network architecture, first introduced by researchers at Google. The name "GoogleNet" is often used interchangeably with "Inception", which is the actual name of the architecture.

Purpose: It was primarily designed for computer vision tasks, particularly excelling in image classification and detection in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC).

Architecture Highlights: The most notable feature of GoogleNet is its deep and complex architecture, which is carefully designed to optimize computational efficiency and reduce the number of parameters (to prevent overfitting).

Inception Modules
Modular Design: The core idea of GoogleNet is the Inception module. This module performs several convolutions of different sizes (1x1, 3x3, 5x5) and pooling operations in parallel, concatenating their outputs into a single output vector for the next stage.

Dimensionality Reduction: To manage computational complexity, 1x1 convolutions are used for dimensionality reduction before larger convolutions.

Stacking Modules: Multiple Inception modules are stacked together, allowing the network to learn complex features at various scales.

InceptionV3 Specifics
Evolution: InceptionV3 is an iteration of the original GoogleNet, incorporating several improvements in terms of architecture and efficiency.

Enhancements in V3:

Factorized Convolutions: Larger convolutions are factorized into smaller, more manageable operations for computational efficiency.
Expanded the Filter Bank: Wider inception modules (using more filters of different sizes) allow the model to represent a broader range of features.
Label Smoothing: A regularization technique that prevents the model from becoming too confident about its predictions, improving generalization.
Applications: InceptionV3 continues to be widely used in various image recognition and computer vision tasks due to its powerful feature extraction capabilities and efficiency.

Performance: It offers one of the best trade-offs between accuracy and computational efficiency in the field of image recognition.

Impact
State-of-the-Art Results: GoogleNet and its iterations like InceptionV3 achieved state-of-the-art results in many benchmarks and competitions.
Influence on Future Designs: The inception module concept influenced many subsequent architectures in deep learning, highlighting the importance of carefully balancing depth, width, and computational efficiency.

In [7]:
import os
import cv2
import csv
import numpy as np
import pandas as pd
import random
import gc
import sys
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.applications import EfficientNetB0
from tensorflow.keras.layers import Dense, Conv2D, Flatten, MaxPooling2D, Input, Dense, Flatten, concatenate, GlobalAveragePooling2D, Dropout, BatchNormalization
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.applications.inception_v3 import InceptionV3
import keras_tuner as kt
from keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Model
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from tensorflow.keras.mixed_precision import set_global_policy
from tensorflow.keras.callbacks import LearningRateScheduler
from sklearn.model_selection import train_test_split
from tensorflow.keras.regularizers import l2

In [9]:
# Set the global policy to mixed_float16
set_global_policy('mixed_float16')

In [10]:
# Ensure the script uses the GPU if available and set memory growth
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    try:
        # Currently, memory growth needs to be the same across GPUs
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
    except RuntimeError as e:
        # Memory growth must be set at program startup
        print(e)

### Load data 

In [11]:
# Load your preprocessed data
X_train = np.load('X_train-299.npy')
print('X_train loaded')
X_val = np.load('X_val-299.npy')
print('X_val loaded')
y_train = np.load('y_train-299.npy')
print('y_train loaded')
y_val = np.load('y_val-299.npy')
print('y_val loaded')

X_train loaded
X_val loaded
y_train loaded
y_val loaded


### Model training

In [8]:
def build_efficientnet_model(hp):
    # Load EfficientNetB0 as base model
    base_model = EfficientNetB0(include_top=False, input_tensor=Input(shape=(299, 299, 3)), weights='imagenet')
    
    # Freeze the base model layers based on the hyperparameter value
    # The tuner will decide the number of layers to freeze during hyperparameter optimization
    for layer in base_model.layers[:-hp.Int('unfreeze_layers', min_value=0, max_value=len(base_model.layers), step=5)]:
        if not isinstance(layer, BatchNormalization):  # It's often advised to keep BatchNormalization layers frozen
            layer.trainable = False

    # Add custom layers on top of EfficientNetB0
    x = base_model.output
    x = GlobalAveragePooling2D()(x)
    
    # Additional custom layers
    for i in range(hp.Int('num_additional_conv_blocks', 1, 3)):
        x = Conv2D(filters=hp.Int(f'conv_filters_{i}', min_value=32, max_value=128, step=32),
                   kernel_size=hp.Choice(f'conv_kernel_size_{i}', values=[3, 5]),
                   activation='relu', padding='same',
                   kernel_regularizer=l2(hp.Float(f'conv_l2_reg_{i}', min_value=1e-5, max_value=1e-2, step=1e-5)))(x)
        x = BatchNormalization()(x)
        x = MaxPooling2D(pool_size=(2, 2))(x)
        x = Dropout(hp.Float(f'conv_dropout_rate_{i}', min_value=0.1, max_value=0.5, step=0.1))(x)

    x = Dense(hp.Int('dense_units', min_value=32, max_value=256, step=32), activation='relu',
              kernel_regularizer=l2(hp.Float('dense_l2_reg', min_value=1e-5, max_value=1e-2, step=1e-5)))(x)
    x = Dropout(hp.Float('dense_dropout_rate', min_value=0.1, max_value=0.5, step=0.1))(x)
    
    # Output layer
    predictions = Dense(26, activation='softmax')(x)

    # Compile the model
    model = Model(inputs=base_model.input, outputs=predictions)
    model.compile(optimizer=Adam(hp.Choice('learning_rate', values=[1e-3, 1e-4])),
                  loss='categorical_crossentropy',
                  metrics=['accuracy'])

    return model


In [9]:
# Early stopping callback
early_stopping = EarlyStopping(monitor='val_loss', patience=3, verbose=1, mode='min')

# Set up the tuner for hyperparameter tuning using Hyperband
tuner = kt.Hyperband(
    build_googlenet_model, 
    objective='val_accuracy',
    max_epochs=10,
    factor=3,
    hyperband_iterations=2,  # Number of times to iterate over the full Hyperband algorithm
    directory='googlenet-model-tuning', 
    project_name='googlenet-tuning'  
)

NameError: name 'build_googlenet_model' is not defined

In [None]:
# Search for the best hyperparameters
tuner.search(X_train, y_train, epochs=10, validation_data=(X_val, y_val), callbacks=[early_stopping])

In [10]:
# Get the best hyperparameters
best_hp = tuner.get_best_hyperparameters()[0]

# Print each hyperparameter and its corresponding best value
for hp in best_hp.space:
    print(f"{hp.name}: {best_hp.get(hp.name)}")

num_conv_blocks: 1
conv_filters_0: 192
conv_kernel_size_0: 3
conv_l2_reg: 0.00204
conv_dropout_rate_0: 0.1
dense_units: 192
dense_l2_reg: 0.00443
dense_dropout_rate: 0.2
learning_rate: 0.0001
conv_filters_1: 224
conv_kernel_size_1: 3
conv_dropout_rate_1: 0.0
conv_filters_2: 160
conv_kernel_size_2: 5
conv_dropout_rate_2: 0.4


In [11]:
# Retrieve all completed trials
trials = [t for t in tuner.oracle.trials.values() if t.status == 'COMPLETED']

# Prepare data for CSV
data_to_save = [["Trial Number", "Hyperparameters", "Validation Accuracy"]]

# Add data from each trial
for i, trial in enumerate(trials):
    trial_hyperparams = trial.hyperparameters.values
    val_accuracy = trial.score  
    data_to_save.append([f"Trial {i+1}", trial_hyperparams, val_accuracy])

# Write to CSV
with open('googlenet_hyperparameter_trials.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    writer.writerows(data_to_save)

In [11]:
# Learning rate scheduler
def scheduler(epoch, lr):
    if epoch < 10:
        return lr
    else:
        return lr * tf.math.exp(-0.1)

# Early stopping
early_stopping = EarlyStopping(monitor='val_loss', patience=5)

# Model checkpoint
model_checkpoint = ModelCheckpoint(
    'googlenet-model.h5',  # Path where to save the model
    save_best_only=True, 
    monitor='val_loss', 
    mode='min'
)

# Combine all callbacks
callbacks_list = [
    LearningRateScheduler(scheduler),
    early_stopping,
    model_checkpoint
]

# Train model with best hyperparameters within strategy scope
model = build_googlenet_model(best_hp)

# Fit the model
history = model.fit(
    X_train, y_train,
    epochs=10,
    validation_data=(X_val, y_val),
    callbacks=callbacks_list, 
    verbose=1
)

2023-11-15 00:36:25.152629: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:894] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-11-15 00:36:25.154640: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:894] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-11-15 00:36:25.156585: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:894] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysf

Initial number of layers in the base model: 311
Total number of layers in the model: 319
Epoch 1/10


2023-11-15 00:37:01.278710: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8900
2023-11-15 00:37:10.551020: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f48d8004a70 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2023-11-15 00:37:10.551077: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Tesla T4, Compute Capability 7.5
2023-11-15 00:37:10.551085: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (1): Tesla T4, Compute Capability 7.5
2023-11-15 00:37:10.551091: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (2): Tesla T4, Compute Capability 7.5
2023-11-15 00:37:10.551097: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (3): Tesla T4, Compute Capability 7.5
2023-11-15 00:37:10.577893: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, s



  saving_api.save_model(


Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


### Model Metrics save output

In [12]:
metrics_df = pd.DataFrame({
    'Epoch': range(1, len(history.history['loss']) + 1),
    'Loss': history.history['loss'],
    'Accuracy': history.history['accuracy'],
    'Val_Loss': history.history['val_loss'],
    'Val_Accuracy': history.history['val_accuracy']
})

# Save the metrics to a CSV file
metrics_df.to_csv('googlenet-metrics.csv', index=False)

# Save full model 
model.save('googlenet-fullmodel-full.h5')
