## Prepare the Base Model

> This notebook prepares and configures the base model for image classification using TensorFlow and Keras.

## Notebook Summary

> This notebook is designed to prepare and configure a base model for image classification tasks using TensorFlow and Keras. It includes the following steps:
>
> 1. **Configuration Management**: Reading configuration and parameter files (`config.yaml` and `params.yaml`) to set up paths, model parameters, and other settings.
> 2. **Base Model Preparation**: Downloading and saving a pre-trained VGG16 model as the base model.
> 3. **Model Customization**: Updating the base model by adding custom layers, freezing specific layers, and compiling it with appropriate loss functions and optimizers.
> 4. **Utilities**: Using helper functions to manage directories and read YAML configuration files.
>
> The notebook is structured to ensure modularity and reusability, making it easy to adapt for different image classification projects.

In [1]:
# Import the os module for directory and file operations
import os

In [2]:
# Print the current working directory (useful for debugging relative paths)
%pwd

'/home/zkhechadoorian/CNNs_Cats_and_Dogs/research'

In [3]:
# Change working directory up one level (if needed for imports or file access)
os.chdir("../")

In [4]:
# Print the new working directory after changing it
%pwd

'/home/zkhechadoorian/CNNs_Cats_and_Dogs'

In [5]:
# List files in the current directory to verify structure
%ls

10493.jpg              home.py                scores.json
Dockerfile             [0m[01;34mlogs[0m/                  server-instructions.md
README.md              main.py                setup.py
app.py                 [01;34moutput[0m/                [01;34msrc[0m/
[01;34martifacts[0m/             params.yaml            start.sh
[01;34mconfig[0m/                params_augmented.yaml  start_uvicorn_streamlit.sh
deployment_code.ipynb  requirements.txt       template.py
fastapi-req.py         [01;34mresearch[0m/              [01;34mtemplates[0m/


In [6]:
# Define a dataclass for base model configuration
from dataclasses import dataclass
from pathlib import Path

@dataclass(frozen=True)
class PrepareBaseModelConfig:
    root_dir: Path  # Directory for artifacts
    base_model_path: str  # Path to save the base model
    updated_base_model_path: Path  # Path to save the updated model
    params_image_size: list  # Image input size
    params_learning_rate : float  # Learning rate for training
    params_include_top : bool  # Whether to include the top layer in the base model
    params_weights: str  # Pre-trained weights to use
    params_classes: int  # Number of output classes

In [7]:
# Import project constants and utility functions for YAML reading and directory creation
from ImageClassification.constants import *
from ImageClassification.utils import read_yaml, create_directories

In [8]:
# ConfigurationManager reads config and params YAML files and prepares config objects
class ConfigurationManager:
    def __init__(
        self,
        config_filepath = CONFIG_FILE_PATH,
        params_filepath = PARAMS_FILE_PATH):

        # Read configuration and parameter YAML files
        self.config = read_yaml(config_filepath)
        self.params = read_yaml(params_filepath)

        # Create the root artifact directory
        create_directories([self.config.artifacts_root])

    def get_prepare_base_model_config(self) -> PrepareBaseModelConfig:
        # Extract the prepare_base_model section from config
        config = self.config.prepare_base_model

        # Create the directory for base model artifacts
        create_directories([config.root_dir])

        # Build and return the config dataclass
        prepare_base_model_config = PrepareBaseModelConfig(
            root_dir=Path(config.root_dir),
            base_model_path = Path(config.base_model_path),
            updated_base_model_path= Path(config.updated_base_model_path),
            params_image_size= self.params.IMAGE_SIZE,
            params_learning_rate = self.params.LEARNING_RATE,
            params_include_top = self.params.INCLUDE_TOP,
            params_weights= self.params.WEIGHTS,
            params_classes= self.params.CLASSES
        )

        return prepare_base_model_config

In [9]:
# Import TensorFlow and other required modules for model preparation
import os
import urllib.request as request
from zipfile import ZipFile
import tensorflow as tf

2025-09-19 23:05:25.171973: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2025-09-19 23:05:25.172461: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-09-19 23:05:25.240067: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-09-19 23:05:27.011353: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To tur

In [10]:
class PrepareBaseModel:
    def __init__(self, config: PrepareBaseModelConfig):
        self.config = config

    
    def get_base_model(self):
        self.model = tf.keras.applications.vgg16.VGG16(
            input_shape=self.config.params_image_size,
            weights=self.config.params_weights,
            include_top=self.config.params_include_top
        )

        self.save_model(path=self.config.base_model_path, model=self.model)

    
    
    @staticmethod
    def _prepare_full_model(model, classes, freeze_all, freeze_till, learning_rate):
        if freeze_all:
            for layer in model.layers:
                layer.trainable = False  # ✅ Correct
        elif (freeze_till is not None) and (freeze_till > 0):
            for layer in model.layers[:-freeze_till]:
                layer.trainable = False  # ✅ Correct


        flatten_in = tf.keras.layers.Flatten()(model.output)
        prediction = tf.keras.layers.Dense(
            units=classes,
            activation="softmax"
        )(flatten_in)

        full_model = tf.keras.models.Model(
            inputs=model.input,
            outputs=prediction
        )

        full_model.compile(
            optimizer=tf.keras.optimizers.SGD(learning_rate=learning_rate),
            loss=tf.keras.losses.CategoricalCrossentropy(),
            metrics=["accuracy"]
        )

        full_model.summary()
        return full_model

    # Change here: add freeze_till as an argument
    def update_base_model(self, freeze_till_value):
        base_model = tf.keras.applications.vgg16.VGG16(
            input_shape=self.config.params_image_size,
            weights=self.config.params_weights,
            include_top=self.config.params_include_top
        )

        self.full_model = self._prepare_full_model(
            model=base_model,
            classes=self.config.params_classes,
            freeze_all=False,
            freeze_till=freeze_till_value,
            learning_rate=self.config.params_learning_rate
        )

        self.save_model(path=self.config.updated_base_model_path, model=self.full_model)

    
    @staticmethod
    def save_model(path: Path, model: tf.keras.Model):
        model.save(path)

In [11]:
# Example usage: Prepare and update the base model
try:
    # Create configuration manager and get model config
    config = ConfigurationManager()
    prepare_base_model_config = config.get_prepare_base_model_config()
    prepare_base_model = PrepareBaseModel(config=prepare_base_model_config)
    prepare_base_model.get_base_model()  # Download and save base VGG16
    prepare_base_model.update_base_model(3)  # Customize and save updated model (freeze_till=3)
except Exception as e:
    raise e


[2025-09-19 23:05:28,449: INFO: common: yaml file: config/config.yaml loaded successfully]
[2025-09-19 23:05:28,451: INFO: common: yaml file: params.yaml loaded successfully]
[2025-09-19 23:05:28,452: INFO: common: created directory at: artifacts]
[2025-09-19 23:05:28,453: INFO: common: created directory at: artifacts/prepare_base_model]


2025-09-19 23:05:28.458350: E external/local_xla/xla/stream_executor/cuda/cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)






In [12]:
def train_models_with_freeze_till_range(prepare_base_model, freeze_till_range=range(0, 11)):
    """
    Trains and returns a list of (freeze_till_value, model) tuples for each freeze_till in the given range.
    Each model is updated and compiled but not fitted to data (unless you add fitting code).
    """
    models = []
    for n in freeze_till_range:
        print(f"Building model with freeze_till={n}")
        prepare_base_model.update_base_model(n)
        # Clone the model to avoid overwriting in the next iteration
        model_copy = tf.keras.models.clone_model(prepare_base_model.full_model)
        model_copy.set_weights(prepare_base_model.full_model.get_weights())
        models.append((n, model_copy))
    return models

# Example usage:
# models_list = train_models_with_freeze_till_range(prepare_base_model)
# Now models_list is [(0, model_0), (1, model_1), ..., (10, model_10)]