<a href="https://colab.research.google.com/github/Sam-uel-Codes/AI/blob/main/Transfer_learning_without_finetuning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Transfer learning without Fine - Tuning

(All the notes are written for aided understanding of the code).

**Importing**

In [2]:
# Imports
import wandb
from wandb.integration.keras import WandbMetricsLogger

import tensorflow as tf
import matplotlib.pyplot as plt
from tensorflow import keras

## **Adding Sweep Configurations**

About Sweep Configurations -> sweep configuration typically consists of several key-value pairs, nested as necessary.

-  **Grid Search**
Description: Iterates through every possible combination of parameter values specified.
- **Random Search**
Description: Randomly selects combinations of parameters, as defined by their value distributions or sets.
- **Bayesian Search**
Description: Uses probabilistic models to select new parameter combinations, aiming to more efficiently find optimal values.

Top-Level Required Keys
- method (required): Defines the sweep method (grid, random, or bayes).
- parameters (required): Specifies the parameters to be swept and their possible values or distributions.

In [3]:
# Sweep configuration
sweep_config = {
    'method': 'grid',
    'metric': {'name': 'val_accuracy', 'goal': 'maximize'},
    'parameters': {
        'batch_size': {'values': [8]},
        'learning_rate': {'values': [0.0001]},
        'img_size': {'values': [128]},  # Set a reasonably large size for MobileNetV2
        'epochs': {'values': [10]},
        'experiment': {'values': ['transfer_learning']}
    }
}

## Brief on how to Work with Weights and Biases

1 - Import and Login
--> Import wandb and log in (if not using an environment variable/token).

2 - Sweep Configuration
--> Define a configuration dictionary specifying the sweep method (grid, random, bayesian), parameters, and target metric.

3 - Initialize the Sweep
--> Call the sweep initialization method.

4 - Define the Training Function
--> This function should encapsulate all experiment logic.
<br> Best Practice: Place with wandb.init() as run: at the very beginning of this function.

5 - Data Preparation and Model Building
--> Process the dataset and build the model using settings from config.

6 - Compile and Train the Model
--> Compile the model, then train using callbacks such as WandbMetricsLoggerCompile.

7 - Launch the Sweep Agent
--> Use the agent function to execute the sweep, passing the sweep ID and your training function.

### Questions on Why ?

**1 - When and Why to Initialize W&B** <br>
--> wandb.init is a function, wandb.init(), you are initializing a new W&B run. <br>
- Creates a new run session in W&B, generating a unique run ID.
- Initializes experiment logging, so anything you log (metrics, parameters, files) is associated with that specific run.
- Pulls sweep or configuration values if you're running in a sweep, making them available via wandb.config.

**2 - What exactly does sweep do?** <br>
--> Initializing a sweep in Weights & Biases (W&B) sets up an automated, organized infrastructure for exploring combinations of hyperparameters in your machine learning workflow.

-  Automates Hyperparameter

- Defines an Experiment Plan:
  - The model training script or program to run
  - Which hyperparameters to test, with their possible values or ranges
  - The chosen search method
  - The metric to optimize and its direction (minimize or maximize)

- Returns a Sweep ID: This unique identifier is how W&B tracks and coordinates all experiments (runs) associated with this sweep.

- Orchestrates Multiple Runs



## All comes together

1. Agent starts -->	Receives sweep ID, contacts controller for config
2. wandb.init() -->	Detects sweep session, registers run as part of sweep
3. Config assignment --> Injects hyperparameters for this run via wandb.config
4. Run tracking --> Logs results as a new run linked to the sweep (using sweep ID)

In [4]:
# Initialize W&B sweep
sweep_id = wandb.sweep(sweep_config, project="5-flowers-transfer-learning")

<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize?ref=models
wandb: Paste an API key from your profile and hit enter:

 ··········


[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


Create sweep with ID: ikcoglxi
Sweep URL: https://wandb.ai/samtrieswnb-vellore-institute-of-technology/5-flowers-transfer-learning/sweeps/ikcoglxi


In [5]:
# Train function
def train():
    with wandb.init() as run:
        config = wandb.config

        IMG_HEIGHT = config.img_size
        IMG_WIDTH = config.img_size
        IMG_CHANNELS = 3
        CLASS_NAMES = ["daisy", "dandelion", "roses", "sunflowers", "tulips"]

        # Helper functions
        def read_and_decode(filename, resize_dims):
            img_bytes = tf.io.read_file(filename)
            img = tf.image.decode_jpeg(img_bytes, channels=IMG_CHANNELS)
            img = tf.image.convert_image_dtype(img, tf.float32)
            img = tf.image.resize(img, resize_dims)
            return img

        def parse_csvline(csv_line):
            record_default = ["", ""]
            filename, label_string = tf.io.decode_csv(csv_line, record_default)
            img = read_and_decode(filename, [IMG_HEIGHT, IMG_WIDTH])
            label = tf.where(tf.equal(CLASS_NAMES, label_string))[0, 0]
            return img, label

        # Prepare datasets
        train_dataset = (
            tf.data.TextLineDataset("gs://cloud-ml-data/img/flower_photos/train_set.csv")
            .map(parse_csvline, num_parallel_calls=tf.data.AUTOTUNE)
            .batch(config.batch_size)
            .prefetch(tf.data.AUTOTUNE)
        )

        eval_dataset = (
            tf.data.TextLineDataset("gs://cloud-ml-data/img/flower_photos/eval_set.csv")
            .map(parse_csvline, num_parallel_calls=tf.data.AUTOTUNE)
            .batch(config.batch_size)
            .prefetch(tf.data.AUTOTUNE)
        )

        # Build model
        base_model = tf.keras.applications.MobileNetV2(
            input_shape=(IMG_HEIGHT, IMG_WIDTH, IMG_CHANNELS),
            include_top=False,    # Remove ImageNet head
            weights="imagenet"    # Use pretrained weights
        )
        base_model.trainable = False  # Freeze base

        model = keras.Sequential([
            base_model,
            keras.layers.GlobalAveragePooling2D(),
            keras.layers.BatchNormalization(),       # Add BatchNorm
            keras.layers.Dense(len(CLASS_NAMES), activation="softmax")
        ])

        # Compile model
        model.compile(
            optimizer=keras.optimizers.Adam(learning_rate=config.learning_rate),
            loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
            metrics=["accuracy"]
        )

        # Visualize model
        model.summary()

        keras.utils.plot_model(model, show_shapes=True, show_layer_names=True, to_file="model_visualization.png")

        # Train
        callbacks = [WandbMetricsLogger(log_freq=5)]

        model.fit(
            train_dataset,
            validation_data=eval_dataset,
            epochs=config.epochs,
            callbacks=callbacks
        )

# Base model

**Pre-trained Model**

 - When you use a pretrained model (like MobileNetV2), it normally comes with a default "top"—the last few layers that output a prediction for those 1,000 ImageNet classes.
 - Setting include_top=True means the model will end with this default classifier for the 1,000 ImageNet outputs.
 - include_top=False means you do not load these final layers. Instead, you get the core of the network (the feature extractor), but not the part that makes predictions for ImageNet. This allows you to add your own output layers (e.g., for a different number of classes specific to your problem).

**Channels**
- In input layer, channels mean color combo.
- In CNN layer after processed, channels are the total number of feature maps.
- Batch size is how many images are processed in one forward (and backward) pass.

In [6]:
# Launch W&B agent
wandb.agent(sweep_id, function=train)

[34m[1mwandb[0m: Agent Starting Run: xo9k7xm9 with config:
[34m[1mwandb[0m: 	batch_size: 8
[34m[1mwandb[0m: 	epochs: 10
[34m[1mwandb[0m: 	experiment: transfer_learning
[34m[1mwandb[0m: 	img_size: 128
[34m[1mwandb[0m: 	learning_rate: 0.0001
[34m[1mwandb[0m: Currently logged in as: [33msamtrieswnb[0m ([33msamtrieswnb-vellore-institute-of-technology[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_128_no_top.h5
[1m9406464/9406464[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 0us/step


Epoch 1/10
    413/Unknown [1m706s[0m 2s/step - accuracy: 0.3470 - loss: 1.7299



[1m413/413[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m785s[0m 2s/step - accuracy: 0.3473 - loss: 1.7290 - val_accuracy: 0.6946 - val_loss: 0.8633
Epoch 2/10
[1m413/413[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m731s[0m 2s/step - accuracy: 0.6949 - loss: 0.8534 - val_accuracy: 0.7919 - val_loss: 0.6336
Epoch 3/10
[1m413/413[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m657s[0m 2s/step - accuracy: 0.7633 - loss: 0.6702 - val_accuracy: 0.8243 - val_loss: 0.5424
Epoch 4/10
[1m413/413[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m682s[0m 2s/step - accuracy: 0.8027 - loss: 0.5811 - val_accuracy: 0.8351 - val_loss: 0.4958
Epoch 5/10
[1m413/413[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m742s[0m 2s/step - accuracy: 0.8222 - loss: 0.5232 - val_accuracy: 0.8486 - val_loss: 0.4683
Epoch 6/10
[1m413/413[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m743s[0m 2s/step - accuracy: 0.8337 - loss: 0.4801 - val_accura

0,1
batch/accuracy,▁▁▂▂▃▆▆▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇█▇████████████████
batch/batch_step,▁▁▁▁▁▂▂▂▂▂▂▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▅▆▆▆▆▆▆▆▇▇▇▇██
batch/learning_rate,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
batch/loss,█▇▄▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁
epoch/accuracy,▁▅▆▇▇▇████
epoch/epoch,▁▂▃▃▄▅▆▆▇█
epoch/learning_rate,▁▁▁▁▁▁▁▁▁▁
epoch/loss,█▄▃▂▂▂▂▁▁▁
epoch/val_accuracy,▁▅▆▇▇▇████
epoch/val_loss,█▄▃▂▂▁▁▁▁▁

0,1
batch/accuracy,0.87378
batch/batch_step,4145.0
batch/learning_rate,0.0001
batch/loss,0.36021
epoch/accuracy,0.87424
epoch/epoch,9.0
epoch/learning_rate,0.0001
epoch/loss,0.35955
epoch/val_accuracy,0.87027
epoch/val_loss,0.41992


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Sweep Agent: Exiting.
