# Neuro-Fuzzy Computing - Project - Fall 2025
## Galaxy Zoo — Training

In this notebook, we train and evaluate on the **training** portion of the Galaxy Zoo dataset.

Dataset location (Drive): `MyDrive/galaxy-zoo-the-galaxy-challenge/`: includes the files `images_training_rev1.zip` and `training_solutions_rev1.zip`.

#### Inspecting the initial dataset location

In [None]:
from google.colab import drive
from pathlib import Path

# Mount Google Drive
drive.mount("/content/drive")

# Moving to the directory of the original dataset
data_dir = Path("/content/drive/MyDrive/galaxy-zoo-the-galaxy-challenge")

# List files and folders inside the directory
for item in data_dir.iterdir():
    print(item)

Mounted at /content/drive
/content/drive/MyDrive/galaxy-zoo-the-galaxy-challenge/images_training_rev1.zip
/content/drive/MyDrive/galaxy-zoo-the-galaxy-challenge/training_solutions_rev1.zip
/content/drive/MyDrive/galaxy-zoo-the-galaxy-challenge/results.csv


#### Extracting the dataset in its original form
The dataset gets extracted to `/content`, where it remains as long as the session is connected/active.

In [None]:
import zipfile

# Dataset directory
data_dir = Path("/content/drive/MyDrive/galaxy-zoo-the-galaxy-challenge")

# Directory where contents should be extracted, create folder if it doesn't exist
extract_dir = Path("/content")
extract_dir.mkdir(parents=True, exist_ok=True)

# ZIP files
zip_files = [
    data_dir / "images_training_rev1.zip",
    data_dir / "training_solutions_rev1.zip"
]

# Function to safely unzip a file
def safe_unzip(zip_path: Path, extract_to: Path):
    print(f"Unzipping {zip_path.name}...")
    with zipfile.ZipFile(zip_path, 'r') as zip_ref:
        for file_info in zip_ref.infolist():
            extracted_path = extract_to / file_info.filename
            if not extracted_path.exists():  # Skip if already extracted
                zip_ref.extract(file_info, extract_to)
    print(f"Finished unzipping {zip_path.name}")

# Unzip each file
for zip_path in zip_files:
    safe_unzip(zip_path, extract_dir)

print("All files unzipped successfully.")

Unzipping images_training_rev1.zip...
Finished unzipping images_training_rev1.zip
Unzipping training_solutions_rev1.zip...
Finished unzipping training_solutions_rev1.zip
All files unzipped successfully.


### Dataset inspection and preprocessing

In the following cells we perform the essential preprocessing steps

1. **Define data paths and parameters**
   - Point to the processed training images folder: `images_training_rev1/`
   - Point to the label file: `training_solutions_rev1.csv`

In [None]:
import pandas as pd

drive.mount("/content/drive")

DATA_ROOT = Path("/content")

img_dir = DATA_ROOT / "images_training_rev1"
csv_path = DATA_ROOT / "training_solutions_rev1.csv"

print("img_dir exists:", img_dir.is_dir())
print("csv_path exists:", csv_path.is_file())

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
img_dir exists: True
csv_path exists: True


2. **Check label to image consistency**
   - Confirm the number of label rows matches the number of processed images
   - If there is a mismatch, we report example GalaxyIDs whose image files are missing

In [None]:
# If this cell takes minutes to run, something went wrong with Colab finding the images, most likely due to their size. If that happens, restart session
solutions_df = pd.read_csv(csv_path)

# IDs of Galaxies are the labels of the column "GalaxyID"
ids = solutions_df["GalaxyID"].astype(str).tolist()

# These labels must match the names of the files inside the folder "images_training_424"
train_image_names = sorted([p.name for p in img_dir.glob("*.jpg")])

if len(ids) != len(train_image_names):
    missing = []
    name_set = set(train_image_names)
    for gid in ids[:50]:
        if f"{gid}.jpg" not in name_set:
            missing.append(gid)
    raise ValueError(f"Label/image count mismatch: labels={len(ids)} images={len(train_image_names)}. Example missing IDs: {missing[:10]}")

3. **Prepare inputs for a TensorFlow dataset**

In this cell, we:
   - Create `paths` (image filepaths) and `labels` (soft targets) from `training_solutions_rev1.csv`
   - Define `load_image(path, y)`, which will be used later with `tf.data.Dataset.map(...)` to load/parse images **on demand**

In [None]:
import tensorflow as tf

# assuming DATA_ROOT is defined
target_cols = [c for c in solutions_df.columns if c != "GalaxyID"]

paths  = (solutions_df["GalaxyID"].astype(int).astype(str) + ".jpg").apply(lambda fn: str(img_dir / fn)).to_numpy()
labels = solutions_df[target_cols].to_numpy(dtype="float32")

size = (424, 424)

def load_image(path, y):
    img = tf.io.decode_jpeg(tf.io.read_file(path), channels=3)
    img = tf.image.convert_image_dtype(img, tf.float32)  # [0,1]
    img = tf.image.resize(img, size)  # "Guard rail": ensuring that images are of size 424x424
    return img, y

### Train/Val/Test split (80/10/10)

We create our own 80/10/10 train/val/test split from the edited training set.

We start from a dataset of `(filepath, target_vector)` pairs and we shuffle **once** with a fixed seed (`seed=42`, `reshuffle_each_iteration=False`) to get a reproducible, **fixed** random ordering

Split by slicing the shuffled dataset:
  - **Train:** first 80% of samples
  - **Validation:** next 10%
  - **Test:** final 10%

Lastly, we apply `load_image` **after** the split so each subset loads/decodes images lazily and independently

In [None]:
n_total = len(paths)
n_train = int(0.8 * n_total)
n_val   = int(0.1 * n_total)
n_test  = n_total - n_train - n_val

print("Dataset size:", n_total)
print("Train:", n_train)
print("Val:", n_val)
print("Test:", n_test)

# Shuffle ONCE (fixed order for reproducibility)
base_ds = tf.data.Dataset.from_tensor_slices((paths, labels)).shuffle(buffer_size=n_total, seed=42, reshuffle_each_iteration=False)

# Split (no images loaded yet)
train_ds = base_ds.take(n_train)
val_ds   = base_ds.skip(n_train).take(n_val)
test_ds  = base_ds.skip(n_train + n_val)

Dataset size: 61578
Train: 49262
Val: 6157
Test: 6159


### Image loading pipeline (lazy + batched)

To build the dataset, we:
   - Convert the split datasets from `(filepath, target_vector)` into `(image_tensor, target_tensor)` using `map(load_image)`
     - Images are read/decoded **on demand** with `tf.io.read_file` + `tf.io.decode_jpeg`
     - Converted to `float32` in **[0, 1]** (and resized to 424×424 as a safety step)
   - Optimize input throughput:
     - **Train:** shuffle (per epoch) → batch → prefetch
     - **Val/Test:** batch → prefetch

Each dataset element is a **batch** `(image_tensor, target_tensor)` where:
  - `image_tensor` has shape **(batch_size, 424, 424, 3)** (channels-last) in **[0, 1]**
  - `target_tensor` has shape **(batch_size, 37)**


In [None]:
AUTOTUNE = tf.data.AUTOTUNE
BATCH_SIZE = 32

# Allow non-deterministic ordering for speed (especially with parallel map)
options = tf.data.Options()
options.experimental_deterministic = False

train_img_ds = (train_ds.shuffle(10_000, reshuffle_each_iteration=True).map(load_image, num_parallel_calls=AUTOTUNE).batch(BATCH_SIZE, drop_remainder=True).prefetch(AUTOTUNE).with_options(options))
val_img_ds = (val_ds.map(load_image, num_parallel_calls=AUTOTUNE).batch(BATCH_SIZE, drop_remainder=True).prefetch(AUTOTUNE).with_options(options))
test_img_ds = (test_ds.map(load_image, num_parallel_calls=AUTOTUNE).batch(BATCH_SIZE, drop_remainder=True).prefetch(AUTOTUNE).with_options(options))

# Estimate total steps (batches) for full training run
steps_per_epoch = tf.data.experimental.cardinality(train_img_ds).numpy()
total_steps = int(steps_per_epoch * 30)  # 30 = max epochs

xb, yb = next(iter(train_img_ds))
print("train batch x:", xb.shape, xb.dtype, "y:", yb.shape, yb.dtype)

train batch x: (32, 424, 424, 3) <dtype: 'float32'> y: (32, 37) <dtype: 'float32'>


### Building our CNN model

In [None]:
# Sequential CNN, assumes inputs are (424, 424, 3) and outputs are 37 probabilities.
model = tf.keras.Sequential()

# Input
model.add(tf.keras.layers.Input(shape=(424, 424, 3)))

# Block 1 (3 -> 32)
model.add(tf.keras.layers.Conv2D(32, kernel_size=3, use_bias=True, padding="same"))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.ReLU())

model.add(tf.keras.layers.MaxPooling2D(pool_size=2))
model.add(tf.keras.layers.SpatialDropout2D(0.05))

# Block 2 (32 -> 64)
model.add(tf.keras.layers.Conv2D(64, kernel_size=3, use_bias=True, padding="same"))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.ReLU())

model.add(tf.keras.layers.MaxPooling2D(pool_size=2))
model.add(tf.keras.layers.SpatialDropout2D(0.10))

# Block 3 (64 -> 128)
model.add(tf.keras.layers.Conv2D(128, kernel_size=3, use_bias=True, padding="same"))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.ReLU())

model.add(tf.keras.layers.MaxPooling2D(pool_size=2))
model.add(tf.keras.layers.SpatialDropout2D(0.15))

# Block 4 (128 -> 256)
model.add(tf.keras.layers.Conv2D(256, kernel_size=3, use_bias=True, padding="same"))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.ReLU())

model.add(tf.keras.layers.MaxPooling2D(pool_size=2))
model.add(tf.keras.layers.SpatialDropout2D(0.20))

# Pooling layer
model.add(tf.keras.layers.GlobalAveragePooling2D())

# 37-dim output right after pooling:
model.add(tf.keras.layers.Dense(37, use_bias=True))

model.summary()

### Optimizer, loss function and model compilation

In [None]:
lr_schedule = tf.keras.optimizers.schedules.CosineDecay(
    initial_learning_rate=1e-3,
    decay_steps=total_steps,
    alpha=1e-2,  # final LR = alpha * initial (here: 1e-5)
)

# AdamW optimizer (decoupled weight decay)
optimizer = tf.keras.optimizers.AdamW(learning_rate=lr_schedule, weight_decay=1e-4)

# Model compilation, MSE loss and RMSE metric for reporting
model.compile(
    optimizer=optimizer,
    loss=tf.keras.losses.MeanSquaredError(),
    metrics=[tf.keras.metrics.RootMeanSquaredError(name="rmse")],
    jit_compile=True,
)

### Defining the training loop

In [None]:
import time

# helper to read current LR (works for constant LR or schedules)
def get_current_lr(optimizer: tf.keras.optimizers.Optimizer):
    lr = optimizer.learning_rate
    # If lr is a schedule, call it with optimizer.iterations
    if isinstance(lr, tf.keras.optimizers.schedules.LearningRateSchedule):
        return float(lr(optimizer.iterations).numpy())
    # Otherwise it's a scalar/tensor/variable
    return float(tf.convert_to_tensor(lr).numpy())

def train_loop(model, train_ds, val_ds, epochs=30, patience=3, min_delta=1e-3):
    best_val = float("inf")
    patience_ctr = 0
    best_weights = None

    # Helper: get metric value by name
    def metric_value(name: str):
        for m in model.metrics:
            r = m.result()
            if isinstance(r, dict):
                # compiled metrics here
                if name in r:
                    return float(r[name].numpy())
            else:
                if m.name == name:
                    return float(r.numpy())
        # If not found, give a helpful error listing available keys/names
        available = []
        for m in model.metrics:
            r = m.result()
            if isinstance(r, dict):
                available.extend(list(r.keys()))
            else:
                available.append(m.name)
        raise ValueError(f"Metric '{name}' not found. Available: {available}")

    @tf.function(jit_compile=True)
    def train_step(xb, yb):
        with tf.GradientTape() as tape:
            preds = model(xb, training=True)
            loss = model.compute_loss(x=xb, y=yb, y_pred=preds, sample_weight=None, training=True)

        grads = tape.gradient(loss, model.trainable_variables)
        model.optimizer.apply_gradients(zip(grads, model.trainable_variables))

        for m in model.metrics:
            m.update_state(yb, preds)

        return loss

    @tf.function(jit_compile=True)
    def val_step(xb, yb):
        preds = model(xb, training=False)
        for m in model.metrics:
            m.update_state(yb, preds)

    for epoch in range(1, epochs + 1):
        t0 = time.time()

        # Train
        model.reset_metrics()
        for xb, yb in train_ds:
            train_step(xb, yb)
        train_rmse_val = metric_value("rmse")

        # Evaluation
        model.reset_metrics()
        for xb, yb in val_ds:
            val_step(xb, yb)
        val_rmse_val = metric_value("rmse")

        lr_val = get_current_lr(model.optimizer)
        dt = time.time() - t0

        # Early stopping mechanism
        improved = (best_val - val_rmse_val) > min_delta
        if improved:
            best_val = val_rmse_val
            patience_ctr = 0
            best_weights = model.get_weights()
        else:
            patience_ctr += 1

        print(
            f"Epoch {epoch:02d}/{epochs} | "
            f"lr={lr_val:.6g} | "
            f"train_RMSE={train_rmse_val:.6f} | "
            f"eval_RMSE={val_rmse_val:.6f} | "
            f"patience={patience_ctr}/{patience} | "
            f"time={dt:.2f}s"
        )

        if patience_ctr >= patience:
            break

    if best_weights is not None:
        model.set_weights(best_weights)

    return epoch

### Training our model

In [None]:
epochs_ran = train_loop(model, train_img_ds, val_img_ds, epochs=30, patience=3, min_delta=1e-3)

### Evaluate on held-out test split

In [None]:
test_rmse_metric = tf.keras.metrics.RootMeanSquaredError()

for xb, yb in test_img_ds:
    preds = model(xb, training=False)
    test_rmse_metric.update_state(yb, preds)

test_rmse = float(test_rmse_metric.result().numpy())
print("Test RMSE:", test_rmse)

In [None]:
# Cell for cleanup
import gc
tf.keras.backend.clear_session()
gc.collect()

## Comparison with other models

#### DenseNet

##### Model build

In [None]:
# Prebuilt tf.keras.applications.DenseNet121

inputs = tf.keras.layers.Input(shape=(424, 424, 3))

# If your dataset yields [0,1], DenseNet preprocess expects [0,255] then normalizes.
x = tf.keras.layers.Rescaling(255.0)(inputs)
x = tf.keras.layers.Lambda(tf.keras.applications.densenet.preprocess_input)(x)

base_model2 = tf.keras.applications.DenseNet121(
    include_top=False,
    weights=None,
    input_shape=(424, 424, 3),
)

# Train end-to-end if weights=None
x = base_model2(x, training=True)

x = tf.keras.layers.GlobalAveragePooling2D()(x)
outputs = tf.keras.layers.Dense(37, use_bias=True)(x)

model2 = tf.keras.Model(inputs, outputs, name="DenseNet121_custom_head")
model2.summary()

##### Compilation

In [None]:
lr_schedule2 = tf.keras.optimizers.schedules.CosineDecay(
    initial_learning_rate=1e-3,
    decay_steps=total_steps,
    alpha=1e-2,
)

optimizer2 = tf.keras.optimizers.AdamW(learning_rate=lr_schedule2, weight_decay=1e-4)

model2.compile(
    optimizer=optimizer2,
    loss=tf.keras.losses.MeanSquaredError(),
    metrics=[tf.keras.metrics.RootMeanSquaredError(name="rmse")],
    jit_compile=True,
)

##### Training

In [None]:
epochs_ran2 = train_loop(model2, train_img_ds, val_img_ds, epochs=30, patience=3, min_delta=1e-3)

KeyboardInterrupt: 

##### Testing

In [None]:
test_rmse_metric2 = tf.keras.metrics.RootMeanSquaredError()
for xb, yb in test_img_ds:
    preds = model2(xb, training=False)
    test_rmse_metric2.update_state(yb, preds)

test_rmse2 = float(test_rmse_metric2.result().numpy())
print("DenseNet epochs:", epochs_ran2, "| Test RMSE:", test_rmse2)

In [None]:
# Cell for cleanup
import gc
tf.keras.backend.clear_session()
gc.collect()

#### ResNet
##### Model build

In [None]:
# Prebuilt tf.keras.applications.ResNet50

inputs = tf.keras.layers.Input(shape=(424, 424, 3))

# Our dataset yields [0,1] floats; ResNet preprocess expects [0,255] then normalizes
x = tf.keras.layers.Rescaling(255.0)(inputs)
x = tf.keras.layers.Lambda(tf.keras.applications.resnet.preprocess_input)(x)

base_model3 = tf.keras.applications.ResNet50(
    include_top=False,
    weights=None,
    input_shape=(424, 424, 3),
)

x = base_model3(x, training=True)

x = tf.keras.layers.GlobalAveragePooling2D()(x)
outputs = tf.keras.layers.Dense(37, use_bias=True)(x)  # linear 37-dim output

model3 = tf.keras.Model(inputs, outputs, name="ResNet50_custom_head")
model3.summary()

##### Compilation

In [None]:
lr_schedule3 = tf.keras.optimizers.schedules.CosineDecay(
    initial_learning_rate=1e-3,
    decay_steps=total_steps,
    alpha=1e-2,
)

optimizer3 = tf.keras.optimizers.AdamW(learning_rate=lr_schedule3, weight_decay=1e-4)

model3.compile(
    optimizer=optimizer3,
    loss=tf.keras.losses.MeanSquaredError(),
    metrics=[tf.keras.metrics.RootMeanSquaredError(name="rmse")],
    jit_compile=True,
)

##### Training

In [None]:
epochs_ran3 = train_loop(model3, train_img_ds, val_img_ds, epochs=30, patience=3, min_delta=1e-3)

##### Testing

In [None]:
test_rmse_metric3 = tf.keras.metrics.RootMeanSquaredError()
for xb, yb in test_img_ds:
    preds = model3(xb, training=False)
    test_rmse_metric3.update_state(yb, preds)

test_rmse3 = float(test_rmse_metric3.result().numpy())
print("ResNet50 epochs:", epochs_ran3, "| Test RMSE:", test_rmse3)

In [None]:
# Cell for cleanup
import gc
tf.keras.backend.clear_session()
gc.collect()

#### MobileNetV2 (pretrained transfer learning)
##### Model build

In [None]:
inputs = tf.keras.layers.Input(shape=(424, 424, 3))

# dataset gives [0,1]: mobilenet_v2 preprocess expects [0,255] then scales to [-1, 1]
x = tf.keras.layers.Rescaling(255.0)(inputs)
x = tf.keras.layers.Lambda(tf.keras.applications.mobilenet_v2.preprocess_input)(x)

base_model4 = tf.keras.applications.MobileNetV2(
    include_top=False,
    weights="imagenet",
    input_shape=(424, 424, 3),
)

base_model4.trainable = False

x = base_model4(x, training=False)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dropout(0.2)(x)
outputs = tf.keras.layers.Dense(37, use_bias=True)(x)  # linear regression head

model4 = tf.keras.Model(inputs, outputs, name="MobileNetV2_transfer")
model4.summary()

  base_model4 = tf.keras.applications.MobileNetV2(


Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224_no_top.h5
[1m9406464/9406464[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 0us/step


##### Compilation

In [None]:
lr_schedule4 = tf.keras.optimizers.schedules.CosineDecay(
    initial_learning_rate=1e-3,
    decay_steps=total_steps,
    alpha=1e-2,
)

optimizer4 = tf.keras.optimizers.AdamW(learning_rate=lr_schedule4, weight_decay=1e-4)

model4.compile(
    optimizer=optimizer4,
    loss=tf.keras.losses.MeanSquaredError(),
    metrics=[tf.keras.metrics.RootMeanSquaredError(name="rmse")],
    jit_compile=True,
)

##### Training

In [None]:
epochs_ran4 = train_loop(model4, train_img_ds, val_img_ds, epochs=30, patience=3, min_delta=1e-3)

##### Testing

In [None]:
test_rmse_metric4 = tf.keras.metrics.RootMeanSquaredError()
for xb, yb in test_img_ds:
    preds = model4(xb, training=False)
    test_rmse_metric4.update_state(yb, preds)

test_rmse4 = float(test_rmse_metric4.result().numpy())
print("MobileNetV2 TL epochs:", epochs_ran4, "| Test RMSE:", test_rmse4)

In [None]:
# Cell for cleanup
import gc
tf.keras.backend.clear_session()
gc.collect()

#### VGG16 (pretrained)
##### Model build

In [None]:
inputs = tf.keras.layers.Input(shape=(424, 424, 3))

# dataset gives [0,1], vgg16 preprocess expects [0,255] then mean subtraction (BGR convention internally)
x = tf.keras.layers.Rescaling(255.0)(inputs)
x = tf.keras.layers.Lambda(tf.keras.applications.vgg16.preprocess_input)(x)

base_model5 = tf.keras.applications.VGG16(
    include_top=False,
    weights="imagenet",
    input_shape=(424, 424, 3),
)

base_model5.trainable = False

x = base_model5(x, training=False)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dropout(0.3)(x)
x = tf.keras.layers.Dense(256)(x)
x = tf.keras.layers.ReLU()(x)
x = tf.keras.layers.Dropout(0.3)(x)

outputs = tf.keras.layers.Dense(37, use_bias=True)(x)

model5 = tf.keras.Model(inputs, outputs, name="VGG16_transfer")
model5.summary()

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
[1m58889256/58889256[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 0us/step


##### Compilation

In [None]:
lr_schedule5 = tf.keras.optimizers.schedules.CosineDecay(
    initial_learning_rate=1e-3,
    decay_steps=total_steps,
    alpha=1e-2,
)

optimizer5 = tf.keras.optimizers.AdamW(learning_rate=lr_schedule5, weight_decay=1e-4)

model5.compile(
    optimizer=optimizer5,
    loss=tf.keras.losses.MeanSquaredError(),
    metrics=[tf.keras.metrics.RootMeanSquaredError(name="rmse")],
    jit_compile=True,
)

##### Training

In [None]:
epochs_ran5 = train_loop(model5, train_img_ds, val_img_ds, epochs=30, patience=3, min_delta=1e-3)

##### Testing

In [None]:
test_rmse_metric5 = tf.keras.metrics.RootMeanSquaredError()
for xb, yb in test_img_ds:
    preds = model5(xb, training=False)
    test_rmse_metric5.update_state(yb, preds)

test_rmse5 = float(test_rmse_metric5.result().numpy())
print("VGG16 TL epochs:", epochs_ran5, "| Test RMSE:", test_rmse5)

In [None]:
# Cell for cleanup
import gc
tf.keras.backend.clear_session()
gc.collect()

#### EfficientNetB0 (pretrained)
##### Model build

In [None]:
inputs = tf.keras.layers.Input(shape=(424, 424, 3))

# dataset gives [0,1], efficientnet preprocess expects [0,255] then normalizes appropriately
x = tf.keras.layers.Rescaling(255.0)(inputs)
x = tf.keras.layers.Lambda(tf.keras.applications.efficientnet.preprocess_input)(x)

base_model6 = tf.keras.applications.EfficientNetB0(
    include_top=False,
    weights="imagenet",
    input_shape=(424, 424, 3),
)

base_model6.trainable = False

x = base_model6(x, training=False)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dropout(0.2)(x)
outputs = tf.keras.layers.Dense(37, use_bias=True)(x)

model6 = tf.keras.Model(inputs, outputs, name="EfficientNetB0_transfer")
model6.summary()

Downloading data from https://storage.googleapis.com/keras-applications/efficientnetb0_notop.h5
[1m16705208/16705208[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 0us/step


##### Compilation

In [None]:
lr_schedule6 = tf.keras.optimizers.schedules.CosineDecay(
    initial_learning_rate=1e-3,
    decay_steps=total_steps,
    alpha=1e-2,
)

optimizer6 = tf.keras.optimizers.AdamW(learning_rate=lr_schedule6, weight_decay=1e-4)

model6.compile(
    optimizer=optimizer6,
    loss=tf.keras.losses.MeanSquaredError(),
    metrics=[tf.keras.metrics.RootMeanSquaredError(name="rmse")],
    jit_compile=True,
)

##### Training

In [None]:
epochs_ran6 = train_loop(model6, train_img_ds, val_img_ds, epochs=30, patience=3, min_delta=1e-3)

##### Testing

In [None]:
test_rmse_metric6 = tf.keras.metrics.RootMeanSquaredError()
for xb, yb in test_img_ds:
    preds = model6(xb, training=False)
    test_rmse_metric6.update_state(yb, preds)

test_rmse6 = float(test_rmse_metric6.result().numpy())
print("EfficientNetB0 TL epochs:", epochs_ran6, "| Test RMSE:", test_rmse6)