<a href="https://colab.research.google.com/github/Kush-Singh-26/Image_Caption/blob/main/CaptionColab_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Image Caption Generation

## Importing Libraries

In [None]:
import torch
import torchvision.transforms as transforms
import torch.nn as nn
import torchvision.models as models
from torch.utils.data import Dataset, DataLoader
from PIL import Image
import os
import nltk
import json
import collections
from collections import Counter
import random
import time
from torch.nn.utils.rnn import pad_sequence
import torch.optim as optim
from torch.optim.lr_scheduler import ReduceLROnPlateau

## Importing the Data

In [None]:
!mkdir -p /content/coco/images/train2014
!mkdir -p /content/coco/images/val2014
!mkdir -p /content/coco/annotations


In [None]:
!wget http://images.cocodataset.org/zips/train2014.zip -P /content/coco/images/
!wget http://images.cocodataset.org/zips/val2014.zip -P /content/coco/images/


--2025-05-04 17:07:41--  http://images.cocodataset.org/zips/train2014.zip
Resolving images.cocodataset.org (images.cocodataset.org)... 54.231.232.209, 52.216.32.113, 3.5.2.125, ...
Connecting to images.cocodataset.org (images.cocodataset.org)|54.231.232.209|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13510573713 (13G) [application/zip]
Saving to: ‘/content/coco/images/train2014.zip’


2025-05-04 17:13:46 (35.3 MB/s) - ‘/content/coco/images/train2014.zip’ saved [13510573713/13510573713]

--2025-05-04 17:13:46--  http://images.cocodataset.org/zips/val2014.zip
Resolving images.cocodataset.org (images.cocodataset.org)... 3.5.28.192, 52.217.122.161, 16.15.185.235, ...
Connecting to images.cocodataset.org (images.cocodataset.org)|3.5.28.192|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6645013297 (6.2G) [application/zip]
Saving to: ‘/content/coco/images/val2014.zip’


2025-05-04 17:16:20 (41.2 MB/s) - ‘/content/coco/images/val2014.zip’ 

In [None]:
!wget http://images.cocodataset.org/annotations/annotations_trainval2014.zip -P /content/coco/annotations/


--2025-05-04 17:16:20--  http://images.cocodataset.org/annotations/annotations_trainval2014.zip
Resolving images.cocodataset.org (images.cocodataset.org)... 3.5.13.41, 3.5.29.63, 52.216.178.59, ...
Connecting to images.cocodataset.org (images.cocodataset.org)|3.5.13.41|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 252872794 (241M) [application/zip]
Saving to: ‘/content/coco/annotations/annotations_trainval2014.zip’


2025-05-04 17:16:26 (39.4 MB/s) - ‘/content/coco/annotations/annotations_trainval2014.zip’ saved [252872794/252872794]



In [None]:
!unzip -q /content/coco/images/train2014.zip -d /content/coco/images/
!unzip -q /content/coco/images/val2014.zip -d /content/coco/images/
!unzip -q /content/coco/annotations/annotations_trainval2014.zip -d /content/coco/annotations/


In [None]:
!wget -c "https://github.com/Delphboy/karpathy-splits/raw/main/dataset_coco.json?download=" -O /content/dataset_coco.json


--2025-05-04 17:20:21--  https://github.com/Delphboy/karpathy-splits/raw/main/dataset_coco.json?download=
Resolving github.com (github.com)... 140.82.121.4
Connecting to github.com (github.com)|140.82.121.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://media.githubusercontent.com/media/Delphboy/karpathy-splits/main/dataset_coco.json?download=true [following]
--2025-05-04 17:20:21--  https://media.githubusercontent.com/media/Delphboy/karpathy-splits/main/dataset_coco.json?download=true
Resolving media.githubusercontent.com (media.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to media.githubusercontent.com (media.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 408860810 (390M) [application/octet-stream]
Saving to: ‘/content/dataset_coco.json’


2025-05-04 17:20:34 (141 MB/s) - ‘/content/dataset_coco.json’ saved [408860810/408860810]



## Splitting the data into train and validation set

In [None]:
import json

with open("/content/dataset_coco.json", "r") as f:
    karpathy_data = json.load(f)

karpathy_images = karpathy_data['images']

# Example: get all train images
train_data = [img for img in karpathy_images if img['split'] == 'train']
val_data = [img for img in karpathy_images if img['split'] == 'val']
test_data = [img for img in karpathy_images if img['split'] == 'test']

print(f"# Train: {len(train_data)} | Val: {len(val_data)} | Test: {len(test_data)}")


# Train: 82783 | Val: 5000 | Test: 5000


In [None]:
import collections
from collections import Counter
nltk.download('punkt')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


True

## Creating the Vocabulary

In [None]:
class Vocabulary:
    def __init__(self, freq_threshold=5):
        self.freq_threshold = freq_threshold
        # Initialize with special tokens
        self.word2idx = {"<pad>": 0, "<start>": 1, "<end>": 2, "<unk>": 3}
        self.idx2word = {0: "<pad>", 1: "<start>", 2: "<end>", 3: "<unk>"}
        self.idx = 4 # Next index to assign

    def build_vocabulary(self, sentence_list):
        frequencies = Counter()
        print(f"Building vocabulary from {len(sentence_list)} sentences...")
        for i, sentence in enumerate(sentence_list):
            tokens = nltk.tokenize.word_tokenize(sentence.lower())
            frequencies.update(tokens)
            if (i+1) % 100000 == 0:
                 print(f"Processed {i+1}/{len(sentence_list)} sentences")


        original_size = len(frequencies)
        filtered_freq = {word: freq for word, freq in frequencies.items() if freq >= self.freq_threshold}

        for word in filtered_freq:
            if word not in self.word2idx: # Avoid adding duplicates if called multiple times
                self.word2idx[word] = self.idx
                self.idx2word[self.idx] = word
                self.idx += 1
        print(f"Original vocab size: {original_size}, Filtered size (freq>={self.freq_threshold}): {len(self.word2idx)}")


    def numericalize(self, text):
        tokens = nltk.tokenize.word_tokenize(text.lower())
        return [self.word2idx.get(token, self.word2idx["<unk>"]) for token in tokens]

    def __len__(self):
        return self.idx # Correctly returns the size including special tokens


In [None]:
nltk.download('punkt_tab')

[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.


True

In [None]:
all_captions = []
for img in train_data:
    for s in img['sentences']:
        all_captions.append(s['raw'])

vocab = Vocabulary(freq_threshold=5)
vocab.build_vocabulary(all_captions)
vocab_size = len(vocab)
print(f"Vocabulary Size: {vocab_size}")

Building vocabulary from 414113 sentences...
Processed 100000/414113 sentences
Processed 200000/414113 sentences
Processed 300000/414113 sentences
Processed 400000/414113 sentences
Original vocab size: 24916, Filtered size (freq>=5): 8853
Vocabulary Size: 8853


In [None]:
import pickle

with open("vocab.pkl", "wb") as f:
    pickle.dump(vocab, f)


## Encoding using CNN

In [None]:
class EncoderCNN(nn.Module):
  def __init__(self, embed_size, dropout_p=0.5, fine_tune=True):
    super(EncoderCNN, self).__init__()
    print(f"Initializing EncoderCNN: embed_size={embed_size}, dropout={dropout_p}, fine_tune={fine_tune}")
    resnet = models.resnet101(weights=models.ResNet101_Weights.IMAGENET1K_V1)

    # Freeze all layers initially
    for param in resnet.parameters():
      param.requires_grad = False

    # Fine-tuning: Unfreeze later layers if fine_tune is True
    if fine_tune:
      print("Fine-tuning ResNet: Unfreezing layer4 parameters.")
      for param in resnet.layer4.parameters(): # Unfreeze layer 4
          param.requires_grad = True

    # Remove the final classification layer
    self.resnet = nn.Sequential(*list(resnet.children())[:-1])

    # Add trainable layers
    self.fc = nn.Linear(resnet.fc.in_features, embed_size)
    self.bn = nn.BatchNorm1d(embed_size, momentum=0.01) # BatchNorm after FC
    self.dropout = nn.Dropout(dropout_p) # Dropout layer

    # Initialize weights for the new layers
    self.fc.weight.data.normal_(0.0, 0.02)
    self.fc.bias.data.fill_(0)

  def forward(self, images):
    with torch.no_grad() if not self.training else torch.enable_grad(): # Only track gradients during training for ResNet parts if fine-tuning
        features = self.resnet(images) # [B, C, 1, 1]

    features = features.squeeze(3).squeeze(2) # [B, C]
    features = self.fc(features)              # [B, E]
    features = self.bn(features)              # [B, E] - Apply BN before dropout
    features = self.dropout(features)         # [B, E] - Apply dropout
    return features


## Decoding the feature map of image using LSTM

In [None]:
class DecoderRNN(nn.Module):
    def __init__(self, embed_size, hidden_size, vocab_size, num_layers=1, dropout_p=0.5):
        super().__init__()
        print(f"Initializing DecoderRNN: embed_size={embed_size}, hidden_size={hidden_size}, vocab_size={vocab_size}, num_layers={num_layers}, dropout={dropout_p}")
        self.embed = nn.Embedding(vocab_size, embed_size)
        self.embed_dropout = nn.Dropout(dropout_p) # Dropout after embedding
        # Apply LSTM dropout between layers only if num_layers > 1
        lstm_dropout = dropout_p if num_layers > 1 else 0
        self.lstm = nn.LSTM(embed_size, hidden_size, num_layers, batch_first=True, dropout=lstm_dropout)
        self.dropout = nn.Dropout(dropout_p) # Dropout before final linear layer
        self.linear = nn.Linear(hidden_size, vocab_size)

        # Layers to initialize LSTM state from image features
        self.init_h = nn.Linear(embed_size, hidden_size)
        self.init_c = nn.Linear(embed_size, hidden_size)

        # Initialize weights
        self.embed.weight.data.uniform_(-0.1, 0.1)
        self.linear.weight.data.uniform_(-0.1, 0.1)
        self.linear.bias.data.fill_(0)
        self.init_h.weight.data.uniform_(-0.1, 0.1)
        self.init_h.bias.data.fill_(0)
        self.init_c.weight.data.uniform_(-0.1, 0.1)
        self.init_c.bias.data.fill_(0)


    def forward(self, features, captions):
        # features: [B, E], captions: [B, T] (T = sequence length)

        # Prepare initial LSTM state from image features
        # Need shape [num_layers, B, H]
        h0 = self.init_h(features).unsqueeze(0) # [1, B, H]
        c0 = self.init_c(features).unsqueeze(0) # [1, B, H]
        # If num_layers > 1, repeat the initial state for each layer
        if self.lstm.num_layers > 1:
             h0 = h0.repeat(self.lstm.num_layers, 1, 1)
             c0 = c0.repeat(self.lstm.num_layers, 1, 1)

        # Embed captions and apply dropout
        embeddings = self.embed(captions)    # [B, T, E]
        embeddings = self.embed_dropout(embeddings) # Apply dropout

        # Pass through LSTM
        # embeddings shape needs to be [B, T, E] for batch_first=True
        hiddens, _ = self.lstm(embeddings, (h0, c0))  # [B, T, H]

        # Apply dropout before the final linear layer
        hiddens = self.dropout(hiddens) # Apply dropout

        # Generate outputs (logits over vocabulary)
        outputs = self.linear(hiddens)  # [B, T, Vocab]
        return outputs


## Prepare the Dataset

In [None]:
class CocoDataset(Dataset):
    def __init__(self, data, img_root, vocab, transform=None):
        self.data = data
        self.img_root = img_root
        self.vocab = vocab
        self.transform = transform
        print(f"Initialized CocoDataset with {len(self.data)} items. Root: {self.img_root}")


    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        entry = self.data[idx]

        # --- Randomly select one caption ---
        caption_entry = random.choice(entry['sentences'])
        caption = caption_entry['raw']

        # Numericalize caption
        tokens = [self.vocab.word2idx["<start>"]] + \
                 self.vocab.numericalize(caption) + \
                 [self.vocab.word2idx["<end>"]]

        # Load image
        split_folder = entry.get('filepath', '')

        if split_folder in self.img_root:
             img_path = os.path.join(self.img_root, entry['filename'])
        else:
             img_path = os.path.join(self.img_root, split_folder, entry['filename'])

        try:
            image = Image.open(img_path).convert("RGB")
        except FileNotFoundError:
             print(f"Warning: Image not found at {img_path}. Skipping.")

             return self.__getitem__(0)


        # Apply transformations
        if self.transform:
            image = self.transform(image)

        return image, torch.Tensor(tokens).long() # Ensure tokens are LongTensor


## Transformations to be applied on the data

In [None]:
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

# Training Transform (with augmentation)
train_transform = transforms.Compose([
    transforms.RandomResizedCrop(224, scale=(0.8, 1.0)), # Random crop and resize
    transforms.RandomHorizontalFlip(),      # Random horizontal flip
    transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1), # Color augmentation
    transforms.ToTensor(),
    normalize
])

# Validation/Test Transform
val_transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    normalize
])

## Make all captions in a batch equal in length by padding them

In [None]:
def collate_fn(batch):
    # Separate images and captions
    images, captions = zip(*batch)

    # Stack images
    images = torch.stack(images, 0)

    # Pad captions to the max length in the batch
    captions = pad_sequence(captions, batch_first=True, padding_value=vocab.word2idx["<pad>"])

    return images, captions

In [None]:
embed_size = 256
hidden_size = 512
num_layers = 1         # Number of LSTM layers
dropout_prob = 0.5     # Dropout probability
batch_size = 64       # Increased batch size 
num_epochs = 15       # Reduced epochs as it was overfitting quickly
learning_rate = 3e-4   # Adjusted learning rate
fine_tune_lr = 1e-5    # Separate learning rate for fine-tuned layers
weight_decay = 1e-5    # Weight decay (L2 regularization)
fine_tune_encoder = True # Set to True to for fine-tuning ResNet layer4
patience_early_stop = 3
patience_scheduler = 1
delta_early_stop = 0.005 # Require a minimum improvement to reset counter

In [None]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

train_img_root = '/content/coco/images/'
val_img_root = '/content/coco/images/'

train_dataset = CocoDataset(train_data, train_img_root, vocab, train_transform)
val_dataset = CocoDataset(val_data, val_img_root, vocab, val_transform)

num_workers = 2 if device.type == 'cuda' else 0
print(f"Using {num_workers} workers for DataLoaders.")

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, collate_fn=collate_fn, num_workers=num_workers, pin_memory=True if device.type == 'cuda' else False)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False, collate_fn=collate_fn, num_workers=num_workers, pin_memory=True if device.type == 'cuda' else False)


Using device: cuda
Initialized CocoDataset with 82783 items. Root: /content/coco/images/
Initialized CocoDataset with 5000 items. Root: /content/coco/images/
Using 2 workers for DataLoaders.


In [None]:
print("--- Initializing Models, Optimizer, etc. ---")
encoder = EncoderCNN(embed_size, dropout_p=dropout_prob, fine_tune=fine_tune_encoder).to(device)
decoder = DecoderRNN(embed_size, hidden_size, vocab_size, num_layers, dropout_p=dropout_prob).to(device)

decoder_params = list(decoder.parameters())
encoder_fc_params = list(encoder.fc.parameters()) + list(encoder.bn.parameters())
encoder_finetune_params = []
if fine_tune_encoder:
    layer4_index = 7
    encoder_finetune_params = list(encoder.resnet[layer4_index].parameters())
    print(f"Optimizing {len(encoder_finetune_params)} parameter tensors from ResNet layer4 with LR {fine_tune_lr}")


params_to_optimize = [
    {'params': decoder_params},
    {'params': encoder_fc_params}
]

if encoder_finetune_params:
    params_to_optimize.append({'params': encoder_finetune_params, 'lr': fine_tune_lr})

optimizer = optim.Adam(params_to_optimize, lr=learning_rate, weight_decay=weight_decay)

# Learning Rate Scheduler
scheduler = ReduceLROnPlateau(optimizer, mode='min', factor=0.5, patience=patience_scheduler, verbose=True)

# Loss Function (ignore padding)
criterion = nn.CrossEntropyLoss(ignore_index=vocab.word2idx["<pad>"])


--- Initializing Models, Optimizer, etc. ---
Initializing EncoderCNN: embed_size=256, dropout=0.5, fine_tune=True


Downloading: "https://download.pytorch.org/models/resnet101-63fe2227.pth" to /root/.cache/torch/hub/checkpoints/resnet101-63fe2227.pth
100%|██████████| 171M/171M [00:01<00:00, 168MB/s]


Fine-tuning ResNet: Unfreezing layer4 parameters.
Initializing DecoderRNN: embed_size=256, hidden_size=512, vocab_size=8853, num_layers=1, dropout=0.5
Optimizing 30 parameter tensors from ResNet layer4 with LR 1e-05




## Early Stopping mechanism to prevent overfitting

In [None]:
class EarlyStopping:
    def __init__(self, patience=3, delta=0.0):
        self.patience = patience
        self.counter = 0
        self.best_loss = None
        self.early_stop = False
        self.delta = delta # Minimum change to qualify as an improvement
        print(f"Initialized EarlyStopping: patience={patience}, delta={delta}")


    def __call__(self, val_loss):
        if self.best_loss is None:
            self.best_loss = val_loss
            print(f"EarlyStopping: Initial best loss set to {val_loss:.4f}")
        # Check if val_loss has improved significantly
        elif val_loss < self.best_loss - self.delta:
             print(f"EarlyStopping: Validation loss improved ({self.best_loss:.4f} --> {val_loss:.4f}). Resetting counter.")
             self.best_loss = val_loss
             self.counter = 0
        else:
            self.counter += 1
            print(f"EarlyStopping: No significant improvement for {self.counter}/{self.patience} epochs.")
            if self.counter >= self.patience:
                print("EarlyStopping: Triggering early stop.")
                self.early_stop = True

early_stopper = EarlyStopping(patience=patience_early_stop, delta=delta_early_stop)

Initialized EarlyStopping: patience=3, delta=0.005


In [None]:
total_params_encoder = sum(p.numel() for p in encoder.parameters() if p.requires_grad)
total_params_decoder = sum(p.numel() for p in decoder.parameters() if p.requires_grad)
print(f"\n--- Model Summary ---")
print(f"Trainable Parameters in Encoder: {total_params_encoder:,}")
print(f"Trainable Parameters in Decoder: {total_params_decoder:,}")
print(f"Total Trainable Parameters: {total_params_encoder + total_params_decoder:,}")
print("---------------------\n")


--- Model Summary ---
Trainable Parameters in Encoder: 15,489,792
Trainable Parameters in Decoder: 8,648,085
Total Trainable Parameters: 24,137,877
---------------------



In [None]:
print("--- Starting Training ---")
best_val_loss = float('inf')

for epoch in range(num_epochs):
    start_time = time.time()
    epoch_train_loss = 0.0

    # --- Training Phase ---
    encoder.train()
    decoder.train()
    print(f"\nEpoch [{epoch+1}/{num_epochs}] - Training")
    for i, (images, captions) in enumerate(train_loader):
        images, captions = images.to(device), captions.to(device)

        # Zero the gradients
        optimizer.zero_grad()

        # Forward pass
        features = encoder(images)
        # Teacher forcing: Feed target captions (shifted) to decoder
        # Input: <start>, w1, w2, ... wn
        # Target: w1, w2, ... wn, <end>
        outputs = decoder(features, captions[:, :-1]) # Exclude <end> token for input

        # Calculate loss
        # Target is captions shifted left, excluding <start> token
        loss = criterion(outputs.reshape(-1, vocab_size), captions[:, 1:].reshape(-1))

        # Backward pass and optimization
        loss.backward()
        optimizer.step()

        epoch_train_loss += loss.item()

        if (i + 1) % 100 == 0:
            print(f"Batch [{i+1}/{len(train_loader)}], Loss: {loss.item():.4f}")

    avg_train_loss = epoch_train_loss / len(train_loader)
    epoch_time = time.time() - start_time
    print(f"Epoch [{epoch+1}/{num_epochs}] Training completed in {epoch_time:.2f}s")
    print(f"Average Training Loss: {avg_train_loss:.4f}")

    # Validation Phase
    encoder.eval()
    decoder.eval()
    epoch_val_loss = 0.0
    print(f"Epoch [{epoch+1}/{num_epochs}] - Validation")
    with torch.no_grad():
        for val_images, val_captions in val_loader:
            val_images, val_captions = val_images.to(device), val_captions.to(device)

            val_features = encoder(val_images)
            val_outputs = decoder(val_features, val_captions[:, :-1]) # Exclude <end> token

            val_loss = criterion(val_outputs.reshape(-1, vocab_size), val_captions[:, 1:].reshape(-1))
            epoch_val_loss += val_loss.item()

    avg_val_loss = epoch_val_loss / len(val_loader)
    print(f"Epoch [{epoch+1}/{num_epochs}], Average Validation Loss: {avg_val_loss:.4f}")

    if avg_val_loss < best_val_loss - delta_early_stop: # Use delta here too for saving
        print(f"Validation loss decreased ({best_val_loss:.4f} --> {avg_val_loss:.4f}). Saving model...")
        best_val_loss = avg_val_loss
        torch.save({
            'epoch': epoch + 1,
            'encoder_state_dict': encoder.state_dict(),
            'decoder_state_dict': decoder.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'vocab': vocab, # Save vocab for inference later
            'val_loss': best_val_loss,
            'embed_size': embed_size,
            'hidden_size': hidden_size,
            'num_layers': num_layers,
            'dropout_prob': dropout_prob,
            'fine_tune_encoder': fine_tune_encoder
        }, 'best_model_improved.pth')
    else:
         print(f"Validation loss did not improve significantly from {best_val_loss:.4f}.")

    # Learning Rate Scheduling
    scheduler.step(avg_val_loss)

    # Early Stopping Check
    early_stopper(avg_val_loss)
    if early_stopper.early_stop:
        print("Early stopping criteria met. Stopping training.")
        break


--- Starting Training ---

Epoch [1/15] - Training
Batch [100/1294], Loss: 5.0666
Batch [200/1294], Loss: 4.6535
Batch [300/1294], Loss: 4.5354
Batch [400/1294], Loss: 4.0987
Batch [500/1294], Loss: 4.1721
Batch [600/1294], Loss: 4.0385
Batch [700/1294], Loss: 3.6736
Batch [800/1294], Loss: 3.7483
Batch [900/1294], Loss: 3.8578
Batch [1000/1294], Loss: 3.7661
Batch [1100/1294], Loss: 3.7105
Batch [1200/1294], Loss: 3.3711
Epoch [1/15] Training completed in 1257.06s
Average Training Loss: 4.1539
Epoch [1/15] - Validation
Epoch [1/15], Average Validation Loss: 3.2645
Validation loss decreased (inf --> 3.2645). Saving model...
EarlyStopping: Initial best loss set to 3.2645

Epoch [2/15] - Training
Batch [100/1294], Loss: 3.3486
Batch [200/1294], Loss: 3.2202
Batch [300/1294], Loss: 3.3123
Batch [400/1294], Loss: 3.3849
Batch [500/1294], Loss: 3.3364
Batch [600/1294], Loss: 3.3308
Batch [700/1294], Loss: 3.6197
Batch [800/1294], Loss: 3.0635
Batch [900/1294], Loss: 3.1723
Batch [1000/1294]

KeyboardInterrupt: 

# A bit of training was left as the colab session was over. So it is continued in the next colab notebook