# PapyrusNU Hieroglyph Detection - Model Training

This notebook trains a Detectron2 model for hieroglyph detection and classification.

## Objectives:
1. Set up Detectron2 configuration for hieroglyph detection
2. Configure training parameters and data augmentation
3. Train the model with monitoring and logging
4. Visualize training progress
5. Save the trained model

## Setup

In [1]:
# Install dependencies
!pip install -U torch torchvision cython
!pip install -U 'git+https://github.com/facebookresearch/fvcore.git' 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
import torch, torchvision
torch.__version__

Collecting git+https://github.com/facebookresearch/fvcore.git
  Cloning https://github.com/facebookresearch/fvcore.git to /tmp/pip-req-build-08c6muid
  Running command git clone --filter=blob:none --quiet https://github.com/facebookresearch/fvcore.git /tmp/pip-req-build-08c6muid
  Resolved https://github.com/facebookresearch/fvcore.git to commit 3b2d62f06b22ef743ac394e568e1e87ae12b30a8
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI
  Cloning https://github.com/cocodataset/cocoapi.git to /tmp/pip-req-build-jajkqmud
  Running command git clone --filter=blob:none --quiet https://github.com/cocodataset/cocoapi.git /tmp/pip-req-build-jajkqmud
  Resolved https://github.com/cocodataset/cocoapi.git to commit 8c9bcc3cf640524c4c20a9c40e89cb6a2f2fa0e9
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: fvcore, pycocotools
  Building wheel for fvcore (setup.py) ... [?25l

'2.8.0+cu128'

In [2]:
!git clone https://github.com/facebookresearch/detectron2 detectron2_repo
!pip install -e detectron2_repo

fatal: destination path 'detectron2_repo' already exists and is not an empty directory.
Obtaining file:///content/detectron2_repo
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting pycocotools>=2.0.2 (from detectron2==0.6)
  Using cached pycocotools-2.0.10-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.3 kB)
Collecting fvcore<0.1.6,>=0.1.5 (from detectron2==0.6)
  Using cached fvcore-0.1.5.post20221221-py3-none-any.whl
Using cached pycocotools-2.0.10-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (477 kB)
Installing collected packages: pycocotools, fvcore, detectron2
  Attempting uninstall: pycocotools
    Found existing installation: pycocotools 2.0
    Uninstalling pycocotools-2.0:
      Successfully uninstalled pycocotools-2.0
  Attempting uninstall: fvcore
    Found existing installation: fvcore 0.1.6
    Uninstalling fvcore-0.1.6:
      Successfully uninstalled fvcore-0.1.6
  Attempting uninstall: detectron2
    Found existing insta

In [3]:
import subprocess
import sys
import torch

# Check CUDA version to pick the correct Detectron2 wheel
cuda_version = torch.version.cuda
print(f"Detected CUDA version: {cuda_version}")

# Install other required packages
print("\nInstalling other required packages...")
other_packages = [
    'opencv-python-headless',
    'matplotlib>=3.3.0',
    'seaborn>=0.11.0',
    'plotly>=5.0.0',
    'pandas>=1.3.0',
    'scikit-learn>=1.0.0',
    'kaleido>=0.2.1',
    'pycocotools',
    'pillow>=8.0.0',
    'numpy',
    'tqdm',
    'ipywidgets'
]

try:
    subprocess.check_call([sys.executable, '-m', 'pip', 'install', '-q'] + other_packages)
    print("✅ All other packages installed successfully!")
except subprocess.CalledProcessError as e:
    print(f"❌ Failed to install other packages: {e}")
except Exception as e:
    print(f"❌ Installation of other packages failed: {e}")

Detected CUDA version: 12.8

Installing other required packages...
✅ All other packages installed successfully!


In [4]:
# Import libraries
import os
import sys
import json
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import cv2
import torch
from PIL import Image
import pandas as pd
from pathlib import Path
import logging
from datetime import datetime
import time
import pickle

# Detectron2 imports
import detectron2
from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor, DefaultTrainer
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer, ColorMode
from detectron2.data import MetadataCatalog, DatasetCatalog, build_detection_train_loader, build_detection_test_loader
from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from detectron2.utils.logger import setup_logger
from detectron2.checkpoint import DetectionCheckpointer
from detectron2.modeling import build_model
from detectron2.solver import build_lr_scheduler, build_optimizer
from detectron2.utils.events import EventStorage
import detectron2.data.transforms as T

# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

# Set up paths
PROJECT_ROOT = '/content/drive/My Drive/PapyrusNU_Detectron'
DATA_DIR = f'{PROJECT_ROOT}/data'
SCRIPTS_DIR = f'{PROJECT_ROOT}/scripts'
MODELS_DIR = f'{PROJECT_ROOT}/models'
NOTEBOOKS_DIR = f'{PROJECT_ROOT}/notebooks'

# Add scripts to path
sys.path.append(SCRIPTS_DIR)

# Set up logging
setup_logger()
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

print(f"Detectron2 version: {detectron2.__version__}")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA device: {torch.cuda.get_device_name(0)}")

Mounted at /content/drive
Detectron2 version: 0.6
PyTorch version: 2.8.0+cu128
CUDA available: True
CUDA device: NVIDIA A100-SXM4-40GB


## Data Loading and Category Mapping

In [5]:
# Load split datasets and create master category mapping
with open(f"{DATA_DIR}/annotations/train_annotations.json", 'r') as f:
    train_data = json.load(f)
with open(f"{DATA_DIR}/annotations/val_annotations.json", 'r') as f:
    val_data = json.load(f)
with open(f"{DATA_DIR}/annotations/test_annotations.json", 'r') as f:
    test_data = json.load(f)

print(f"Loaded annotation files:")
print(f"  Train: {len(train_data['annotations'])} annotations, {len(train_data['categories'])} categories")
print(f"  Val: {len(val_data['annotations'])} annotations, {len(val_data['categories'])} categories")
print(f"  Test: {len(test_data['annotations'])} annotations, {len(test_data['categories'])} categories")

# Create a master category list from all splits
master_categories = {}
for data in [train_data, val_data, test_data]:
    for cat in data.get('categories', []):
        master_categories[cat['id']] = cat

# Sort categories and create a mapping from original COCO IDs to 0-indexed IDs for Detectron2
sorted_cat_ids = sorted(master_categories.keys())
category_id_map = {old_id: new_id for new_id, old_id in enumerate(sorted_cat_ids)}
category_names = [master_categories[old_id]['name'] for old_id in sorted_cat_ids]

print(f"\n✅ Created master category mapping:")
print(f"  Total unique categories: {len(category_names)}")
print(f"  Original ID range: {min(sorted_cat_ids)} - {max(sorted_cat_ids)}")
print(f"  New ID range: 0 - {len(category_names)-1}")
print(f"  Sample categories: {category_names[:5]}...")

Loaded annotation files:
  Train: 1682 annotations, 177 categories
  Val: 432 annotations, 66 categories
  Test: 316 annotations, 95 categories

✅ Created master category mapping:
  Total unique categories: 177
  Original ID range: 5 - 628
  New ID range: 0 - 176
  Sample categories: ['A13A', 'A14', 'A17', 'A1B', 'A2']...


In [6]:
# Function to load data and remap category IDs for Detectron2
def load_and_remap_dataset(data, image_dir, id_map):
    """Load COCO format data and remap category IDs to be 0-indexed"""
    dataset_dicts = []
    images = {img['id']: img for img in data['images']}

    # Group annotations by image
    annotations_by_image = {}
    for ann in data['annotations']:
        img_id = ann['image_id']
        if img_id not in annotations_by_image:
            annotations_by_image[img_id] = []
        annotations_by_image[img_id].append(ann)

    for image_id, image_info in images.items():
        record = {}
        record["file_name"] = os.path.join(image_dir, image_info['file_name'])
        record["height"] = image_info['height']
        record["width"] = image_info['width']
        record["image_id"] = image_id

        objs = []
        for ann in annotations_by_image.get(image_id, []):
            # Apply the remapping to the category ID
            remapped_id = id_map.get(ann['category_id'])
            if remapped_id is not None:
                obj = {
                    "bbox": ann['bbox'],
                    "bbox_mode": 1,  # BoxMode.XYWH_ABS
                    "segmentation": ann.get('segmentation', []),
                    "category_id": remapped_id,
                    "iscrowd": ann.get('iscrowd', 0),
                }
                objs.append(obj)
        record["annotations"] = objs
        dataset_dicts.append(record)
    return dataset_dicts

# Load and remap all data splits
image_dir = f"{DATA_DIR}/images"
train_dataset_dicts = load_and_remap_dataset(train_data, image_dir, category_id_map)
val_dataset_dicts = load_and_remap_dataset(val_data, image_dir, category_id_map)
test_dataset_dicts = load_and_remap_dataset(test_data, image_dir, category_id_map)

print(f"\n✅ Loaded and remapped datasets:")
print(f"  Train: {len(train_dataset_dicts)} images")
print(f"  Val: {len(val_dataset_dicts)} images")
print(f"  Test: {len(test_dataset_dicts)} images")

# Verify remapping worked correctly
all_cat_ids = set()
for dataset_dicts in [train_dataset_dicts, val_dataset_dicts, test_dataset_dicts]:
    for record in dataset_dicts:
        for ann in record['annotations']:
            all_cat_ids.add(ann['category_id'])

print(f"  Remapped category IDs range: {min(all_cat_ids)} - {max(all_cat_ids)}")
print(f"  Expected range: 0 - {len(category_names)-1}")
assert max(all_cat_ids) < len(category_names), "Category ID mapping failed!"
print("  ✅ Category ID remapping successful!")


✅ Loaded and remapped datasets:
  Train: 1 images
  Val: 1 images
  Test: 1 images
  Remapped category IDs range: 0 - 176
  Expected range: 0 - 176
  ✅ Category ID remapping successful!


## Dataset Registration

In [7]:
# Register datasets with Detectron2
dataset_names = ["hieroglyphs_train", "hieroglyphs_val", "hieroglyphs_test"]
dataset_dicts_list = [train_dataset_dicts, val_dataset_dicts, test_dataset_dicts]

# Clear existing registrations
for name in dataset_names:
    if name in DatasetCatalog.list():
        DatasetCatalog.remove(name)

# Register datasets
for name, dataset_dicts in zip(dataset_names, dataset_dicts_list):
    DatasetCatalog.register(name, lambda d=dataset_dicts: d)
    MetadataCatalog.get(name).set(
        thing_classes=category_names,
        evaluator_type="coco"
    )

print(f"✅ Datasets registered successfully!")
print(f"Number of categories: {len(category_names)}")
print(f"Sample categories: {category_names[:5]}...")

# Verify registration
hieroglyphs_metadata = MetadataCatalog.get("hieroglyphs_train")
print(f"Metadata classes: {len(hieroglyphs_metadata.thing_classes)}")

✅ Datasets registered successfully!
Number of categories: 177
Sample categories: ['A13A', 'A14', 'A17', 'A1B', 'A2']...
Metadata classes: 177


## Model Configuration

In [8]:
# Set up model configuration
cfg = get_cfg()

# Model architecture - using Mask R-CNN with ResNet-50 backbone
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))

# Dataset configuration
cfg.DATASETS.TRAIN = ("hieroglyphs_train",)
cfg.DATASETS.TEST = ("hieroglyphs_val",)  # Use validation set for periodic evaluation
cfg.DATALOADER.NUM_WORKERS = 2

# Model parameters
cfg.MODEL.ROI_HEADS.NUM_CLASSES = len(category_names)  # Number of hieroglyph categories
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128  # Reduced due to small object sizes
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5  # Detection threshold

# Training configuration
cfg.SOLVER.IMS_PER_BATCH = 1  # Batch size (limited by single image dataset)
cfg.SOLVER.BASE_LR = 0.00025  # Learning rate
cfg.SOLVER.MAX_ITER = 5000  # Number of training iterations
cfg.SOLVER.STEPS = (3000, 4500)  # Learning rate decay steps
cfg.SOLVER.GAMMA = 0.1  # Learning rate decay factor
cfg.SOLVER.WARMUP_ITERS = 500  # Warmup iterations
cfg.SOLVER.WARMUP_FACTOR = 1.0 / 1000
cfg.SOLVER.CHECKPOINT_PERIOD = 1000  # Save checkpoint every 1000 iterations

# Evaluation configuration
cfg.TEST.EVAL_PERIOD = 1000  # Evaluate every 1000 iterations

# Output directory
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
output_dir = f"{MODELS_DIR}/hieroglyph_model_{timestamp}"
cfg.OUTPUT_DIR = output_dir
os.makedirs(output_dir, exist_ok=True)

print(f"Model Configuration:")
print(f"  Architecture: Mask R-CNN with ResNet-50 FPN")
print(f"  Number of classes: {cfg.MODEL.ROI_HEADS.NUM_CLASSES}")
print(f"  Batch size: {cfg.SOLVER.IMS_PER_BATCH}")
print(f"  Learning rate: {cfg.SOLVER.BASE_LR}")
print(f"  Max iterations: {cfg.SOLVER.MAX_ITER}")
print(f"  Output directory: {output_dir}")

# Save configuration
with open(f"{output_dir}/config.yaml", 'w') as f:
    f.write(cfg.dump())
print(f"Configuration saved to {output_dir}/config.yaml")

Model Configuration:
  Architecture: Mask R-CNN with ResNet-50 FPN
  Number of classes: 177
  Batch size: 1
  Learning rate: 0.00025
  Max iterations: 5000
  Output directory: /content/drive/My Drive/PapyrusNU_Detectron/models/hieroglyph_model_20250807_190054
Configuration saved to /content/drive/My Drive/PapyrusNU_Detectron/models/hieroglyph_model_20250807_190054/config.yaml


## Data Augmentation Setup

In [12]:
# Custom data augmentation for hieroglyphs
from detectron2.data import detection_utils as utils
import copy # Import the copy module

def custom_mapper(dataset_dict):
    """
    Custom data mapper with augmentations suitable for hieroglyphs
    """
    # Use copy.deepcopy instead of utils.deepcopy
    dataset_dict = copy.deepcopy(dataset_dict)  # it will be modified by code below

    # Load image
    image = utils.read_image(dataset_dict["file_name"], format="BGR")

    # Define augmentations
    augmentations = [
        T.RandomFlip(prob=0.5, horizontal=True),  # Horizontal flip
        T.RandomBrightness(0.8, 1.2),  # Brightness adjustment
        T.RandomContrast(0.8, 1.2),  # Contrast adjustment
        T.RandomSaturation(0.8, 1.2),  # Saturation adjustment
        T.RandomRotation(angle=[-5, 5]),  # Small rotations
        T.ResizeShortestEdge(
            short_edge_length=[640, 672, 704, 736, 768, 800],
            max_size=1333,
            sample_style="choice"
        ),
    ]

    # Apply augmentations
    aug_input = T.AugInput(image)
    transforms = T.AugmentationList(augmentations)(aug_input)
    image = aug_input.image

    # Transform annotations
    annos = [
        utils.transform_instance_annotations(obj, transforms, image.shape[:2])
        for obj in dataset_dict.pop("annotations")
        if obj.get("iscrowd", 0) == 0
    ]

    instances = utils.annotations_to_instances(annos, image.shape[:2])
    dataset_dict["instances"] = utils.filter_empty_instances(instances)
    dataset_dict["image"] = torch.as_tensor(image.transpose(2, 0, 1).astype("float32"))

    return dataset_dict

print("Custom data mapper with augmentations defined:")
print("  - Random horizontal flips")
print("  - Brightness, contrast, saturation adjustments")
print("  - Small random rotations (±5 degrees)")
print("  - Multi-scale training")

Custom data mapper with augmentations defined:
  - Random horizontal flips
  - Brightness, contrast, saturation adjustments
  - Small random rotations (±5 degrees)
  - Multi-scale training


## Custom Trainer with Monitoring

In [10]:
# Custom trainer class with enhanced monitoring
class HieroglyphTrainer(DefaultTrainer):
    """
    Custom trainer for hieroglyph detection with enhanced logging and evaluation
    """

    def __init__(self, cfg):
        super().__init__(cfg)
        self.training_history = {
            'iterations': [],
            'total_loss': [],
            'learning_rate': [],
            'validation_ap': [],
            'validation_iterations': []
        }

    @classmethod
    def build_train_loader(cls, cfg):
        """Build training data loader with custom augmentations"""
        return build_detection_train_loader(cfg, mapper=custom_mapper)

    @classmethod
    def build_evaluator(cls, cfg, dataset_name, output_folder=None):
        """Build evaluator for validation"""
        if output_folder is None:
            output_folder = os.path.join(cfg.OUTPUT_DIR, "inference")
        return COCOEvaluator(dataset_name, cfg, False, output_folder)

    def run_step(self):
        """Custom run step with loss logging"""
        assert self.model.training, "[Trainer] model was changed to eval mode!"
        start = time.perf_counter()

        data = next(self._trainer._data_loader_iter)
        data_time = time.perf_counter() - start

        loss_dict = self.model(data)
        losses = sum(loss_dict.values())

        self.optimizer.zero_grad()
        losses.backward()

        self._trainer._write_metrics(loss_dict, data_time)

        # Log training metrics
        if self.iter % 50 == 0:  # Log every 50 iterations
            self.training_history['iterations'].append(self.iter)
            self.training_history['total_loss'].append(losses.item())
            self.training_history['learning_rate'].append(self.optimizer.param_groups[0]['lr'])

        self.optimizer.step()

    def save_training_history(self):
        """Save training history to file"""
        history_file = os.path.join(self.cfg.OUTPUT_DIR, "training_history.json")
        with open(history_file, 'w') as f:
            json.dump(self.training_history, f, indent=2)
        print(f"Training history saved to {history_file}")

print("Custom trainer class defined with:")
print("  - Enhanced loss logging")
print("  - Custom data augmentations")
print("  - Training history tracking")
print("  - Periodic validation evaluation")

Custom trainer class defined with:
  - Enhanced loss logging
  - Custom data augmentations
  - Training history tracking
  - Periodic validation evaluation


## Model Training

In [11]:
# Initialize trainer
print("Initializing trainer...")
trainer = HieroglyphTrainer(cfg)
trainer.resume_or_load(resume=False)

print(f"Starting training for {cfg.SOLVER.MAX_ITER} iterations...")
print(f"Model will be saved to: {output_dir}")
print(f"Training started at: {datetime.now()}")
print("-" * 60)

# Start training
start_time = time.time()

try:
    trainer.train()
    print("\n✅ Training completed successfully!")
except Exception as e:
    print(f"❌ Training failed with error: {e}")
    raise
finally:
    end_time = time.time()
    training_duration = end_time - start_time
    print(f"\nTraining duration: {training_duration/3600:.2f} hours")

    # Save training history
    trainer.save_training_history()

    # Save final model
    trainer.checkpointer.save("model_final")
    print(f"Final model saved to: {output_dir}/model_final.pth")

Initializing trainer...
[08/07 19:01:05 d2.engine.defaults]: Model:
GeneralizedRCNN(
  (backbone): FPN(
    (fpn_lateral2): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral3): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral4): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output4): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral5): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output5): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (top_block): LastLevelMaxPool()
    (bottom_up): ResNet(
      (stem): BasicStem(
        (conv1): Conv2d(
          3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False
          (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
    

R-50.pkl: 102MB [00:01, 70.6MB/s]                           


[08/07 19:01:07 d2.checkpoint.c2_model_loading]: Renaming Caffe2 weights ......
[08/07 19:01:07 d2.checkpoint.c2_model_loading]: Following weights matched with submodule backbone.bottom_up - Total num: 54


backbone.fpn_lateral2.{bias, weight}
backbone.fpn_lateral3.{bias, weight}
backbone.fpn_lateral4.{bias, weight}
backbone.fpn_lateral5.{bias, weight}
backbone.fpn_output2.{bias, weight}
backbone.fpn_output3.{bias, weight}
backbone.fpn_output4.{bias, weight}
backbone.fpn_output5.{bias, weight}
proposal_generator.rpn_head.anchor_deltas.{bias, weight}
proposal_generator.rpn_head.conv.{bias, weight}
proposal_generator.rpn_head.objectness_logits.{bias, weight}
roi_heads.box_head.fc1.{bias, weight}
roi_heads.box_head.fc2.{bias, weight}
roi_heads.box_predictor.bbox_pred.{bias, weight}
roi_heads.box_predictor.cls_score.{bias, weight}
roi_heads.mask_head.deconv.{bias, weight}
roi_heads.mask_head.mask_fcn1.{bias, weight}
roi_heads.mask_head.mask_fcn2.{bias, weight}
roi_heads.mask_head.mask_fcn3.{bias, weight}
roi_heads.mask_head.mask_fcn4.{bias, weight}
roi_heads.mask_head.predictor.{bias, weight}
  fc1000.{bias, weight}
  stem.conv1.bias


Starting training for 5000 iterations...
Model will be saved to: /content/drive/My Drive/PapyrusNU_Detectron/models/hieroglyph_model_20250807_190054
Training started at: 2025-08-07 19:01:07.450879
------------------------------------------------------------
[08/07 19:01:07 d2.engine.train_loop]: Starting training from iteration 0
ERROR [08/07 19:01:07 d2.engine.train_loop]: Exception during training:
Traceback (most recent call last):
  File "/content/detectron2_repo/detectron2/engine/train_loop.py", line 155, in train
    self.run_step()
  File "/tmp/ipython-input-473624132.py", line 34, in run_step
    data = next(self._trainer._data_loader_iter)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/content/detectron2_repo/detectron2/data/common.py", line 329, in __iter__
    for d in self.dataset:
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/data/dataloader.py", line 734, in __next__
    data = self._next_data()
           ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/

AttributeError: Caught AttributeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/data/_utils/worker.py", line 349, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/data/_utils/fetch.py", line 33, in fetch
    data.append(next(self.dataset_iter))
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/content/detectron2_repo/detectron2/data/common.py", line 296, in __iter__
    yield self.dataset[idx]
          ~~~~~~~~~~~~^^^^^
  File "/content/detectron2_repo/detectron2/data/common.py", line 125, in __getitem__
    data = self._map_func(self._dataset[cur_idx])
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/content/detectron2_repo/detectron2/utils/serialize.py", line 26, in __call__
    return self._obj(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/ipython-input-90798842.py", line 8, in custom_mapper
    dataset_dict = utils.deepcopy(dataset_dict)  # it will be modified by code below
                   ^^^^^^^^^^^^^^
AttributeError: module 'detectron2.data.detection_utils' has no attribute 'deepcopy'


## Model Evaluation

In [None]:
# Load trained model for evaluation
print("Loading trained model for evaluation...")

# Update config for inference
cfg.MODEL.WEIGHTS = os.path.join(output_dir, "model_final.pth")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5  # Detection threshold
cfg.DATASETS.TEST = ("hieroglyphs_val",)

# Create predictor
predictor = DefaultPredictor(cfg)

print("Model loaded successfully!")
print(f"Detection threshold: {cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST}")

# Run evaluation on validation set
print("Running evaluation on validation set...")

evaluator = COCOEvaluator("hieroglyphs_val", cfg, False, output_dir)
val_loader = build_detection_test_loader(cfg, "hieroglyphs_val")

eval_results = inference_on_dataset(predictor.model, val_loader, evaluator)

print("\n📊 Validation Results:")
print(f"AP (Average Precision): {eval_results['bbox']['AP']:.3f}")
print(f"AP50 (AP @ IoU=0.5): {eval_results['bbox']['AP50']:.3f}")
print(f"AP75 (AP @ IoU=0.75): {eval_results['bbox']['AP75']:.3f}")
print(f"APs (Small objects): {eval_results['bbox']['APs']:.3f}")
print(f"APm (Medium objects): {eval_results['bbox']['APm']:.3f}")
print(f"APl (Large objects): {eval_results['bbox']['APl']:.3f}")

if 'segm' in eval_results:
    print(f"\nSegmentation Results:")
    print(f"Mask AP: {eval_results['segm']['AP']:.3f}")
    print(f"Mask AP50: {eval_results['segm']['AP50']:.3f}")
    print(f"Mask AP75: {eval_results['segm']['AP75']:.3f}")

# Save evaluation results
eval_file = os.path.join(output_dir, "validation_results.json")
with open(eval_file, 'w') as f:
    json.dump(eval_results, f, indent=2)
print(f"\nValidation results saved to: {eval_file}")

## Training Summary

In [None]:
# Create comprehensive training summary
training_summary = {
    'model_info': {
        'architecture': 'Mask R-CNN with ResNet-50 FPN',
        'num_classes': len(category_names),
        'training_iterations': cfg.SOLVER.MAX_ITER,
        'batch_size': cfg.SOLVER.IMS_PER_BATCH,
        'learning_rate': cfg.SOLVER.BASE_LR,
        'timestamp': timestamp
    },
    'dataset_info': {
        'train_annotations': len(train_data['annotations']),
        'val_annotations': len(val_data['annotations']),
        'test_annotations': len(test_data['annotations']),
        'categories': len(category_names),
        'category_id_remapping': 'Applied - mapped from original COCO IDs to 0-indexed'
    },
    'training_results': {
        'training_duration_hours': training_duration / 3600 if 'training_duration' in locals() else None,
        'final_loss': trainer.training_history['total_loss'][-1] if trainer.training_history['total_loss'] else None
    },
    'validation_results': eval_results if 'eval_results' in locals() else None,
    'files_created': [
        'model_final.pth',
        'config.yaml',
        'training_history.json',
        'validation_results.json'
    ]
}

# Save training summary
summary_file = os.path.join(output_dir, "training_summary.json")
with open(summary_file, 'w') as f:
    json.dump(training_summary, f, indent=2, default=str)

# Create model info file for easy loading
model_info = {
    'model_path': os.path.join(output_dir, "model_final.pth"),
    'config_path': os.path.join(output_dir, "config.yaml"),
    'num_classes': len(category_names),
    'category_names': category_names,
    'category_id_mapping': category_id_map,
    'detection_threshold': cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST,
    'timestamp': timestamp
}

model_info_file = os.path.join(output_dir, "model_info.json")
with open(model_info_file, 'w') as f:
    json.dump(model_info, f, indent=2)

print("📋 TRAINING COMPLETED SUCCESSFULLY!")
print("=" * 60)
print(f"✅ Model trained and saved to: {output_dir}")
print(f"✅ Training summary saved to: {summary_file}")
print(f"✅ Model info saved to: {model_info_file}")

if 'eval_results' in locals():
    print(f"\n📊 Final Performance Metrics:")
    print(f"   AP (Average Precision): {eval_results['bbox']['AP']:.3f}")
    print(f"   AP50: {eval_results['bbox']['AP50']:.3f}")
    print(f"   AP75: {eval_results['bbox']['AP75']:.3f}")

print(f"\n🚀 NEXT STEPS:")
print(f"1. Run notebook 04_model_evaluation.ipynb for detailed evaluation")
print(f"2. Test the model on the test set")
print(f"3. Analyze failure cases and model performance")
print(f"4. Consider fine-tuning or adjusting hyperparameters if needed")

print(f"\n📁 Output Directory Contents:")
for file in os.listdir(output_dir):
    file_path = os.path.join(output_dir, file)
    if os.path.isfile(file_path):
        size_mb = os.path.getsize(file_path) / (1024 * 1024)
        print(f"   {file}: {size_mb:.1f} MB")

In [13]:
# Initialize trainer
print("Initializing trainer...")
trainer = HieroglyphTrainer(cfg)
trainer.resume_or_load(resume=False)

print(f"Starting training for {cfg.SOLVER.MAX_ITER} iterations...")
print(f"Model will be saved to: {output_dir}")
print(f"Training started at: {datetime.now()}")
print("-" * 60)

# Start training
start_time = time.time()

try:
    trainer.train()
    print("\n✅ Training completed successfully!")
except Exception as e:
    print(f"❌ Training failed with error: {e}")
    raise
finally:
    end_time = time.time()
    training_duration = end_time - start_time
    print(f"\nTraining duration: {training_duration/3600:.2f} hours")

    # Save training history
    trainer.save_training_history()

    # Save final model
    trainer.checkpointer.save("model_final")
    print(f"Final model saved to: {output_dir}/model_final.pth")

Initializing trainer...
[08/07 19:01:46 d2.engine.defaults]: Model:
GeneralizedRCNN(
  (backbone): FPN(
    (fpn_lateral2): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral3): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral4): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output4): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral5): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output5): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (top_block): LastLevelMaxPool()
    (bottom_up): ResNet(
      (stem): BasicStem(
        (conv1): Conv2d(
          3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False
          (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
    

backbone.fpn_lateral2.{bias, weight}
backbone.fpn_lateral3.{bias, weight}
backbone.fpn_lateral4.{bias, weight}
backbone.fpn_lateral5.{bias, weight}
backbone.fpn_output2.{bias, weight}
backbone.fpn_output3.{bias, weight}
backbone.fpn_output4.{bias, weight}
backbone.fpn_output5.{bias, weight}
proposal_generator.rpn_head.anchor_deltas.{bias, weight}
proposal_generator.rpn_head.conv.{bias, weight}
proposal_generator.rpn_head.objectness_logits.{bias, weight}
roi_heads.box_head.fc1.{bias, weight}
roi_heads.box_head.fc2.{bias, weight}
roi_heads.box_predictor.bbox_pred.{bias, weight}
roi_heads.box_predictor.cls_score.{bias, weight}
roi_heads.mask_head.deconv.{bias, weight}
roi_heads.mask_head.mask_fcn1.{bias, weight}
roi_heads.mask_head.mask_fcn2.{bias, weight}
roi_heads.mask_head.mask_fcn3.{bias, weight}
roi_heads.mask_head.mask_fcn4.{bias, weight}
roi_heads.mask_head.predictor.{bias, weight}
  fc1000.{bias, weight}
  stem.conv1.bias


Starting training for 5000 iterations...
Model will be saved to: /content/drive/My Drive/PapyrusNU_Detectron/models/hieroglyph_model_20250807_190054
Training started at: 2025-08-07 19:01:47.149647
------------------------------------------------------------
[08/07 19:01:47 d2.engine.train_loop]: Starting training from iteration 0


  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]


[08/07 19:02:38 d2.utils.events]:  eta: 2:19:22  iter: 19  total_loss: 7.09  loss_cls: 4.859  loss_box_reg: 0.02399  loss_mask: 0.6916  loss_rpn_cls: 0.7007  loss_rpn_loc: 0.8201    time: 2.4078  last_time: 0.1041  data_time: 2.3014  last_data_time: 0.0024   lr: 9.7405e-06  max_mem: 11355M
[08/07 19:03:28 d2.utils.events]:  eta: 1:23:12  iter: 39  total_loss: 6.402  loss_cls: 4.228  loss_box_reg: 0.02331  loss_mask: 0.6915  loss_rpn_cls: 0.6961  loss_rpn_loc: 0.7546    time: 2.3674  last_time: 0.1077  data_time: 2.2341  last_data_time: 0.0029   lr: 1.973e-05  max_mem: 11799M
[08/07 19:04:18 d2.utils.events]:  eta: 1:22:52  iter: 59  total_loss: 5.031  loss_cls: 3.005  loss_box_reg: 0.02385  loss_mask: 0.6912  loss_rpn_cls: 0.6886  loss_rpn_loc: 0.6279    time: 2.4137  last_time: 0.1136  data_time: 2.2810  last_data_time: 0.0029   lr: 2.972e-05  max_mem: 11799M
[08/07 19:05:08 d2.utils.events]:  eta: 1:22:32  iter: 79  total_loss: 3.833  loss_cls: 1.933  loss_box_reg: 0.02344  loss_mask