# Sequential Multi-Task IQA Training Pipeline

This notebook trains three tasks sequentially:
1. **Stage 1**: Scene Classification
2. **Stage 2**: Distortion Classification (building on Scene knowledge)
3. **Stage 3**: Quality Assessment (building on Scene + Distortion knowledge)

Each stage is in a separate cell, so if one fails, you can fix it and continue from that stage.

## Configuration

In [1]:
# Configuration parameters
import os
os.environ["TOKENIZERS_PARALLELISM"] = "false"
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"

# Training configuration
DATASET_PATHS = ["datasets/koniq-10k/"]  # Change to your dataset
OUTPUT_DIR = "outputs/10310800_full_2"
BASE_MODEL = "src/owl3"

# Task selection - Configure which tasks to train
WITH_SCENE = True       # Train scene classification task
WITH_DISTORTION = True  # Train distortion classification task

# Training hyperparameters
MAX_STEPS = -1  # Number of steps per stage (-1 for full epochs)
NUM_TRAIN_EPOCHS = 3
BATCH_SIZE = 1
GRAD_ACCUM = 8
LEARNING_RATE = 2e-4
EVAL_STEPS = 100
SAVE_STEPS = 100
LOGGING_STEPS = 50

# Early stopping configuration
EARLY_STOPPING_PATIENCE = 5  # Stop if no improvement after 5 evaluations

# LoRA parameters
LORA_R = 16
LORA_ALPHA = 32
LORA_DROPOUT = 0.05

# Loss weights
USE_FIDELITY_LOSS = True

print("✅ Configuration set!")
print(f"📁 Dataset: {DATASET_PATHS}")
print(f"📁 Output: {OUTPUT_DIR}")
print(f"🎯 Training: {NUM_TRAIN_EPOCHS} epochs, max {MAX_STEPS} steps per stage")
print(f"⚙️  Batch Size: {BATCH_SIZE} × {GRAD_ACCUM} = {BATCH_SIZE * GRAD_ACCUM}")
print(f"🛑 Early Stopping: patience={EARLY_STOPPING_PATIENCE}")
print()
print(f"📋 Tasks to train:")
print(f"   Scene:       {'✅ Enabled' if WITH_SCENE else '❌ Disabled'}")
print(f"   Distortion:  {'✅ Enabled' if WITH_DISTORTION else '❌ Disabled'}")
print(f"   Quality:     ✅ Enabled (always trained)")
print()
print(f"📊 Dataset will use:")
print(f"   use_scene_labels={WITH_SCENE}")
print(f"   use_distortion_labels={WITH_DISTORTION}")

✅ Configuration set!
📁 Dataset: ['datasets/koniq-10k/']
📁 Output: outputs/10310800_full_2
🎯 Training: 3 epochs, max -1 steps per stage
⚙️  Batch Size: 1 × 8 = 8
🛑 Early Stopping: patience=5

📋 Tasks to train:
   Scene:       ✅ Enabled
   Distortion:  ✅ Enabled
   Quality:     ✅ Enabled (always trained)

📊 Dataset will use:
   use_scene_labels=True
   use_distortion_labels=True


## Imports and Setup

In [2]:
import sys
from pathlib import Path
import torch

from transformers import (
    AutoTokenizer,
    TrainingArguments,
    set_seed,
)

# Add src to path
sys.path.insert(0, str(Path.cwd()))

from src.new_train.model_wrapper import IQAModelWrapper
from src.new_train.dataset_adapter import IQAPairDataset, collate_fn_pair
from src.new_train.processor_no_cut import create_processor_no_cut
from src.new_train.iqa_trainer import IQATrainer
from src.new_train.plot_utils import plot_training_curves

# Import collate functions
from src.new_train.train_scene import collate_fn_scene
from src.new_train.train_distortion import collate_fn_distortion
from transformers import EarlyStoppingCallback

# Set seed
set_seed(42)

print("✅ Imports completed!")

  from .autonotebook import tqdm as notebook_tqdm


✅ Imports completed!




## Initialize Model (Run Once)

In [3]:
dataset_paths = [Path(p) for p in DATASET_PATHS]

# Create output directory
Path(OUTPUT_DIR).mkdir(parents=True, exist_ok=True)

print("🔧 Loading tokenizer and processor...")
tokenizer = AutoTokenizer.from_pretrained(
    BASE_MODEL,
    trust_remote_code=True,
)
processor = create_processor_no_cut(tokenizer)

print("🔧 Initializing model with LoRA...")
model = IQAModelWrapper(
    model_name_or_path=BASE_MODEL,
    lora_r=LORA_R,
    lora_alpha=LORA_ALPHA,
    lora_dropout=LORA_DROPOUT,
    weight_fidelity=1.0 if USE_FIDELITY_LOSS else 0.0,
)

print("\n✅ Model initialized!")
print(f"📊 Model will be trained sequentially on 3 tasks")

🔧 Loading tokenizer and processor...
🔧 Initializing model with LoRA...
use flash_attn rotary
use flash_attn rotary


HyperQwen2ForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.
  - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).
  - If you are not the owner of the model architecture class, please contact the model code owner to update it.


trainable params: 43,356,160 || all params: 8,115,903,040 || trainable%: 0.5342

✅ Model initialized!
📊 Model will be trained sequentially on 3 tasks


---
## Stage 1: Scene Classification Training

Train the model to classify scene types (e.g., landscape, cityscape, human, etc.)

In [4]:
if not WITH_SCENE:
    print("="*80)
    print("⏭️  STAGE 1/3: Scene Classification Training - SKIPPED")
    print("="*80)
    print("Scene training is disabled in configuration.")
else:
    print("="*80)
    print("STAGE 1/3: Scene Classification Training")
    print("="*80)

    # Create dataset
    print("\n📊 Creating scene classification dataset...")
    dataset_paths = [Path(p) for p in DATASET_PATHS]
    train_dataset_scene = IQAPairDataset(
        dataset_paths=dataset_paths,
        processor=processor,
        tokenizer=tokenizer,
        split="training",
        use_scene_labels=WITH_SCENE,
        use_distortion_labels=WITH_DISTORTION,
    )

    val_dataset_scene = IQAPairDataset(
        dataset_paths=dataset_paths,
        processor=processor,
        tokenizer=tokenizer,
        split="validation",
        use_scene_labels=WITH_SCENE,
        use_distortion_labels=WITH_DISTORTION,
    )

    print(f"✅ Training dataset size: {len(train_dataset_scene)}")
    print(f"✅ Validation dataset size: {len(val_dataset_scene)}")

STAGE 1/3: Scene Classification Training

📊 Creating scene classification dataset...
✅ Training dataset size: 7252
✅ Validation dataset size: 1813


In [5]:
if not WITH_SCENE:
    print("⏭️  Skipping Scene training arguments configuration...")
else:
    # Training arguments for Scene task
    output_dir_scene = f"{OUTPUT_DIR}/01_scene"
    training_args_scene = TrainingArguments(
        output_dir=output_dir_scene,
        num_train_epochs=NUM_TRAIN_EPOCHS if MAX_STEPS <= 0 else 1,
        max_steps=MAX_STEPS if MAX_STEPS > 0 else -1,
        per_device_train_batch_size=BATCH_SIZE,
        per_device_eval_batch_size=BATCH_SIZE,
        gradient_accumulation_steps=GRAD_ACCUM,
        learning_rate=LEARNING_RATE,
        lr_scheduler_type="cosine",
        warmup_ratio=0.03,
        weight_decay=0.0,
        logging_steps=LOGGING_STEPS,
        eval_strategy="steps",
        eval_steps=EVAL_STEPS,
        save_strategy="steps",
        save_steps=SAVE_STEPS,
        save_total_limit=2,
        bf16=True,
        dataloader_num_workers=12,
        remove_unused_columns=False,
        report_to="none",
        load_best_model_at_end=True,  # Load best model based on eval_loss
        metric_for_best_model="eval_loss",
        greater_is_better=False,  # Lower loss is better
    )

    print("✅ Training arguments configured for Scene task")
    print("   📌 Will load best model (lowest eval_loss) at end")
    print(f"   📌 Early stopping: patience={EARLY_STOPPING_PATIENCE}")

✅ Training arguments configured for Scene task
   📌 Will load best model (lowest eval_loss) at end
   📌 Early stopping: patience=5


In [6]:
if not WITH_SCENE:
    print("⏭️  Skipping Scene trainer creation...")
else:
    # Custom trainer for scene task
    

    class SceneTrainer(IQATrainer):
        def compute_loss(self, model, inputs, return_outputs=False, **kwargs):
            outputs = model.forward_scene_task(
                pixel_values_A=inputs["pixel_values_A"],
                input_ids_scene_A=inputs["input_ids_scene_A"],
                attention_mask_scene_A=inputs["attention_mask_scene_A"],
                labels_scene_A=inputs["labels_scene_A"],
                media_offset_A=inputs["media_offset_A"],
                pixel_values_B=inputs["pixel_values_B"],
                input_ids_scene_B=inputs["input_ids_scene_B"],
                attention_mask_scene_B=inputs["attention_mask_scene_B"],
                labels_scene_B=inputs["labels_scene_B"],
                media_offset_B=inputs["media_offset_B"],
            )
            loss = outputs["loss"]
            return (loss, outputs) if return_outputs else loss
        
        def prediction_step(self, model, inputs, prediction_loss_only: bool, ignore_keys=None):
            has_labels = "labels_scene_A" in inputs and "labels_scene_B" in inputs
            with torch.no_grad():
                if has_labels:
                    loss, outputs = self.compute_loss(model, inputs, return_outputs=True)
                    loss = loss.mean().detach()
                else:
                    loss = None
            return (loss, None, None)

    # Create early stopping callback
    early_stopping_callback = EarlyStoppingCallback(
        early_stopping_patience=EARLY_STOPPING_PATIENCE,
        early_stopping_threshold=0.0,  # Any improvement counts
    )

    # Create trainer
    trainer_scene = SceneTrainer(
        model=model,
        args=training_args_scene,
        train_dataset=train_dataset_scene,
        eval_dataset=val_dataset_scene,
        data_collator=collate_fn_scene,
        callbacks=[early_stopping_callback],
    )

    print("✅ Scene trainer created!")
    print(f"   🛑 Early stopping enabled: patience={EARLY_STOPPING_PATIENCE}")

✅ Scene trainer created!
   🛑 Early stopping enabled: patience=5


In [7]:
if not WITH_SCENE:
    print("⏭️  Skipping Scene training...")
else:
    # Train scene model
    print("\n" + "="*60)
    print("🔄 Starting STAGE 1: Scene Training")
    print("="*60)
    
    trainer_scene.train()
    
    print("✅ Stage 1 (Scene) completed!")
    print(f"Training logs in: {output_dir_scene}/")


🔄 Starting STAGE 1: Scene Training


The attention layers in this model are transitioning from computing the RoPE embeddings internally through `position_ids` (2D tensor with the indexes of the tokens), to using externally computed `position_embeddings` (Tuple of tensors, containing cos and sin). In v4.46 `position_ids` will be removed and `position_embeddings` will be mandatory.


Step,Training Loss,Validation Loss
100,0.0545,0.042632
200,0.0352,0.03742
300,0.0307,0.032693
400,0.0351,0.033466
500,0.024,0.037936
600,0.0248,0.043953
700,0.0225,0.035021
800,0.023,0.034282


Epoch 0.06 | Loss: 0.3500
Epoch 0.11 | Loss: 0.0545
Epoch 0.11 | Loss: 0.0545

[DEBUG] Collected 0 predictions
[DEBUG] Collected 0 loss_ce values
[DEBUG] Collected 0 loss_kl values
[DEBUG] Collected 0 loss_fidelity values

📊 Validation Results at Epoch 0.11
  Total Loss: 0.042632


[DEBUG] Collected 0 predictions
[DEBUG] Collected 0 loss_ce values
[DEBUG] Collected 0 loss_kl values
[DEBUG] Collected 0 loss_fidelity values

📊 Validation Results at Epoch 0.11
  Total Loss: 0.042632

Epoch 0.17 | Loss: 0.0406
Epoch 0.17 | Loss: 0.0406
Epoch 0.22 | Loss: 0.0352
Epoch 0.22 | Loss: 0.0352

[DEBUG] Collected 0 predictions
[DEBUG] Collected 0 loss_ce values
[DEBUG] Collected 0 loss_kl values
[DEBUG] Collected 0 loss_fidelity values

📊 Validation Results at Epoch 0.22
  Total Loss: 0.037420


[DEBUG] Collected 0 predictions
[DEBUG] Collected 0 loss_ce values
[DEBUG] Collected 0 loss_kl values
[DEBUG] Collected 0 loss_fidelity values

📊 Validation Results at Epoch 0.22
  Total Loss: 0.037420

Ep

---
## Stage 2: Distortion Classification Training

Build on Scene knowledge to classify distortion types (e.g., blur, noise, compression, etc.)

In [8]:
if not WITH_DISTORTION:
    print("="*80)
    print("⏭️  STAGE 2/3: Distortion Classification Training - SKIPPED")
    print("="*80)
    print("Distortion training is disabled in configuration.")
else:
    print("="*80)
    print("STAGE 2/3: Distortion Classification Training")
    print("="*80)

    # Create dataset
    print("\n📊 Creating distortion classification dataset...")
    train_dataset_distortion = IQAPairDataset(
        dataset_paths=dataset_paths,
        processor=processor,
        tokenizer=tokenizer,
        split="training",
        use_scene_labels=WITH_SCENE,
        use_distortion_labels=WITH_DISTORTION,
    )

    val_dataset_distortion = IQAPairDataset(
        dataset_paths=dataset_paths,
        processor=processor,
        tokenizer=tokenizer,
        split="validation",
        use_scene_labels=WITH_SCENE,
        use_distortion_labels=WITH_DISTORTION,
    )

    print(f"✅ Training dataset size: {len(train_dataset_distortion)}")
    print(f"✅ Validation dataset size: {len(val_dataset_distortion)}")

STAGE 2/3: Distortion Classification Training

📊 Creating distortion classification dataset...
✅ Training dataset size: 7252
✅ Validation dataset size: 1813


In [9]:
if not WITH_DISTORTION:
    print("⏭️  Skipping Distortion training arguments configuration...")
else:
    # Training arguments for Distortion task
    output_dir_distortion = f"{OUTPUT_DIR}/02_distortion"
    training_args_distortion = TrainingArguments(
        output_dir=output_dir_distortion,
        num_train_epochs=NUM_TRAIN_EPOCHS if MAX_STEPS <= 0 else 1,
        max_steps=MAX_STEPS if MAX_STEPS > 0 else -1,
        per_device_train_batch_size=BATCH_SIZE,
        per_device_eval_batch_size=BATCH_SIZE,
        gradient_accumulation_steps=GRAD_ACCUM,
        learning_rate=LEARNING_RATE,
        lr_scheduler_type="cosine",
        warmup_ratio=0.03,
        weight_decay=0.0,
        logging_steps=LOGGING_STEPS,
        eval_strategy="steps",
        eval_steps=EVAL_STEPS,
        save_strategy="steps",
        save_steps=SAVE_STEPS,
        save_total_limit=2,
        bf16=True,
        dataloader_num_workers=12,
        remove_unused_columns=False,
        report_to="none",
        load_best_model_at_end=True,
        metric_for_best_model="eval_loss",
        greater_is_better=False,
    )

    print("✅ Training arguments configured for Distortion task")
    print("   📌 Will load best model (lowest eval_loss) at end")
    print(f"   📌 Early stopping: patience={EARLY_STOPPING_PATIENCE}")

✅ Training arguments configured for Distortion task
   📌 Will load best model (lowest eval_loss) at end
   📌 Early stopping: patience=5


In [10]:
if not WITH_DISTORTION:
    print("⏭️  Skipping Distortion trainer class definition...")
else:
    # Custom trainer for distortion task
    class DistortionTrainer(IQATrainer):
        def compute_loss(self, model, inputs, return_outputs=False, **kwargs):
            outputs = model.forward_distortion_task(
                pixel_values_A=inputs["pixel_values_A"],
                input_ids_distortion_A=inputs["input_ids_distortion_A"],
                attention_mask_distortion_A=inputs["attention_mask_distortion_A"],
                labels_distortion_A=inputs["labels_distortion_A"],
                media_offset_A=inputs["media_offset_A"],
                pixel_values_B=inputs["pixel_values_B"],
                input_ids_distortion_B=inputs["input_ids_distortion_B"],
                attention_mask_distortion_B=inputs["attention_mask_distortion_B"],
                labels_distortion_B=inputs["labels_distortion_B"],
                media_offset_B=inputs["media_offset_B"],
            )
            loss = outputs["loss"]
            return (loss, outputs) if return_outputs else loss
        
        def prediction_step(self, model, inputs, prediction_loss_only: bool, ignore_keys=None):
            has_labels = "labels_distortion_A" in inputs and "labels_distortion_B" in inputs
            with torch.no_grad():
                if has_labels:
                    loss, outputs = self.compute_loss(model, inputs, return_outputs=True)
                    loss = loss.mean().detach()
                else:
                    loss = None
            return (loss, None, None)
    
    print("✅ DistortionTrainer class defined!")

✅ DistortionTrainer class defined!


In [11]:
if not WITH_DISTORTION:
    print("⏭️  Skipping Distortion trainer creation...")
else:
    # Create early stopping callback
    early_stopping_callback = EarlyStoppingCallback(
        early_stopping_patience=EARLY_STOPPING_PATIENCE,
        early_stopping_threshold=0.0,
    )

    # Create trainer
    trainer_distortion = DistortionTrainer(
        model=model,
        args=training_args_distortion,
        train_dataset=train_dataset_distortion,
        eval_dataset=val_dataset_distortion,
        data_collator=collate_fn_pair,
        callbacks=[early_stopping_callback],
    )
    
    print("✅ Distortion trainer created!")
    print(f"   🛑 Early stopping enabled: patience={EARLY_STOPPING_PATIENCE}")

✅ Distortion trainer created!
   🛑 Early stopping enabled: patience=5


In [12]:
if not WITH_DISTORTION:
    print("⏭️  Skipping Distortion training...")
else:
    # Train distortion model
    print("\n" + "="*60)
    print("🔄 Starting STAGE 2: Distortion Training")
    print("="*60)
    
    trainer_distortion.train()
    
    print("✅ Stage 2 (Distortion) completed!")
    print(f"   📊 Training logs in: {output_dir_distortion}/")


🔄 Starting STAGE 2: Distortion Training


Step,Training Loss,Validation Loss
100,0.0564,0.05146
200,0.0423,0.052788
300,0.044,0.048393
400,0.4837,0.045239
500,0.0522,0.047829
600,0.0413,0.053623
700,0.0526,0.052859
800,0.0376,0.046992
900,0.0373,0.049099


Epoch 0.06 | Loss: 0.1300
Epoch 0.11 | Loss: 0.0564
Epoch 0.11 | Loss: 0.0564

[DEBUG] Collected 0 predictions
[DEBUG] Collected 0 loss_ce values
[DEBUG] Collected 0 loss_kl values
[DEBUG] Collected 0 loss_fidelity values

📊 Validation Results at Epoch 0.11
  Total Loss: 0.051460


[DEBUG] Collected 0 predictions
[DEBUG] Collected 0 loss_ce values
[DEBUG] Collected 0 loss_kl values
[DEBUG] Collected 0 loss_fidelity values

📊 Validation Results at Epoch 0.11
  Total Loss: 0.051460

Epoch 0.17 | Loss: 0.0455
Epoch 0.17 | Loss: 0.0455
Epoch 0.22 | Loss: 0.0423
Epoch 0.22 | Loss: 0.0423

[DEBUG] Collected 0 predictions
[DEBUG] Collected 0 loss_ce values
[DEBUG] Collected 0 loss_kl values
[DEBUG] Collected 0 loss_fidelity values

📊 Validation Results at Epoch 0.22
  Total Loss: 0.052788


[DEBUG] Collected 0 predictions
[DEBUG] Collected 0 loss_ce values
[DEBUG] Collected 0 loss_kl values
[DEBUG] Collected 0 loss_fidelity values

📊 Validation Results at Epoch 0.22
  Total Loss: 0.052788

Ep

---
## Stage 3: Quality Assessment Training

Build on Scene + Distortion knowledge to predict image quality scores

In [13]:
print("="*80)
print("STAGE 3/3: Quality Assessment Training")
print("="*80)

# Create dataset
print("\n📊 Creating quality assessment dataset...")
train_dataset_quality = IQAPairDataset(
    dataset_paths=dataset_paths,
    processor=processor,
    tokenizer=tokenizer,
    split="training",
    use_scene_labels=WITH_SCENE,
    use_distortion_labels=WITH_DISTORTION,
)

val_dataset_quality = IQAPairDataset(
    dataset_paths=dataset_paths,
    processor=processor,
    tokenizer=tokenizer,
    split="validation",
    use_scene_labels=WITH_SCENE,
    use_distortion_labels=WITH_DISTORTION,
)

print(f"✅ Training dataset size: {len(train_dataset_quality)}")
print(f"✅ Validation dataset size: {len(val_dataset_quality)}")

STAGE 3/3: Quality Assessment Training

📊 Creating quality assessment dataset...
✅ Training dataset size: 7252
✅ Validation dataset size: 1813


In [14]:
# Training arguments for Quality task
output_dir_quality = f"{OUTPUT_DIR}/03_quality"
training_args_quality = TrainingArguments(
    output_dir=output_dir_quality,
    num_train_epochs=NUM_TRAIN_EPOCHS if MAX_STEPS <= 0 else 1,
    max_steps=MAX_STEPS if MAX_STEPS > 0 else -1,
    per_device_train_batch_size=BATCH_SIZE,
    per_device_eval_batch_size=BATCH_SIZE,
    gradient_accumulation_steps=GRAD_ACCUM,
    learning_rate=LEARNING_RATE,
    lr_scheduler_type="cosine",
    warmup_ratio=0.03,
    weight_decay=0.0,
    logging_steps=LOGGING_STEPS,
    eval_strategy="steps",
    eval_steps=EVAL_STEPS,
    save_strategy="steps",
    save_steps=SAVE_STEPS,
    save_total_limit=2,
    bf16=True,
    dataloader_num_workers=12,
    remove_unused_columns=False,
    report_to="none",
    load_best_model_at_end=True,
    metric_for_best_model="eval_loss",
    greater_is_better=False,
)

print("✅ Training arguments configured for Quality task")
print("   📌 Will load best model (lowest eval_loss) at end")
print(f"   📌 Early stopping: patience={EARLY_STOPPING_PATIENCE}")

✅ Training arguments configured for Quality task
   📌 Will load best model (lowest eval_loss) at end
   📌 Early stopping: patience=5


In [15]:
# Create early stopping callback
early_stopping_callback = EarlyStoppingCallback(
    early_stopping_patience=EARLY_STOPPING_PATIENCE,
    early_stopping_threshold=0.0,
)

# Create trainer
trainer_quality = IQATrainer(
    model=model,
    args=training_args_quality,
    train_dataset=train_dataset_quality,
    eval_dataset=val_dataset_quality,
    data_collator=collate_fn_pair,
    callbacks=[early_stopping_callback],
)

print("✅ Quality trainer created!")
print(f"   🛑 Early stopping enabled: patience={EARLY_STOPPING_PATIENCE}")

✅ Quality trainer created!
   🛑 Early stopping enabled: patience=5


In [16]:
# Train quality model
print("\n" + "="*60)
print("🔄 Starting STAGE 3: Quality Training")
print("="*60)

trainer_quality.train()

print("✅ Stage 3 (Quality) completed!")
print(f"   📊 Training logs in: {output_dir_quality}/")

# Auto-save final model with tokenizer for evaluation
print("\n📦 Saving final model (model + tokenizer)...")
final_path = f"{OUTPUT_DIR}/final_model"
model.model.save_pretrained(final_path)
tokenizer.save_pretrained(final_path)
print(f"✅ Final model saved to: {final_path}")
print(f"💡 Use this path for evaluation: {final_path}")

print("\n" + "="*60)
print("🎉 ALL TRAINING STAGES COMPLETED!")
print("="*60)


🔄 Starting STAGE 3: Quality Training


Step,Training Loss,Validation Loss,Loss Ce,Loss Kl,Loss Fidelity,Mae,Mse,Rmse,Plcc,Srcc
100,0.1114,0.141385,0.112645,0.453777,0.006051,0.313137,0.132844,0.364478,0.945565,0.927229
200,0.0957,0.140432,0.118255,0.317575,0.006298,0.190627,0.058983,0.242864,0.94542,0.925677
300,0.093,0.134704,0.109201,0.391578,0.005924,0.208298,0.065526,0.25598,0.951426,0.93402
400,0.0961,0.121248,0.101509,0.279982,0.005739,0.193744,0.059168,0.243245,0.950378,0.935299
500,0.1162,0.122957,0.102772,0.286195,0.005875,0.202009,0.062269,0.249537,0.951406,0.940978
600,0.0995,0.117853,0.102744,0.19942,0.005137,0.173669,0.051676,0.227325,0.953477,0.939735
700,0.0888,0.131348,0.113645,0.25738,0.004834,0.240209,0.080381,0.283515,0.957157,0.944501
800,0.0913,0.119288,0.103571,0.226819,0.004376,0.170165,0.051566,0.227081,0.960351,0.947013
900,0.0911,0.13308,0.111485,0.338211,0.004684,0.227314,0.072491,0.269241,0.959877,0.947271
1000,0.0753,0.138439,0.123205,0.206229,0.004922,0.180524,0.049899,0.223381,0.959401,0.943701


Epoch 0.06 | Loss: 0.3390
Epoch 0.11 | Loss: 0.1114
Epoch 0.11 | Loss: 0.1114

[DEBUG] Collected 1813 predictions
[DEBUG] Collected 1813 loss_ce values
[DEBUG] Collected 1813 loss_kl values
[DEBUG] Collected 1813 loss_fidelity values
[DEBUG] Average loss_ce: 0.112645
[DEBUG] Average loss_kl: 0.453777
[DEBUG] Average loss_fidelity: 0.006051
[DEBUG] Computing metrics: pred shape=(1813,), gt shape=(1813,)
[DEBUG] Computed metrics: {'mae': 0.3131371089521851, 'mse': 0.132843862129213, 'rmse': 0.36447751937425854, 'plcc': 0.9455645342231441, 'srcc': 0.9272293494295094}
[DEBUG] Added metrics to output.metrics: ['eval_mae', 'eval_mse', 'eval_rmse', 'eval_plcc', 'eval_srcc']

📊 Validation Results at Epoch 0.11
  Total Loss: 0.141385
  - CE Loss:      0.112645
  - KL Loss:      0.453777
  - Fidelity Loss: 0.006051

  MAE:        0.3131
  RMSE:       0.3645
  ------------------------------------------------------------------
  Correlation Metrics:
  PLCC:       0.9456  [██████████████████░░]
  S

---
## Save Final Model

In [17]:
# Note: Final model is already saved after Quality training
# This cell is optional - only run if you want to manually save again

print("="*80)
print("MANUAL MODEL SAVE (Optional)")
print("="*80)

final_path = f"{OUTPUT_DIR}/final_model"
print(f"📍 Model is already saved at: {final_path}")
print()
print("If you made changes after training and want to re-save:")
print("  1. Uncomment the lines below")
print("  2. Run this cell")
print()
# Uncomment to re-save:
# model.model.save_pretrained(final_path)
# tokenizer.save_pretrained(final_path)
# print(f"✅ Model re-saved to: {final_path}")

print("\n" + "="*80)
print("📊 Training Summary:")
print("="*80)
if WITH_SCENE:
    print(f"  ✅ Stage 1 (Scene):      {output_dir_scene}/")
else:
    print(f"  ⏭️  Stage 1 (Scene):      SKIPPED")
if WITH_DISTORTION:
    print(f"  ✅ Stage 2 (Distortion): {output_dir_distortion}/")
else:
    print(f"  ⏭️  Stage 2 (Distortion): SKIPPED")
print(f"  ✅ Stage 3 (Quality):    {output_dir_quality}/")
print(f"  📦 Final Model:          {final_path}/")
print()
print("🎯 To evaluate the model, run:")
print(f"   uv run -m eval_sequential_model --model_path {final_path} --dataset_paths {' '.join(DATASET_PATHS)} --split testing")

MANUAL MODEL SAVE (Optional)
📍 Model is already saved at: outputs/10310800_full_2/final_model

If you made changes after training and want to re-save:
  1. Uncomment the lines below
  2. Run this cell


📊 Training Summary:
  ✅ Stage 1 (Scene):      outputs/10310800_full_2/01_scene/
  ✅ Stage 2 (Distortion): outputs/10310800_full_2/02_distortion/
  ✅ Stage 3 (Quality):    outputs/10310800_full_2/03_quality/
  📦 Final Model:          outputs/10310800_full_2/final_model/

🎯 To evaluate the model, run:
   uv run -m eval_sequential_model --model_path outputs/10310800_full_2/final_model --dataset_paths datasets/koniq-10k/ --split testing


---
## Evaluation (Optional)

Evaluate the final model on test set

In [None]:
# You can evaluate on test set here if needed
# Example:
# test_dataset = IQAPairDataset(
#     dataset_paths=dataset_paths,
#     processor=processor,
#     tokenizer=tokenizer,
#     split="testing",
# )
# 
# test_results = trainer_quality.evaluate(test_dataset)
# print(test_results)

print("💡 To evaluate the model, use the eval_sequential_model.py script:")
print(f"   python eval_sequential_model.py --model_path {final_path} --dataset_paths {' '.join(DATASET_PATHS)} --split testing")

💡 To evaluate the model, use the eval_sequential_model.py script:
   python eval_sequential_model.py --model_path outputs/10310800_full_2/final_model --dataset_paths datasets/koniq-10k/ --split testing


: 