# Qwen Fine-tuning for IRG Pipeline
## Phase 2: Fine-tune Qwen 2.5 on Visual Reasoning Tasks

This notebook fine-tunes Qwen using LoRA/QLoRA for memory efficiency.

**GPU Required:** T4 x2 (32GB VRAM) recommended

**Estimated Time:** 3-4 hours

**Required Inputs:**
- Qwen model: Add from Kaggle datasets
- Training data: From Notebook 1 (upload as dataset)

### 1. Setup & Install Dependencies

In [1]:
# Install required packages
!pip install -q transformers>=4.35.0 \
    peft>=0.7.0 \
    bitsandbytes>=0.41.0 \
    accelerate>=0.25.0 \
    datasets \
    tqdm \
    pandas \
    wandb

print("‚úÖ Dependencies installed successfully")

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
bigframes 2.12.0 requires google-cloud-bigquery-storage<3.0.0,>=2.30.0, which is not installed.
pylibcudf-cu12 25.2.2 requires pyarrow<20.0.0a0,>=14.0.0; platform_machine == "x86_64", but you have pyarrow 22.0.0 which is incompatible.
cudf-cu12 25.2.2 requires pyarrow<20.0.0a0,>=14.0.0; platform_machine == "x86_64", but you have pyarrow 22.0.0 which is incompatible.
bigframes 2.12.0 requires google-cloud-bigquery[bqstorage,pandas]>=3.31.0, but you have google-cloud-bigquery 3.25.0 which is incompatible.
bigframes 2.12.0 requires rich<14,>=12.4.4, but you have rich 14.1.0 which is incompatible.
libcugraph-cu12 25.6.0 requires libraft-cu12==25.6.*, but you have libraft-cu12 25.2.0 which is incompatible.
gradio 5.38.1 requires pydantic<2.12,>=2.0, but you have pydantic 2.12.0a1 which is incompatible.
cud

### 2. Check GPU & Memory

In [2]:
import torch
import os

print("GPU Information:")
print("="*60)
if torch.cuda.is_available():
    print(f"‚úì CUDA available: {torch.cuda.get_device_name(0)}")
    print(f"‚úì GPU Count: {torch.cuda.device_count()}")
    for i in range(torch.cuda.device_count()):
        print(f"  GPU {i}: {torch.cuda.get_device_name(i)}")
        mem_total = torch.cuda.get_device_properties(i).total_memory / (1024**3)
        print(f"    Total Memory: {mem_total:.2f} GB")
else:
    print("‚ö†Ô∏è  No GPU available - this notebook requires GPU!")
    print("   Go to Settings ‚Üí Accelerator ‚Üí GPU T4 x2")

# Check disk space
import shutil
total, used, free = shutil.disk_usage("/kaggle/working")
print(f"\nDisk Space:")
print(f"  Total: {total // (2**30)} GB")
print(f"  Free: {free // (2**30)} GB")

GPU Information:
‚úì CUDA available: Tesla T4
‚úì GPU Count: 2
  GPU 0: Tesla T4
    Total Memory: 14.74 GB
  GPU 1: Tesla T4
    Total Memory: 14.74 GB

Disk Space:
  Total: 19 GB
  Free: 19 GB


### 3. Verify Input Datasets

**Before running:** Make sure you've added these datasets as inputs:
1. Qwen model (search "qwen2.5" on Kaggle datasets)
2. Your training data from Notebook 1

In [3]:
import os

# Check for Qwen model - adjust path based on your dataset
# Common paths:
QWEN_PATHS = [
    "/kaggle/input/qwen2.5/transformers/3b-instruct/1"
]

QWEN_MODEL_PATH = None
for path in QWEN_PATHS:
    if os.path.exists(path):
        QWEN_MODEL_PATH = path
        print(f"‚úì Found Qwen model at: {path}")
        break

if QWEN_MODEL_PATH is None:
    print("‚ö†Ô∏è  Qwen model not found!")
    print("   Available inputs:")
    for item in os.listdir("/kaggle/input"):
        print(f"     - {item}")
    print("\n   Please add Qwen model as input dataset")

# Check for training data
TRAINING_DATA_PATHS = [
    "/kaggle/input/irg-1-dataset-generation/irg_training_data_improved",
    "/kaggle/input/irg_training_data",
]

TRAINING_DATA_PATH = None
for path in TRAINING_DATA_PATHS:
    if os.path.exists(path):
        TRAINING_DATA_PATH = path
        print(f"‚úì Found training data at: {path}")
        # List files
        files = os.listdir(path)
        print(f"  Files: {files}")
        break

if TRAINING_DATA_PATH is None:
    print("‚ö†Ô∏è  Training data not found!")
    print("   Please add your training dataset from Notebook 1 as input")

print("\n" + "="*60)
if QWEN_MODEL_PATH and TRAINING_DATA_PATH:
    print("‚úÖ All inputs verified - ready to fine-tune!")
else:
    print("‚ö†Ô∏è  Missing required inputs - add them before continuing")

‚úì Found Qwen model at: /kaggle/input/qwen2.5/transformers/3b-instruct/1
‚úì Found training data at: /kaggle/input/irg-1-dataset-generation/irg_training_data_improved
  Files: ['complete_improved_dataset.json', 'train_improved.json', 'test_improved.json', 'train_improved.csv', 'dataset_stats.png', 'complete_improved_dataset.csv', 'test_improved.csv', 'val_improved.json', 'val_improved.csv']

‚úÖ All inputs verified - ready to fine-tune!


### 4. Load Fine-tuning Script
**Upload `qwen_finetune.py` to this notebook before running this cell**

In [4]:
%%writefile qwen_finetune.py

"""
Qwen Fine-tuning System for Enhanced IRG Pipeline Performance
Optimizes Qwen for better visual reasoning and image generation guidance
"""

import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
from transformers import (
    AutoModelForCausalLM, 
    AutoTokenizer,
    TrainingArguments,
    Trainer,
    DataCollatorForLanguageModeling,
    get_linear_schedule_with_warmup
)
from peft import (
    LoraConfig,
    get_peft_model,
    TaskType,
    prepare_model_for_kbit_training
)
from datasets import load_dataset, Dataset as HFDataset
import json
import numpy as np
from pathlib import Path
from typing import Dict, List, Optional, Tuple, Any
from dataclasses import dataclass, field
import wandb
from tqdm import tqdm
import pandas as pd
from accelerate import Accelerator
import bitsandbytes as bnb

# ==================== CONFIGURATION ====================
@dataclass
class FineTuneConfig:
    """Configuration for Qwen fine-tuning"""
    
    # Model settings
    model_path: str = "/kaggle/input/qwen2.5/transformers/0.5b-instruct/1"
    output_dir: str = "./qwen_irg_finetuned"
    
    # LoRA configuration
    use_lora: bool = True
    lora_r: int = 32  # Rank
    lora_alpha: int = 64  # Scaling parameter
    lora_dropout: float = 0.1
    lora_target_modules: List[str] = field(default_factory=lambda: [
        "q_proj", "k_proj", "v_proj", "o_proj",
        "gate_proj", "up_proj", "down_proj"
    ])
    
    # Training parameters
    num_epochs: int = 3
    batch_size: int = 4
    gradient_accumulation_steps: int = 4
    learning_rate: float = 2e-4
    warmup_ratio: float = 0.1
    weight_decay: float = 0.01
    max_grad_norm: float = 1.0
    
    # Optimization
    use_8bit: bool = False
    use_4bit: bool = True  # QLoRA
    bnb_4bit_compute_dtype: str = "float16"
    bnb_4bit_quant_type: str = "nf4"
    use_nested_quant: bool = True
    
    # Data settings
    max_seq_length: int = 2048
    train_split: float = 0.9
    seed: int = 42
    
    # Logging
    logging_steps: int = 10
    save_steps: int = 100
    eval_steps: int = 50
    save_total_limit: int = 3
    use_wandb: bool = True
    wandb_project: str = "qwen-irg-finetune"

# ==================== DATASET CREATION ====================

class IRGReasoningDataset:
    """
    Create specialized dataset for training Qwen on visual reasoning tasks
    """
    
    @staticmethod
    def create_visual_reasoning_examples() -> List[Dict[str, str]]:
        """Create high-quality visual reasoning examples"""
        
        examples = []
        
        # Template categories for comprehensive training
        templates = {
            "composition": [
                {
                    "prompt": "A majestic lion resting under an acacia tree at sunset",
                    "reasoning": "Compose the scene with the lion as the focal point in the lower third, positioned slightly off-center using the rule of thirds. The acacia tree should frame the composition from the left, creating depth. Use golden hour lighting with warm oranges and deep shadows. The sunset should create rim lighting on the lion's mane. Include savanna grass in the foreground with bokeh effect. Add atmospheric haze for depth. Use a low camera angle to emphasize the lion's majesty.",
                    "refinement": "Enhance the rim lighting on the lion's mane to create more dramatic contrast. Add more detail to the lion's eyes - they should reflect the sunset light. Adjust the acacia tree's silhouette to be more distinctive. Increase the warmth in the golden hour tones. Add dust particles in the air catching the light. Refine the grass texture in the foreground for better realism."
                },
                {
                    "prompt": "A futuristic cityscape with flying vehicles and neon lights",
                    "reasoning": "Create a vertical composition emphasizing the height of skyscrapers. Use cyberpunk aesthetic with dominant cyan and magenta neon colors. Position multiple flying vehicles at different depths for scale. Apply atmospheric perspective with fog in the distance. Include reflective surfaces on buildings to multiply the neon lights. Use a slightly tilted camera angle for dynamism. Add rain for enhanced reflections and mood.",
                    "refinement": "Intensify the neon glow effects with proper bloom. Add more variety to the flying vehicle designs. Enhance the rain streaks and their interaction with lights. Increase detail in building textures - add more windows, balconies, and architectural elements. Adjust the fog density for better depth separation. Add subtle lens flares from bright light sources."
                }
            ],
            "lighting": [
                {
                    "prompt": "A still life with fruits in dramatic chiaroscuro lighting",
                    "reasoning": "Set up strong directional lighting from the top-left at 45 degrees. Create deep shadows for dramatic contrast. Use a single key light source to emulate Caravaggio's technique. Arrange fruits (apples, grapes, pears) with varying textures and translucency. Position them to create interesting shadow patterns. Use a dark background to emphasize the light-dark contrast. Add subtle rim lighting to separate subjects from background.",
                    "refinement": "Increase the contrast between light and shadow areas. Add subsurface scattering to grapes for translucency. Enhance the texture details on fruit surfaces - show imperfections and natural patterns. Adjust the shadow edges - softer for distant objects, sharper for close ones. Add subtle reflected light in shadow areas from nearby fruits."
                }
            ],
            "detail_enhancement": [
                {
                    "prompt": "An elderly person's portrait showing wisdom and experience",
                    "reasoning": "Focus on capturing fine details in facial features. Use soft, diffused lighting from a window. Emphasize wrinkles, age spots, and texture in skin. Capture the depth in eyes with catchlights. Use shallow depth of field with eyes in sharp focus. Include subtle details like individual hair strands, fabric texture in clothing. Apply Rembrandt lighting for character.",
                    "refinement": "Enhance the micro-details in skin texture - pores, fine lines, age spots. Add more depth to the eyes with subtle color variations in the iris. Refine individual hair strands and eyebrows. Increase the catchlight clarity. Add subtle veins visible under thin skin. Enhance fabric texture with visible weave patterns."
                }
            ],
            "atmosphere": [
                {
                    "prompt": "A misty forest path at dawn",
                    "reasoning": "Create layered depth with multiple fog density levels. Use cool blue-green color palette for dawn atmosphere. Position trees to create a natural leading line along the path. Apply volumetric lighting with sun rays filtering through trees. Add dew drops on leaves and spider webs. Use atmospheric perspective with distant trees fading into mist. Include ground fog rolling across the path.",
                    "refinement": "Increase the variation in fog density between tree layers. Enhance the god rays with more defined light shafts. Add more dew drops catching the light. Refine the tree bark textures with moss and lichen details. Adjust the color temperature gradient from cool shadows to warm highlights. Add subtle movement blur to fog for dynamism."
                }
            ],
            "style_specific": [
                {
                    "prompt": "A samurai warrior in traditional armor",
                    "reasoning": "Compose with strong diagonal lines from the katana. Use dramatic lighting to highlight armor details and create depth. Apply rich, saturated colors - deep reds, blacks, and gold accents. Focus on the intricate patterns in the armor plates. Create a sense of power through low camera angle. Add atmospheric elements like subtle smoke or mist. Ensure historically accurate details in armor design.",
                    "refinement": "Enhance the metallic reflections on armor plates. Add more intricate details to the armor lacing and patterns. Refine the katana blade with proper hamon line and reflections. Increase the texture detail in fabric elements. Add subtle battle wear and patina to armor. Enhance the facial expression for more intensity."
                }
            ]
        }
        
        # Generate examples from templates
        for category, items in templates.items():
            for item in items:
                # Initial reasoning example
                examples.append({
                    "instruction": f"Analyze this prompt and provide detailed visual reasoning for image generation: \"{item['prompt']}\"",
                    "input": item['prompt'],
                    "output": item['reasoning'],
                    "type": "initial_reasoning"
                })
                
                # Refinement reasoning example
                examples.append({
                    "instruction": f"Based on the current image state, provide specific refinement instructions for improvement. Original prompt: \"{item['prompt']}\"",
                    "input": f"Prompt: {item['prompt']}\nCurrent state: Image has been generated with basic composition.",
                    "output": item['refinement'],
                    "type": "refinement_reasoning"
                })
        
        return examples
    
    @staticmethod
    def create_quality_improvement_examples() -> List[Dict[str, str]]:
        """Create examples for quality improvement reasoning"""
        
        quality_examples = [
            {
                "instruction": "Identify quality issues and provide improvement suggestions",
                "input": "The image appears blurry and lacks sharp details, especially in the focal area.",
                "output": "To improve sharpness: 1) Increase the image guidance scale to 8.5 for better prompt adherence, 2) Add 'sharp focus, highly detailed, 4k, crisp details' to the positive prompt, 3) Reduce the denoising strength to 0.6 to preserve more detail, 4) Focus on enhancing edge definition in the main subject, 5) Add subtle sharpening in post-processing focusing on mid-frequency details.",
                "type": "quality_fix"
            },
            {
                "instruction": "Analyze color balance issues and suggest corrections",
                "input": "The image has oversaturated colors and poor color harmony.",
                "output": "Color correction approach: 1) Reduce global saturation by 15-20%, 2) Adjust individual color channels - particularly reduce reds and magentas, 3) Improve color harmony by shifting towards analogous color scheme, 4) Add 'balanced colors, natural saturation, harmonious palette' to the prompt, 5) Apply subtle color grading to unify the overall tone, 6) Ensure shadows and highlights have complementary color temperatures.",
                "type": "quality_fix"
            },
            {
                "instruction": "Address composition problems in the generated image",
                "input": "The composition feels unbalanced with too much empty space on the right side.",
                "output": "Composition improvements: 1) Apply rule of thirds - reposition main subject to left third vertical line, 2) Add secondary elements on the right to balance visual weight, 3) Create depth by adding foreground elements, 4) Use leading lines to guide the eye through the composition, 5) Adjust crop to eliminate excessive negative space, 6) Include 'balanced composition, rule of thirds, visual harmony' in the prompt.",
                "type": "quality_fix"
            }
        ]
        
        return quality_examples
    
    @staticmethod
    def create_style_transfer_examples() -> List[Dict[str, str]]:
        """Create examples for style-specific reasoning"""
        
        style_examples = [
            {
                "instruction": "Apply photorealistic style to this concept",
                "input": "A dragon perched on a mountain peak",
                "output": "For photorealistic rendering: 1) Focus on anatomically plausible dragon design based on reptilian references, 2) Apply realistic lighting with proper shadow casting and ambient occlusion, 3) Include atmospheric effects - fog, haze, and distance blur, 4) Add realistic textures - scales with iridescent properties, weathered rock surfaces, 5) Use 'photorealistic, 8k photography, ultra detailed, shot on Hasselblad' in prompt, 6) Ensure proper scale references with trees or birds, 7) Apply realistic color grading similar to nature photography.",
                "type": "style_reasoning"
            },
            {
                "instruction": "Convert to oil painting style",
                "input": "A busy market street scene",
                "output": "Oil painting stylization: 1) Apply visible brushstrokes with impasto technique for texture, 2) Use rich, saturated oil paint colors with subtle mixing, 3) Create depth through overlapping brush strokes and color temperature shifts, 4) Add 'oil painting, thick brushstrokes, impasto, traditional art, canvas texture' to prompt, 5) Emphasize light and shadow with bold strokes, 6) Slightly reduce fine details in favor of impressionistic rendering, 7) Include paint mixing effects at color boundaries.",
                "type": "style_reasoning"
            }
        ]
        
        return style_examples
    
    @staticmethod
    def create_full_dataset() -> List[Dict[str, str]]:
        """Combine all examples into a complete dataset"""
        
        all_examples = []
        all_examples.extend(IRGReasoningDataset.create_visual_reasoning_examples())
        all_examples.extend(IRGReasoningDataset.create_quality_improvement_examples())
        all_examples.extend(IRGReasoningDataset.create_style_transfer_examples())
        
        # Add prompt engineering examples
        prompt_engineering = [
            {
                "instruction": "Enhance this basic prompt with technical photography terms",
                "input": "A cat sitting on a windowsill",
                "output": "Enhanced prompt: 'A tabby cat gracefully perched on a weathered wooden windowsill, soft natural lighting, shallow depth of field with bokeh background, shot at golden hour, rule of thirds composition, Kodak Portra 400 film aesthetic, 85mm lens, f/1.8 aperture, subtle rim lighting highlighting fur texture, warm color grading, photorealistic detail'",
                "type": "prompt_enhancement"
            },
            {
                "instruction": "Add artistic style descriptors to this prompt",
                "input": "A mountain landscape",
                "output": "Artistic prompt: 'Majestic mountain landscape in the style of Albert Bierstadt, dramatic romantic lighting, sublime natural beauty, oil on canvas, thick impasto technique, Hudson River School aesthetic, golden hour illumination, atmospheric perspective with misty valleys, rich earth tones contrasting with snow-capped peaks, masterpiece quality, museum-worthy composition'",
                "type": "prompt_enhancement"
            }
        ]
        all_examples.extend(prompt_engineering)
        
        return all_examples

# ==================== CUSTOM DATASET CLASS ====================

class QwenIRGDataset(Dataset):
    """PyTorch dataset for Qwen fine-tuning"""
    
    def __init__(
        self,
        examples: List[Dict[str, str]],
        tokenizer,
        max_length: int = 2048,
        is_training: bool = True
    ):
        self.examples = examples
        self.tokenizer = tokenizer
        self.max_length = max_length
        self.is_training = is_training
        
    def __len__(self):
        return len(self.examples)
    
    def __getitem__(self, idx):
        example = self.examples[idx]
        
        # Format as conversation
        messages = [
            {"role": "system", "content": "You are an expert visual reasoning assistant specialized in providing detailed guidance for high-quality image generation."},
            {"role": "user", "content": example['instruction']},
            {"role": "assistant", "content": example['output']}
        ]
        
        # Apply chat template
        text = self.tokenizer.apply_chat_template(
            messages,
            tokenize=False,
            add_generation_prompt=False
        )
        
        # Tokenize
        encodings = self.tokenizer(
            text,
            truncation=True,
            max_length=self.max_length,
            padding="max_length",
            return_tensors="pt"
        )
        
        # Set up labels for training
        labels = encodings["input_ids"].clone()
        
        # Find where assistant response starts and mask everything before
        response_start = text.find(example['output'])
        if response_start != -1:
            # Mask tokens before the response
            response_token_start = len(self.tokenizer.encode(text[:response_start]))
            labels[0, :response_token_start] = -100
        
        return {
            "input_ids": encodings["input_ids"].squeeze(),
            "attention_mask": encodings["attention_mask"].squeeze(),
            "labels": labels.squeeze()
        }

# ==================== TRAINING UTILITIES ====================

class QwenFineTuner:
    """Main class for fine-tuning Qwen"""

    def __init__(self, config: FineTuneConfig):
        self.config = config
        # Note: Don't use Accelerator when using Trainer with quantized models
        # Trainer handles acceleration internally

        # Initialize wandb if enabled
        if config.use_wandb:
            wandb.init(project=config.wandb_project, config=vars(config))
    
    def prepare_model_and_tokenizer(self):
        """Load and prepare model for fine-tuning"""
        
        print("Loading tokenizer and model...")
        
        # Load tokenizer
        tokenizer = AutoTokenizer.from_pretrained(
            self.config.model_path,
            trust_remote_code=True
        )
        
        if tokenizer.pad_token is None:
            tokenizer.pad_token = tokenizer.eos_token
        
        # Quantization config for QLoRA
        bnb_config = None
        if self.config.use_4bit:
            from transformers import BitsAndBytesConfig
            bnb_config = BitsAndBytesConfig(
                load_in_4bit=True,
                bnb_4bit_compute_dtype=torch.float16,
                bnb_4bit_quant_type=self.config.bnb_4bit_quant_type,
                bnb_4bit_use_double_quant=self.config.use_nested_quant
            )
        elif self.config.use_8bit:
            from transformers import BitsAndBytesConfig
            bnb_config = BitsAndBytesConfig(load_in_8bit=True)

        # Determine device map - for quantized models, use current device
        if self.config.use_4bit or self.config.use_8bit:
            device_map = {"": 0}  # Load everything on GPU 0
        else:
            device_map = "auto"

        # Load model
        model = AutoModelForCausalLM.from_pretrained(
            self.config.model_path,
            quantization_config=bnb_config,
            torch_dtype=torch.float16 if not (self.config.use_4bit or self.config.use_8bit) else None,
            trust_remote_code=True,
            device_map=device_map
        )
        
        # Prepare model for training
        if self.config.use_4bit or self.config.use_8bit:
            model = prepare_model_for_kbit_training(model)
        
        # Apply LoRA if enabled
        if self.config.use_lora:
            print("Applying LoRA configuration...")
            lora_config = LoraConfig(
                r=self.config.lora_r,
                lora_alpha=self.config.lora_alpha,
                target_modules=self.config.lora_target_modules,
                lora_dropout=self.config.lora_dropout,
                bias="none",
                task_type=TaskType.CAUSAL_LM
            )
            model = get_peft_model(model, lora_config)
            model.print_trainable_parameters()
        
        return model, tokenizer
    
    def prepare_datasets(self, tokenizer, external_train_data=None, external_val_data=None):
        """Prepare training and validation datasets"""

        print("Preparing datasets...")

        # Use external data if provided, otherwise use built-in
        if external_train_data is not None and external_val_data is not None:
            print("Using external dataset (from Phase 1)")
            train_examples = external_train_data
            val_examples = external_val_data
        else:
            print("Using built-in dataset (small - not recommended)")
            # Create examples
            examples = IRGReasoningDataset.create_full_dataset()

            # Augment with more examples
            examples = self.augment_dataset(examples)

            # Split into train/val
            split_idx = int(len(examples) * self.config.train_split)
            train_examples = examples[:split_idx]
            val_examples = examples[split_idx:]

        print(f"Training examples: {len(train_examples)}")
        print(f"Validation examples: {len(val_examples)}")
        
        # Create datasets
        train_dataset = QwenIRGDataset(
            train_examples,
            tokenizer,
            self.config.max_seq_length,
            is_training=True
        )
        
        val_dataset = QwenIRGDataset(
            val_examples,
            tokenizer,
            self.config.max_seq_length,
            is_training=False
        )
        
        return train_dataset, val_dataset
    
    def augment_dataset(self, examples: List[Dict[str, str]]) -> List[Dict[str, str]]:
        """Augment dataset with variations"""
        
        augmented = examples.copy()
        
        # Add variations for each example
        for example in examples:
            if example['type'] == 'initial_reasoning':
                # Create variation with different focus
                variation = example.copy()
                variation['instruction'] = variation['instruction'].replace(
                    "provide detailed visual reasoning",
                    "focus on composition and lighting"
                )
                augmented.append(variation)
        
        return augmented
    
    def create_training_args(self):
        """Create training arguments"""
        
        return TrainingArguments(
            output_dir=self.config.output_dir,
            num_train_epochs=self.config.num_epochs,
            per_device_train_batch_size=self.config.batch_size,
            per_device_eval_batch_size=self.config.batch_size,
            gradient_accumulation_steps=self.config.gradient_accumulation_steps,
            learning_rate=self.config.learning_rate,
            warmup_ratio=self.config.warmup_ratio,
            weight_decay=self.config.weight_decay,
            logging_steps=self.config.logging_steps,
            save_steps=self.config.save_steps,
            eval_steps=self.config.eval_steps,
            save_total_limit=self.config.save_total_limit,
            eval_strategy="steps" if self.config.eval_steps < 1000000 else "no",
            save_strategy="steps",
            load_best_model_at_end=True if self.config.eval_steps < 1000000 else False,
            metric_for_best_model="loss" if self.config.eval_steps < 1000000 else None,
            greater_is_better=False,
            push_to_hub=False,
            report_to=["wandb"] if self.config.use_wandb else ["none"],
            bf16=True,  # BFloat16 is more stable than FP16
            fp16=False,  # Disable FP16 to avoid CUBLAS errors
            gradient_checkpointing=True,
            max_grad_norm=self.config.max_grad_norm,
            optim="paged_adamw_8bit" if self.config.use_4bit else "adamw_torch",
            seed=self.config.seed,
            remove_unused_columns=False,
            dataloader_pin_memory=False,  # Reduce memory pressure
            ddp_find_unused_parameters=False,  # Stability for distributed training
        )
    
    def train(self, external_train_data=None, external_val_data=None):
        """Main training function"""

        print("="*80)
        print("Starting Qwen Fine-tuning for IRG Pipeline")
        print("="*80)

        # Prepare model and tokenizer
        model, tokenizer = self.prepare_model_and_tokenizer()

        # Prepare datasets
        train_dataset, val_dataset = self.prepare_datasets(
            tokenizer,
            external_train_data,
            external_val_data
        )
        
        # Create training arguments
        training_args = self.create_training_args()
        
        # Data collator
        data_collator = DataCollatorForLanguageModeling(
            tokenizer=tokenizer,
            mlm=False
        )
        
        # Create memory cleanup callback
        from transformers import TrainerCallback
        import gc

        class MemoryCleanupCallback(TrainerCallback):
            """Aggressively clean memory after evaluation to prevent CUBLAS errors"""
            def on_evaluate(self, args, state, control, **kwargs):
                """Clean up after evaluation"""
                print("\nüßπ Cleaning GPU memory after evaluation...")
                gc.collect()
                torch.cuda.empty_cache()
                torch.cuda.synchronize()
                if torch.cuda.is_available():
                    for i in range(torch.cuda.device_count()):
                        torch.cuda.reset_peak_memory_stats(i)
                print("‚úÖ Memory cleaned\n")
                return control

        # Create trainer with memory cleanup callback
        trainer = Trainer(
            model=model,
            args=training_args,
            train_dataset=train_dataset,
            eval_dataset=val_dataset,
            data_collator=data_collator,
            tokenizer=tokenizer,
            callbacks=[MemoryCleanupCallback()],
        )
        
        # Start training
        print("\nüöÄ Starting training...")
        trainer.train()
        
        # Save final model
        print("\nüíæ Saving fine-tuned model...")
        trainer.save_model(self.config.output_dir)
        tokenizer.save_pretrained(self.config.output_dir)
        
        # Save LoRA weights separately if used
        if self.config.use_lora:
            model.save_pretrained(f"{self.config.output_dir}/lora_weights")
        
        print(f"\n‚úÖ Fine-tuning complete! Model saved to {self.config.output_dir}")
        
        return model, tokenizer

# ==================== INFERENCE WITH FINE-TUNED MODEL ====================

class OptimizedQwenInference:
    """Use the fine-tuned Qwen for improved IRG pipeline"""
    
    def __init__(self, model_path: str, use_lora: bool = True):
        self.model_path = model_path
        self.use_lora = use_lora
        self.device = "cuda" if torch.cuda.is_available() else "cpu"
        
        # Load model and tokenizer
        self.load_model()
    
    def load_model(self):
        """Load fine-tuned model"""
        
        print(f"Loading fine-tuned model from {self.model_path}")
        
        self.tokenizer = AutoTokenizer.from_pretrained(
            self.model_path,
            trust_remote_code=True
        )
        
        if self.use_lora:
            # Load base model + LoRA weights
            from peft import PeftModel
            
            base_model = AutoModelForCausalLM.from_pretrained(
                self.model_path,
                torch_dtype=torch.float16,
                device_map="auto",
                trust_remote_code=True
            )
            
            self.model = PeftModel.from_pretrained(
                base_model,
                f"{self.model_path}/lora_weights"
            )
            self.model = self.model.merge_and_unload()  # Merge LoRA weights
        else:
            self.model = AutoModelForCausalLM.from_pretrained(
                self.model_path,
                torch_dtype=torch.float16,
                device_map="auto",
                trust_remote_code=True
            )
        
        self.model.eval()
    
    @torch.no_grad()
    def generate_reasoning(
        self,
        prompt: str,
        reasoning_type: str = "initial",
        max_length: int = 500,
        temperature: float = 0.7
    ) -> str:
        """Generate improved reasoning with fine-tuned model"""
        
        # Create appropriate instruction based on type
        if reasoning_type == "initial":
            instruction = f"Analyze this prompt and provide detailed visual reasoning for image generation: \"{prompt}\""
        elif reasoning_type == "refinement":
            instruction = f"Based on the current image state, provide specific refinement instructions for improvement. Original prompt: \"{prompt}\""
        else:
            instruction = prompt
        
        # Format as conversation
        messages = [
            {"role": "system", "content": "You are an expert visual reasoning assistant specialized in providing detailed guidance for high-quality image generation."},
            {"role": "user", "content": instruction}
        ]
        
        # Apply chat template
        text = self.tokenizer.apply_chat_template(
            messages,
            tokenize=False,
            add_generation_prompt=True
        )
        
        # Tokenize
        inputs = self.tokenizer(
            text,
            return_tensors="pt",
            truncation=True,
            max_length=2048
        ).to(self.device)
        
        # Generate
        outputs = self.model.generate(
            **inputs,
            max_new_tokens=max_length,
            temperature=temperature,
            top_p=0.9,
            do_sample=True,
            pad_token_id=self.tokenizer.eos_token_id
        )
        
        # Decode
        response = self.tokenizer.decode(outputs[0], skip_special_tokens=True)
        
        # Extract assistant response
        if "assistant" in response:
            response = response.split("assistant")[-1].strip()
        
        return response

# ==================== EVALUATION ====================

class FineTuneEvaluator:
    """Evaluate the fine-tuned model's performance"""
    
    def __init__(self, original_model_path: str, finetuned_model_path: str):
        self.original_path = original_model_path
        self.finetuned_path = finetuned_model_path
        
    def compare_outputs(self, test_prompts: List[str]) -> pd.DataFrame:
        """Compare outputs between original and fine-tuned models"""
        
        # Load both models
        original_inference = OptimizedQwenInference(
            self.original_path,
            use_lora=False
        )
        finetuned_inference = OptimizedQwenInference(
            self.finetuned_path,
            use_lora=True
        )
        
        results = []
        
        for prompt in test_prompts:
            # Generate with both models
            original_output = original_inference.generate_reasoning(prompt)
            finetuned_output = finetuned_inference.generate_reasoning(prompt)
            
            # Simple quality metrics
            results.append({
                "prompt": prompt,
                "original_length": len(original_output),
                "finetuned_length": len(finetuned_output),
                "original_detail_keywords": self._count_detail_keywords(original_output),
                "finetuned_detail_keywords": self._count_detail_keywords(finetuned_output),
                "original_output": original_output[:200] + "...",
                "finetuned_output": finetuned_output[:200] + "..."
            })
        
        return pd.DataFrame(results)
    
    def _count_detail_keywords(self, text: str) -> int:
        """Count visual detail keywords in output"""
        keywords = [
            "lighting", "shadow", "composition", "texture", "color",
            "detail", "contrast", "depth", "focus", "atmosphere",
            "reflection", "highlight", "tone", "saturation", "sharpness"
        ]
        
        text_lower = text.lower()
        return sum(1 for keyword in keywords if keyword in text_lower)

# ==================== MAIN EXECUTION ====================

if __name__ == "__main__":
    
    # Configuration
    config = FineTuneConfig(
        model_path="/kaggle/input/qwen2.5/transformers/0.5b-instruct/1",
        output_dir="./qwen_irg_finetuned",
        use_lora=True,
        lora_r=32,
        lora_alpha=64,
        num_epochs=3,
        batch_size=4,
        learning_rate=2e-4,
        use_4bit=True,  # QLoRA for memory efficiency
        use_wandb=False  # Set True if you have wandb configured
    )
    
    # Initialize fine-tuner
    fine_tuner = QwenFineTuner(config)
    
    # Start fine-tuning
    model, tokenizer = fine_tuner.train()
    
    # Test the fine-tuned model
    print("\n" + "="*80)
    print("Testing Fine-tuned Model")
    print("="*80)
    
    inference = OptimizedQwenInference(
        config.output_dir,
        use_lora=config.use_lora
    )
    
    # Test prompts
    test_prompts = [
        "A serene lake at sunset with mountains in the background",
        "A cyberpunk street scene with neon lights",
        "A medieval castle on a hilltop during a storm"
    ]
    
    for prompt in test_prompts:
        print(f"\nüìù Prompt: {prompt}")
        reasoning = inference.generate_reasoning(prompt, reasoning_type="initial")
        print(f"ü§ñ Reasoning: {reasoning[:300]}...")
    
    print("\n‚úÖ Fine-tuning and testing complete!")


Writing qwen_finetune.py


In [5]:
import sys
sys.path.append('/kaggle/working')

# Import fine-tuning components
from qwen_finetune import (
    QwenFineTuner,
    FineTuneConfig,
    IRGReasoningDataset
)

print("‚úì Fine-tuning modules imported successfully")

2025-11-11 12:36:51.562755: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1762864611.763971      19 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1762864611.826257      19 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


‚úì Fine-tuning modules imported successfully


### 5. Configure Fine-tuning (Optimized for T4 x2)

In [6]:
# Fine-tuning configuration optimized for Qwen 2.5-7B on T4 x2 (32GB VRAM)
# IMPORTANT: This config is specifically for 7B model (much more aggressive than 0.5B)
config = FineTuneConfig(
    # Model paths
    model_path=QWEN_MODEL_PATH,
    output_dir="/kaggle/working/qwen_irg_finetuned",
    
    # LoRA settings (REDUCED for 7B model)
    use_lora=True,
    lora_r=8,               # REDUCED from 32 ‚Üí 8 (less trainable params)
    lora_alpha=16,          # REDUCED from 64 ‚Üí 16 (2x rank)
    lora_dropout=0.1,
    
    # Training parameters (OPTIMIZED for 7B)
    num_epochs=3,           # 3 epochs is usually sufficient
    batch_size=1,           # REDUCED from 2 ‚Üí 1 (critical for 7B!)
    gradient_accumulation_steps=8,  # INCREASED from 4 ‚Üí 8 (effective batch = 1*8 = 8)
    learning_rate=2e-4,     # Standard for LoRA
    warmup_ratio=0.1,       # 10% warmup
    weight_decay=0.01,
    max_grad_norm=1.0,
    
    # Memory optimization (CRITICAL for 7B on T4)
    use_4bit=True,          # QLoRA - 4-bit quantization (ESSENTIAL!)
    use_8bit=False,
    bnb_4bit_compute_dtype="float16",
    bnb_4bit_quant_type="nf4",
    use_nested_quant=True,  # Double quantization for extra savings
    
    # Data settings (REDUCED for 7B)
    max_seq_length=1024,     # REDUCED from 2048 ‚Üí 512 (huge memory savings!)
    train_split=0.9,        # 90% train, 10% validation
    seed=42,
    
    # Checkpointing (IMPORTANT for Kaggle timeout protection)
    logging_steps=10,       # Log every 10 steps
    save_steps=100,         # Save checkpoint every 100 steps
    eval_steps=10000000,          # Evaluate every 50 steps
    save_total_limit=3,     # Keep only last 3 checkpoints
    
    # Logging
    use_wandb=False,        # Set True if you have wandb configured
    wandb_project="qwen-irg-finetune-7b"
)

print("Fine-tuning Configuration for Qwen 2.5-7B:")
print("="*60)
print(f"Model: {config.model_path}")
print(f"Output: {config.output_dir}")
print(f"\nTraining:")
print(f"  Epochs: {config.num_epochs}")
print(f"  Batch size: {config.batch_size} (REDUCED for 7B)")
print(f"  Gradient accumulation: {config.gradient_accumulation_steps}")
print(f"  Effective batch size: {config.batch_size * config.gradient_accumulation_steps}")
print(f"  Learning rate: {config.learning_rate}")
print(f"\nMemory Optimization (7B-specific):")
print(f"  4-bit quantization: {config.use_4bit}")
print(f"  LoRA rank: {config.lora_r} (REDUCED for memory)")
print(f"  Max sequence length: {config.max_seq_length} (REDUCED from 2048)")
print(f"\nExpected Memory Usage:")
print(f"  Model (4-bit): ~3.5-4GB")
print(f"  LoRA adapters (r=8): ~0.5GB")
print(f"  Activations + gradients: ~8-10GB")
print(f"  Total estimated: ~12-15GB per GPU")
print(f"\nCheckpointing:")
print(f"  Save every {config.save_steps} steps")
print(f"  Evaluate every {config.eval_steps} steps")
print("="*60)
print("\n‚ö†Ô∏è  NOTE: If still OOM, further reduce max_seq_length to 256")

Fine-tuning Configuration for Qwen 2.5-7B:
Model: /kaggle/input/qwen2.5/transformers/3b-instruct/1
Output: /kaggle/working/qwen_irg_finetuned

Training:
  Epochs: 3
  Batch size: 1 (REDUCED for 7B)
  Gradient accumulation: 8
  Effective batch size: 8
  Learning rate: 0.0002

Memory Optimization (7B-specific):
  4-bit quantization: True
  LoRA rank: 8 (REDUCED for memory)
  Max sequence length: 1024 (REDUCED from 2048)

Expected Memory Usage:
  Model (4-bit): ~3.5-4GB
  LoRA adapters (r=8): ~0.5GB
  Activations + gradients: ~8-10GB
  Total estimated: ~12-15GB per GPU

Checkpointing:
  Save every 100 steps
  Evaluate every 10000000 steps

‚ö†Ô∏è  NOTE: If still OOM, further reduce max_seq_length to 256


### 6. Load or Generate Training Data

In [7]:
import json

# Option 1: Use generated dataset from Notebook 1 (RECOMMENDED)
if TRAINING_DATA_PATH:
    train_file = os.path.join(TRAINING_DATA_PATH, 'complete_improved_dataset.json')
    val_file = os.path.join(TRAINING_DATA_PATH, 'val.json')
    
    if os.path.exists(train_file):
        with open(train_file, 'r') as f:
            train_examples = json.load(f)
        
        if os.path.exists(val_file):
            with open(val_file, 'r') as f:
                val_examples = json.load(f)
        else:
            # Create validation split if not present
            split_idx = int(len(train_examples) * 0.9)
            val_examples = train_examples[split_idx:]
            train_examples = train_examples[:split_idx]
        
        print(f"‚úì Loaded external dataset:")
        print(f"  Training examples: {len(train_examples)}")
        print(f"  Validation examples: {len(val_examples)}")
        
        USE_EXTERNAL_DATA = True
    else:
        print("‚ö†Ô∏è  train.json not found in training data path")
        USE_EXTERNAL_DATA = False
else:
    USE_EXTERNAL_DATA = False

# Option 2: Use built-in dataset (fallback)
if not USE_EXTERNAL_DATA:
    print("Using built-in dataset from qwen_finetune.py")
    print("‚ö†Ô∏è  This is a smaller dataset - external data recommended for better results")

‚úì Loaded external dataset:
  Training examples: 3600
  Validation examples: 400


### 7. Initialize Fine-tuner

In [8]:
# Initialize the fine-tuner
fine_tuner = QwenFineTuner(config)

print("‚úì Fine-tuner initialized")
print("  This loaded the Accelerator for distributed training")

‚úì Fine-tuner initialized
  This loaded the Accelerator for distributed training


### 8. Start Fine-tuning

**This will take 3-4 hours on T4 x2**

**Checkpoints** will be saved every 100 steps to `/kaggle/working/qwen_irg_finetuned/`

In [9]:
import time

print("="*60)
print("Starting Fine-tuning")
print("="*60)
print(f"Start time: {time.strftime('%Y-%m-%d %H:%M:%S')}")
print("\nThis will take approximately 3-4 hours...")
print("You can monitor progress below\n")

start_time = time.time()

# Start training - PASS EXTERNAL DATA TO FIX BUG
try:
    if USE_EXTERNAL_DATA:
        print(f"‚ö†Ô∏è IMPORTANT: Using {len(train_examples)} training examples from external dataset\n")
        model, tokenizer = fine_tuner.train(
            external_train_data=train_examples,
            external_val_data=val_examples
        )
    else:
        print("‚ö†Ô∏è WARNING: Using built-in small dataset (not recommended)\n")
        model, tokenizer = fine_tuner.train()
    
    elapsed_time = time.time() - start_time
    
    print("\n" + "="*60)
    print("‚úÖ Fine-tuning Complete!")
    print("="*60)
    print(f"End time: {time.strftime('%Y-%m-%d %H:%M:%S')}")
    print(f"Total time: {elapsed_time/3600:.2f} hours")
    print(f"Model saved to: {config.output_dir}")
    
except Exception as e:
    print(f"\n‚ö†Ô∏è  Training interrupted: {str(e)}")
    print(f"Checkpoints saved in: {config.output_dir}")
    print("You can resume training from the last checkpoint")
    raise

Starting Fine-tuning
Start time: 2025-11-11 12:37:16

This will take approximately 3-4 hours...
You can monitor progress below

‚ö†Ô∏è IMPORTANT: Using 3600 training examples from external dataset

Starting Qwen Fine-tuning for IRG Pipeline
Loading tokenizer and model...


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Applying LoRA configuration...


  trainer = Trainer(
No label_names provided for model class `PeftModelForCausalLM`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.


trainable params: 14,966,784 || all params: 3,100,905,472 || trainable%: 0.4827
Preparing datasets...
Using external dataset (from Phase 1)
Training examples: 3600
Validation examples: 400

üöÄ Starting training...


`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.


Step,Training Loss
10,1.6282
20,1.319
30,0.9542
40,0.5748
50,0.2666
60,0.1343
70,0.0913
80,0.0746
90,0.0655
100,0.0682



üíæ Saving fine-tuned model...

‚úÖ Fine-tuning complete! Model saved to /kaggle/working/qwen_irg_finetuned

‚úÖ Fine-tuning Complete!
End time: 2025-11-11 18:22:32
Total time: 5.75 hours
Model saved to: /kaggle/working/qwen_irg_finetuned


### 9. Test Fine-tuned Model

In [10]:
from qwen_finetune import OptimizedQwenInference

# Load fine-tuned model for testing
print("Loading fine-tuned model for testing...")
inference = OptimizedQwenInference(
    model_path=config.output_dir,
    use_lora=config.use_lora
)

print("‚úì Model loaded successfully\n")

# Test prompts
test_prompts = [
    "A serene lake at sunset with mountains in the background",
    "A cyberpunk street scene with neon lights and rain",
    "A medieval castle on a hilltop during a thunderstorm",
    "Portrait of an elderly person reading by candlelight"
]

print("="*60)
print("Testing Fine-tuned Model")
print("="*60)

for i, prompt in enumerate(test_prompts, 1):
    print(f"\n{i}. Prompt: {prompt}")
    print("-" * 60)
    
    # Generate reasoning
    reasoning = inference.generate_reasoning(
        prompt=prompt,
        reasoning_type="initial",
        max_length=300,
        temperature=0.7
    )
    
    print(f"Reasoning: {reasoning[:500]}...")
    print()

Loading fine-tuned model for testing...
Loading fine-tuned model from /kaggle/working/qwen_irg_finetuned


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]



‚úì Model loaded successfully

Testing Fine-tuned Model

1. Prompt: A serene lake at sunset with mountains in the background
------------------------------------------------------------
Reasoning: Composition: Use centered to create visual balance and interest.
Position the main subject according to this principle.

Lighting: Apply soft light to establish mood and depth.
Ensure consistent light direction and appropriate shadows.

Color palette: Choose hues that support the subject and mood.
Apply proper color temperature and saturation.

Technical: Maintain sharp focus on the subject with appropriate depth of field.
Ensure high detail quality and proper exposure.

Apply these principles t...


2. Prompt: A cyberpunk street scene with neon lights and rain
------------------------------------------------------------
Reasoning: Composition: Use frame within frame to create visual balance and interest. Position the main subject according to this principle. Lighting: Apply studio lighting t

### 10. Save Model for Next Notebook

In [11]:
import shutil

# Verify model files
model_dir = config.output_dir

print("Model files in output directory:")
print("="*60)

if os.path.exists(model_dir):
    for item in os.listdir(model_dir):
        item_path = os.path.join(model_dir, item)
        if os.path.isfile(item_path):
            size_mb = os.path.getsize(item_path) / (1024**2)
            print(f"  ‚úì {item} ({size_mb:.2f} MB)")
        else:
            print(f"  ‚úì {item}/ (directory)")
    
    # Calculate total size
    total_size = sum(
        os.path.getsize(os.path.join(model_dir, f)) 
        for f in os.listdir(model_dir) 
        if os.path.isfile(os.path.join(model_dir, f))
    ) / (1024**2)
    
    print(f"\nTotal model size: {total_size:.2f} MB")
    print(f"Location: {model_dir}")
else:
    print("‚ö†Ô∏è  Model directory not found!")

print("\n" + "="*60)
print("NEXT STEPS:")
print("="*60)
print("1. Create a Kaggle Dataset from this output:")
print("   - Go to 'File' ‚Üí 'Download' (or use Kaggle API)")
print("   - Create new dataset: 'qwen-irg-finetuned'")
print("   - Upload the entire output folder")
print("")
print("2. Use in Notebook 3 (Benchmarking):")
print("   - Add fine-tuned model as input dataset")
print("   - Path: /kaggle/input/qwen-irg-finetuned/")
print("="*60)

Model files in output directory:
  ‚úì lora_weights/ (directory)
  ‚úì added_tokens.json (0.00 MB)
  ‚úì checkpoint-600/ (directory)
  ‚úì checkpoint-500/ (directory)
  ‚úì checkpoint-675/ (directory)
  ‚úì adapter_model.safetensors (57.16 MB)
  ‚úì tokenizer_config.json (0.00 MB)
  ‚úì chat_template.jinja (0.00 MB)
  ‚úì tokenizer.json (10.89 MB)
  ‚úì special_tokens_map.json (0.00 MB)
  ‚úì adapter_config.json (0.00 MB)
  ‚úì merges.txt (1.59 MB)
  ‚úì README.md (0.01 MB)
  ‚úì training_args.bin (0.01 MB)
  ‚úì vocab.json (2.65 MB)

Total model size: 72.31 MB
Location: /kaggle/working/qwen_irg_finetuned

NEXT STEPS:
1. Create a Kaggle Dataset from this output:
   - Go to 'File' ‚Üí 'Download' (or use Kaggle API)
   - Create new dataset: 'qwen-irg-finetuned'
   - Upload the entire output folder

2. Use in Notebook 3 (Benchmarking):
   - Add fine-tuned model as input dataset
   - Path: /kaggle/input/qwen-irg-finetuned/


### 11. Optional: Save Training Metrics

In [12]:
# If training logs exist, visualize them
import pandas as pd
import matplotlib.pyplot as plt

log_history_file = os.path.join(config.output_dir, "trainer_state.json")

if os.path.exists(log_history_file):
    with open(log_history_file, 'r') as f:
        trainer_state = json.load(f)
    
    if 'log_history' in trainer_state:
        logs = pd.DataFrame(trainer_state['log_history'])
        
        # Plot training curves
        fig, axes = plt.subplots(1, 2, figsize=(14, 5))
        
        # Loss curve
        if 'loss' in logs.columns:
            axes[0].plot(logs['step'], logs['loss'], label='Training Loss')
            axes[0].set_xlabel('Step')
            axes[0].set_ylabel('Loss')
            axes[0].set_title('Training Loss Over Time')
            axes[0].legend()
            axes[0].grid(alpha=0.3)
        
        # Learning rate
        if 'learning_rate' in logs.columns:
            axes[1].plot(logs['step'], logs['learning_rate'], color='orange')
            axes[1].set_xlabel('Step')
            axes[1].set_ylabel('Learning Rate')
            axes[1].set_title('Learning Rate Schedule')
            axes[1].grid(alpha=0.3)
        
        plt.tight_layout()
        plt.savefig(os.path.join(config.output_dir, 'training_curves.png'), dpi=150)
        plt.show()
        
        print("‚úì Training metrics visualized and saved")
else:
    print("No training logs found")

No training logs found


### 12. Clean Up GPU Memory

In [13]:
import gc

# Clean up
del model
del tokenizer
if 'inference' in locals():
    del inference

gc.collect()
torch.cuda.empty_cache()

print("‚úì GPU memory cleared")

# Show final GPU memory
if torch.cuda.is_available():
    for i in range(torch.cuda.device_count()):
        mem_allocated = torch.cuda.memory_allocated(i) / (1024**3)
        mem_reserved = torch.cuda.memory_reserved(i) / (1024**3)
        print(f"GPU {i}:")
        print(f"  Allocated: {mem_allocated:.2f} GB")
        print(f"  Reserved: {mem_reserved:.2f} GB")

‚úì GPU memory cleared
GPU 0:
  Allocated: 1.18 GB
  Reserved: 3.17 GB
GPU 1:
  Allocated: 0.00 GB
  Reserved: 0.00 GB


---
## ‚úÖ Fine-tuning Complete!

Your fine-tuned Qwen model is ready for the benchmarking phase.

**Next:** Create Kaggle dataset from output and move to Notebook 3