# XiYan-SQL Evaluation on Google Colab

This notebook provides a complete evaluation pipeline for XiYan-SQL models on Google Colab.

## What this notebook does:
1. ‚úÖ Installs dependencies
2. ‚úÖ Clones the repository
3. ‚úÖ Mounts Google Drive
4. ‚úÖ Copies model from Google Drive
5. ‚úÖ Runs inference on test dataset
6. ‚úÖ Runs evaluation (execution accuracy)

## Prerequisites:
- Upload your trained model to Google Drive
- Enable GPU runtime in Colab (Runtime ‚Üí Change runtime type ‚Üí GPU)
- Test dataset is already in the repository

## Step 1: Install Dependencies

Install all required packages for evaluation.

In [None]:
# Install system dependencies
!apt-get update -qq
!apt-get install -y -qq libaio-dev  # Required for DeepSpeed

print("üì¶ Installing Python packages...")
print("‚ö†Ô∏è  Note: Installing in specific order to avoid numpy/DeepSpeed conflicts.\n")

# Install DeepSpeed AFTER numpy and torch are set
print("\nüîß Installing DeepSpeed (may show some warnings)...")
!pip install -q --disable-pip-version-check --no-cache-dir deepspeed

# Install remaining packages
!pip install -q --disable-pip-version-check llama-index>=0.9.6.post2
!pip install -q --disable-pip-version-check modelscope>=1.33.0
!pip install -q --disable-pip-version-check mysql-connector-python>=9.5.0
!pip install -q --disable-pip-version-check "protobuf>=6.33.3"
!pip install -q --disable-pip-version-check psycopg2-binary>=2.9.11
!pip install -q --disable-pip-version-check swanlab>=0.7.6
!pip install -q --disable-pip-version-check textdistance>=4.6.3
!pip install -q --disable-pip-version-check jedi>=0.16

# Install flash-attn (optional, for faster attention)
print("\nüî® Attempting to install flash-attn (this may take a few minutes)...")
import subprocess
result = subprocess.run(
    ["pip", "install", "-q", "--no-build-isolation", "flash-attn"],
    capture_output=True,
    text=True
)
if result.returncode == 0:
    print("‚úÖ flash-attn installed successfully")
else:
    print("‚ö†Ô∏è  flash-attn installation failed (this is optional, continuing without it)")

print("\n‚úÖ Core dependencies installed!")
print("\nüí° If you see numpy warnings, they are expected and won't affect training.")

# Verify installation
print("\nüîç Verifying installation...")
try:
    import torch
    import transformers
    import accelerate
    import deepspeed
    import peft
    import numpy as np

    print(f"‚úÖ PyTorch: {torch.__version__}")
    print(f"‚úÖ Transformers: {transformers.__version__}")
    print(f"‚úÖ Accelerate: {accelerate.__version__}")
    print(f"‚úÖ DeepSpeed: {deepspeed.__version__}")
    print(f"‚úÖ PEFT: {peft.__version__}")
    print(f"‚úÖ NumPy: {np.__version__}")
    print(f"‚úÖ CUDA Available: {torch.cuda.is_available()}")

    if torch.cuda.is_available():
        print(f"‚úÖ GPU: {torch.cuda.get_device_name(0)}")
        gpu_mem = torch.cuda.get_device_properties(0).total_memory / 1024**3
        print(f"‚úÖ GPU Memory: {gpu_mem:.1f} GB")

        if gpu_mem >= 14:
            print("\nüéØ Your GPU has 15GB+ memory - perfect for optimized training!")
        elif gpu_mem >= 10:
            print("\nüìä Your GPU has 12GB memory - good for moderate training.")
        else:
            print("\n‚ö†Ô∏è  Your GPU has limited memory - training will use conservative settings.")
    else:
        print("\n‚ùå No GPU detected! Make sure to enable GPU in Runtime ‚Üí Change runtime type")

    print("\nüöÄ Ready to proceed!")

except ImportError as e:
    print(f"\n‚ùå Import error: {e}")
    print("\nüîÑ If you see numpy errors, restart runtime and run this cell again.")
    print("   Go to: Runtime ‚Üí Restart runtime")

## Step 2: Clone Repository

Clone the XiYan-SQL repository to Colab.

In [None]:
# Change to content directory
import os
import sys
os.chdir('/content')

# Clone the repository
# ‚ö†Ô∏è UPDATE THIS with your repository URL
REPO_URL = "https://github.com/rezaarrazi/XiYan-SQL.git"

if not os.path.exists('XiYan-SQL'):
    os.system(f'git clone {REPO_URL}')
    print("‚úÖ Repository cloned successfully")
else:
    print("‚úÖ Repository already exists")

# Navigate to evaluation directory
os.chdir('XiYan-SQL/XiYan-SQLTraining/evaluation')

# Add to Python path
TRAINING_DIR = os.path.dirname(os.getcwd())
if TRAINING_DIR not in sys.path:
    sys.path.insert(0, TRAINING_DIR)
if os.path.dirname(TRAINING_DIR) not in sys.path:
    sys.path.insert(0, os.path.dirname(TRAINING_DIR))

print(f"\nüìÅ Current directory: {os.getcwd()}")
print(f"‚úÖ Python path configured")

## Step 3: Mount Google Drive

Mount your Google Drive to access the trained model.

In [None]:
from google.colab import drive

# Mount Google Drive
drive.mount('/content/drive')

print("‚úÖ Google Drive mounted successfully")
print("\nüìÇ Drive path: /content/drive/MyDrive")

## Step 4: Copy Model from Google Drive

Copy your trained model from Google Drive. The model can be either:
- A merged model (single folder)
- A base model + LoRA adapter (two separate folders)

**Configure the path below to match your Google Drive structure.**

In [None]:
import shutil
import os

# ‚ö†Ô∏è CONFIGURE THIS: Path to your model in Google Drive
# Option 1: Merged model (single folder)
MODEL_DRIVE_PATH = "/content/drive/MyDrive/XiYan-SQL/Trained-Models/your-model-name"

# Option 2: Base model + LoRA adapter (if using separate adapter)
BASE_MODEL_DRIVE_PATH = "/content/drive/MyDrive/XiYan-SQL/Models/Qwen/Qwen2.5-Coder-3B-Instruct"
LORA_ADAPTER_DRIVE_PATH = "/content/drive/MyDrive/XiYan-SQL/Trained-Models/your-adapter-name"

# Choose which model to use:
# - "merged": Use merged model (single path)
# - "adapter": Use base model + LoRA adapter (two paths)
MODEL_TYPE = "merged"  # Change to "adapter" if using separate adapter

# Target directory in the repository
MODEL_TARGET_DIR = "../train/model/Qwen"
os.makedirs(MODEL_TARGET_DIR, exist_ok=True)

if MODEL_TYPE == "merged":
    # Copy merged model
    if os.path.exists(MODEL_DRIVE_PATH):
        model_name = os.path.basename(MODEL_DRIVE_PATH)
        target_path = os.path.join(MODEL_TARGET_DIR, model_name)
        
        if os.path.exists(target_path):
            print(f"‚ö†Ô∏è  Model already exists at {target_path}")
            print("Skipping copy (delete manually if you want to re-copy)")
        else:
            print(f"üì• Copying merged model from {MODEL_DRIVE_PATH}...")
            shutil.copytree(MODEL_DRIVE_PATH, target_path)
            print(f"‚úÖ Model copied to {target_path}")
        
        MODEL_PATH = target_path
        LORA_PATH = ""  # No adapter needed for merged model
    else:
        print(f"‚ùå Model not found at {MODEL_DRIVE_PATH}")
        print("\nPlease check:")
        print("1. Google Drive is mounted correctly")
        print("2. The model path is correct")
        MODEL_PATH = None
        LORA_PATH = ""
        
elif MODEL_TYPE == "adapter":
    # Copy base model
    if os.path.exists(BASE_MODEL_DRIVE_PATH):
        base_model_name = os.path.basename(BASE_MODEL_DRIVE_PATH)
        base_target_path = os.path.join(MODEL_TARGET_DIR, base_model_name)
        
        if os.path.exists(base_target_path):
            print(f"‚ö†Ô∏è  Base model already exists at {base_target_path}")
        else:
            print(f"üì• Copying base model from {BASE_MODEL_DRIVE_PATH}...")
            shutil.copytree(BASE_MODEL_DRIVE_PATH, base_target_path)
            print(f"‚úÖ Base model copied to {base_target_path}")
        
        MODEL_PATH = base_target_path
    else:
        print(f"‚ùå Base model not found at {BASE_MODEL_DRIVE_PATH}")
        MODEL_PATH = None
    
    # Copy LoRA adapter
    if os.path.exists(LORA_ADAPTER_DRIVE_PATH):
        adapter_name = os.path.basename(LORA_ADAPTER_DRIVE_PATH)
        adapter_target_path = os.path.join("../train/output", adapter_name)
        os.makedirs(os.path.dirname(adapter_target_path), exist_ok=True)
        
        if os.path.exists(adapter_target_path):
            print(f"‚ö†Ô∏è  Adapter already exists at {adapter_target_path}")
        else:
            print(f"üì• Copying LoRA adapter from {LORA_ADAPTER_DRIVE_PATH}...")
            shutil.copytree(LORA_ADAPTER_DRIVE_PATH, adapter_target_path)
            print(f"‚úÖ Adapter copied to {adapter_target_path}")
        
        LORA_PATH = adapter_target_path
    else:
        print(f"‚ùå Adapter not found at {LORA_ADAPTER_DRIVE_PATH}")
        LORA_PATH = ""

if MODEL_PATH:
    print(f"\nüìå Model path: {MODEL_PATH}")
    if LORA_PATH:
        print(f"üìå LoRA adapter path: {LORA_PATH}")
    print("\n‚úÖ Model ready for evaluation!")

## Step 5: Configure Evaluation Parameters

Set up the evaluation configuration.

In [None]:
import subprocess
import os

# Check GPU memory for optimal batch size
try:
    result = subprocess.run(['nvidia-smi', '--query-gpu=memory.total', '--format=csv,noheader,nounits'], 
                          capture_output=True, text=True)
    gpu_memory_mb = int(result.stdout.strip())
    gpu_memory_gb = gpu_memory_mb / 1024
    print(f"üéÆ Detected GPU Memory: {gpu_memory_gb:.1f} GB")
    
    # Detect GPU architecture for flash attention
    gpu_name_result = subprocess.run(['nvidia-smi', '--query-gpu=name', '--format=csv,noheader'], 
                                     capture_output=True, text=True)
    gpu_name = gpu_name_result.stdout.strip()
    print(f"üéÆ GPU: {gpu_name}")
    
    supports_flash = any(x in gpu_name.upper() for x in ['A100', 'A10', 'RTX 30', 'RTX 40', 'H100', 'L4'])
    if supports_flash:
        print(f"‚úÖ Flash Attention supported!")
    else:
        print(f"‚ö†Ô∏è  Flash Attention not supported on this GPU")
except:
    gpu_memory_gb = 15.0
    supports_flash = False
    print(f"‚ö†Ô∏è  Could not detect GPU, assuming {gpu_memory_gb} GB")

# Auto-configure batch size based on GPU memory
if gpu_memory_gb >= 40:
    BATCH_SIZE = 4
elif gpu_memory_gb >= 24:
    BATCH_SIZE = 2
elif gpu_memory_gb >= 15:
    BATCH_SIZE = 1
else:
    BATCH_SIZE = 1

# Check for db_conn.json (repository or Google Drive)
DB_CONN_REPO_PATH = "../data/data_warehouse/train/db_conn.json"
DB_CONN_DRIVE_PATH = "/content/drive/MyDrive/XiYan-SQL/db_conn.json"
if os.path.exists(DB_CONN_REPO_PATH):
    DB_CONN_CONFIG = DB_CONN_REPO_PATH
elif os.path.exists(DB_CONN_DRIVE_PATH):
    DB_CONN_CONFIG = DB_CONN_DRIVE_PATH
else:
    DB_CONN_CONFIG = ""

# Evaluation configuration
EVAL_CONFIG = {
    # Model paths (set in Step 4)
    "model_path": MODEL_PATH if 'MODEL_PATH' in globals() else "../train/model/Qwen/Qwen2.5-Coder-3B-Instruct",
    "lora_path": LORA_PATH if 'LORA_PATH' in globals() else "",
    
    # Test dataset (already in repository)
    "test_set_path": "datasets/nl2sql_standard_test.json",
    
    # Experiment version (for output naming)
    "expr_version": "colab_eval",
    
    # Inference parameters
    "batch_size": BATCH_SIZE,
    "use_flash_attention": supports_flash,
    "max_samples": None,  # Set to a number (e.g., 100) to test on subset
    
    # Evaluation parameters (for sql_eval.py)
    "db_conn_config": DB_CONN_CONFIG if "DB_CONN_CONFIG" in globals() else "",  # Set if found in repo or Drive
}

print("\nüìã Evaluation Configuration:")
print(f"  Model: {EVAL_CONFIG['model_path']}")
if EVAL_CONFIG["lora_path"]:
    print(f"  LoRA Adapter: {EVAL_CONFIG['lora_path']}")
print(f"  Test Dataset: {EVAL_CONFIG['test_set_path']}")
print(f"  Batch Size: {EVAL_CONFIG['batch_size']}")
print(f"  Flash Attention: {'‚úÖ Enabled' if EVAL_CONFIG['use_flash_attention'] else '‚ùå Disabled'}")
if EVAL_CONFIG["max_samples"]:
    print(f"  Max Samples: {EVAL_CONFIG['max_samples']} (testing on subset)")
else:
    print(f"  Max Samples: All (full evaluation)")
if EVAL_CONFIG["db_conn_config"]:
    print(f"  Database Config: {EVAL_CONFIG['db_conn_config']} ‚úÖ")
else:
    print(f"  Database Config: Not found (evaluation will be skipped)")


## Step 6: Run Inference

Generate SQL predictions for the test dataset.

In [None]:
import os
import subprocess
import sys

# Ensure we're in the evaluation directory
os.chdir('/content/XiYan-SQL/XiYan-SQLTraining/evaluation')

# Build inference command
cmd = [
    sys.executable, "sql_infer.py",
    "--model_name_or_path", EVAL_CONFIG["model_path"],
    "--test_set_path", EVAL_CONFIG["test_set_path"],
    "--expr_version", EVAL_CONFIG["expr_version"],
    "--batch_size", str(EVAL_CONFIG["batch_size"]),
]

# Add LoRA path if using adapter
if EVAL_CONFIG["lora_path"]:
    cmd.extend(["--lora_path", EVAL_CONFIG["lora_path"]])

# Add flash attention flag if enabled
if EVAL_CONFIG["use_flash_attention"]:
    cmd.append("--use_flash_attention")

# Add max samples if specified
if EVAL_CONFIG["max_samples"]:
    cmd.extend(["--max_samples", str(EVAL_CONFIG["max_samples"])])

print("üöÄ Starting Inference")
print("="*60)
print(f"üìÅ Model: {EVAL_CONFIG['model_path']}")
print(f"üìä Test Dataset: {EVAL_CONFIG['test_set_path']}")
print(f"üíæ Output Version: {EVAL_CONFIG['expr_version']}")
print(f"üì¶ Batch Size: {EVAL_CONFIG['batch_size']}")
print("="*60)
print("\n‚è≥ Inference will take some time depending on dataset size...")
print("üí° Keep this tab active to prevent disconnection\n")

# Show command
print("üìù Running command:")
print(" ".join(cmd))
print("\n" + "="*60 + "\n")

# Run inference
try:
    result = subprocess.run(cmd, cwd=os.getcwd(), check=False)
    
    if result.returncode == 0:
        print("\n" + "="*60)
        print("‚úÖ Inference completed successfully!")
        
        # Find output file
        import datetime
        today = datetime.date.today().strftime('%Y%m%d')
        output_dir = os.path.join("datasets", "output", EVAL_CONFIG["expr_version"])
        output_file = os.path.join(output_dir, f"{EVAL_CONFIG['expr_version']}_{today}_results.json")
        
        if os.path.exists(output_file):
            import json
            with open(output_file, 'r') as f:
                results = json.load(f)
            print(f"üìÅ Results saved to: {output_file}")
            print(f"üìä Total predictions: {len(results)}")
            
            # Store output path for evaluation
            globals()['PRED_SQL_PATH'] = output_file
        else:
            print(f"‚ö†Ô∏è  Output file not found at expected path: {output_file}")
        
        print("="*60)
    else:
        print("\n" + "="*60)
        print(f"‚ùå Inference failed with return code {result.returncode}")
        print("="*60)
        print("\nüí° Common issues:")
        print("  - Model not found: Check MODEL_PATH in Step 4")
        print("  - Dataset not found: Check test_set_path")
        print("  - Out of memory: Try reducing batch_size")
        
except Exception as e:
    print(f"\n‚ùå Error during inference: {e}")

### Option A: Configure db_conn.json

The notebook will check for `db_conn.json` in the repository first (`data/data_warehouse/train/db_conn.json`), then Google Drive.


In [None]:
# Check for db_conn.json - first in repository, then Google Drive
import os

# Path in repository (relative to evaluation directory)
DB_CONN_REPO_PATH = "../data/data_warehouse/train/db_conn.json"

# Path in Google Drive (if you uploaded it there)
DB_CONN_DRIVE_PATH = "/content/drive/MyDrive/XiYan-SQL/db_conn.json"

# Check repository first
if os.path.exists(DB_CONN_REPO_PATH):
    print(f"‚úÖ Database config found in repository")
    print(f"üìÅ Location: {DB_CONN_REPO_PATH}")
    EVAL_CONFIG["db_conn_config"] = DB_CONN_REPO_PATH
elif os.path.exists(DB_CONN_DRIVE_PATH):
    # Copy from Google Drive to current directory
    import shutil
    DB_CONN_TARGET_PATH = "db_conn.json"
    shutil.copy2(DB_CONN_DRIVE_PATH, DB_CONN_TARGET_PATH)
    print(f"‚úÖ Database config copied from Google Drive")
    print(f"üìÅ Location: {DB_CONN_TARGET_PATH}")
    EVAL_CONFIG["db_conn_config"] = DB_CONN_TARGET_PATH
else:
    print(f"‚ö†Ô∏è  Database config not found")
    print(f"   Checked repository: {DB_CONN_REPO_PATH}")
    print(f"   Checked Google Drive: {DB_CONN_DRIVE_PATH}")
    print("\nüí° You can:")
    print("  1. Ensure db_conn.json exists at: data/data_warehouse/train/db_conn.json in the repository")
    print("  2. Or upload db_conn.json to Google Drive at the path above")
    print("  3. Or skip evaluation and manually inspect predictions")
    print("\nüìù Example db_conn.json structure:")
    print("""{
  "db_name": "your_database",
  "db_user": "username",
  "db_password": "password",
  "db_host": "hostname",
  "db_port": 5432,
  "dialect": "postgresql"  // or "mysql" or "sqlite"
}""")



### Option B: Run Evaluation

In [None]:
import os
import subprocess
import sys
import datetime

# Check if we have predictions and database config
if 'PRED_SQL_PATH' not in globals() or not os.path.exists(globals().get('PRED_SQL_PATH', '')):
    print("‚ùå No predictions found! Please run Step 6 (Inference) first.")
elif not EVAL_CONFIG.get('db_conn_config') or not os.path.exists(EVAL_CONFIG['db_conn_config']):
    print("‚ö†Ô∏è  Database config not available. Skipping evaluation.")
    print("\nüí° You can still inspect the predictions manually:")
    if 'PRED_SQL_PATH' in globals():
        print(f"   üìÅ Predictions file: {globals()['PRED_SQL_PATH']}")
else:
    # Ensure we're in the evaluation directory
    os.chdir('/content/XiYan-SQL/XiYan-SQLTraining/evaluation')
    
    # Build evaluation command
    today = datetime.date.today().strftime('%Y%m%d')
    save_eval_path = f"datasets/output/{EVAL_CONFIG['expr_version']}/eval_results_{today}.json"
    os.makedirs(os.path.dirname(save_eval_path), exist_ok=True)
    
    cmd = [
        sys.executable, "sql_eval.py",
        "--pred_sql_path", globals()['PRED_SQL_PATH'],
        "--test_sql_path", EVAL_CONFIG["test_set_path"],
        "--db_conn_config", EVAL_CONFIG["db_conn_config"],
        "--save_eval_path", save_eval_path,
    ]
    
    print("üöÄ Starting Evaluation")
    print("="*60)
    print(f"üìÅ Predictions: {globals()['PRED_SQL_PATH']}")
    print(f"üìä Ground Truth: {EVAL_CONFIG['test_set_path']}")
    print(f"üíæ Results: {save_eval_path}")
    print("="*60)
    print("\n‚è≥ Evaluation will execute SQL queries against the database...")
    print("üí° This may take some time depending on dataset size\n")
    
    # Show command
    print("üìù Running command:")
    print(" ".join(cmd))
    print("\n" + "="*60 + "\n")
    
    # Run evaluation
    try:
        result = subprocess.run(cmd, cwd=os.getcwd(), check=False)
        
        if result.returncode == 0:
            print("\n" + "="*60)
            print("‚úÖ Evaluation completed successfully!")
            print(f"üìÅ Results saved to: {save_eval_path}")
            print("="*60)
            
            # Try to read and display metrics
            try:
                import json
                with open(save_eval_path, 'r') as f:
                    eval_results = json.load(f)
                
                # Calculate metrics from results
                total = len(eval_results)
                ex_eq = sum(1 for r in eval_results if r.get('ex_eq', 0) == 1)
                ex_bird = sum(1 for r in eval_results if r.get('ex_bird', 0) == 1)
                executable = sum(1 for r in eval_results if r.get('executable', 0) == 1)
                
                print("\nüìä Evaluation Metrics:")
                print(f"  Total samples: {total}")
                print(f"  Executable: {executable} ({executable/total*100:.2f}%)")
                print(f"  Execution Accuracy (ex_eq): {ex_eq} ({ex_eq/total*100:.2f}%)")
                print(f"  Execution Accuracy (ex_bird): {ex_bird} ({ex_bird/total*100:.2f}%)")
            except Exception as e:
                print(f"\n‚ö†Ô∏è  Could not parse evaluation results: {e}")
        else:
            print("\n" + "="*60)
            print(f"‚ùå Evaluation failed with return code {result.returncode}")
            print("="*60)
            print("\nüí° Common issues:")
            print("  - Database connection failed: Check db_conn.json")
            print("  - Database not accessible from Colab: Use VPN or different connection")
            print("  - Predictions file not found: Re-run Step 6")
            
    except Exception as e:
        print(f"\n‚ùå Error during evaluation: {e}")

## Step 8: View Results (Optional)

Inspect a few predictions to verify the model is working correctly.

In [None]:
import json
import os

if 'PRED_SQL_PATH' in globals() and os.path.exists(globals()['PRED_SQL_PATH']):
    with open(globals()['PRED_SQL_PATH'], 'r') as f:
        results = json.load(f)
    
    print(f"üìä Total predictions: {len(results)}")
    print("\n" + "="*80)
    print("üîç Sample Predictions (first 3)")
    print("="*80)
    
    for i, result in enumerate(results[:3], 1):
        print(f"\n{'‚îÄ'*80}")
        print(f"Sample {i}:")
        print(f"{'‚îÄ'*80}")
        
        # Extract question from conversations
        if 'conversations' in result and len(result['conversations']) > 0:
            question_text = result['conversations'][0]['content']
            # Extract just the question part (simplified)
            if '„ÄêQuestion„Äë' in question_text:
                question = question_text.split('„ÄêQuestion„Äë')[1].split('\n')[0].strip()
            else:
                question = question_text[:200] + "..." if len(question_text) > 200 else question_text
            print(f"\n‚ùì Question: {question}")
        
        # Show ground truth SQL
        if 'sql' in result:
            print(f"\n‚úÖ Ground Truth SQL:")
            print(f"   {result['sql']}")
        
        # Show predicted SQL
        if 'pred_sql' in result:
            print(f"\nü§ñ Predicted SQL:")
            print(f"   {result['pred_sql']}")
        
        # Show evaluation results if available
        if 'ex_eq' in result:
            ex_eq = result['ex_eq']
            ex_bird = result.get('ex_bird', 0)
            executable = result.get('executable', 0)
            print(f"\nüìä Evaluation:")
            print(f"   Executable: {'‚úÖ' if executable else '‚ùå'}")
            print(f"   Execution Match (ex_eq): {'‚úÖ' if ex_eq else '‚ùå'}")
            print(f"   Execution Match (ex_bird): {'‚úÖ' if ex_bird else '‚ùå'}")
    
    print(f"\n{'='*80}")
    print("‚úÖ Results inspection complete!")
    print(f"\nüí° Full results saved at: {globals()['PRED_SQL_PATH']}")
else:
    print("‚ùå No predictions found! Please run Step 6 (Inference) first.")

## Troubleshooting

### Out of Memory (OOM) Errors
- Reduce `batch_size` to 1 in Step 5
- Set `max_samples` to a smaller number for testing
- Disable flash attention if enabled

### Model Not Found
- Check that `MODEL_DRIVE_PATH` in Step 4 is correct
- Verify the model folder exists in Google Drive
- Ensure the model folder contains all required files (config.json, tokenizer files, etc.)

### Dataset Not Found
- The test dataset should be in `evaluation/datasets/nl2sql_standard_test.json`
- If missing, check that the repository was cloned correctly

### Database Connection Issues
- Evaluation (Step 7) requires database access
- If you don't have database access, you can skip evaluation and manually inspect predictions
- Make sure `db_conn.json` has the correct database credentials

### Slow Inference
- Colab free tier has limited GPU time
- Consider using Colab Pro for longer sessions
- Reduce `max_samples` for quick testing
- Enable flash attention if your GPU supports it