# Full Hybrid Pipeline: Thai-English NLP System
*Typhoon Translate 4B (GGUF) + XLM-RoBERTa Hierarchical Classifier + Typhoon 2.1 Instruct (4B)*

## Overview
This notebook implements a complete hybrid NLP pipeline for Thai-to-English grammar analysis with three components:
1. **Translation**: Typhoon Translate 4B (GGUF) for Thai ‚Üí English translation
2. **Classification**: Custom XLM-RoBERTa model for hierarchical tense classification
3. **Explanation**: Typhoon 2.1 Instruct (4B) for detailed grammar explanations

### Pipeline Flow
Thai Input ‚Üí Translation ‚Üí Tense Classification ‚Üí Grammar Explanation ‚Üí Structured Output

---
## Section 1: Environment Setup

In [None]:
# 1.1 Install Build Tools (Optional - only if CUDA wheel not available)
!apt-get update -y && apt-get install -y build-essential cmake ninja-build
!pip install -U 'pip>=23.3'

0% [Working]            Get:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease [3,632 B]
0% [Connecting to archive.ubuntu.com (91.189.91.82)] [Connecting to security.ub0% [Connecting to archive.ubuntu.com (91.189.91.82)] [Connecting to security.ub                                                                               Hit:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease
0% [Waiting for headers] [Waiting for headers] [Connected to r2u.stat.illinois.                                                                               Hit:3 http://archive.ubuntu.com/ubuntu jammy InRelease
Get:4 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]
Get:5 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [128 kB]
Get:6 https://r2u.stat.illinois.edu/ubuntu jammy InRelease [6,555 B]
Get:7 https://r2u.stat.illinois.edu/ubuntu jammy/main amd64 Packages [2,758 kB]
Get:8 http://archive.ubuntu.com/ubuntu jammy

In [None]:
# 1.2 Install llama-cpp-python with CUDA support
import os, re, subprocess, sys

# Detect CUDA version and attempt wheel installation
CUDA_VERSION = !nvidia-smi --query-gpu=cuda_version --format=csv,noheader
CUDA_VERSION = CUDA_VERSION[0].strip()
cu_tag = re.sub(r"\.", "", CUDA_VERSION)[:3]

extra_index = f"https://abetlen.github.io/llama-cpp-python/whl/cu{cu_tag}"
print(f"üîç Trying prebuilt wheel from: {extra_index}")

try:
    subprocess.check_call([
        sys.executable, "-m", "pip", "install", "llama-cpp-python",
        "--extra-index-url", extra_index
    ])
    print("‚úÖ Prebuilt wheel installed successfully")
except subprocess.CalledProcessError:
    print("‚ö†Ô∏è Prebuilt wheel not found - compiling from source...")
    gpu_info = !nvidia-smi
    arch = "80" if any("A100" in line for line in gpu_info) else "75"

    # Clean setup for compilation
    !rm -rf llama.cpp
    !git clone https://github.com/ggerganov/llama.cpp.git
    %cd llama.cpp

    # Configure build environment
    env = os.environ.copy()
    env["CMAKE_ARGS"] = f"-DLLAMA_CUDA=on -DLLAMA_CUBLAS=on -DLLAMA_GEMMA=on -DCMAKE_CUDA_ARCHITECTURES={arch}"
    env["FORCE_CMAKE"] = "1"

    subprocess.check_call([
        sys.executable, "-m", "pip", "install", "--no-cache-dir",
        "--force-reinstall", "-e", "."
    ], env=env)

# Install core dependencies
!pip install -q -U huggingface_hub transformers accelerate safetensors

print("‚úÖ llama-cpp-python and dependencies ready")

üîç Trying prebuilt wheel from: https://abetlen.github.io/llama-cpp-python/whl/cuFie
‚úÖ Prebuilt wheel installed successfully
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m10.8/10.8 MB[0m [31m167.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m363.4/363.4 MB[0m [31m66.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m13.8/13.8 MB[0m [31m177.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m24.6/24.6 MB[0m [31m210.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚î

In [None]:
# 1.3 Mount Google Drive and Setup Directories
from google.colab import drive
import os, torch, gc, time

drive.mount("/content/drive")

# Define project structure
BASE_DIR = "/content/drive/MyDrive/Dissertation/Model v.5"
MODEL_DIR = f"{BASE_DIR}/models"
RESULTS_DIR = f"{BASE_DIR}/results"
CACHE_DIR = f"{BASE_DIR}/cache"

# Create directories
for directory in [MODEL_DIR, RESULTS_DIR, CACHE_DIR]:
    os.makedirs(directory, exist_ok=True)

# Optimize PyTorch settings
torch.set_float32_matmul_precision("high")

print("‚úÖ Google Drive mounted and directories configured")
print(f"üìÅ Base directory: {BASE_DIR}")
print(f"üìÅ Models: {MODEL_DIR}")
print(f"üìÅ Results: {RESULTS_DIR}")

Mounted at /content/drive
‚úÖ Google Drive mounted and directories configured
üìÅ Base directory: /content/drive/MyDrive/Dissertation/Model v.5
üìÅ Models: /content/drive/MyDrive/Dissertation/Model v.5/models
üìÅ Results: /content/drive/MyDrive/Dissertation/Model v.5/results


---
## Section 2: Model Architecture Definitions

In [None]:
# 2.1 Hierarchical BERT Classifier Architecture
import torch.nn as nn
from transformers import XLMRobertaModel, AutoConfig, PreTrainedModel

class XLMRHierClassifier(PreTrainedModel):
    """
    Hierarchical XLM-RoBERTa classifier for tense classification
    Predicts both coarse (Past/Present/Future) and fine-grained tense labels
    """
    config_class = AutoConfig

    def __init__(self, config, n_coarse=3, n_fine=25, coarse_w=0.3):
        super().__init__(config)
        self.encoder = XLMRobertaModel(config, add_pooling_layer=False)
        h = self.encoder.config.hidden_size

        # Dual classification heads
        self.coarse_head = nn.Linear(h, n_coarse)  # Past/Present/Future
        self.fine_head = nn.Linear(h, n_fine)      # Detailed tense categories

        self.crit = nn.CrossEntropyLoss()
        self.coarse_w = coarse_w  # Loss weighting
        self.post_init()

    def forward(self, input_ids=None, attention_mask=None, labels=None, **_):
        # Encode input
        hidden = self.encoder(input_ids, attention_mask=attention_mask).last_hidden_state
        pooled = hidden[:, 0]  # Use CLS token

        # Predict both levels
        logits_c = self.coarse_head(pooled)
        logits_f = self.fine_head(pooled)

        if labels is None:
            return {"logits": (logits_c, logits_f)}

        # Calculate hierarchical loss
        lab_c, lab_f = labels[:, 0], labels[:, 1]
        loss_c = self.crit(logits_c, lab_c)

        mask = lab_f != -100
        if mask.any():
            loss_f = self.crit(logits_f[mask], lab_f[mask])
            loss = self.coarse_w * loss_c + (1 - self.coarse_w) * loss_f
        else:
            loss = loss_c

        return {"loss": loss, "logits": (logits_c, logits_f)}

print("‚úÖ Hierarchical classifier architecture defined")

‚úÖ Hierarchical classifier architecture defined


In [None]:
# 2.2 Download and Load Models
from huggingface_hub import hf_hub_download
from transformers import AutoTokenizer, AutoConfig
from safetensors.torch import load_file as safe_load_file
import json

# Model paths and configurations
BERT_CLS_PATH = f"{BASE_DIR}/bert-tense-hier/best"
TRANS_REPO = "scb10x/typhoon-translate-4b-gguf"
TRANS_FILE = "typhoon-translate-4b-q4_k_m.gguf"
TRANS_PATH = f"{MODEL_DIR}/{TRANS_FILE}"
EXPL_REPO = "scb10x/typhoon2.1-gemma3-4b"
EXPL_DIR = f"{MODEL_DIR}/typhoon2.1-gemma3-4b"

print("üì• Downloading/verifying models...")

# Download GGUF translator if needed
if not os.path.exists(TRANS_PATH):
    print("‚¨áÔ∏è Downloading GGUF translator...")
    hf_hub_download(
        repo_id=TRANS_REPO,
        filename=TRANS_FILE,
        local_dir=MODEL_DIR,
        resume_download=True
    )
    print(f"‚úÖ Translator saved to {TRANS_PATH}")
else:
    print(f"‚úÖ GGUF translator already exists")

# Download explainer model if needed
if not os.path.exists(EXPL_DIR):
    print("‚¨áÔ∏è Downloading explainer model...")
    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained(EXPL_REPO, cache_dir=EXPL_DIR)
    model = AutoModelForCausalLM.from_pretrained(EXPL_REPO, cache_dir=EXPL_DIR)
    print(f"‚úÖ Explainer downloaded to {EXPL_DIR}")
else:
    print(f"‚úÖ Explainer model already exists")

# Load BERT classifier
print(f"üìÇ Loading BERT classifier from: {BERT_CLS_PATH}")
config = AutoConfig.from_pretrained(BERT_CLS_PATH)
config.add_pooling_layer = False

bert_cls = XLMRHierClassifier(
    config=config,
    n_coarse=3,
    n_fine=25,
    coarse_w=0.3
)

# Load weights
weights = safe_load_file(os.path.join(BERT_CLS_PATH, "model.safetensors"))
bert_cls.load_state_dict(weights)

# Move to device
device = "cuda" if torch.cuda.is_available() else "cpu"
bert_cls.eval().to(device)

# Load label mappings
with open(f"{BERT_CLS_PATH}/coarse_labels.json") as f:
    coarse_labels = json.load(f)
with open(f"{BERT_CLS_PATH}/fine_labels.json") as f:
    fine_labels = json.load(f)

print(f"‚úÖ BERT classifier loaded on {device}")
print(f"‚úÖ Label mappings loaded: {len(coarse_labels)} coarse, {len(fine_labels)} fine labels")

üì• Downloading/verifying models...
‚úÖ GGUF translator already exists
‚úÖ Explainer model already exists
üìÇ Loading BERT classifier from: /content/drive/MyDrive/Dissertation/Model v.5/bert-tense-hier/best
‚úÖ BERT classifier loaded on cuda
‚úÖ Label mappings loaded: 3 coarse, 24 fine labels


In [None]:
# 2.3 Test BERT Classifier
import torch
import torch.nn.functional as F

# Load tokenizer for testing
bert_tokenizer = AutoTokenizer.from_pretrained(BERT_CLS_PATH)

# Test with sample sentence
test_sentence = "I used to go to school"
print(f"üß™ Testing classifier with: '{test_sentence}'")

# Tokenize and predict
inputs = bert_tokenizer(test_sentence, return_tensors="pt").to(device)

with torch.no_grad():
    logits = bert_cls(**inputs)["logits"]
    coarse_logits, fine_logits = logits

# Convert to probabilities
coarse_probs = F.softmax(coarse_logits, dim=1)
fine_probs = F.softmax(fine_logits, dim=1)

# Get top 3 predictions
coarse_topk = torch.topk(coarse_probs, k=3, dim=1)
fine_topk = torch.topk(fine_probs, k=3, dim=1)

print("\nüè∑Ô∏è Top 3 Coarse Predictions:")
for i in range(3):
    idx = coarse_topk.indices[0][i].item()
    prob = coarse_topk.values[0][i].item()
    print(f"   {i+1}. {coarse_labels[idx]} ({prob:.2%})")

print("\nüè∑Ô∏è Top 3 Fine-grained Predictions:")
for i in range(3):
    idx = fine_topk.indices[0][i].item()
    prob = fine_topk.values[0][i].item()
    print(f"   {i+1}. {fine_labels[idx]} ({prob:.2%})")

print("\n‚úÖ BERT classifier working correctly")

üß™ Testing classifier with: 'I used to go to school'

üè∑Ô∏è Top 3 Coarse Predictions:
   1. Past (99.70%)
   2. Future (0.16%)
   3. Present (0.14%)

üè∑Ô∏è Top 3 Fine-grained Predictions:
   1. NORFIN (99.94%)
   2. DOINGATSOMETIMEPAST (0.01%)
   3. HABIT (0.01%)

‚úÖ BERT classifier working correctly


---
## Section 3: Complete Hybrid Pipeline System

### 3.1 Tense Definitions and Classifications

In [None]:
# 3.2 Complete Hybrid Pipeline Implementation
import torch, json, time, re
from safetensors.torch import load_file as safe_load_file
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoConfig
from llama_cpp import Llama

class TenseTagDefinitions:
    """Comprehensive tense tag definitions for grammar explanations"""

    def __init__(self):
        # Coarse-level definitions
        self.coarse_definitions = {
            "PAST": "‡∏´‡∏°‡∏≤‡∏¢‡∏ñ‡∏∂‡∏á ‡πÄ‡∏´‡∏ï‡∏∏‡∏Å‡∏≤‡∏£‡∏ì‡πå‡∏´‡∏£‡∏∑‡∏≠‡∏Å‡∏≤‡∏£‡∏Å‡∏£‡∏∞‡∏ó‡∏≥‡∏ó‡∏µ‡πà‡πÄ‡∏Å‡∏¥‡∏î‡∏Ç‡∏∂‡πâ‡∏ô‡πÅ‡∏•‡∏∞‡∏à‡∏ö‡∏•‡∏á‡πÅ‡∏•‡πâ‡∏ß‡πÉ‡∏ô‡∏≠‡∏î‡∏µ‡∏ï",
            "PRESENT": "‡∏´‡∏°‡∏≤‡∏¢‡∏ñ‡∏∂‡∏á ‡πÄ‡∏´‡∏ï‡∏∏‡∏Å‡∏≤‡∏£‡∏ì‡πå‡∏´‡∏£‡∏∑‡∏≠‡∏Å‡∏≤‡∏£‡∏Å‡∏£‡∏∞‡∏ó‡∏≥‡∏ó‡∏µ‡πà‡πÄ‡∏Å‡∏¥‡∏î‡∏Ç‡∏∂‡πâ‡∏ô‡πÉ‡∏ô‡∏õ‡∏±‡∏à‡∏à‡∏∏‡∏ö‡∏±‡∏ô ‡∏´‡∏£‡∏∑‡∏≠‡πÄ‡∏õ‡πá‡∏ô‡∏Ñ‡∏ß‡∏≤‡∏°‡∏à‡∏£‡∏¥‡∏á‡∏ó‡∏±‡πà‡∏ß‡πÑ‡∏õ",
            "FUTURE": "‡∏´‡∏°‡∏≤‡∏¢‡∏ñ‡∏∂‡∏á ‡πÄ‡∏´‡∏ï‡∏∏‡∏Å‡∏≤‡∏£‡∏ì‡πå‡∏´‡∏£‡∏∑‡∏≠‡∏Å‡∏≤‡∏£‡∏Å‡∏£‡∏∞‡∏ó‡∏≥‡∏ó‡∏µ‡πà‡∏à‡∏∞‡πÄ‡∏Å‡∏¥‡∏î‡∏Ç‡∏∂‡πâ‡∏ô‡πÉ‡∏ô‡∏≠‡∏ô‡∏≤‡∏Ñ‡∏ï"
        }

        # Fine-grained tense definitions with detailed explanations
        self.fine_definitions = {
            # Present Simple categories
            "HABIT": {
                "tense": "Present Simple",
                "thai_name": "Present Simple - ‡∏Å‡∏¥‡∏à‡∏ß‡∏±‡∏ï‡∏£/‡∏ô‡∏¥‡∏™‡∏±‡∏¢",
                "usage": "‡πÉ‡∏ä‡πâ‡πÄ‡∏°‡∏∑‡πà‡∏≠‡∏û‡∏π‡∏î‡∏ñ‡∏∂‡∏á‡∏Å‡∏¥‡∏à‡∏ß‡∏±‡∏ï‡∏£‡∏´‡∏£‡∏∑‡∏≠‡∏û‡∏§‡∏ï‡∏¥‡∏Å‡∏£‡∏£‡∏°‡∏ó‡∏µ‡πà‡∏ó‡∏≥‡πÄ‡∏õ‡πá‡∏ô‡∏õ‡∏£‡∏∞‡∏à‡∏≥",
                "structure": "Subject + V1 (‡∏ñ‡πâ‡∏≤‡∏õ‡∏£‡∏∞‡∏ò‡∏≤‡∏ô‡πÄ‡∏≠‡∏Å‡∏û‡∏à‡∏ô‡πå‡πÄ‡∏ï‡∏¥‡∏° s/es)",
                "keywords": "always, usually, often, sometimes, every day",
                "example": "I drink coffee every morning."
            },
            "FACT": {
                "tense": "Present Simple",
                "thai_name": "Present Simple - ‡∏Ç‡πâ‡∏≠‡πÄ‡∏ó‡πá‡∏à‡∏à‡∏£‡∏¥‡∏á",
                "usage": "‡πÉ‡∏ä‡πâ‡∏Å‡∏±‡∏ö‡∏Ç‡πâ‡∏≠‡πÄ‡∏ó‡πá‡∏à‡∏à‡∏£‡∏¥‡∏á‡∏ó‡∏µ‡πà‡πÄ‡∏õ‡πá‡∏ô‡∏™‡∏±‡∏à‡∏ò‡∏£‡∏£‡∏°‡∏´‡∏£‡∏∑‡∏≠‡πÄ‡∏õ‡πá‡∏ô‡∏Ñ‡∏ß‡∏≤‡∏°‡∏£‡∏π‡πâ‡∏ó‡∏≤‡∏á‡∏ß‡∏¥‡∏ó‡∏¢‡∏≤‡∏®‡∏≤‡∏™‡∏ï‡∏£‡πå",
                "structure": "Subject + V1",
                "keywords": "‡∏Ç‡πâ‡∏≠‡πÄ‡∏ó‡πá‡∏à‡∏à‡∏£‡∏¥‡∏á‡∏ó‡∏±‡πà‡∏ß‡πÑ‡∏õ, ‡∏™‡∏±‡∏à‡∏ò‡∏£‡∏£‡∏°",
                "example": "The sun rises in the east."
            },
            "SCHEDULEDFUTURE": {
                "tense": "Present Simple",
                "thai_name": "Present Simple - ‡∏ï‡∏≤‡∏£‡∏≤‡∏á‡πÄ‡∏ß‡∏•‡∏≤/‡πÅ‡∏ú‡∏ô‡∏Å‡∏≤‡∏£‡∏ó‡∏µ‡πà‡∏Å‡∏≥‡∏´‡∏ô‡∏î‡πÑ‡∏ß‡πâ",
                "usage": "‡πÉ‡∏ä‡πâ‡πÄ‡∏°‡∏∑‡πà‡∏≠‡∏Å‡∏•‡πà‡∏≤‡∏ß‡∏ñ‡∏∂‡∏á‡∏ï‡∏≤‡∏£‡∏≤‡∏á‡πÄ‡∏ß‡∏•‡∏≤ ‡∏ï‡∏≤‡∏£‡∏≤‡∏á‡πÄ‡∏î‡∏¥‡∏ô‡∏£‡∏ñ ‡πÅ‡∏ú‡∏ô‡∏ó‡∏µ‡πà‡∏Å‡∏≥‡∏´‡∏ô‡∏î‡πÑ‡∏ß‡πâ‡πÅ‡∏ô‡πà‡∏ô‡∏≠‡∏ô ‡∏´‡∏£‡∏∑‡∏≠‡πÅ‡∏ú‡∏ô‡∏Å‡∏≤‡∏£‡πÉ‡∏ô‡∏≠‡∏ô‡∏≤‡∏Ñ‡∏ï‡∏ó‡∏µ‡πà‡∏ß‡∏≤‡∏á‡πÑ‡∏ß‡πâ",
                "structure": "Subject + V1",
                "keywords": "schedule, timetable, plan to, intend to",
                "example": "The train leaves at 9 AM. / I plan to study abroad next year."
            },
            "SAYING": {
                "tense": "Present Simple",
                "thai_name": "Present Simple - ‡∏™‡∏∏‡∏†‡∏≤‡∏©‡∏¥‡∏ï/‡∏Ñ‡∏≥‡∏û‡∏±‡∏á‡πÄ‡∏û‡∏¢",
                "usage": "‡πÉ‡∏ä‡πâ‡∏Å‡∏±‡∏ö‡∏™‡∏∏‡∏†‡∏≤‡∏©‡∏¥‡∏ï ‡∏Ñ‡∏≥‡∏û‡∏±‡∏á‡πÄ‡∏û‡∏¢ ‡∏´‡∏£‡∏∑‡∏≠‡∏Ñ‡∏≥‡∏Å‡∏•‡πà‡∏≤‡∏ß‡∏ó‡∏±‡πà‡∏ß‡πÑ‡∏õ",
                "structure": "Subject + V1",
                "keywords": "‡∏™‡∏∏‡∏†‡∏≤‡∏©‡∏¥‡∏ï, ‡∏Ñ‡∏≥‡∏Å‡∏•‡πà‡∏≤‡∏ß",
                "example": "Practice makes perfect."
            },
            "HEADLINE": {
                "tense": "Present Simple",
                "thai_name": "Present Simple - ‡∏û‡∏≤‡∏î‡∏´‡∏±‡∏ß‡∏Ç‡πà‡∏≤‡∏ß",
                "usage": "‡πÉ‡∏ä‡πâ‡πÉ‡∏ô‡∏û‡∏≤‡∏î‡∏´‡∏±‡∏ß‡∏Ç‡πà‡∏≤‡∏ß‡∏´‡∏£‡∏∑‡∏≠‡∏Ç‡πâ‡∏≠‡∏Ñ‡∏ß‡∏≤‡∏°‡∏ó‡∏µ‡πà‡πÄ‡∏ô‡πâ‡∏ô‡∏¢‡πà‡∏≠‡∏õ‡∏£‡∏∞‡πÇ‡∏¢‡∏Ñ ‡πÅ‡∏°‡πâ‡∏à‡∏∞‡∏û‡∏π‡∏î‡∏ñ‡∏∂‡∏á‡πÄ‡∏´‡∏ï‡∏∏‡∏Å‡∏≤‡∏£‡∏ì‡πå‡∏ó‡∏µ‡πà‡πÄ‡∏Å‡∏¥‡∏î‡∏Ç‡∏∂‡πâ‡∏ô‡πÅ‡∏•‡πâ‡∏ß ‡πÄ‡∏û‡∏∑‡πà‡∏≠‡∏™‡∏£‡πâ‡∏≤‡∏á‡∏Ñ‡∏ß‡∏≤‡∏°‡∏£‡∏π‡πâ‡∏™‡∏∂‡∏Å‡∏™‡∏î‡πÉ‡∏´‡∏°‡πà",
                "structure": "Subject + V1",
                "keywords": "‡∏û‡∏≤‡∏î‡∏´‡∏±‡∏ß‡∏Ç‡πà‡∏≤‡∏ß",
                "example": "Prime Minister visits flood victims."
            },

            # Present Continuous categories
            "HAPPENING": {
                "tense": "Present Continuous",
                "thai_name": "Present Continuous - ‡∏Å‡∏≥‡∏•‡∏±‡∏á‡πÄ‡∏Å‡∏¥‡∏î‡∏Ç‡∏∂‡πâ‡∏ô",
                "usage": "‡∏™‡∏¥‡πà‡∏á‡∏ó‡∏µ‡πà‡∏ó‡∏≥‡∏≠‡∏¢‡∏π‡πà‡∏Ç‡∏ì‡∏∞‡∏û‡∏π‡∏î",
                "structure": "Subject + is/am/are + V-ing",
                "keywords": "now, right now, at the moment",
                "example": "I am writing an email now."
            },
            "NOWADAYS": {
                "tense": "Present Continuous",
                "thai_name": "Present Continuous - ‡∏ä‡πà‡∏ß‡∏á‡∏ô‡∏µ‡πâ",
                "usage": "‡∏™‡∏¥‡πà‡∏á‡∏ó‡∏µ‡πà‡∏ó‡∏≥‡∏≠‡∏¢‡∏π‡πà‡πÉ‡∏ô‡∏ä‡πà‡∏ß‡∏á‡∏ô‡∏µ‡πâ ‡πÄ‡∏ä‡πà‡∏ô ‡πÇ‡∏õ‡∏£‡πÄ‡∏à‡∏Ñ ‡∏´‡∏£‡∏∑‡∏≠‡∏™‡∏¥‡πà‡∏á‡∏ó‡∏µ‡πà‡πÉ‡∏ä‡πâ‡πÄ‡∏ß‡∏•‡∏≤‡∏ó‡∏≥‡∏ô‡∏≤‡∏ô‡πÄ‡∏õ‡πá‡∏ô‡∏´‡∏•‡∏±‡∏Å‡∏ß‡∏±‡∏ô",
                "structure": "Subject + is/am/are + V-ing",
                "keywords": "these days, nowadays, currently",
                "example": "I am working on a big project these days."
            },
            "SUREFUT": {
                "tense": "Present Continuous",
                "thai_name": "Present Continuous - ‡∏≠‡∏ô‡∏≤‡∏Ñ‡∏ï‡∏ó‡∏µ‡πà‡∏ß‡∏≤‡∏á‡πÅ‡∏ú‡∏ô‡πÑ‡∏ß‡πâ",
                "usage": "‡πÄ‡∏´‡∏ï‡∏∏‡∏Å‡∏≤‡∏£‡∏ì‡πå‡∏ó‡∏µ‡πà‡∏à‡∏∞‡πÄ‡∏Å‡∏¥‡∏î‡∏Ç‡∏∂‡πâ‡∏ô‡πÉ‡∏ô‡∏≠‡∏ô‡∏≤‡∏Ñ‡∏ï‡πÇ‡∏î‡∏¢‡∏°‡∏µ‡∏Å‡∏≤‡∏£‡∏ß‡∏≤‡∏á‡πÅ‡∏ú‡∏ô‡πÑ‡∏ß‡πâ‡πÅ‡∏•‡πâ‡∏ß ‡∏°‡∏±‡∏Å‡πÄ‡∏à‡∏≠ be + going to",
                "structure": "Subject + is/am/are + going to + V1",
                "keywords": "tomorrow, next week, planning",
                "example": "I am going to visit my parents tomorrow."
            },
            "PROGRESS": {
                "tense": "Present Continuous",
                "thai_name": "Present Continuous - ‡∏Å‡∏≥‡∏•‡∏±‡∏á‡πÄ‡∏õ‡∏•‡∏µ‡πà‡∏¢‡∏ô‡πÅ‡∏õ‡∏•‡∏á",
                "usage": "‡πÄ‡∏´‡∏ï‡∏∏‡∏Å‡∏≤‡∏£‡∏ì‡πå‡∏ó‡∏µ‡πà‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏°‡∏µ‡∏Å‡∏≤‡∏£‡πÄ‡∏õ‡∏•‡∏µ‡πà‡∏¢‡∏ô‡πÅ‡∏õ‡∏•‡∏á ‡∏û‡∏±‡∏í‡∏ô‡∏≤‡∏Ç‡∏∂‡πâ‡∏ô ‡∏´‡∏£‡∏∑‡∏≠‡∏Å‡πâ‡∏≤‡∏ß‡∏´‡∏ô‡πâ‡∏≤‡∏Ç‡∏∂‡πâ‡∏ô",
                "structure": "Subject + is/am/are + V-ing",
                "keywords": "changing, improving, getting better",
                "example": "The weather is getting warmer."
            },

            # Present Perfect categories
            "JUSTFIN": {
                "tense": "Present Perfect",
                "thai_name": "Present Perfect - ‡πÄ‡∏û‡∏¥‡πà‡∏á‡∏à‡∏ö",
                "usage": "‡πÉ‡∏ä‡πâ‡πÄ‡∏°‡∏∑‡πà‡∏≠‡πÄ‡∏´‡∏ï‡∏∏‡∏Å‡∏≤‡∏£‡∏ì‡πå‡πÄ‡∏û‡∏¥‡πà‡∏á‡∏à‡∏∞‡∏™‡∏¥‡πâ‡∏ô‡∏™‡∏∏‡∏î‡∏•‡∏á",
                "structure": "Subject + have/has + V3",
                "keywords": "just, just now",
                "example": "I have just finished my homework."
            },
            "RESULT": {
                "tense": "Present Perfect",
                "thai_name": "Present Perfect - ‡∏°‡∏µ‡∏ú‡∏•‡∏ñ‡∏∂‡∏á‡∏õ‡∏±‡∏à‡∏à‡∏∏‡∏ö‡∏±‡∏ô",
                "usage": "‡∏™‡∏¥‡πà‡∏á‡∏ó‡∏µ‡πà‡πÄ‡∏Å‡∏¥‡∏î‡∏Ç‡∏∂‡πâ‡∏ô‡∏ï‡∏±‡πâ‡∏á‡πÅ‡∏ï‡πà‡∏≠‡∏î‡∏µ‡∏ï‡πÅ‡∏•‡∏∞‡∏°‡∏µ‡∏ú‡∏•‡∏´‡∏£‡∏∑‡∏≠‡∏Ñ‡∏á‡∏™‡∏†‡∏≤‡∏û‡∏à‡∏ô‡∏ñ‡∏∂‡∏á‡∏õ‡∏±‡∏à‡∏à‡∏∏‡∏ö‡∏±‡∏ô",
                "structure": "Subject + have/has + V3",
                "keywords": "already, yet, still",
                "example": "I have lost my keys."
            },
            "EXP": {
                "tense": "Present Perfect",
                "thai_name": "Present Perfect - ‡∏õ‡∏£‡∏∞‡∏™‡∏ö‡∏Å‡∏≤‡∏£‡∏ì‡πå",
                "usage": "‡∏õ‡∏£‡∏∞‡∏™‡∏ö‡∏Å‡∏≤‡∏£‡∏ì‡πå (‡πÄ‡∏à‡∏≠‡∏Ñ‡∏≥‡∏ß‡πà‡∏≤ First / ... time)",
                "structure": "Subject + have/has + V3",
                "keywords": "ever, never, first time",
                "example": "This is the first time I have visited Japan."
            },

            # Present Perfect Continuous
            "SINCEFOR": {
                "tense": "Present Perfect Continuous",
                "thai_name": "Present Perfect Continuous - ‡∏ó‡∏≥‡∏°‡∏≤‡∏ï‡∏±‡πâ‡∏á‡πÅ‡∏ï‡πà",
                "usage": "‡∏™‡∏¥‡πà‡∏á‡∏ó‡∏µ‡πà‡∏ó‡∏≥‡πÄ‡∏£‡∏∑‡πà‡∏≠‡∏¢‡∏°‡∏≤‡∏à‡∏ô‡∏ñ‡∏∂‡∏á‡∏õ‡∏±‡∏à‡∏à‡∏∏‡∏ö‡∏±‡∏ô ‡πÇ‡∏î‡∏¢‡πÄ‡∏ô‡πâ‡∏ô‡∏£‡∏∞‡∏¢‡∏∞‡πÄ‡∏ß‡∏•‡∏≤ ‡∏°‡∏±‡∏Å‡πÄ‡∏à‡∏≠ for/since",
                "structure": "Subject + have/has + been + V-ing",
                "keywords": "for, since, all day",
                "example": "I have been studying for 3 hours."
            },

            # Past Simple
            "NORFIN": {
                "tense": "Past Simple",
                "thai_name": "Past Simple - ‡∏≠‡∏î‡∏µ‡∏ï‡∏ó‡∏±‡πà‡∏ß‡πÑ‡∏õ",
                "usage": "‡∏Å‡∏≤‡∏£‡∏Å‡∏£‡∏∞‡∏ó‡∏≥‡πÉ‡∏ô‡∏≠‡∏î‡∏µ‡∏ï‡∏ó‡∏±‡πà‡∏ß‡πÑ‡∏õ ‡πÇ‡∏î‡∏¢‡πÑ‡∏°‡πà‡∏°‡∏µ‡∏ö‡∏£‡∏¥‡∏ö‡∏ó‡∏´‡∏£‡∏∑‡∏≠‡∏£‡∏≤‡∏¢‡∏•‡∏∞‡πÄ‡∏≠‡∏µ‡∏¢‡∏î‡πÄ‡∏û‡∏¥‡πà‡∏°‡πÄ‡∏ï‡∏¥‡∏°",
                "structure": "Subject + V2",
                "keywords": "yesterday, last week, ago",
                "example": "I went to school yesterday."
            },

            # Past Continuous
            "INTERRUPT": {
                "tense": "Past Continuous",
                "thai_name": "Past Continuous - ‡∏ñ‡∏π‡∏Å‡∏Ç‡∏±‡∏î‡∏à‡∏±‡∏á‡∏´‡∏ß‡∏∞",
                "usage": "‡πÉ‡∏ä‡πâ‡∏Ñ‡∏π‡πà‡∏Å‡∏±‡∏ö Past Simple ‡πÄ‡∏û‡∏∑‡πà‡∏≠‡∏ö‡∏≠‡∏Å‡∏ß‡πà‡∏≤‡πÄ‡∏´‡∏ï‡∏∏‡∏Å‡∏≤‡∏£‡∏ì‡πå‡πÉ‡∏ô Past simple ‡πÄ‡∏Å‡∏¥‡∏î‡πÅ‡∏ó‡∏£‡∏Å‡∏≠‡∏µ‡∏Å‡πÄ‡∏´‡∏ï‡∏∏‡∏Å‡∏≤‡∏£‡∏ì‡πå‡∏ó‡∏µ‡πà‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏ó‡∏≥‡πÉ‡∏ô‡∏≠‡∏î‡∏µ‡∏ï",
                "structure": "Subject + was/were + V-ing + when + Past Simple",
                "keywords": "when, while",
                "example": "I was sleeping when the phone rang."
            },
            "DOINGATSOMETIMEPAST": {
                "tense": "Past Continuous",
                "thai_name": "Past Continuous - ‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏ó‡∏≥‡πÉ‡∏ô‡∏≠‡∏î‡∏µ‡∏ï",
                "usage": "‡∏™‡∏¥‡πà‡∏á‡∏ó‡∏µ‡πà‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏ó‡∏≥‡∏≠‡∏¢‡∏π‡πà ‡∏ì ‡πÄ‡∏ß‡∏•‡∏≤‡∏´‡∏ô‡∏∂‡πà‡∏á‡πÉ‡∏ô‡∏≠‡∏î‡∏µ‡∏ï",
                "structure": "Subject + was/were + V-ing",
                "keywords": "at ... yesterday, at that time",
                "example": "I was reading at 8 PM yesterday."
            },

            # Past Perfect
            "BEFOREPAST": {
                "tense": "Past Perfect",
                "thai_name": "Past Perfect - ‡∏Å‡πà‡∏≠‡∏ô‡πÄ‡∏´‡∏ï‡∏∏‡∏Å‡∏≤‡∏£‡∏ì‡πå‡πÉ‡∏ô‡∏≠‡∏î‡∏µ‡∏ï",
                "usage": "‡πÉ‡∏ä‡πâ‡πÄ‡∏°‡∏∑‡πà‡∏≠‡∏ï‡πâ‡∏≠‡∏á‡∏Å‡∏≤‡∏£‡πÅ‡∏™‡∏î‡∏á‡∏ß‡πà‡∏≤‡πÄ‡∏´‡∏ï‡∏∏‡∏Å‡∏≤‡∏£‡∏ì‡πå‡∏´‡∏ô‡∏∂‡πà‡∏á‡πÄ‡∏Å‡∏¥‡∏î‡∏Ç‡∏∂‡πâ‡∏ô‡πÅ‡∏•‡∏∞‡πÄ‡∏™‡∏£‡πá‡∏à‡∏Å‡πà‡∏≠‡∏ô‡∏≠‡∏µ‡∏Å‡πÄ‡∏´‡∏ï‡∏∏‡∏Å‡∏≤‡∏£‡∏ì‡πå‡πÉ‡∏ô‡∏≠‡∏î‡∏µ‡∏ï",
                "structure": "Subject + had + V3",
                "keywords": "before, after, already",
                "example": "She had finished homework before she ate dinner."
            },

            # Past Perfect Continuous
            "DURATION": {
                "tense": "Past Perfect Continuous",
                "thai_name": "Past Perfect Continuous - ‡∏ó‡∏≥‡∏°‡∏≤‡∏Å‡πà‡∏≠‡∏ô‡πÉ‡∏ô‡∏≠‡∏î‡∏µ‡∏ï",
                "usage": "‡πÉ‡∏ä‡πâ‡∏Ñ‡∏π‡πà‡∏Å‡∏±‡∏ö Past Simple ‡πÄ‡∏û‡∏∑‡πà‡∏≠‡∏ö‡∏≠‡∏Å‡∏™‡∏¥‡πà‡∏á‡∏ó‡∏µ‡πà‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏ó‡∏≥‡∏≠‡∏¢‡∏π‡πà‡∏™‡∏±‡∏Å‡∏û‡∏±‡∏Å‡∏´‡∏ô‡∏∂‡πà‡∏á‡∏Å‡πà‡∏≠‡∏ô‡∏≠‡∏µ‡∏Å‡∏™‡∏¥‡πà‡∏á‡∏à‡∏∞‡πÄ‡∏Å‡∏¥‡∏î‡πÉ‡∏ô‡∏≠‡∏î‡∏µ‡∏ï",
                "structure": "Subject + had + been + V-ing",
                "keywords": "for, since, before",
                "example": "I had been waiting for 2 hours before he arrived."
            },

            # Future Simple
            "50PERC": {
                "tense": "Future Simple",
                "thai_name": "Future Simple - ‡∏Ñ‡∏≤‡∏î‡∏Å‡∏≤‡∏£‡∏ì‡πå 50%",
                "usage": "‡∏™‡∏¥‡πà‡∏á‡∏ó‡∏µ‡πà‡∏Ñ‡∏≤‡∏î‡∏à‡∏∞‡πÄ‡∏Å‡∏¥‡∏î ‡∏´‡∏£‡∏∑‡∏≠‡∏°‡∏µ‡πÅ‡∏ô‡∏ß‡πÇ‡∏ô‡πâ‡∏°‡πÉ‡∏ô‡∏≠‡∏ô‡∏≤‡∏Ñ‡∏ï (50%)",
                "structure": "Subject + will + V1",
                "keywords": "probably, maybe, I think",
                "example": "It will probably rain tomorrow."
            },
            "PROMISE": {
                "tense": "Future Simple",
                "thai_name": "Future Simple - ‡∏™‡∏±‡∏ç‡∏ç‡∏≤/‡πÄ‡∏™‡∏ô‡∏≠",
                "usage": "‡∏Å‡∏≤‡∏£‡πÉ‡∏´‡πâ‡∏Ñ‡∏≥‡∏™‡∏±‡∏ç‡∏ç‡∏≤ ‡∏´‡∏£‡∏∑‡∏≠‡πÄ‡∏™‡∏ô‡∏≠‡∏≠‡∏∞‡πÑ‡∏£‡πÉ‡∏´‡πâ‡πÉ‡∏Ñ‡∏£",
                "structure": "Subject + will + V1",
                "keywords": "promise, offer",
                "example": "I will help you with your homework."
            },
            "RIGHTNOW": {
                "tense": "Future Simple",
                "thai_name": "Future Simple - ‡∏ï‡∏±‡∏î‡∏™‡∏¥‡∏ô‡πÉ‡∏à‡∏ó‡∏±‡∏ô‡∏ó‡∏µ",
                "usage": "‡∏™‡∏¥‡πà‡∏á‡∏ó‡∏µ‡πà‡πÄ‡∏û‡∏¥‡πà‡∏á‡∏Ñ‡∏¥‡∏î‡∏ß‡πà‡∏≤‡∏à‡∏∞‡∏ó‡∏≥‡πÄ‡∏î‡∏µ‡πã‡∏¢‡∏ß‡∏ô‡∏±‡πâ‡∏ô (‡πÑ‡∏°‡πà‡πÑ‡∏î‡πâ‡∏ß‡∏≤‡∏á‡πÅ‡∏ú‡∏ô‡∏ß‡πà‡∏≤‡∏à‡∏∞‡∏ó‡∏≥‡∏°‡∏≤‡∏Å‡πà‡∏≠‡∏ô)",
                "structure": "Subject + will + V1",
                "keywords": "OK, I'll..., spontaneous decision",
                "example": "The doorbell is ringing. I'll answer it."
            },

            # Future Continuous
            "LONGFUTURE": {
                "tense": "Future Continuous",
                "thai_name": "Future Continuous - ‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏ó‡∏≥‡πÉ‡∏ô‡∏≠‡∏ô‡∏≤‡∏Ñ‡∏ï",
                "usage": "‡∏™‡∏¥‡πà‡∏á‡∏ó‡∏µ‡πà‡∏Ñ‡∏≤‡∏î‡∏ß‡πà‡∏≤‡∏ô‡πà‡∏≤‡∏à‡∏∞‡πÄ‡∏Å‡∏¥‡∏î‡∏Ç‡∏∂‡πâ‡∏ô ‡∏Ñ‡∏á‡∏à‡∏∞‡∏ó‡∏≥‡∏≠‡∏¢‡∏π‡πà ‡∏´‡∏£‡∏∑‡∏≠‡∏ß‡∏≤‡∏á‡πÅ‡∏ú‡∏ô‡∏ß‡πà‡∏≤‡∏à‡∏∞‡∏ó‡∏≥ ‡∏ì ‡πÄ‡∏ß‡∏•‡∏≤‡∏´‡∏ô‡∏∂‡πà‡∏á‡πÉ‡∏ô‡∏≠‡∏ô‡∏≤‡∏Ñ‡∏ï",
                "structure": "Subject + will + be + V-ing",
                "keywords": "at ... tomorrow, this time next week",
                "example": "I will be studying at 8 PM tomorrow."
            },

            # Future Perfect
            "PREDICT": {
                "tense": "Future Perfect",
                "thai_name": "Future Perfect - ‡∏à‡∏∞‡πÄ‡∏™‡∏£‡πá‡∏à‡∏Å‡πà‡∏≠‡∏ô",
                "usage": "‡∏™‡∏¥‡πà‡∏á‡∏ó‡∏µ‡πà‡∏Ñ‡∏≤‡∏î‡∏ß‡πà‡∏≤‡∏Ñ‡∏á‡∏à‡∏ö ‡πÄ‡∏™‡∏£‡πá‡∏à‡∏™‡∏¥‡πâ‡∏ô ‡∏´‡∏£‡∏∑‡∏≠‡∏Ñ‡∏£‡∏ö‡πÄ‡∏ß‡∏•‡∏≤‡πÅ‡∏•‡πâ‡∏ß ‡∏ì ‡πÄ‡∏ß‡∏•‡∏≤‡∏´‡∏ô‡∏∂‡πà‡∏á‡πÉ‡∏ô‡∏≠‡∏ô‡∏≤‡∏Ñ‡∏ï",
                "structure": "Subject + will + have + V3",
                "keywords": "by, by the time",
                "example": "I will have finished by 5 PM."
            },

            # Future Perfect Continuous
            "WILLCONTINUEINFUTURE": {
                "tense": "Future Perfect Continuous",
                "thai_name": "Future Perfect Continuous - ‡∏ó‡∏≥‡∏ï‡πà‡∏≠‡πÄ‡∏ô‡∏∑‡πà‡∏≠‡∏á‡πÉ‡∏ô‡∏≠‡∏ô‡∏≤‡∏Ñ‡∏ï",
                "usage": "‡∏™‡∏¥‡πà‡∏á‡∏ó‡∏µ‡πà‡∏Ñ‡∏≤‡∏î‡∏ß‡πà‡∏≤‡∏Ñ‡∏á‡∏à‡∏ö ‡πÄ‡∏™‡∏£‡πá‡∏à‡∏™‡∏¥‡πâ‡∏ô ‡∏´‡∏£‡∏∑‡∏≠‡∏Ñ‡∏£‡∏ö‡πÄ‡∏ß‡∏•‡∏≤‡πÅ‡∏•‡πâ‡∏ß ‡∏ì ‡πÄ‡∏ß‡∏•‡∏≤‡∏´‡∏ô‡∏∂‡πà‡∏á‡πÉ‡∏ô‡∏≠‡∏ô‡∏≤‡∏Ñ‡∏ï ‡πÅ‡∏•‡∏∞‡∏à‡∏∞‡∏Ñ‡∏á‡∏™‡∏†‡∏≤‡∏û‡∏´‡∏£‡∏∑‡∏≠‡∏ó‡∏≥‡πÅ‡∏ö‡∏ö‡∏ô‡∏µ‡πâ‡∏ï‡πà‡∏≠‡πÑ‡∏õ‡∏≠‡∏µ‡∏Å",
                "structure": "Subject + will + have + been + V-ing",
                "keywords": "for, by the time",
                "example": "By next year, I will have been working here for 10 years."
            }
        }

class Hybrid4BSystem:
    """
    Complete hybrid NLP pipeline system combining:
    - GGUF Translator (Typhoon Translate 4B)
    - BERT Classifier (XLM-RoBERTa Hierarchical)
    - Instruct Explainer (Typhoon 2.1 4B)
    """

    def __init__(self):
        # Model paths
        self.gguf_path = TRANS_PATH
        self.exp_dir = EXPL_DIR
        self.cls_dir = BERT_CLS_PATH

        # Model instances
        self.translator = None
        self.expl_tokenizer = None
        self.expl_model = None
        self.cls_tokenizer = None
        self.cls_model = None

        # Device configuration
        self.device = "cuda" if torch.cuda.is_available() else "cpu"

        # Initialize components
        self.tense_defs = TenseTagDefinitions()
        self._load_label_mappings()
        self._setup_examples()

        # Caching for efficiency
        self.translation_cache = {}
        self.explanation_cache = {}
        self.max_cache_size = 100

    def _load_label_mappings(self):
        """Load tense label mappings"""
        with open(f"{self.cls_dir}/fine_labels.json") as f:
            self.id2fine = json.load(f)
        with open(f"{self.cls_dir}/coarse_labels.json") as f:
            self.id2coarse = json.load(f)

    def _setup_examples(self):
        """Setup few-shot examples for improved translation"""
        self.translation_examples = [
            {
                "thai": "‡∏â‡∏±‡∏ô‡∏Å‡∏¥‡∏ô‡∏Ç‡πâ‡∏≤‡∏ß‡πÄ‡∏õ‡πá‡∏ô‡∏õ‡∏£‡∏∞‡∏à‡∏≥‡∏ó‡∏∏‡∏Å‡πÄ‡∏ä‡πâ‡∏≤",
                "english": "I eat breakfast every morning.",
                "fine": "HABIT",
                "coarse": "PRESENT"
            },
            {
                "thai": "‡πÄ‡∏Ç‡∏≤‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏ó‡∏≥‡∏á‡∏≤‡∏ô‡∏≠‡∏¢‡∏π‡πà‡∏ï‡∏≠‡∏ô‡∏ô‡∏µ‡πâ",
                "english": "He is working right now.",
                "fine": "HAPPENING",
                "coarse": "PRESENT"
            },
            {
                "thai": "‡∏â‡∏±‡∏ô‡πÄ‡∏û‡∏¥‡πà‡∏á‡∏ó‡∏≥‡∏Å‡∏≤‡∏£‡∏ö‡πâ‡∏≤‡∏ô‡πÄ‡∏™‡∏£‡πá‡∏à",
                "english": "I have just finished my homework.",
                "fine": "JUSTFIN",
                "coarse": "PRESENT"
            }
        ]

    def load_translator(self, n_gpu_layers=-1):
        """Load GGUF translator model"""
        if self.translator is None:
            print("üîå Loading GGUF translator...")
            self.translator = Llama(
                model_path=self.gguf_path,
                n_gpu_layers=n_gpu_layers,
                n_ctx=4096,
                n_batch=512,
                f16_kv=True,
                use_mlock=True,
                use_mmap=True,
                logits_all=False,
                verbose=False
            )
            print("‚úÖ Translator loaded successfully")

    def load_explainer(self):
        """Load Transformers explainer model"""
        if self.expl_model is None:
            print("üîå Loading explainer model...")
            # Configure torch compilation
            import torch._dynamo
            torch._dynamo.config.suppress_errors = True
            torch._dynamo.config.cache_size_limit = 64

            self.expl_tokenizer = AutoTokenizer.from_pretrained(self.exp_dir, use_fast=True)
            self.expl_model = AutoModelForCausalLM.from_pretrained(
                self.exp_dir,
                torch_dtype=torch.bfloat16,
                device_map="auto",
                low_cpu_mem_usage=True,
                attn_implementation="eager"
            ).eval()
            print("‚úÖ Explainer loaded successfully")

    def load_classifier(self):
        """Load BERT classifier model"""
        if self.cls_model is None:
            print("üîå Loading BERT classifier...")
            self.cls_tokenizer = AutoTokenizer.from_pretrained(self.cls_dir)
            config = AutoConfig.from_pretrained(self.cls_dir)
            self.cls_model = XLMRHierClassifier(config, n_coarse=3, n_fine=25, coarse_w=0.3)
            weights = safe_load_file(f"{self.cls_dir}/model.safetensors")
            self.cls_model.load_state_dict(weights)
            self.cls_model.eval().to(self.device)
            print("‚úÖ Classifier loaded successfully")

    def translate(self, thai_text, max_tokens=80):
        """Translate Thai text to English"""
        if self.translator is None:
            raise RuntimeError("Translator not loaded")

        # Check cache
        if thai_text in self.translation_cache:
            return self.translation_cache[thai_text]

        # Simple prompt for non-instruct model
        prompt = f"""Translate the following Thai sentence to English:
Thai: {thai_text}
English:"""

        start_time = time.time()
        output = self.translator(
            prompt,
            max_tokens=max_tokens,
            temperature=0.1,
            top_p=0.95,
            top_k=40,
            repeat_penalty=1.1,
            stop=["\n", "Thai:", "Translate"]
        )
        duration = time.time() - start_time

        # Clean output
        translation = output["choices"][0]["text"].strip()
        translation = translation.split("\n")[0]

        # Cache result
        if len(self.translation_cache) >= self.max_cache_size:
            self.translation_cache.pop(next(iter(self.translation_cache)))
        self.translation_cache[thai_text] = (translation, duration)

        return translation, duration

    def predict_tense(self, english_sentence, top_k=3):
        """Predict tense labels with confidence scores"""
        if self.cls_model is None or self.cls_tokenizer is None:
            raise RuntimeError("Classifier not loaded")

        inputs = self.cls_tokenizer(english_sentence, return_tensors="pt").to(self.device)

        with torch.inference_mode():
            logits = self.cls_model(**inputs)["logits"]
            coarse_logits, fine_logits = logits

        # Convert to probabilities
        coarse_probs = torch.nn.functional.softmax(coarse_logits, dim=1)
        fine_probs = torch.nn.functional.softmax(fine_logits, dim=1)

        # Get top-k predictions
        coarse_topk = torch.topk(coarse_probs, k=top_k, dim=1)
        fine_topk = torch.topk(fine_probs, k=top_k, dim=1)

        # Format results
        coarse_result = [(self.id2coarse[i.item()], p.item())
                        for i, p in zip(coarse_topk.indices[0], coarse_topk.values[0])]
        fine_result = [(self.id2fine[i.item()], p.item())
                      for i, p in zip(fine_topk.indices[0], fine_topk.values[0])]

        return coarse_result, fine_result

    def analyze_sentence_context(self, thai_text, english_translation):
        """Analyze sentence context for better explanations"""
        # Determine sentence type
        if any(word in thai_text for word in ['‡∏≠‡∏∏‡∏ì‡∏´‡∏†‡∏π‡∏°‡∏¥', '‡∏≠‡∏á‡∏®‡∏≤', '‡πÇ‡∏•‡∏Å', '‡∏î‡∏ß‡∏á‡∏≠‡∏≤‡∏ó‡∏¥‡∏ï‡∏¢‡πå', '‡∏ß‡∏¥‡∏ó‡∏¢‡∏≤‡∏®‡∏≤‡∏™‡∏ï‡∏£‡πå']):
            sentence_type = "‡∏Ç‡πâ‡∏≠‡πÄ‡∏ó‡πá‡∏à‡∏à‡∏£‡∏¥‡∏á‡∏ó‡∏≤‡∏á‡∏ß‡∏¥‡∏ó‡∏¢‡∏≤‡∏®‡∏≤‡∏™‡∏ï‡∏£‡πå"
        elif any(word in thai_text for word in ['‡∏ó‡∏∏‡∏Å‡∏ß‡∏±‡∏ô', '‡πÄ‡∏™‡∏°‡∏≠', '‡∏õ‡∏£‡∏∞‡∏à‡∏≥', '‡∏ó‡∏∏‡∏Å‡πÄ‡∏ä‡πâ‡∏≤']):
            sentence_type = "‡∏Å‡∏¥‡∏à‡∏ß‡∏±‡∏ï‡∏£‡∏õ‡∏£‡∏∞‡∏à‡∏≥‡∏ß‡∏±‡∏ô"
        elif any(word in thai_text for word in ['‡∏°‡∏±‡∏Å‡∏à‡∏∞', '‡∏ä‡∏≠‡∏ö', '‡πÄ‡∏Ñ‡∏¢', '‡∏ö‡πà‡∏≠‡∏¢']):
            sentence_type = "‡∏ô‡∏¥‡∏™‡∏±‡∏¢‡∏™‡πà‡∏ß‡∏ô‡∏ö‡∏∏‡∏Ñ‡∏Ñ‡∏•"
        elif any(word in thai_text for word in ['‡∏Å‡∏≥‡∏•‡∏±‡∏á', '‡∏≠‡∏¢‡∏π‡πà', '‡∏ï‡∏≠‡∏ô‡∏ô‡∏µ‡πâ']):
            sentence_type = "‡∏Å‡∏≤‡∏£‡∏Å‡∏£‡∏∞‡∏ó‡∏≥‡∏ó‡∏µ‡πà‡∏Å‡∏≥‡∏•‡∏±‡∏á‡πÄ‡∏Å‡∏¥‡∏î‡∏Ç‡∏∂‡πâ‡∏ô"
        else:
            sentence_type = "‡∏õ‡∏£‡∏∞‡πÇ‡∏¢‡∏Ñ‡∏ó‡∏±‡πà‡∏ß‡πÑ‡∏õ"

        # Extract signal words
        thai_signals = {
            '‡∏ó‡∏∏‡∏Å‡∏ß‡∏±‡∏ô': 'every day', '‡πÄ‡∏™‡∏°‡∏≠': 'always', '‡∏õ‡∏Å‡∏ï‡∏¥': 'usually',
            '‡∏ö‡πà‡∏≠‡∏¢‡πÜ': 'often', '‡∏ï‡∏≠‡∏ô‡∏ô‡∏µ‡πâ': 'now', '‡∏Å‡∏≥‡∏•‡∏±‡∏á': 'currently',
            '‡πÄ‡∏°‡∏∑‡πà‡∏≠‡∏ß‡∏≤‡∏ô': 'yesterday', '‡∏û‡∏£‡∏∏‡πà‡∏á‡∏ô‡∏µ‡πâ': 'tomorrow'
        }

        found_signals = []
        for thai, eng in thai_signals.items():
            if thai in thai_text:
                found_signals.append(f"{thai} ({eng})")

        return {
            "sentence_type": sentence_type,
            "signal_words": ", ".join(found_signals) if found_signals else "‡πÑ‡∏°‡πà‡∏°‡∏µ‡∏Ñ‡∏≥‡∏™‡∏±‡∏ç‡∏ç‡∏≤‡∏ì‡∏ä‡∏±‡∏î‡πÄ‡∏à‡∏ô"
        }

    def explain(self, thai_text, english_translation, max_tokens=600):
        """Generate detailed grammar explanation"""
        if self.expl_model is None:
            raise RuntimeError("Explainer not loaded")

        # Get tense predictions
        coarse_preds, fine_preds = self.predict_tense(english_translation)

        print("üîç [Tense Prediction]")
        print(f"   Coarse: {coarse_preds[0][0]} ({coarse_preds[0][1]:.2%})")
        print(f"   Fine  : {fine_preds[0][0]} ({fine_preds[0][1]:.2%})")

        fine_label = fine_preds[0][0]
        confidence = fine_preds[0][1]
        fine_def = self.tense_defs.fine_definitions.get(fine_label, {})

        # Build context-aware prompt
        tag_context = f"""
Tense ‡∏ó‡∏µ‡πà‡∏ï‡∏£‡∏ß‡∏à‡∏û‡∏ö: {fine_label}
‡∏õ‡∏£‡∏∞‡πÄ‡∏†‡∏ó: {fine_def.get('tense', 'Unknown')} - {fine_def.get('thai_name', '')}
‡∏Å‡∏≤‡∏£‡πÉ‡∏ä‡πâ‡∏á‡∏≤‡∏ô: {fine_def.get('usage', '')}
‡πÇ‡∏Ñ‡∏£‡∏á‡∏™‡∏£‡πâ‡∏≤‡∏á: {fine_def.get('structure', '')}
‡∏Ñ‡∏≥‡∏™‡∏±‡∏ç‡∏ç‡∏≤‡∏ì: {fine_def.get('keywords', '')}
‡∏ï‡∏±‡∏ß‡∏≠‡∏¢‡πà‡∏≤‡∏á: {fine_def.get('example', '')}
"""

        prompt_body = f"""<context>
‡∏Ñ‡∏∏‡∏ì‡∏Ñ‡∏∑‡∏≠‡∏£‡∏∞‡∏ö‡∏ö‡∏ß‡∏¥‡πÄ‡∏Ñ‡∏£‡∏≤‡∏∞‡∏´‡πå‡πÑ‡∏ß‡∏¢‡∏≤‡∏Å‡∏£‡∏ì‡πå‡∏†‡∏≤‡∏©‡∏≤‡∏≠‡∏±‡∏á‡∏Å‡∏§‡∏©‡∏™‡∏≥‡∏´‡∏£‡∏±‡∏ö‡∏ú‡∏π‡πâ‡πÄ‡∏£‡∏µ‡∏¢‡∏ô‡πÑ‡∏ó‡∏¢
‡∏Ñ‡∏∏‡∏ì‡∏°‡∏µ‡∏Ñ‡∏ß‡∏≤‡∏°‡∏£‡∏π‡πâ‡∏•‡∏∂‡∏Å‡∏ã‡∏∂‡πâ‡∏á‡πÄ‡∏Å‡∏µ‡πà‡∏¢‡∏ß‡∏Å‡∏±‡∏ö‡∏£‡∏∞‡∏ö‡∏ö tense ‡πÉ‡∏ô‡∏†‡∏≤‡∏©‡∏≤‡∏≠‡∏±‡∏á‡∏Å‡∏§‡∏©‡πÅ‡∏•‡∏∞‡∏Ñ‡∏ß‡∏≤‡∏°‡πÅ‡∏ï‡∏Å‡∏ï‡πà‡∏≤‡∏á‡∏Å‡∏±‡∏ö‡∏†‡∏≤‡∏©‡∏≤‡πÑ‡∏ó‡∏¢
</context>

<tense_knowledge>
{tag_context}
</tense_knowledge>

<task>
‡∏ß‡∏¥‡πÄ‡∏Ñ‡∏£‡∏≤‡∏∞‡∏´‡πå‡∏Å‡∏≤‡∏£‡πÅ‡∏õ‡∏•‡∏õ‡∏£‡∏∞‡πÇ‡∏¢‡∏Ñ‡πÅ‡∏•‡∏∞‡∏≠‡∏ò‡∏¥‡∏ö‡∏≤‡∏¢‡∏Å‡∏≤‡∏£‡πÉ‡∏ä‡πâ tense ‡∏ó‡∏µ‡πà‡πÄ‡∏•‡∏∑‡∏≠‡∏Å‡∏≠‡∏¢‡πà‡∏≤‡∏á‡∏•‡∏∞‡πÄ‡∏≠‡∏µ‡∏¢‡∏î
</task>

<input>
‡∏õ‡∏£‡∏∞‡πÇ‡∏¢‡∏Ñ‡∏†‡∏≤‡∏©‡∏≤‡πÑ‡∏ó‡∏¢: {thai_text}
‡∏Å‡∏≤‡∏£‡πÅ‡∏õ‡∏•‡∏†‡∏≤‡∏©‡∏≤‡∏≠‡∏±‡∏á‡∏Å‡∏§‡∏©: {english_translation}
Tense ‡∏ó‡∏µ‡πà‡∏£‡∏∞‡∏ö‡∏ö‡∏ï‡∏£‡∏ß‡∏à‡∏û‡∏ö: {fine_label} (‡∏Ñ‡∏ß‡∏≤‡∏°‡∏°‡∏±‡πà‡∏ô‡πÉ‡∏à: {confidence:.1%})
</input>

<requirements>
‡πÇ‡∏õ‡∏£‡∏î‡∏≠‡∏ò‡∏¥‡∏ö‡∏≤‡∏¢‡πÇ‡∏î‡∏¢‡∏Ñ‡∏£‡∏≠‡∏ö‡∏Ñ‡∏•‡∏∏‡∏°‡∏õ‡∏£‡∏∞‡πÄ‡∏î‡πá‡∏ô‡∏ï‡πà‡∏≠‡πÑ‡∏õ‡∏ô‡∏µ‡πâ:

1. **‡∏ß‡∏¥‡πÄ‡∏Ñ‡∏£‡∏≤‡∏∞‡∏´‡πå Tense ‡∏ó‡∏µ‡πà‡πÉ‡∏ä‡πâ**:
   - ‡∏≠‡∏ò‡∏¥‡∏ö‡∏≤‡∏¢ tense ‡∏ó‡∏≤‡∏á‡πÑ‡∏ß‡∏¢‡∏≤‡∏Å‡∏£‡∏ì‡πå‡∏ó‡∏µ‡πà‡πÉ‡∏ä‡πâ (‡πÄ‡∏ä‡πà‡∏ô Present Simple, Past Perfect)
   - ‡∏≠‡∏ò‡∏¥‡∏ö‡∏≤‡∏¢‡∏Å‡∏≤‡∏£‡πÉ‡∏ä‡πâ‡∏á‡∏≤‡∏ô‡πÉ‡∏ô‡∏ö‡∏£‡∏¥‡∏ö‡∏ó‡∏ô‡∏µ‡πâ‡πÇ‡∏î‡∏¢‡πÄ‡∏â‡∏û‡∏≤‡∏∞
   - ‡πÇ‡∏Ñ‡∏£‡∏á‡∏™‡∏£‡πâ‡∏≤‡∏á‡πÑ‡∏ß‡∏¢‡∏≤‡∏Å‡∏£‡∏ì‡πå: {fine_def.get('structure', '')}

2. **‡∏Ñ‡∏≥‡∏®‡∏±‡∏û‡∏ó‡πå‡∏ó‡∏µ‡πà‡∏ô‡πà‡∏≤‡∏™‡∏ô‡πÉ‡∏à**:
   - ‡πÄ‡∏•‡∏∑‡∏≠‡∏Å‡∏Ñ‡∏≥‡∏®‡∏±‡∏û‡∏ó‡πå / ‡∏ß‡∏•‡∏µ ‡∏†‡∏≤‡∏©‡∏≤‡∏≠‡∏±‡∏á‡∏Å‡∏§‡∏©‡∏ó‡∏µ‡πà‡∏ô‡πà‡∏≤‡∏™‡∏ô‡πÉ‡∏à‡∏à‡∏≤‡∏Å‡∏õ‡∏£‡∏∞‡πÇ‡∏¢‡∏Ñ‡∏°‡∏≤‡∏´‡∏ô‡∏∂‡πà‡∏á‡∏Ñ‡∏≥ / ‡∏ß‡∏•‡∏µ ‡πÅ‡∏•‡∏∞‡∏≠‡∏ò‡∏¥‡∏ö‡∏≤‡∏¢‡∏ß‡πà‡∏≤‡∏ó‡∏≥‡πÑ‡∏°‡∏ñ‡∏∂‡∏á‡πÄ‡∏•‡∏∑‡∏≠‡∏Å‡πÉ‡∏ä‡πâ‡∏Ñ‡∏≥‡∏ô‡∏±‡πâ‡∏ô‡πÉ‡∏ô‡∏Å‡∏≤‡∏£‡πÅ‡∏õ‡∏•

3. **‡∏Ç‡πâ‡∏≠‡∏ú‡∏¥‡∏î‡∏û‡∏•‡∏≤‡∏î‡∏ó‡∏µ‡πà‡∏û‡∏ö‡∏ö‡πà‡∏≠‡∏¢**:
   - ‡∏ú‡∏π‡πâ‡πÄ‡∏£‡∏µ‡∏¢‡∏ô‡πÑ‡∏ó‡∏¢‡∏°‡∏±‡∏Å‡πÉ‡∏ä‡πâ tense ‡∏ó‡∏µ‡πà‡πÉ‡∏ä‡πâ‡πÉ‡∏ô‡∏õ‡∏£‡∏∞‡πÇ‡∏¢‡∏Ñ‡∏ú‡∏¥‡∏î‡∏≠‡∏¢‡πà‡∏≤‡∏á‡πÑ‡∏£
   - ‡∏ß‡∏¥‡∏ò‡∏µ‡∏à‡∏≥‡∏á‡πà‡∏≤‡∏¢ ‡πÜ

</requirements>

<format>
- ‡πÉ‡∏ä‡πâ‡∏†‡∏≤‡∏©‡∏≤‡πÑ‡∏ó‡∏¢‡∏ó‡∏µ‡πà‡πÄ‡∏Ç‡πâ‡∏≤‡πÉ‡∏à‡∏á‡πà‡∏≤‡∏¢
- ‡∏≠‡∏ò‡∏¥‡∏ö‡∏≤‡∏¢‡πÄ‡∏õ‡πá‡∏ô‡∏Ç‡∏±‡πâ‡∏ô‡∏ï‡∏≠‡∏ô ‡∏°‡∏µ‡∏´‡∏±‡∏ß‡∏Ç‡πâ‡∏≠‡∏ä‡∏±‡∏î‡πÄ‡∏à‡∏ô
- ‡∏¢‡∏Å‡∏ï‡∏±‡∏ß‡∏≠‡∏¢‡πà‡∏≤‡∏á‡∏õ‡∏£‡∏∞‡∏Å‡∏≠‡∏ö
- ‡πÄ‡∏ô‡πâ‡∏ô‡∏™‡∏¥‡πà‡∏á‡∏ó‡∏µ‡πà‡∏ú‡∏π‡πâ‡πÄ‡∏£‡∏µ‡∏¢‡∏ô‡πÑ‡∏ó‡∏¢‡∏Ñ‡∏ß‡∏£‡∏£‡∏∞‡∏ß‡∏±‡∏á
- **‡πÄ‡∏£‡∏¥‡πà‡∏°‡∏ï‡πâ‡∏ô‡∏î‡πâ‡∏ß‡∏¢‡∏Å‡∏≤‡∏£‡∏ß‡∏¥‡πÄ‡∏Ñ‡∏£‡∏≤‡∏∞‡∏´‡πå‡∏ó‡∏±‡∏ô‡∏ó‡∏µ**
- **‡πÉ‡∏ä‡πâ‡∏£‡∏π‡∏õ‡πÅ‡∏ö‡∏ö‡∏ß‡∏¥‡∏ä‡∏≤‡∏Å‡∏≤‡∏£ ‡πÑ‡∏°‡πà‡πÉ‡∏ä‡πà‡∏£‡∏π‡∏õ‡πÅ‡∏ö‡∏ö‡∏™‡∏ô‡∏ó‡∏ô‡∏≤**
- ‡∏´‡πâ‡∏≤‡∏°‡πÄ‡∏Ç‡∏µ‡∏¢‡∏ô‡∏´‡∏±‡∏ß‡∏Ç‡πâ‡∏≠‡∏≠‡∏∑‡πà‡∏ô ‡πÜ ‡∏ó‡∏µ‡πà‡πÑ‡∏°‡πà‡∏≠‡∏¢‡∏π‡πà‡πÉ‡∏ô Requirement
</format>"""

        messages = [
            {
                "role": "system",
                "content": """‡∏Ñ‡∏∏‡∏ì‡∏Ñ‡∏∑‡∏≠‡∏£‡∏∞‡∏ö‡∏ö‡∏ß‡∏¥‡πÄ‡∏Ñ‡∏£‡∏≤‡∏∞‡∏´‡πå‡πÑ‡∏ß‡∏¢‡∏≤‡∏Å‡∏£‡∏ì‡πå‡∏†‡∏≤‡∏©‡∏≤‡∏≠‡∏±‡∏á‡∏Å‡∏§‡∏©‡∏™‡∏≥‡∏´‡∏£‡∏±‡∏ö‡∏ú‡∏π‡πâ‡πÄ‡∏£‡∏µ‡∏¢‡∏ô‡πÑ‡∏ó‡∏¢ ‡∏Ñ‡∏∏‡∏ì‡πÉ‡∏´‡πâ‡∏Ñ‡∏≥‡∏≠‡∏ò‡∏¥‡∏ö‡∏≤‡∏¢‡∏ó‡∏µ‡πà‡∏ï‡∏£‡∏á‡∏õ‡∏£‡∏∞‡πÄ‡∏î‡πá‡∏ô ‡∏Å‡∏£‡∏∞‡∏ä‡∏±‡∏ö ‡πÅ‡∏•‡∏∞‡πÄ‡∏õ‡πá‡∏ô‡∏ß‡∏¥‡∏ä‡∏≤‡∏Å‡∏≤‡∏£

‡∏Å‡∏é‡∏™‡∏≥‡∏Ñ‡∏±‡∏ç:
- ‡∏≠‡∏ò‡∏¥‡∏ö‡∏≤‡∏¢ TENSE ‡∏ó‡∏≤‡∏á‡πÑ‡∏ß‡∏¢‡∏≤‡∏Å‡∏£‡∏ì‡πå (Present Simple, Past Perfect ‡∏Ø‡∏•‡∏Ø) ‡πÑ‡∏°‡πà‡πÉ‡∏ä‡πà‡∏£‡∏´‡∏±‡∏™‡∏à‡∏±‡∏î‡∏´‡∏°‡∏ß‡∏î‡∏´‡∏°‡∏π‡πà
- ‡πÑ‡∏°‡πà‡πÉ‡∏ä‡πâ‡∏Ñ‡∏≥‡∏ó‡∏±‡∏Å‡∏ó‡∏≤‡∏¢ ‡∏Ñ‡∏≥‡∏•‡∏≤ ‡∏´‡∏£‡∏∑‡∏≠‡∏ö‡∏ó‡∏ô‡∏≥
- ‡πÄ‡∏£‡∏¥‡πà‡∏°‡∏ï‡πâ‡∏ô‡∏î‡πâ‡∏ß‡∏¢‡∏Å‡∏≤‡∏£‡∏ß‡∏¥‡πÄ‡∏Ñ‡∏£‡∏≤‡∏∞‡∏´‡πå‡∏ó‡∏±‡∏ô‡∏ó‡∏µ
- ‡πÉ‡∏ä‡πâ‡∏†‡∏≤‡∏©‡∏≤‡∏ß‡∏¥‡∏ä‡∏≤‡∏Å‡∏≤‡∏£‡∏ó‡∏µ‡πà‡πÄ‡∏Ç‡πâ‡∏≤‡πÉ‡∏à‡∏á‡πà‡∏≤‡∏¢
- ‡∏ï‡∏≠‡∏ö‡∏ï‡∏£‡∏á‡∏õ‡∏£‡∏∞‡πÄ‡∏î‡πá‡∏ô‡∏ï‡∏≤‡∏°‡∏´‡∏±‡∏ß‡∏Ç‡πâ‡∏≠‡∏ó‡∏µ‡πà‡∏Å‡∏≥‡∏´‡∏ô‡∏î"""
            },
            {"role": "user", "content": prompt_body}
        ]

        # Generate explanation
        full_prompt = self.expl_tokenizer.apply_chat_template(
            messages, tokenize=False, add_generation_prompt=True
        )

        toks = self.expl_tokenizer(full_prompt, return_tensors="pt").to(self.expl_model.device)

        start_time = time.time()
        with torch.inference_mode():
            with torch.backends.cuda.sdp_kernel(enable_flash=False, enable_math=True, enable_mem_efficient=True):
                output = self.expl_model.generate(
                    **toks,
                    max_new_tokens=max_tokens,
                    do_sample=True,
                    temperature=0.7,
                    top_p=0.9,
                    top_k=50,
                    repetition_penalty=1.1,
                    pad_token_id=self.expl_tokenizer.eos_token_id,
                    eos_token_id=self.expl_tokenizer.eos_token_id,
                    use_cache=True
                )
        duration = time.time() - start_time

        explanation = self.expl_tokenizer.decode(
            output[0][toks['input_ids'].shape[-1]:],
            skip_special_tokens=True
        ).strip()

        # Validate explanation quality
        if self.validate_explanation(explanation, fine_label):
            print("‚úÖ Explanation validation passed")
        else:
            print("‚ö†Ô∏è Explanation may be incomplete")

        return explanation, duration

    def validate_explanation(self, explanation, fine_label):
        """Validate explanation contains required elements"""
        required_elements = [
            ("tense_mention", [self.tense_defs.fine_definitions.get(fine_label, {}).get('tense', '').lower()]),
            ("structure", ["‡πÇ‡∏Ñ‡∏£‡∏á‡∏™‡∏£‡πâ‡∏≤‡∏á", "‡∏£‡∏π‡∏õ‡∏õ‡∏£‡∏∞‡πÇ‡∏¢‡∏Ñ", "subject", "verb"]),
            ("mistakes", ["‡∏ú‡∏¥‡∏î‡∏û‡∏•‡∏≤‡∏î", "‡∏Ç‡πâ‡∏≠‡∏ú‡∏¥‡∏î", "‡∏û‡∏•‡∏≤‡∏î", "‡∏£‡∏∞‡∏ß‡∏±‡∏á"])
        ]

        missing = []
        for category, keywords in required_elements:
            if not any(keyword in explanation.lower() for keyword in keywords):
                missing.append(category)

        if missing:
            print(f"‚ö†Ô∏è Missing elements: {', '.join(missing)}")

        return len(missing) == 0

    def get_tense_info(self, fine_label):
        """Get detailed information about a tense tag"""
        if fine_label not in self.tense_defs.fine_definitions:
            return f"Unknown tense tag: {fine_label}"

        info = self.tense_defs.fine_definitions[fine_label]
        return f"""
Tense Tag: {fine_label}
Tense: {info.get('tense', 'Unknown')}
Thai Name: {info.get('thai_name', '')}
Usage: {info.get('usage', '')}
Structure: {info.get('structure', '')}
Keywords: {info.get('keywords', '')}
Example: {info.get('example', '')}
"""

    def full_pipeline(self, thai_text, verbose=True):
        """Execute complete pipeline: Translation ‚Üí Classification ‚Üí Explanation"""
        if verbose:
            print("="*60)
            print(f"üìù Input Thai: {thai_text}")
            print("="*60)

        # Step 1: Translation
        if verbose:
            print("\nüîÑ Step 1: Translation")
        translation, trans_time = self.translate(thai_text)
        if not translation:
            raise ValueError("Translation failed")
        if verbose:
            print(f"   Result: {translation}")
            print(f"   Time: {trans_time:.2f}s")

        # Step 2: Tense Classification
        if verbose:
            print("\nüè∑Ô∏è Step 2: Tense Classification")
        coarse_preds, fine_preds = self.predict_tense(translation)
        if verbose:
            print(f"   Top predictions:")
            for i, (label, conf) in enumerate(fine_preds[:3]):
                tense_info = self.tense_defs.fine_definitions.get(label, {})
                print(f"   {i+1}. {label} ({tense_info.get('thai_name', 'Unknown')}): {conf:.2%}")

        # Step 3: Grammar Explanation
        if verbose:
            print("\nüí° Step 3: Grammar Explanation")
        explanation, expl_time = self.explain(thai_text, translation)

        # Compile results
        result = {
            "thai_input": thai_text,
            "english_translation": translation,
            "tense_predictions": {
                "coarse": coarse_preds[0],
                "fine": fine_preds[0],
                "all_predictions": fine_preds
            },
            "tense_details": self.get_tense_info(fine_preds[0][0]),
            "explanation": explanation,
            "timing": {
                "translation": trans_time,
                "explanation": expl_time,
                "total": trans_time + expl_time
            }
        }

        if verbose:
            print(f"\nüìä Total processing time: {result['timing']['total']:.2f}s")
            print("\n" + "="*60)
            print("üìñ TENSE DETAILS:")
            print(result['tense_details'])
            print("="*60)
            print("üìñ EXPLANATION:")
            print("="*60)
            print(explanation)
            print("="*60)

        return result

print("‚úÖ Complete Hybrid4BSystem implemented")

‚úÖ Complete Hybrid4BSystem implemented


In [None]:
if __name__ == "__main__":
    import pandas as pd
    import random
    from datetime import datetime

    # Initialize system
    system = Hybrid4BSystem()

    # Load all models
    system.load_translator()
    system.load_classifier()
    system.load_explainer()

    # Extended test sentences
    test_sentences = [
        "‡∏â‡∏±‡∏ô‡∏Å‡∏¥‡∏ô‡∏Ç‡πâ‡∏≤‡∏ß‡πÄ‡∏ä‡πâ‡∏≤‡∏ó‡∏∏‡∏Å‡∏ß‡∏±‡∏ô",                            # HABIT
        "‡πÄ‡∏£‡∏≤‡πÑ‡∏õ‡∏ß‡∏¥‡πà‡∏á‡∏ï‡∏≠‡∏ô‡πÄ‡∏ä‡πâ‡∏≤‡πÄ‡∏õ‡πá‡∏ô‡∏õ‡∏£‡∏∞‡∏à‡∏≥",                     #  HABIT
        "‡πÄ‡∏Ç‡∏≤‡∏ï‡∏∑‡πà‡∏ô‡∏™‡∏≤‡∏¢‡πÄ‡∏™‡∏°‡∏≠",                             # HABIT
        "‡∏Ñ‡∏∏‡∏ì‡∏û‡πà‡∏≠‡∏Ç‡∏≠‡∏á‡∏ú‡∏°‡∏ï‡∏∑‡πà‡∏ô‡∏ô‡∏≠‡∏ô‡πÄ‡∏ß‡∏•‡∏≤ 8 ‡πÇ‡∏°‡∏á‡πÄ‡∏ä‡πâ‡∏≤", # HABIT
        "‡∏ô‡πâ‡∏≥‡πÄ‡∏õ‡∏•‡πà‡∏≤‡πÄ‡∏î‡∏∑‡∏≠‡∏î‡∏ó‡∏µ‡πà‡∏≠‡∏∏‡∏ì‡∏´‡∏†‡∏π‡∏°‡∏¥ 100 ‡∏≠‡∏á‡∏®‡∏≤",     # FACT
        "‡πÇ‡∏•‡∏Å‡∏´‡∏°‡∏∏‡∏ô‡∏£‡∏≠‡∏ö‡∏î‡∏ß‡∏á‡∏≠‡∏≤‡∏ó‡∏¥‡∏ï‡∏¢‡πå",     # FACT
        "‡∏î‡∏ß‡∏á‡∏≠‡∏≤‡∏ó‡∏¥‡∏ï‡∏¢‡πå‡∏Ç‡∏∂‡πâ‡∏ô‡∏ó‡∏¥‡∏®‡∏ï‡∏∞‡∏ß‡∏±‡∏ô‡∏≠‡∏≠‡∏Å",     # FACT
        "‡∏´‡∏ô‡∏∂‡πà‡∏á‡∏õ‡∏µ‡∏°‡∏µ‡∏ó‡∏±‡πâ‡∏á‡∏´‡∏°‡∏î 365 ‡∏ß‡∏±‡∏ô",   # FACT
        "‡∏£‡∏ñ‡πÑ‡∏ü‡∏≠‡∏≠‡∏Å‡∏ï‡∏≠‡∏ô 9 ‡πÇ‡∏°‡∏á",         # SCHEDULEDFUTURE
        "‡∏â‡∏±‡∏ô‡∏ß‡∏≤‡∏á‡πÅ‡∏ú‡∏ô‡∏à‡∏∞‡πÄ‡∏£‡∏µ‡∏¢‡∏ô‡∏ï‡πà‡∏≠‡∏ï‡πà‡∏≤‡∏á‡∏õ‡∏£‡∏∞‡πÄ‡∏ó‡∏®‡∏õ‡∏µ‡∏´‡∏ô‡πâ‡∏≤",  # SCHEDULEDFUTURE
        "‡∏Ñ‡∏∏‡∏ì‡∏ß‡∏≤‡∏á‡πÅ‡∏ú‡∏ô‡∏≠‡∏ô‡∏≤‡∏Ñ‡∏ï‡∏ß‡πà‡∏≤‡∏≠‡∏¢‡πà‡∏≤‡∏á‡πÑ‡∏£‡∏ö‡πâ‡∏≤‡∏á", # SCHEDULEDFUTURE
        "‡πÄ‡∏Ç‡∏≤‡∏ï‡∏±‡πâ‡∏á‡πÉ‡∏à‡∏à‡∏∞‡∏•‡∏≤‡∏≠‡∏≠‡∏Å‡∏à‡∏≤‡∏Å‡∏á‡∏≤‡∏ô",  # SCHEDULEDFUTURE
        "‡∏ô‡∏≤‡∏¢‡∏Å‡πÄ‡∏Ç‡πâ‡∏≤‡πÉ‡∏´‡πâ‡∏Å‡∏≥‡∏•‡∏±‡∏á‡πÉ‡∏à‡∏õ‡∏£‡∏∞‡∏ä‡∏≤‡∏ä‡∏ô‡∏ó‡∏µ‡πà‡∏õ‡∏£‡∏∞‡∏™‡∏ö‡∏ô‡πâ‡∏≥‡∏ó‡πà‡∏ß‡∏°", # HEADLINE
        "‡∏õ‡∏£‡∏∞‡πÄ‡∏ó‡∏®‡πÑ‡∏ó‡∏¢ ‡∏õ‡∏£‡∏∞‡∏™‡∏ö‡∏õ‡∏±‡∏ç‡∏´‡∏≤‡∏Ç‡∏≤‡∏î‡∏Ñ‡∏ß‡∏≤‡∏°‡∏Ñ‡∏•‡πà‡∏≠‡∏á‡∏ï‡∏±‡∏ß‡∏ó‡∏≤‡∏á‡∏Å‡∏≤‡∏£‡πÄ‡∏á‡∏¥‡∏ô", # HEADLINE
        "‡∏™‡∏∏‡∏ô‡∏±‡∏Ç‡πÄ‡∏õ‡πá‡∏ô‡∏Æ‡∏µ‡πÇ‡∏£‡πà‡∏à‡∏≤‡∏Å‡∏Å‡∏≤‡∏£‡πÄ‡∏Ç‡πâ‡∏≤‡∏ä‡πà‡∏ß‡∏¢‡πÄ‡∏à‡πâ‡∏≤‡∏Ç‡∏≠‡∏á", # HEADLINE
        "‡∏ô‡∏±‡∏Å‡∏ß‡∏¥‡∏ó‡∏¢‡∏≤‡∏®‡∏≤‡∏™‡∏ï‡∏£‡πå‡∏Ñ‡πâ‡∏ô‡∏û‡∏ö‡πÅ‡∏£‡πà‡∏´‡∏≤‡∏¢‡∏≤‡∏Å‡πÉ‡∏´‡∏°‡πà", # HEADLINE
        "‡∏ô‡πâ‡∏≥‡∏Ç‡∏∂‡πâ‡∏ô‡πÉ‡∏´‡πâ‡∏£‡∏µ‡∏ö‡∏ï‡∏±‡∏Å", # SAYING
        "‡∏Ñ‡∏ß‡∏≤‡∏°‡∏û‡∏¢‡∏≤‡∏¢‡∏≤‡∏°‡∏≠‡∏¢‡∏π‡πà‡∏ó‡∏µ‡πà‡πÑ‡∏´‡∏ô ‡∏Ñ‡∏ß‡∏≤‡∏°‡∏™‡∏≥‡πÄ‡∏£‡πá‡∏à‡∏≠‡∏¢‡∏π‡πà‡∏ó‡∏µ‡πà‡∏ô‡∏±‡πà‡∏ô", # SAYING
        "‡∏ä‡πâ‡∏≤‡πÜ ‡πÑ‡∏î‡πâ‡∏û‡∏£‡πâ‡∏≤‡πÄ‡∏•‡πà‡∏°‡∏á‡∏≤‡∏°", # SAYING
        "‡∏ó‡∏∏‡∏Å‡∏õ‡∏±‡∏ç‡∏´‡∏≤‡∏°‡∏µ‡∏ó‡∏≤‡∏á‡∏≠‡∏≠‡∏Å", # SAYING
        "‡πÄ‡∏Ç‡∏≤‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏≠‡πà‡∏≤‡∏ô‡∏´‡∏ô‡∏±‡∏á‡∏™‡∏∑‡∏≠‡∏≠‡∏¢‡∏π‡πà ‡∏ì ‡∏ï‡∏≠‡∏ô‡∏ô‡∏µ‡πâ",   # HAPPENING
        "‡∏ù‡∏ô‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏ï‡∏Å‡∏ï‡∏≠‡∏ô‡∏ô‡∏µ‡πâ",          # HAPPENING
        "‡∏Ñ‡∏∏‡∏ì‡πÅ‡∏°‡πà‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏•‡πâ‡∏≤‡∏á‡∏à‡∏≤‡∏ô‡∏≠‡∏¢‡∏π‡πà", # HAPPENING
        "‡πÅ‡∏ü‡∏ô‡∏™‡∏≤‡∏ß‡∏Ç‡∏≠‡∏á‡∏â‡∏±‡∏ô‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏ã‡∏±‡∏Å‡∏ú‡πâ‡∏≤", # HAPPENING
        "‡∏â‡∏±‡∏ô‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏ó‡∏≥‡πÇ‡∏õ‡∏£‡πÄ‡∏à‡∏Ñ‡πÄ‡∏Å‡∏µ‡πà‡∏¢‡∏ß‡∏Å‡∏±‡∏ö‡∏Å‡∏≤‡∏£‡∏®‡∏∂‡∏Å‡∏©‡∏≤‡∏≠‡∏¢‡∏π‡πà‡∏ä‡πà‡∏ß‡∏á‡∏ô‡∏µ‡πâ", #NOWADAYS
        "‡∏ä‡πà‡∏ß‡∏á‡∏ô‡∏µ‡πâ‡∏û‡∏µ‡πà‡∏™‡∏≤‡∏ß‡∏â‡∏±‡∏ô‡∏≠‡∏≠‡∏Å‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏Å‡∏≤‡∏¢", #NOWADAYS
        "‡∏â‡∏±‡∏ô‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏ï‡πà‡∏≠‡πÄ‡∏ï‡∏¥‡∏°‡∏ö‡πâ‡∏≤‡∏ô‡∏≠‡∏¢‡∏π‡πà", #NOWADAYS
        "‡∏ö‡∏£‡∏¥‡∏©‡∏±‡∏ó‡∏â‡∏±‡∏ô‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏õ‡∏£‡∏±‡∏ö‡∏ê‡∏≤‡∏ô‡πÄ‡∏á‡∏¥‡∏ô‡πÄ‡∏î‡∏∑‡∏≠‡∏ô‡πÉ‡∏´‡πâ‡∏ó‡∏∏‡∏Å‡∏Ñ‡∏ô", #NOWADAYS
        "‡∏â‡∏±‡∏ô‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏à‡∏∞‡πÑ‡∏õ‡∏™‡∏ô‡∏≤‡∏°‡∏ö‡∏¥‡∏ô", # SUREFUT
        "‡πÄ‡∏Ç‡∏≤‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏à‡∏∞‡∏™‡∏≠‡∏ö‡∏û‡∏£‡∏∏‡πà‡∏á‡∏ô‡∏µ‡πâ",  # SUREFUT
        "‡πÄ‡∏£‡∏≤‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏à‡∏∞‡πÄ‡∏£‡∏¥‡πà‡∏°‡∏õ‡∏£‡∏∞‡∏ä‡∏∏‡∏°‡πÉ‡∏ô 5 ‡∏ô‡∏≤‡∏ó‡∏µ",  # SUREFUT
        "‡∏û‡∏ß‡∏Å‡πÄ‡∏Ç‡∏≤‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏à‡∏∞‡∏¢‡πâ‡∏≤‡∏¢‡∏ö‡πâ‡∏≤‡∏ô‡∏™‡∏±‡∏õ‡∏î‡∏≤‡∏´‡πå‡∏´‡∏ô‡πâ‡∏≤",  # SUREFUT
        "‡πÄ‡∏®‡∏£‡∏©‡∏ê‡∏Å‡∏¥‡∏à‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏î‡∏µ‡∏Ç‡∏∂‡πâ‡∏ô‡πÄ‡∏£‡∏∑‡πà‡∏≠‡∏¢‡πÜ", #PROGRESS
        "‡∏â‡∏±‡∏ô‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏õ‡∏£‡∏±‡∏ö‡∏õ‡∏£‡∏∏‡∏á‡∏ó‡∏±‡∏Å‡∏©‡∏∞‡∏Å‡∏≤‡∏£‡πÄ‡∏Ç‡∏µ‡∏¢‡∏ô‡∏Ç‡∏≠‡∏á‡∏â‡∏±‡∏ô", #PROGRESS
        "‡∏ó‡∏µ‡∏°‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏û‡∏±‡∏í‡∏ô‡∏≤‡πÄ‡∏ó‡∏Ñ‡πÇ‡∏ô‡πÇ‡∏•‡∏¢‡∏µ‡πÉ‡∏´‡∏°‡πà", #PROGRESS
        "‡∏™‡∏†‡∏≤‡∏û‡∏≠‡∏≤‡∏Å‡∏≤‡∏®‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏≠‡∏∏‡πà‡∏ô‡∏Ç‡∏∂‡πâ‡∏ô", #PROGRESS
        "‡∏â‡∏±‡∏ô‡πÄ‡∏û‡∏¥‡πà‡∏á‡∏ó‡∏≥‡∏Å‡∏≤‡∏£‡∏ö‡πâ‡∏≤‡∏ô‡πÄ‡∏™‡∏£‡πá‡∏à",  # JUSTFIN
        "‡πÄ‡∏ò‡∏≠‡πÄ‡∏û‡∏¥‡πà‡∏á‡∏Å‡∏•‡∏±‡∏ö‡∏°‡∏≤‡∏ñ‡∏∂‡∏á‡∏ö‡πâ‡∏≤‡∏ô",  # JUSTFIN
        "‡πÄ‡∏£‡∏≤‡πÄ‡∏û‡∏¥‡πà‡∏á‡∏Å‡∏¥‡∏ô‡∏Ç‡πâ‡∏≤‡∏ß‡πÄ‡∏™‡∏£‡πá‡∏à",  # JUSTFIN
        "‡πÄ‡∏Ç‡∏≤‡πÄ‡∏û‡∏¥‡πà‡∏á‡∏≠‡∏≠‡∏Å‡∏à‡∏≤‡∏Å‡∏´‡πâ‡∏≠‡∏á‡∏õ‡∏£‡∏∞‡∏ä‡∏∏‡∏°",  # JUSTFIN
        "‡∏â‡∏±‡∏ô‡∏ó‡∏≥‡∏Å‡∏∏‡∏ç‡πÅ‡∏à‡∏´‡∏≤‡∏¢",  # RESULT
        "‡πÄ‡∏Ç‡∏≤‡∏ã‡∏∑‡πâ‡∏≠‡∏Ç‡∏≠‡∏á‡∏≠‡∏≠‡∏ô‡πÑ‡∏•‡∏ô‡πå‡πÑ‡∏õ‡πÅ‡∏•‡πâ‡∏ß",  # RESULT
        "‡πÄ‡∏£‡∏≤‡∏î‡∏π‡∏´‡∏ô‡∏±‡∏á‡πÄ‡∏£‡∏∑‡πà‡∏≠‡∏á‡∏ô‡∏µ‡πâ‡πÅ‡∏•‡πâ‡∏ß",  # RESULT
        "‡πÄ‡∏ò‡∏≠‡∏•‡∏∑‡∏°‡∏Å‡∏£‡∏∞‡πÄ‡∏õ‡πã‡∏≤‡πÑ‡∏ß‡πâ‡∏ó‡∏µ‡πà‡∏ö‡πâ‡∏≤‡∏ô",  # RESULT
        "‡∏â‡∏±‡∏ô‡πÄ‡∏Ñ‡∏¢‡πÑ‡∏õ‡∏ç‡∏µ‡πà‡∏õ‡∏∏‡πà‡∏ô‡∏°‡∏≤‡πÅ‡∏•‡πâ‡∏ß",  # EXP
        "‡∏ô‡∏µ‡πà‡πÄ‡∏õ‡πá‡∏ô‡∏Ñ‡∏£‡∏±‡πâ‡∏á‡πÅ‡∏£‡∏Å‡∏ó‡∏µ‡πà‡∏â‡∏±‡∏ô‡πÑ‡∏î‡πâ‡πÄ‡∏à‡∏≠‡∏´‡∏¥‡∏°‡∏∞",  # EXP
        "‡∏Ñ‡∏∏‡∏ì‡πÄ‡∏Ñ‡∏¢‡πÄ‡∏´‡πá‡∏ô‡∏õ‡∏£‡∏≤‡∏Å‡∏è‡∏Å‡∏≤‡∏£‡∏ì‡πå‡∏ô‡∏µ‡πâ‡πÑ‡∏´‡∏°",  # EXP
        "‡πÄ‡∏Ç‡∏≤‡πÑ‡∏°‡πà‡πÄ‡∏Ñ‡∏¢‡∏Å‡∏¥‡∏ô‡∏ó‡∏∏‡πÄ‡∏£‡∏µ‡∏¢‡∏ô",  # EXP
        "‡∏â‡∏±‡∏ô‡πÄ‡∏£‡∏µ‡∏¢‡∏ô‡∏†‡∏≤‡∏©‡∏≤‡∏≠‡∏±‡∏á‡∏Å‡∏§‡∏©‡∏°‡∏≤ 3 ‡∏õ‡∏µ‡πÅ‡∏•‡πâ‡∏ß",  # SINCEFOR
        "‡πÄ‡∏Ç‡∏≤‡∏ó‡∏≥‡∏á‡∏≤‡∏ô‡∏ó‡∏µ‡πà‡∏ô‡∏µ‡πà‡∏°‡∏≤‡∏ï‡∏±‡πâ‡∏á‡πÅ‡∏ï‡πà‡∏õ‡∏µ 2015",  # SINCEFOR
        "‡πÄ‡∏£‡∏≤‡∏£‡∏≠‡∏£‡∏ñ‡πÄ‡∏°‡∏•‡πå‡∏°‡∏≤‡∏ï‡∏±‡πâ‡∏á‡πÅ‡∏ï‡πà‡∏ö‡πà‡∏≤‡∏¢‡πÇ‡∏°‡∏á",  # SINCEFOR
        "‡∏â‡∏±‡∏ô‡∏≠‡∏≠‡∏Å‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏Å‡∏≤‡∏¢‡∏°‡∏≤‡πÄ‡∏õ‡πá‡∏ô‡πÄ‡∏ß‡∏•‡∏≤ 2 ‡∏ä‡∏±‡πà‡∏ß‡πÇ‡∏°‡∏á",  # SINCEFOR
        "‡πÄ‡∏°‡∏∑‡πà‡∏≠‡∏ß‡∏≤‡∏ô‡∏â‡∏±‡∏ô‡πÑ‡∏õ‡πÇ‡∏£‡∏á‡πÄ‡∏£‡∏µ‡∏¢‡∏ô",  # NORFIN
        "‡πÄ‡∏Ç‡∏≤‡∏Å‡∏¥‡∏ô‡∏Ç‡πâ‡∏≤‡∏ß‡πÄ‡∏ó‡∏µ‡πà‡∏¢‡∏á‡∏ï‡∏≠‡∏ô‡∏ö‡πà‡∏≤‡∏¢‡∏™‡∏≠‡∏á",  # NORFIN
        "‡πÄ‡∏£‡∏≤‡πÑ‡∏õ‡∏î‡∏π‡∏´‡∏ô‡∏±‡∏á‡πÄ‡∏°‡∏∑‡πà‡∏≠‡∏Ñ‡∏∑‡∏ô",  # NORFIN
        "‡∏â‡∏±‡∏ô‡∏ã‡∏∑‡πâ‡∏≠‡∏ô‡πâ‡∏≥‡∏°‡∏≤‡πÄ‡∏°‡∏∑‡πà‡∏≠‡πÄ‡∏ä‡πâ‡∏≤",  # NORFIN
        "‡∏â‡∏±‡∏ô‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏ô‡∏≠‡∏ô‡∏ï‡∏≠‡∏ô‡∏ó‡∏µ‡πà‡πÄ‡∏ò‡∏≠‡πÇ‡∏ó‡∏£‡∏°‡∏≤",  # INTERRUPT
        "‡πÄ‡∏Ç‡∏≤‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏≠‡∏≤‡∏ö‡∏ô‡πâ‡∏≥‡πÄ‡∏°‡∏∑‡πà‡∏≠‡πÑ‡∏ü‡∏î‡∏±‡∏ö",  # INTERRUPT
        "‡πÄ‡∏£‡∏≤‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏î‡∏π‡∏´‡∏ô‡∏±‡∏á‡∏ï‡∏≠‡∏ô‡∏ó‡∏µ‡πà‡πÄ‡∏û‡∏∑‡πà‡∏≠‡∏ô‡∏°‡∏≤‡∏ñ‡∏∂‡∏á",  # INTERRUPT
        "‡πÅ‡∏°‡πà‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏ó‡∏≥‡∏Å‡∏±‡∏ö‡∏Ç‡πâ‡∏≤‡∏ß‡∏ï‡∏≠‡∏ô‡∏ó‡∏µ‡πà‡∏û‡πà‡∏≠‡∏Å‡∏•‡∏±‡∏ö‡∏ö‡πâ‡∏≤‡∏ô",  # INTERRUPT
        "‡πÄ‡∏°‡∏∑‡πà‡∏≠‡∏Ñ‡∏∑‡∏ô‡∏ï‡∏≠‡∏ô‡∏™‡∏≠‡∏á‡∏ó‡∏∏‡πà‡∏°‡∏â‡∏±‡∏ô‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏î‡∏π‡∏´‡∏ô‡∏±‡∏á",  # DOINGATSOMETIMEPAST
        "‡∏ï‡∏≠‡∏ô‡∏ô‡∏±‡πâ‡∏ô‡πÄ‡∏Ç‡∏≤‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏ß‡∏¥‡πà‡∏á‡∏≠‡∏¢‡∏π‡πà",  # DOINGATSOMETIMEPAST
        "‡πÄ‡∏ß‡∏•‡∏≤ 5 ‡πÇ‡∏°‡∏á‡πÄ‡∏°‡∏∑‡πà‡∏≠‡∏ß‡∏≤‡∏ô ‡∏â‡∏±‡∏ô‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏ó‡∏≥‡∏Å‡∏±‡∏ö‡∏Ç‡πâ‡∏≤‡∏ß",  # DOINGATSOMETIMEPAST
        "‡∏ï‡∏≠‡∏ô‡∏ô‡∏±‡πâ‡∏ô‡πÄ‡∏£‡∏≤‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏õ‡∏£‡∏∞‡∏ä‡∏∏‡∏°‡∏Å‡∏±‡∏ô‡∏≠‡∏¢‡∏π‡πà",  # DOINGATSOMETIMEPAST
        "‡∏â‡∏±‡∏ô‡∏Å‡∏¥‡∏ô‡∏Ç‡πâ‡∏≤‡∏ß‡∏Å‡πà‡∏≠‡∏ô‡∏≠‡∏≠‡∏Å‡∏à‡∏≤‡∏Å‡∏ö‡πâ‡∏≤‡∏ô",  # BEFOREPAST
        "‡πÄ‡∏ò‡∏≠‡∏≠‡πà‡∏≤‡∏ô‡∏´‡∏ô‡∏±‡∏á‡∏™‡∏∑‡∏≠‡∏Å‡πà‡∏≠‡∏ô‡∏ô‡∏≠‡∏ô",  # BEFOREPAST
        "‡πÄ‡∏£‡∏≤‡∏ó‡∏≥‡∏Å‡∏≤‡∏£‡∏ö‡πâ‡∏≤‡∏ô‡πÄ‡∏™‡∏£‡πá‡∏à‡∏Å‡πà‡∏≠‡∏ô‡∏î‡∏π‡∏ó‡∏µ‡∏ß‡∏µ",  # BEFOREPAST
        "‡πÄ‡∏Ç‡∏≤‡∏•‡πâ‡∏≤‡∏á‡∏à‡∏≤‡∏ô‡πÄ‡∏™‡∏£‡πá‡∏à‡∏Å‡πà‡∏≠‡∏ô‡∏ó‡∏µ‡πà‡πÅ‡∏°‡πà‡∏à‡∏∞‡∏Å‡∏•‡∏±‡∏ö‡∏°‡∏≤",  # BEFOREPAST
        "‡∏â‡∏±‡∏ô‡∏£‡∏≠‡πÄ‡∏Ç‡∏≤‡∏≠‡∏¢‡∏π‡πà‡πÄ‡∏õ‡πá‡∏ô‡∏ä‡∏±‡πà‡∏ß‡πÇ‡∏°‡∏á‡∏Å‡πà‡∏≠‡∏ô‡∏ó‡∏µ‡πà‡πÄ‡∏Ç‡∏≤‡∏à‡∏∞‡∏°‡∏≤‡∏ñ‡∏∂‡∏á",  # DURATION
        "‡πÄ‡∏Ç‡∏≤‡∏ó‡∏≥‡∏á‡∏≤‡∏ô‡∏ó‡∏µ‡πà‡∏ö‡∏£‡∏¥‡∏©‡∏±‡∏ó‡∏ô‡∏µ‡πâ‡∏°‡∏≤ 10 ‡∏õ‡∏µ‡∏Å‡πà‡∏≠‡∏ô‡∏à‡∏∞‡∏•‡∏≤‡∏≠‡∏≠‡∏Å",  # DURATION
        "‡πÄ‡∏£‡∏≤‡∏£‡∏≠‡∏ù‡∏ô‡∏´‡∏¢‡∏∏‡∏î‡∏ï‡∏Å‡∏≠‡∏¢‡∏π‡πà‡∏Ñ‡∏£‡∏∂‡πà‡∏á‡∏ä‡∏±‡πà‡∏ß‡πÇ‡∏°‡∏á ‡∏ñ‡∏∂‡∏á‡∏à‡∏∞‡πÑ‡∏î‡πâ‡∏Å‡∏•‡∏±‡∏ö‡∏ö‡πâ‡∏≤‡∏ô",  # DURATION
        "‡πÄ‡∏ò‡∏≠‡∏ó‡∏≥‡∏Å‡∏±‡∏ö‡∏Ç‡πâ‡∏≤‡∏ß‡∏≠‡∏¢‡∏π‡πà‡∏ô‡∏≤‡∏ô‡∏Å‡πà‡∏≠‡∏ô‡∏ó‡∏µ‡πà‡πÅ‡∏Ç‡∏Å‡∏à‡∏∞‡∏°‡∏≤",  # DURATION
        "‡∏â‡∏±‡∏ô‡∏à‡∏∞‡πÑ‡∏õ‡πÄ‡∏ó‡∏µ‡πà‡∏¢‡∏ß‡∏û‡∏£‡∏∏‡πà‡∏á‡∏ô‡∏µ‡πâ",  # 50PERC
        "‡∏û‡∏£‡∏∏‡πà‡∏á‡∏ô‡∏µ‡πâ‡∏â‡∏±‡∏ô‡∏à‡∏∞‡πÑ‡∏õ‡∏´‡∏≤‡∏´‡∏°‡∏≠",  # 50PERC
        "‡∏ù‡∏ô‡∏Ñ‡∏á‡∏à‡∏∞‡∏ï‡∏Å‡∏ï‡∏≠‡∏ô‡πÄ‡∏¢‡πá‡∏ô",  # 50PERC
        "‡πÄ‡∏Ç‡∏≤‡∏≠‡∏≤‡∏à‡∏à‡∏∞‡πÑ‡∏°‡πà‡∏°‡∏≤‡∏á‡∏≤‡∏ô‡πÄ‡∏•‡∏µ‡πâ‡∏¢‡∏á",  # 50PERC
        "‡∏â‡∏±‡∏ô‡∏à‡∏∞‡∏ä‡πà‡∏ß‡∏¢‡∏Ñ‡∏∏‡∏ì‡∏ó‡∏≥‡∏Å‡∏≤‡∏£‡∏ö‡πâ‡∏≤‡∏ô",  # PROMISE
        "‡∏ú‡∏°‡∏™‡∏±‡∏ç‡∏ç‡∏≤‡∏ß‡πà‡∏≤‡∏à‡∏∞‡πÑ‡∏°‡πà‡∏•‡∏∑‡∏°",  # PROMISE
        "‡∏â‡∏±‡∏ô‡∏à‡∏∞‡∏ã‡∏∑‡πâ‡∏≠‡∏Ç‡∏≠‡∏á‡∏Ç‡∏ß‡∏±‡∏ç‡πÉ‡∏´‡πâ‡πÄ‡∏ò‡∏≠",  # PROMISE
        "‡∏ú‡∏°‡∏à‡∏∞‡πÇ‡∏ó‡∏£‡∏´‡∏≤‡∏Ñ‡∏∏‡∏ì‡∏Ñ‡∏∑‡∏ô‡∏ô‡∏µ‡πâ",  # PROMISE
        "‡∏°‡∏µ‡∏Ñ‡∏ô‡∏Å‡∏î‡∏Å‡∏£‡∏¥‡πà‡∏á ‡∏â‡∏±‡∏ô‡∏à‡∏∞‡πÑ‡∏õ‡πÄ‡∏õ‡∏¥‡∏î‡∏õ‡∏£‡∏∞‡∏ï‡∏π",  # RIGHTNOW
        "‡∏â‡∏±‡∏ô‡πÄ‡∏´‡∏ô‡∏∑‡πà‡∏≠‡∏¢‡πÅ‡∏•‡πâ‡∏ß ‡∏â‡∏±‡∏ô‡∏à‡∏∞‡∏ô‡∏≠‡∏ô‡∏•‡∏∞",  # RIGHTNOW
        "‡∏ï‡∏Å‡∏•‡∏á ‡∏â‡∏±‡∏ô‡∏à‡∏∞‡∏ä‡πà‡∏ß‡∏¢‡∏Ñ‡∏∏‡∏ì",  # RIGHTNOW
        "‡πÑ‡∏î‡πâ‡πÄ‡∏•‡∏¢ ‡∏â‡∏±‡∏ô‡∏à‡∏∞‡πÄ‡∏£‡∏¥‡πà‡∏°‡πÄ‡∏î‡∏µ‡πã‡∏¢‡∏ß‡∏ô‡∏µ‡πâ",  # RIGHTNOW
        "‡∏û‡∏£‡∏∏‡πà‡∏á‡∏ô‡∏µ‡πâ‡∏ï‡∏≠‡∏ô‡∏ö‡πà‡∏≤‡∏¢‡∏™‡∏≠‡∏á ‡∏â‡∏±‡∏ô‡∏à‡∏∞‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏õ‡∏£‡∏∞‡∏ä‡∏∏‡∏°‡∏≠‡∏¢‡∏π‡πà",  # LONGFUTURE
        "‡πÄ‡∏ß‡∏•‡∏≤‡∏ô‡∏µ‡πâ‡∏™‡∏±‡∏õ‡∏î‡∏≤‡∏´‡πå‡∏´‡∏ô‡πâ‡∏≤ ‡πÄ‡∏£‡∏≤‡∏à‡∏∞‡πÄ‡∏î‡∏¥‡∏ô‡∏ó‡∏≤‡∏á‡∏≠‡∏¢‡∏π‡πà",  # LONGFUTURE
        "‡∏â‡∏±‡∏ô‡∏à‡∏∞‡∏Å‡∏≥‡∏•‡∏±‡∏á‡πÄ‡∏£‡∏µ‡∏¢‡∏ô‡∏≠‡∏¢‡∏π‡πà‡∏ï‡∏≠‡∏ô‡∏ô‡∏±‡πâ‡∏ô",  # LONGFUTURE
        "‡πÄ‡∏Ç‡∏≤‡∏à‡∏∞‡∏Å‡∏≥‡∏•‡∏±‡∏á‡∏ô‡∏≠‡∏ô‡∏ï‡∏≠‡∏ô‡πÄ‡∏£‡∏≤‡πÑ‡∏õ‡∏ñ‡∏∂‡∏á",  # LONGFUTURE
        "‡∏†‡∏≤‡∏¢‡πÉ‡∏ô‡∏™‡∏¥‡πâ‡∏ô‡∏õ‡∏µ‡∏ô‡∏µ‡πâ ‡∏â‡∏±‡∏ô‡∏à‡∏∞‡πÄ‡∏Å‡πá‡∏ö‡πÄ‡∏á‡∏¥‡∏ô‡πÑ‡∏î‡πâ 1 ‡πÅ‡∏™‡∏ô‡∏ö‡∏≤‡∏ó",  # PREDICT
        "‡πÄ‡∏Ç‡∏≤‡∏à‡∏∞‡∏ó‡∏≥‡∏ß‡∏¥‡∏à‡∏±‡∏¢‡πÄ‡∏™‡∏£‡πá‡∏à‡∏Å‡πà‡∏≠‡∏ô‡∏Å‡∏≥‡∏´‡∏ô‡∏î",  # PREDICT
        "‡∏û‡∏£‡∏∏‡πà‡∏á‡∏ô‡∏µ‡πâ‡∏ï‡∏≠‡∏ô‡πÄ‡∏ä‡πâ‡∏≤ ‡πÄ‡∏£‡∏≤‡∏à‡∏∞‡∏ñ‡∏∂‡∏á‡πÄ‡∏ä‡∏µ‡∏¢‡∏á‡πÉ‡∏´‡∏°‡πà",  # PREDICT
        "‡∏â‡∏±‡∏ô‡∏à‡∏∞‡πÄ‡∏Ç‡∏µ‡∏¢‡∏ô‡∏£‡∏≤‡∏¢‡∏á‡∏≤‡∏ô‡πÄ‡∏™‡∏£‡πá‡∏à‡∏Å‡πà‡∏≠‡∏ô 5 ‡πÇ‡∏°‡∏á‡πÄ‡∏¢‡πá‡∏ô",  # PREDICT
        "‡∏†‡∏≤‡∏¢‡πÉ‡∏ô‡∏õ‡∏µ‡∏´‡∏ô‡πâ‡∏≤ ‡∏â‡∏±‡∏ô‡∏à‡∏∞‡∏ó‡∏≥‡∏á‡∏≤‡∏ô‡∏ó‡∏µ‡πà‡∏ô‡∏µ‡πà‡∏Ñ‡∏£‡∏ö 10 ‡∏õ‡∏µ",  # WILLCONTINUEINFUTURE
        "‡∏ï‡∏≠‡∏ô‡∏ô‡∏±‡πâ‡∏ô‡πÄ‡∏Ç‡∏≤‡∏à‡∏∞‡∏ù‡∏∂‡∏Å‡∏ã‡πâ‡∏≠‡∏°‡∏°‡∏≤‡πÅ‡∏•‡πâ‡∏ß 6 ‡πÄ‡∏î‡∏∑‡∏≠‡∏ô",  # WILLCONTINUEINFUTURE
        "‡πÄ‡∏°‡∏∑‡πà‡∏≠‡∏ñ‡∏∂‡∏á‡πÄ‡∏ß‡∏•‡∏≤‡∏ô‡∏±‡πâ‡∏ô ‡πÄ‡∏£‡∏≤‡∏à‡∏∞‡∏Ñ‡∏ö‡∏Å‡∏±‡∏ô‡∏°‡∏≤ 3 ‡∏õ‡∏µ",  # WILLCONTINUEINFUTURE
        "‡∏â‡∏±‡∏ô‡∏à‡∏∞‡πÄ‡∏£‡∏µ‡∏¢‡∏ô‡∏´‡∏•‡∏±‡∏Å‡∏™‡∏π‡∏ï‡∏£‡∏ô‡∏µ‡πâ‡∏°‡∏≤‡πÅ‡∏•‡πâ‡∏ß 1 ‡∏õ‡∏µ‡πÄ‡∏ï‡πá‡∏° ‡πÉ‡∏ô‡∏≠‡∏µ‡∏Å‡∏™‡∏≠‡∏á‡πÄ‡∏î‡∏∑‡∏≠‡∏ô",  # WILLCONTINUEINFUTURE
    ]

    random.seed(42)
    random.shuffle(test_sentences)

    all_results = []

    for sentence in test_sentences:
        result = system.full_pipeline(sentence, verbose=True)

        all_results.append({
            "thai_input": result["thai_input"],
            "english_translation": result["english_translation"],
            "coarse_label": result["tense_predictions"]["coarse"][0],
            "fine_label": result["tense_predictions"]["fine"][0],
            "confidence": result["tense_predictions"]["fine"][1],
            "translation_time_sec": result["timing"]["translation"],
            "explanation_time_sec": result["timing"]["explanation"],
            "total_time_sec": result["timing"]["total"],
            "explanation": result["explanation"]
        })

        print("\n" + "="*80 + "\n")

    # Save CSV to Google Drive path
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    output_path = f"/content/drive/MyDrive/Dissertation/Model v.5/results/hybrid4b_results_{timestamp}.csv"
    df = pd.DataFrame(all_results)
    df.to_csv(output_path, index=False)
    print(f"‚úÖ Logged results saved to Google Drive:\n{output_path}")

In [None]:
if __name__ == "__main__":
    import pandas as pd
    import random
    from datetime import datetime

    # Initialize system
    system = Hybrid4BSystem()

    # Load all models
    system.load_translator()
    system.load_classifier()
    system.load_explainer()

    # Extended test sentences
    test_sentences = [
        "‡∏â‡∏±‡∏ô‡∏Å‡∏¥‡∏ô‡∏Ç‡πâ‡∏≤‡∏ß‡πÄ‡∏ä‡πâ‡∏≤‡∏ó‡∏∏‡∏Å‡∏ß‡∏±‡∏ô",                            # HABIT
        "‡πÄ‡∏£‡∏≤‡πÑ‡∏õ‡∏ß‡∏¥‡πà‡∏á‡∏ï‡∏≠‡∏ô‡πÄ‡∏ä‡πâ‡∏≤‡πÄ‡∏õ‡πá‡∏ô‡∏õ‡∏£‡∏∞‡∏à‡∏≥",                     #  HABIT
        "‡πÄ‡∏Ç‡∏≤‡∏ï‡∏∑‡πà‡∏ô‡∏™‡∏≤‡∏¢‡πÄ‡∏™‡∏°‡∏≠",                             # HABIT
        "‡∏Ñ‡∏∏‡∏ì‡∏û‡πà‡∏≠‡∏Ç‡∏≠‡∏á‡∏ú‡∏°‡∏ï‡∏∑‡πà‡∏ô‡∏ô‡∏≠‡∏ô‡πÄ‡∏ß‡∏•‡∏≤ 8 ‡πÇ‡∏°‡∏á‡πÄ‡∏ä‡πâ‡∏≤", # HABIT
        "‡∏ô‡πâ‡∏≥‡πÄ‡∏õ‡∏•‡πà‡∏≤‡πÄ‡∏î‡∏∑‡∏≠‡∏î‡∏ó‡∏µ‡πà‡∏≠‡∏∏‡∏ì‡∏´‡∏†‡∏π‡∏°‡∏¥ 100 ‡∏≠‡∏á‡∏®‡∏≤",     # FACT
        "‡πÇ‡∏•‡∏Å‡∏´‡∏°‡∏∏‡∏ô‡∏£‡∏≠‡∏ö‡∏î‡∏ß‡∏á‡∏≠‡∏≤‡∏ó‡∏¥‡∏ï‡∏¢‡πå",     # FACT
        "‡∏î‡∏ß‡∏á‡∏≠‡∏≤‡∏ó‡∏¥‡∏ï‡∏¢‡πå‡∏Ç‡∏∂‡πâ‡∏ô‡∏ó‡∏¥‡∏®‡∏ï‡∏∞‡∏ß‡∏±‡∏ô‡∏≠‡∏≠‡∏Å",     # FACT
        "‡∏´‡∏ô‡∏∂‡πà‡∏á‡∏õ‡∏µ‡∏°‡∏µ‡∏ó‡∏±‡πâ‡∏á‡∏´‡∏°‡∏î 365 ‡∏ß‡∏±‡∏ô",   # FACT
        "‡∏£‡∏ñ‡πÑ‡∏ü‡∏≠‡∏≠‡∏Å‡∏ï‡∏≠‡∏ô 9 ‡πÇ‡∏°‡∏á",         # SCHEDULEDFUTURE
    ]

    random.seed(42)
    random.shuffle(test_sentences)

    all_results = []

    for sentence in test_sentences:
        result = system.full_pipeline(sentence, verbose=True)

        all_results.append({
            "thai_input": result["thai_input"],
            "english_translation": result["english_translation"],
            "coarse_label": result["tense_predictions"]["coarse"][0],
            "fine_label": result["tense_predictions"]["fine"][0],
            "confidence": result["tense_predictions"]["fine"][1],
            "translation_time_sec": result["timing"]["translation"],
            "explanation_time_sec": result["timing"]["explanation"],
            "total_time_sec": result["timing"]["total"],
            "explanation": result["explanation"]
        })

        print("\n" + "="*80 + "\n")

    # Save CSV to Google Drive path
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    output_path = f"/content/drive/MyDrive/Dissertation/Model v.5/results/hybrid4b_results_{timestamp}.csv"
    df = pd.DataFrame(all_results)
    df.to_csv(output_path, index=False)
    print(f"‚úÖ Logged results saved to Google Drive:\n{output_path}")

üîå Loading GGUF translator...


llama_context: n_ctx_per_seq (4096) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
llama_kv_cache_unified_iswa: using full-size SWA cache (ref: https://github.com/ggml-org/llama.cpp/pull/13194#issuecomment-2868343055)
llama_kv_cache_unified: LLAMA_SET_ROWS=0, using old ggml_cpy() method for backwards compatibility
llama_kv_cache_unified: LLAMA_SET_ROWS=0, using old ggml_cpy() method for backwards compatibility


‚úÖ Translator loaded successfully
üîå Loading BERT classifier...
‚úÖ Classifier loaded successfully
üîå Loading explainer model...


The following generation flags are not valid and may be ignored: ['cache_implementation']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
The following generation flags are not valid and may be ignored: ['cache_implementation']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

‚úÖ Explainer loaded successfully
üìù Input Thai: ‡∏Ñ‡∏∏‡∏ì‡∏û‡πà‡∏≠‡∏Ç‡∏≠‡∏á‡∏ú‡∏°‡∏ï‡∏∑‡πà‡∏ô‡∏ô‡∏≠‡∏ô‡πÄ‡∏ß‡∏•‡∏≤ 8 ‡πÇ‡∏°‡∏á‡πÄ‡∏ä‡πâ‡∏≤

üîÑ Step 1: Translation
   Result: My dad wakes up at 8 o'clock in the morning.
   Time: 1.82s

üè∑Ô∏è Step 2: Tense Classification
   Top predictions:
   1. HABIT (Present Simple - ‡∏Å‡∏¥‡∏à‡∏ß‡∏±‡∏ï‡∏£/‡∏ô‡∏¥‡∏™‡∏±‡∏¢): 89.69%
   2. SCHEDULEDFUTURE (Present Simple - ‡∏ï‡∏≤‡∏£‡∏≤‡∏á‡πÄ‡∏ß‡∏•‡∏≤/‡πÅ‡∏ú‡∏ô‡∏Å‡∏≤‡∏£‡∏ó‡∏µ‡πà‡∏Å‡∏≥‡∏´‡∏ô‡∏î‡πÑ‡∏ß‡πâ): 9.24%
   3. FACT (Present Simple - ‡∏Ç‡πâ‡∏≠‡πÄ‡∏ó‡πá‡∏à‡∏à‡∏£‡∏¥‡∏á): 0.18%

üí° Step 3: Grammar Explanation
üîç [Tense Prediction]
   Coarse: Present (99.89%)
   Fine  : HABIT (89.69%)


  self.gen = func(*args, **kwds)
W0713 14:53:10.146000 229 torch/_inductor/utils.py:1137] [0/0] Not enough SMs to use max_autotune_gemm mode


‚úÖ Explanation validation passed

üìä Total processing time: 99.40s

üìñ TENSE DETAILS:

Tense Tag: HABIT
Tense: Present Simple
Thai Name: Present Simple - ‡∏Å‡∏¥‡∏à‡∏ß‡∏±‡∏ï‡∏£/‡∏ô‡∏¥‡∏™‡∏±‡∏¢
Usage: ‡πÉ‡∏ä‡πâ‡πÄ‡∏°‡∏∑‡πà‡∏≠‡∏û‡∏π‡∏î‡∏ñ‡∏∂‡∏á‡∏Å‡∏¥‡∏à‡∏ß‡∏±‡∏ï‡∏£‡∏´‡∏£‡∏∑‡∏≠‡∏û‡∏§‡∏ï‡∏¥‡∏Å‡∏£‡∏£‡∏°‡∏ó‡∏µ‡πà‡∏ó‡∏≥‡πÄ‡∏õ‡πá‡∏ô‡∏õ‡∏£‡∏∞‡∏à‡∏≥
Structure: Subject + V1 (‡∏ñ‡πâ‡∏≤‡∏õ‡∏£‡∏∞‡∏ò‡∏≤‡∏ô‡πÄ‡∏≠‡∏Å‡∏û‡∏à‡∏ô‡πå‡πÄ‡∏ï‡∏¥‡∏° s/es)
Keywords: always, usually, often, sometimes, every day
Example: I drink coffee every morning.

üìñ EXPLANATION:
**1. ‡∏ß‡∏¥‡πÄ‡∏Ñ‡∏£‡∏≤‡∏∞‡∏´‡πå Tense ‡∏ó‡∏µ‡πà‡πÉ‡∏ä‡πâ:**

*   **Tense ‡∏ó‡∏µ‡πà‡πÉ‡∏ä‡πâ:** ‡∏õ‡∏£‡∏∞‡πÇ‡∏¢‡∏Ñ‡∏ô‡∏µ‡πâ‡πÉ‡∏ä‡πâ **Present Simple** (‡∏õ‡∏±‡∏à‡∏à‡∏∏‡∏ö‡∏±‡∏ô‡∏ö‡∏∏‡∏Ñ‡∏•‡∏¥‡∏Å)
*   **‡∏Å‡∏≤‡∏£‡πÉ‡∏ä‡πâ‡∏á‡∏≤‡∏ô‡πÉ‡∏ô‡∏ö‡∏£‡∏¥‡∏ö‡∏ó‡∏ô‡∏µ‡πâ:** Present Simple ‡∏ñ‡∏π‡∏Å‡πÉ‡∏ä‡πâ‡πÄ‡∏û‡∏∑‡πà‡∏≠‡∏ö‡∏£‡∏£‡∏¢‡∏≤‡∏¢‡∏Å‡∏¥‡∏à‡∏ß‡∏±‡∏ï‡∏£‡∏õ‡∏£‡∏∞‡∏à‡∏≥‡∏ß‡∏±‡∏ô‡∏Ç‡∏≠‡∏á‡∏ö‡∏∏‡∏Ñ‡∏Ñ‡∏• ‡∏ã‡∏∂‡πà‡∏á‡πÉ‡∏ô‡∏Å‡∏£‡∏ì‡∏µ‡∏ô‡∏µ‡πâ‡∏Ñ‡∏∑‡∏≠‡

  self.gen = func(*args, **kwds)


‚úÖ Explanation validation passed

üìä Total processing time: 17.62s

üìñ TENSE DETAILS:

Tense Tag: FACT
Tense: Present Simple
Thai Name: Present Simple - ‡∏Ç‡πâ‡∏≠‡πÄ‡∏ó‡πá‡∏à‡∏à‡∏£‡∏¥‡∏á
Usage: ‡πÉ‡∏ä‡πâ‡∏Å‡∏±‡∏ö‡∏Ç‡πâ‡∏≠‡πÄ‡∏ó‡πá‡∏à‡∏à‡∏£‡∏¥‡∏á‡∏ó‡∏µ‡πà‡πÄ‡∏õ‡πá‡∏ô‡∏™‡∏±‡∏à‡∏ò‡∏£‡∏£‡∏°‡∏´‡∏£‡∏∑‡∏≠‡πÄ‡∏õ‡πá‡∏ô‡∏Ñ‡∏ß‡∏≤‡∏°‡∏£‡∏π‡πâ‡∏ó‡∏≤‡∏á‡∏ß‡∏¥‡∏ó‡∏¢‡∏≤‡∏®‡∏≤‡∏™‡∏ï‡∏£‡πå
Structure: Subject + V1
Keywords: ‡∏Ç‡πâ‡∏≠‡πÄ‡∏ó‡πá‡∏à‡∏à‡∏£‡∏¥‡∏á‡∏ó‡∏±‡πà‡∏ß‡πÑ‡∏õ, ‡∏™‡∏±‡∏à‡∏ò‡∏£‡∏£‡∏°
Example: The sun rises in the east.

üìñ EXPLANATION:
**‡∏Å‡∏≤‡∏£‡∏ß‡∏¥‡πÄ‡∏Ñ‡∏£‡∏≤‡∏∞‡∏´‡πå Tense ‡πÉ‡∏ô‡∏õ‡∏£‡∏∞‡πÇ‡∏¢‡∏Ñ ‚Äú‡∏î‡∏ß‡∏á‡∏≠‡∏≤‡∏ó‡∏¥‡∏ï‡∏¢‡πå‡∏Ç‡∏∂‡πâ‡∏ô‡∏ó‡∏¥‡∏®‡∏ï‡∏∞‡∏ß‡∏±‡∏ô‡∏≠‡∏≠‡∏Å‚Äù**

1.  **‡∏ß‡∏¥‡πÄ‡∏Ñ‡∏£‡∏≤‡∏∞‡∏´‡πå Tense ‡∏ó‡∏µ‡πà‡πÉ‡∏ä‡πâ:**

    *   Tense ‡∏ó‡∏µ‡πà‡πÉ‡∏ä‡πâ‡πÉ‡∏ô‡∏õ‡∏£‡∏∞‡πÇ‡∏¢‡∏Ñ "The sun rises in the east." ‡∏Ñ‡∏∑‡∏≠ **Present Simple**
    *   **‡∏Å‡∏≤‡∏£‡πÉ‡∏ä‡πâ‡∏á‡∏≤‡∏ô:** Present Simple ‡πÉ‡∏ä‡πâ‡πÄ‡∏û‡∏∑‡πà‡∏≠‡πÅ‡∏™‡∏î‡∏á‡∏Ç‡πâ‡∏≠‡πÄ‡∏ó‡πá‡∏à‡∏à‡∏£‡∏¥‡