# üîÑ Resume Training: English-Vietnamese Model

Resume training from existing checkpoint.

**Prerequisites:**
- Checkpoint file (`best_model.pt` ho·∫∑c `checkpoint_epoch_X.pt`)
- Tokenizer files (`tokenizer_en.model`, `tokenizer_vi.model`)
- Processed data (`train.pt`, `val.pt`)

---

## 1. ‚öôÔ∏è Setup

In [None]:
!git clone https://github.com/TranKien2005/EV_Translate_Modle_NLP_Project.git
%cd EV_Translate_Modle_NLP_Project

In [None]:
!pip install -q datasets sentencepiece sacrebleu google-generativeai python-dotenv tqdm tensorboard pyyaml

In [None]:
GEMINI_API_KEY = "YOUR_GEMINI_API_KEY_HERE"
HF_TOKEN = "YOUR_HF_TOKEN_HERE"

with open('.env', 'w') as f:
    f.write(f'GEMINI_API_KEY={GEMINI_API_KEY}\n')
    f.write(f'HF_TOKEN={HF_TOKEN}\n')
print('‚úì .env created')

## 2. üîß Configure Paths

**Nh·∫≠p tr·ª±c ti·∫øp ƒë∆∞·ªùng d·∫´n ƒë·∫øn c√°c files tr√™n Kaggle:**

In [None]:
# ============================================================
# üìÅ NH·∫¨P ƒê∆Ø·ªúNG D·∫™N TR·ª∞C TI·∫æP ƒê·∫æN C√ÅC FILES TR√äN KAGGLE
# ============================================================

# ƒê∆∞·ªùng d·∫´n ƒë·∫øn checkpoint ƒë·ªÉ resume (c√≥ th·ªÉ l√† best_model.pt ho·∫∑c checkpoint_epoch_X.pt)
CHECKPOINT_PATH = "/kaggle/input/YOUR_DATASET/best_model.pt"  # THAY ƒê·ªîI

# ƒê∆∞·ªùng d·∫´n ƒë·∫øn tokenizers
TOKENIZER_SRC_PATH = "/kaggle/input/YOUR_DATASET/tokenizer_en.model"  # THAY ƒê·ªîI
TOKENIZER_TGT_PATH = "/kaggle/input/YOUR_DATASET/tokenizer_vi.model"  # THAY ƒê·ªîI

# ƒê∆∞·ªùng d·∫´n ƒë·∫øn processed data
TRAIN_DATA_PATH = "/kaggle/input/YOUR_DATASET/train.pt"  # THAY ƒê·ªîI
VAL_DATA_PATH = "/kaggle/input/YOUR_DATASET/val.pt"  # THAY ƒê·ªîI

# Output directory (n∆°i l∆∞u checkpoints m·ªõi)
OUTPUT_CHECKPOINT_DIR = "/kaggle/working/checkpoints"
OUTPUT_LOG_DIR = "/kaggle/working/logs"

# ============================================================
print("üìÅ Input Paths:")
print(f"  üì¶ Checkpoint: {CHECKPOINT_PATH}")
print(f"  üìù Tokenizer EN: {TOKENIZER_SRC_PATH}")
print(f"  üìù Tokenizer VI: {TOKENIZER_TGT_PATH}")
print(f"  üìä Train data: {TRAIN_DATA_PATH}")
print(f"  üìä Val data: {VAL_DATA_PATH}")
print(f"\nüìÅ Output Paths:")
print(f"  üíæ Checkpoints: {OUTPUT_CHECKPOINT_DIR}")
print(f"  üìà Logs: {OUTPUT_LOG_DIR}")

In [None]:
# Verify input files exist
import os

files_to_check = [
    ("Checkpoint", CHECKPOINT_PATH),
    ("Tokenizer EN", TOKENIZER_SRC_PATH),
    ("Tokenizer VI", TOKENIZER_TGT_PATH),
    ("Train data", TRAIN_DATA_PATH),
    ("Val data", VAL_DATA_PATH),
]

all_exist = True
for name, path in files_to_check:
    exists = os.path.exists(path)
    status = "‚úì" if exists else "‚úó"
    print(f"{status} {name}: {path}")
    if not exists:
        all_exist = False

if all_exist:
    print("\n‚úÖ All input files found!")
else:
    print("\n‚ùå Some files not found. Please check the paths above.")

## 3. üìÇ Setup Directories & Symlinks

In [None]:
import os
import shutil

# Create output directories
os.makedirs(OUTPUT_CHECKPOINT_DIR, exist_ok=True)
os.makedirs(f"{OUTPUT_CHECKPOINT_DIR}/tokenizers", exist_ok=True)
os.makedirs(OUTPUT_LOG_DIR, exist_ok=True)
os.makedirs("/kaggle/working/data/processed", exist_ok=True)

# Copy tokenizers to output dir (needed for saving)
shutil.copy(TOKENIZER_SRC_PATH, f"{OUTPUT_CHECKPOINT_DIR}/tokenizers/tokenizer_en.model")
shutil.copy(TOKENIZER_TGT_PATH, f"{OUTPUT_CHECKPOINT_DIR}/tokenizers/tokenizer_vi.model")

# Create symlinks for data (avoid copying large files)
if not os.path.exists("/kaggle/working/data/processed/train.pt"):
    os.symlink(TRAIN_DATA_PATH, "/kaggle/working/data/processed/train.pt")
if not os.path.exists("/kaggle/working/data/processed/val.pt"):
    os.symlink(VAL_DATA_PATH, "/kaggle/working/data/processed/val.pt")

print("‚úì Directories created")
print("‚úì Tokenizers copied")
print("‚úì Data symlinks created")

In [None]:
# Update config with paths
import yaml

CONFIG_FILE = 'config/config.yaml'

with open(CONFIG_FILE, 'r') as f:
    cfg = yaml.safe_load(f)

cfg['paths'] = {
    'data_dir': '/kaggle/working/data',
    'checkpoint_dir': OUTPUT_CHECKPOINT_DIR,
    'log_dir': OUTPUT_LOG_DIR
}
cfg['data']['source'] = 'processed'

with open(CONFIG_FILE, 'w') as f:
    yaml.dump(cfg, f, default_flow_style=False, allow_unicode=True)

print('‚úì Config updated')

## 4. üîç Verify Checkpoint

In [None]:
import sys
sys.path.insert(0, '.')

import torch

# Load and inspect checkpoint
checkpoint = torch.load(CHECKPOINT_PATH, map_location='cpu')

print("üì¶ Checkpoint Info:")
print(f"  Epoch: {checkpoint.get('epoch', 'N/A')}")
print(f"  Val Loss: {checkpoint.get('val_loss', 'N/A'):.4f}" if 'val_loss' in checkpoint else "  Val Loss: N/A")
print(f"  Best Val Loss: {checkpoint.get('best_val_loss', 'N/A'):.4f}" if 'best_val_loss' in checkpoint else "  Best Val Loss: N/A")
print(f"  Has scheduler state: {'scheduler_state_dict' in checkpoint}")
print(f"  Keys: {list(checkpoint.keys())}")

## 5. üèãÔ∏è Resume Training

In [None]:
from src.train import Trainer
from src.config import load_config

config = load_config('config/config.yaml')
trainer = Trainer(config_path='config/config.yaml')
trainer.setup()

print("\n" + "="*50)
print("üîÑ Resuming EN ‚Üí VI Training")
print("="*50)

# Resume t·ª´ checkpoint ƒë√£ ch·ªâ ƒë·ªãnh
trainer.train(resume_from=CHECKPOINT_PATH)