# TranslateGemma - Document Translator

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/jimmyliao/trans-gemma/blob/main/document-translator-colab.ipynb)

Translate PDFs and images to Traditional Chinese (zh-TW) using Google's TranslateGemma model.

**Features:**
- üìÑ Download from arXiv automatically
- üñºÔ∏è Upload images or PDFs
- üöÄ Fast GPU inference on Colab (T4)
- üáπüáº Force Traditional Chinese output (configurable)

**Single Source of Truth:** Uses the same code from [trans-gemma repo](https://github.com/jimmyliao/trans-gemma)

---

## üë§ About the Author

**Jimmy Liao** - AI GDE (Google Developer Expert), CTO/Co-Founder of AI Startup

Dedicated to smart manufacturing and finance sectors, focusing on transforming technical challenges from AI advancement into competitive advantages while enhancing client value and operational efficiency.

- üê¶ Twitter: [@jimmyliao](https://twitter.com/jimmyliao)
- üíº LinkedIn: [jimmyliao](https://linkedin.com/in/jimmyliao)
- üìù Blog: [memo.jimmyliao.net](https://memo.jimmyliao.net)
- üîó Sessionize: [jimmy-liao](https://sessionize.com/jimmy-liao/)

---

**Disclaimer:** This notebook is provided for educational and research purposes. The author is not affiliated with Google's TranslateGemma team. Use at your own discretion.

## 1Ô∏è‚É£ Setup: Clone Repository

In [None]:
# Clone the repository (single source of truth)
!git clone https://github.com/jimmyliao/trans-gemma.git
%cd trans-gemma

## 2Ô∏è‚É£ Install Dependencies

In [None]:
# Install uv (fast Python package manager)
!pip install uv -q

# Install project dependencies
!uv pip install --system -e ".[examples]"

## 2.5Ô∏è‚É£ HuggingFace Authentication

**IMPORTANT:** TranslateGemma is a gated model. You need to:
1. Get a HuggingFace token from [https://huggingface.co/settings/tokens](https://huggingface.co/settings/tokens)
2. Accept model access at [https://huggingface.co/google/translategemma-4b-it](https://huggingface.co/google/translategemma-4b-it)
3. Add your token to Colab Secrets (üîë icon on left sidebar, name: `HF_TOKEN`)

In [None]:
from huggingface_hub import login
import os

try:
    # Try to load from Colab secrets (recommended)
    from google.colab import userdata
    HF_TOKEN = userdata.get('HF_TOKEN')
    os.environ['HF_TOKEN'] = HF_TOKEN
    login(token=HF_TOKEN)
    print("‚úÖ Authenticated with HuggingFace using Colab secrets")
except Exception as e:
    print("‚ö†Ô∏è  Could not load HF_TOKEN from Colab secrets")
    print("Please add HF_TOKEN to Colab Secrets (üîë icon on left sidebar)")
    print(f"Error: {e}")
    
    # Fallback: Manual token input
    print("\nüìù Or enter your token manually:")
    HF_TOKEN = input("HuggingFace Token: ")
    os.environ['HF_TOKEN'] = HF_TOKEN
    login(token=HF_TOKEN)
    print("‚úÖ Authenticated with HuggingFace")

## 3Ô∏è‚É£ Configuration

In [None]:
import os

# Target language (default: Traditional Chinese)
TARGET_LANG = "zh-TW"  # Change to "zh-CN", "ja", "ko", etc. if needed

# Backend (transformers is best for Colab GPU)
BACKEND = "transformers"

print(f"‚úÖ Target language: {TARGET_LANG}")
print(f"‚úÖ Backend: {BACKEND}")

## 4Ô∏è‚É£ Option A: Download from arXiv

Automatically download and translate arXiv papers.

In [None]:
# Enter arXiv ID (e.g., "2601.09012v2" or "2601.09012")
ARXIV_ID = "2601.09012v2"  # TranslateGemma technical report

# Translate specific pages (1-indexed)
START_PAGE = 1
END_PAGE = 1  # Set to None for all pages

# Build command
cmd = f"python examples/translate.py --mode pdf --arxiv {ARXIV_ID} --backend {BACKEND} --target {TARGET_LANG}"
if START_PAGE:
    cmd += f" --start-page {START_PAGE}"
if END_PAGE:
    cmd += f" --end-page {END_PAGE}"

print(f"Running: {cmd}\n")
!{cmd}

## 4Ô∏è‚É£ Option B: Upload PDF

Upload your own PDF file.

In [None]:
from google.colab import files
import os

# Upload PDF
print("üì§ Please upload your PDF file:")
uploaded = files.upload()

# Get uploaded filename
pdf_file = list(uploaded.keys())[0]
print(f"\n‚úÖ Uploaded: {pdf_file}")

# Translate
START_PAGE = 1
END_PAGE = 3  # Change as needed

cmd = f"python examples/translate.py --mode pdf --file {pdf_file} --backend {BACKEND} --target {TARGET_LANG}"
if START_PAGE:
    cmd += f" --start-page {START_PAGE}"
if END_PAGE:
    cmd += f" --end-page {END_PAGE}"

print(f"\nRunning: {cmd}\n")
!{cmd}

## 4Ô∏è‚É£ Option C: PDF with Image Mode (Multimodal)

Use multimodal TranslateGemma to preserve visual context (tables, charts).

In [None]:
# For arXiv
ARXIV_ID = "2601.09012v2"
START_PAGE = 1
END_PAGE = 1
DPI = 96  # Lower DPI = faster (72, 96, or 150)

cmd = f"python examples/translate.py --mode pdf --arxiv {ARXIV_ID} --backend {BACKEND} --target {TARGET_LANG} --pdf-as-image --dpi {DPI}"
if START_PAGE:
    cmd += f" --start-page {START_PAGE}"
if END_PAGE:
    cmd += f" --end-page {END_PAGE}"

print(f"Running: {cmd}\n")
!{cmd}

## 5Ô∏è‚É£ Single Image Translation

Upload and translate a single image.

In [None]:
from google.colab import files
from PIL import Image
import sys
sys.path.insert(0, 'examples')

# Upload image
print("üì§ Please upload your image:")
uploaded = files.upload()
image_file = list(uploaded.keys())[0]

# Load backend
from backends import TransformersMultimodalBackend

print("\nüîÑ Loading multimodal backend...")
backend = TransformersMultimodalBackend()
backend.load_model()

# Translate
print(f"\nüîÑ Translating {image_file}...")
result = backend.translate_image(image_file, source_lang="en", target_lang=TARGET_LANG)

# Display result
print(f"\n‚úÖ Translation:")
print(result['translation'])
print(f"\n‚è±Ô∏è  Time: {result['time']:.2f}s, Speed: {result['metadata']['tokens_per_second']:.1f} tok/s")

## üìù Notes

- **Backend**: `transformers` is best for Colab GPU (T4)
- **Target Language**: Default is `zh-TW` (Traditional Chinese), change in Configuration section
- **Image Mode**: Slower but preserves visual context (tables, charts, layout)
- **DPI**: Lower DPI (72-96) is faster, higher DPI (150) has better quality

## üîó Links

- [GitHub Repository](https://github.com/jimmyliao/trans-gemma)
- [TranslateGemma Model](https://huggingface.co/google/translategemma-4b-it)
- [Documentation](https://github.com/jimmyliao/trans-gemma/blob/main/examples/README.md)