# Basic Handwriting Recognition with Thulium

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/thulium-htr/Thulium/blob/main/examples/01_basic_recognition.ipynb)
[![PyPI](https://img.shields.io/pypi/v/thulium-htr)](https://pypi.org/project/thulium-htr/)

This notebook demonstrates how to use Thulium for recognizing handwritten text from images.

**Topics covered:**
- Single image recognition
- Batch processing
- Multilingual recognition
- Confidence scores

## Installation

Install Thulium from PyPI:

In [None]:
# Install Thulium (uncomment in Colab)
# !pip install thulium-htr -q

In [None]:
# Verify installation
import thulium
print(f"Thulium version: {thulium.__version__}")

## 1. Single Image Recognition

The simplest way to recognize text from an image:

In [None]:
from thulium import recognize_image

# Recognize text from an image
# result = recognize_image("your_image.png", language="en")
# print(f"Text: {result.text}")
# print(f"Confidence: {result.confidence:.2%}")

## 2. Check Supported Languages

Thulium supports 52+ languages across 10 scripts:

In [None]:
from thulium.data.language_profiles import list_supported_languages, get_language_profile

# List all languages
languages = list_supported_languages()
print(f"Supported languages: {len(languages)}")
print(f"First 10: {languages[:10]}")

In [None]:
# Get details for a specific language
profile = get_language_profile("de")
print(f"Language: {profile.name}")
print(f"Script: {profile.script}")
print(f"Region: {profile.region}")

## 3. Model Selection

Choose the right model for your use case:

| Model | Parameters | Best For |
|-------|-----------|----------|
| `thulium-tiny` | ~5M | Edge/Mobile |
| `thulium-base` | ~25M | Balanced |
| `thulium-large` | ~100M | Maximum accuracy |

In [None]:
from thulium import HTRPipeline

# Load a specific model
# pipeline = HTRPipeline.from_pretrained("thulium-base")
# result = pipeline.recognize(image)

## 4. Multilingual Recognition

Recognize text in different languages:

In [None]:
# German
# result_de = recognize_image("german.png", language="de")

# French
# result_fr = recognize_image("french.png", language="fr")

# Arabic (RTL)
# result_ar = recognize_image("arabic.png", language="ar")

# Japanese
# result_ja = recognize_image("japanese.png", language="ja")

## 5. Batch Processing

Process multiple images efficiently:

In [None]:
# Batch recognition
# images = ["page1.png", "page2.png", "page3.png"]
# results = pipeline.recognize_batch(images, batch_size=16)
# 
# for img, result in zip(images, results):
#     print(f"{img}: {result.text[:50]}...")

## Next Steps

- [Benchmarking](02_benchmarking.ipynb) - Evaluate model performance
- [Error Analysis](03_error_analysis.ipynb) - Debug recognition errors
- [Model Zoo](../docs/models/model_zoo.md) - All available models
- [API Reference](../docs/api/reference.md) - Complete API documentation