<a href="https://colab.research.google.com/github/prabhatpathak77/Punjabi-character-set-recognition-SLM-/blob/main/Punjabi_Character_recognizer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
"""
Train a Punjabi character recognizer that works on BOTH
printed (synthetic) and handwritten characters.

What you get in ONE file:
- Character set ( vowels + consonants + extended letters)
- Synthetic dataset generator (printed) using TrueType fonts
- Data directory setup & train/val split
- PyTorch Dataset/DataLoader with strong augmentations
- Compact CNN (ResNet18) fine‑tuned for 48 classes
- Mixed‑precision training, early stopping, checkpointing
- Evaluation (accuracy + confusion matrix)
- Simple inference helper

USAGE (typical):
1) Put a few Punjabi fonts (TTF) inside:  assets/fonts/
   Examples: Raavi.ttf, AnmolUni.ttf, GurbaniAkhar.ttf (any fonts that support Punjabi)

2) (Optional but recommended) Add your OWN handwritten samples here:
   data/raw_handwritten/<label>/*.png  (label is the literal character, e.g. "ਕ")
   You can have multiple images per character. They’ll be merged with synthetic data.

3) Generate synthetic printed images + split data + train:
   Set `generate = True` and `train = True` below, then run the cell.

4) Evaluate and predict a single image:
   Set `eval = True` or `predict = "path/to/image.png"` below, then run the cell.

Folders that will be created:
- data/synth/{train,val}/<label>/xxx.png      (synthetic printed)
- data/handwritten/{train,val}/<label>/xxx.png (your raw handwritten split)
- data/final/{train,val}/<label>/xxx.png      (merged for training)
- checkpoints/best_model.pt                    (best weights)

Tested with PyTorch 2.x, torchvision, Pillow, scikit-image, numpy.
"""

# -----------------------------
# SET YOUR ARGS HERE
# -----------------------------
generate = True # Set to True to generate synthetic data and split handwritten data
train = False   # Set to True to train the model
eval = False    # Set to True to evaluate the model on the validation set
predict = "/WhatsApp Image 2025-08-22 at 10.35.44 (1).jpeg"  # Set to a path like "data/final/val/ਕ/hand_00001.png" to predict a single image

# Data/Model paths
from pathlib import Path
fonts_dir = Path("assets/fonts")              # Folder with .ttf Punjabi fonts
raw_hand_dir = Path("data/raw_handwritten")   # Your handwritten source images by label
synth_out_dir = Path("data/synth")            # Where to place synthetic dataset
hand_out_dir = Path("data/handwritten")       # Where to split handwritten
final_out_dir = Path("data/final")            # Merged dataset root
checkpoint_path = Path("checkpoints/best_model.pt") # Path to save/load the best model

# Training parameters
images_per_char = 300
epochs = 20
batch_size = 128
lr = 1e-3
img_size = 96
patience = 5


# -----------------------------
# IMPORTS
# -----------------------------
from pathlib import Path


import json
import math
import os
import random
import shutil
from collections import defaultdict



import numpy as np
from PIL import Image, ImageDraw, ImageFont, ImageOps, ImageFilter

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, transforms, models

from sklearn.metrics import confusion_matrix, classification_report


# -----------------------------
# 1) CHARACTER SET
# -----------------------------
# You can edit this list to match exactly what you want to classify.
CHARSET = [
    # Vowels
    "ਅ","ਆ","ਇ","ਈ","ਉ","ਊ","ਏ","ਐ","ਓ","ਔ",
    # Consonants
    "ਕ","ਖ","ਗ","ਘ","ਙ",
    "ਚ","ਛ","ਜ","ਝ","ਞ",
    "ਟ","ਠ","ਡ","ਢ","ਣ",
    "ਤ","ਥ","ਦ","ਧ","ਨ",
    "ਪ","ਫ","ਬ","ਭ","ਮ",
    # Sibilants & h
    "ਯ","ਰ","ਲ","ਵ",
    "ਸ਼","ਸ","ਹ",
    # Nukta/extended
    "ਖ਼","ਗ਼","ਜ਼","ਫ਼","ਲ਼"
]

NUM_CLASSES = len(CHARSET)
LABEL_TO_IDX = {ch: i for i, ch in enumerate(CHARSET)}
IDX_TO_LABEL = {i: ch for ch, i in LABEL_TO_IDX.items()}

# -----------------------------
# 2) UTILS
# -----------------------------

def seed_everything(seed: int = 42):
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False


def ensure_dir(p: Path):
    p.mkdir(parents=True, exist_ok=True)


# -----------------------------
# 3) SYNTHETIC DATA GENERATOR (PRINTED)
# -----------------------------

def _rand_affine(img: Image.Image):
    """Random affine transform with slight rotation, shear, and translate."""
    # PIL Image.affine takes a 6-tuple (a, b, c, d, e, f) for
    # x_new = ax + by + c
    # y_new = dx + ey + f
    # We want slight rotation, shear, and translation.
    # This corresponds to:
    # [ cos(theta) -sin(theta) tx ]
    # [ sin(theta)  cos(theta) ty ]
    # [ 0           0          1  ]
    # [ 0           0          1  ]
    # Then potentially add shear.
    angle = random.uniform(-5, 5)  # degrees
    shear_x = random.uniform(-3, 3) # degrees
    tx = random.uniform(-0.02, 0.02) * img.size[0] # pixels
    ty = random.uniform(-0.02, 0.02) * img.size[1] # pixels

    # Rotation matrix
    rad = math.radians(angle)
    cos = math.cos(rad)
    sin = math.sin(rad)

    # Shear matrix (applied after rotation)
    shear_x_rad = math.radians(shear_x)
    shear_matrix = (1, math.tan(shear_x_rad), 0, 0, 1, 0)

    # Combined (approximate) transform
    # x_new = cos*x - sin*y + tx + tan(shear_x)*(sin*x + cos*y + ty)
    # y_new = sin*x + cos*y + ty
    # This is not exactly a clean affine matrix, but PIL handles it.
    # A simpler approach is to apply transforms sequentially, but that's slower.
    # Let's just use a simple affine matrix construction for small angles/shears.
    # x_new = cos*x + (tan(shear_x)*cos - sin)*y + tx + tan(shear_x)*ty
    # y_new = sin*x + cos*y + ty

    # A common way to approximate the affine matrix for small rotations and shears
    # is to combine them:
    # cos(angle)*cos(shear) - sin(angle)*sin(shear)   -cos(angle)*sin(shear) - sin(angle)*cos(shear)
    # sin(angle)*cos(shear) + cos(angle)*sin(shear)   -sin(angle)*sin(shear) + cos(angle)*cos(shear)
    #
    # For small angles: cos approx 1, sin approx angle.
    # shear_x in PIL affine seems to be applied to x coordinates based on y.
    # x_new = ax + by + c
    # y_new = dx + ey + f
    # For rotation by angle and shear_x:
    # a = cos(angle)
    # b = -sin(angle) + shear_x * cos(angle)
    # d = sin(angle)
    # e = cos(angle) + shear_x * sin(angle)

    # Let's keep it simpler and closer to Pillow's docs for affine:
    # (a, b, c, d, e, f) where (x', y') = (ax + by + c, dx + ey + f)
    # Identity is (1, 0, 0, 0, 1, 0)
    # Rotation by angle: (cos(a), -sin(a), 0, sin(a), cos(a), 0)
    # Shear x by angle: (1, tan(a), 0, 0, 1, 0)
    # Translation: (1, 0, tx, 0, 1, ty)

    # We can combine these matrices (rotation * shear * translate), but PIL's affine
    # is usually applied with respect to the center of the image implicitly or explicitly.
    # A simpler way that matches Pillow's expected tuple:
    # Use get_affine_matrix from torchvision or calculate manually
    # For small transforms around center (cx, cy):
    # x_new = a*(x-cx) + b*(y-cy) + cx + tx
    # y_new = d*(x-cx) + e*(y-cy) + cy + ty
    # x_new = ax + by + (c - acx - bcy + cx + tx)
    # y_new = dx + ey + (f - dcx - ecy + cy + ty)

    # A practical way to apply rotation, translation, and shear with PIL's affine is
    # to build the matrix components:
    # Rotation + Shear (combined matrix):
    # R = [cos, -sin], [sin, cos]
    # S = [1, tan(shear)], [0, 1]
    # RS = [cos, tan(shear)*cos - sin], [sin, tan(shear)*sin + cos]
    #
    # Let's use a simplified approach for slight transforms:
    # Translate to origin, rotate, shear, translate back, then add final translation
    # This is too complex for the PIL affine tuple.

    # Back to the simple affine tuple structure:
    # (a, b, c, d, e, f)
    a = 1.0 + random.uniform(-0.03, 0.03)
    b = random.uniform(-0.05, 0.05)
    c = random.uniform(-img.size[0] * 0.04, img.size[0] * 0.04)
    d = random.uniform(-0.05, 0.05)
    e = 1.0 + random.uniform(-0.03, 0.03)
    f = random.uniform(-img.size[1] * 0.04, img.size[1] * 0.04)

    return img.transform(
        img.size,
        Image.AFFINE,
        (a, b, c, d, e, f),
        fillcolor=255, # Fill with white
        resample=random.choice([Image.BICUBIC, Image.BILINEAR])
    )


def generate_synthetic_dataset(
    fonts_dir: Path,
    out_root: Path,
    images_per_char: int = 300,
    canvas_size: int = 96,
):
    """Render each class (character) using multiple fonts, sizes, and jitters."""
    ensure_dir(out_root / "train")
    ensure_dir(out_root / "val")

    # Recursively search for TTF files
    font_paths = [p for p in fonts_dir.glob("**/*.ttf")]
    if not font_paths:
        print(f"No TTF fonts found in {fonts_dir} or its subdirectories. Please add Gurmukhi fonts (*.ttf). Skipping synthetic data generation.")
        return

    for ch in CHARSET:
        for split in ["train", "val"]:
            ensure_dir(out_root / split / ch)

        for i in range(images_per_char):
            font_path = random.choice(font_paths)
            # random font size tuned for 96x96 canvas
            fsize = random.randint(int(canvas_size * 0.55), int(canvas_size * 0.85))
            font = ImageFont.truetype(str(font_path), fsize)

            img = Image.new("L", (canvas_size, canvas_size), color=255)
            draw = ImageDraw.Draw(img)
            bbox = draw.textbbox((0, 0), ch, font=font)
            tw, th = bbox[2] - bbox[0], bbox[3] - bbox[1]
            x = (canvas_size - tw) // 2 - bbox[0]
            # in Gurmukhi, headline (siari) can get clipped; nudge down a bit
            y = (canvas_size - th) // 2 - bbox[1] + random.randint(-2, 4)
            draw.text((x, y), ch, font=font, fill=0)

            # Augmentations specific to printed text
            if random.random() < 0.8:
                img = _rand_affine(img)
            if random.random() < 0.25:
                img = img.filter(ImageFilter.GaussianBlur(radius=random.uniform(0.2, 0.8)))
            if random.random() < 0.25:
                # slight contrast/inversion trick for robustness
                img = ImageOps.autocontrast(img)

            # Normalize canvas back to fixed size after affine
            img = ImageOps.contain(img, (canvas_size, canvas_size), Image.BICUBIC)
            final = Image.new("L", (canvas_size, canvas_size), 255)
            ox = (canvas_size - img.size[0]) // 2
            oy = (canvas_size - img.size[1]) // 2
            final.paste(img, (ox, oy))

            split = "val" if i % 10 == 0 else "train"  # ~10% val
            fname = f"{ch}_{i:05d}.png"
            final.save(out_root / split / ch / fname)

    print(f"✅ Synthetic dataset generated at: {out_root}")


# -----------------------------
# 4) MERGE HANDWRITTEN + SYNTHETIC & SPLIT
# -----------------------------

def split_handwritten(raw_src: Path, out_root: Path, val_ratio: float = 0.1):
    """Split your own handwritten images into train/val under the same label folders.
    Assumes raw_src/<label>/*.png or *.jpg
    """
    for ch in CHARSET:
        src = raw_src / ch
        if not src.exists():
            continue
        imgs = [p for p in src.glob("*.*") if p.suffix.lower() in {".png", ".jpg", ".jpeg", ".webp", ".bmp"}]
        if not imgs:
            continue
        random.shuffle(imgs)
        n_val = max(1, int(len(imgs) * val_ratio))
        val_imgs = imgs[:n_val]
        train_imgs = imgs[n_val:]
        for split, subset in [("train", train_imgs), ("val", val_imgs)]:
            dst = out_root / split / ch
            ensure_dir(dst)
            for i, p in enumerate(subset):
                shutil.copy2(p, dst / f"hand_{i:05d}{p.suffix.lower()}")
    print(f"📦 Handwritten split into: {out_root}")


def merge_datasets(synth_root: Path, hand_root: Path, final_root: Path):
    # Copy both into final/<split>/<label>/
    for split in ["train", "val"]:
        for ch in CHARSET:
            dst = final_root / split / ch
            ensure_dir(dst)
            for src_root in [synth_root, hand_root]:
                src = src_root / split / ch
                if src.exists():
                    for p in src.glob("*.*"):
                        shutil.copy2(p, dst / p.name)
    print(f"🔗 Merged datasets into: {final_root}")


# -----------------------------
# 5) DATASET & TRANSFORMS
# -----------------------------

def build_dataloaders(final_root: Path, batch_size: int = 128, img_size: int = 96):
    # Handwritten often has stroke width variance and background; keep strong augs on train
    train_tf = transforms.Compose([
        transforms.Grayscale(),
        transforms.RandomApply([transforms.GaussianBlur(3)], p=0.15),
        transforms.RandomAffine(degrees=8, translate=(0.05, 0.05), shear=6),
        transforms.ToTensor(),
        transforms.Normalize((0.5,), (0.5,)),
    ])

    val_tf = transforms.Compose([
        transforms.Grayscale(),
        transforms.ToTensor(),
        transforms.Normalize((0.5,), (0.5,)),
    ])

    train_ds = datasets.ImageFolder(str(final_root / "train"), transform=train_tf)
    val_ds = datasets.ImageFolder(str(final_root / "val"), transform=val_tf)

    # Sanity check on class order -> save mapping
    class_to_idx = train_ds.class_to_idx
    with open("class_mapping.json", "w", encoding="utf-8") as f:
        json.dump(class_to_idx, f, ensure_ascii=False, indent=2)

    train_loader = DataLoader(train_ds, batch_size=batch_size, shuffle=True, num_workers=4, pin_memory=True)
    val_loader = DataLoader(val_ds, batch_size=batch_size, shuffle=False, num_workers=4, pin_memory=True)
    return train_loader, val_loader, class_to_idx


# -----------------------------
# 6) MODEL
# -----------------------------

def build_model(num_classes: int = NUM_CLASSES):
    model = models.resnet18(weights=None)  # keep it light; from scratch is fine here
    # Change first conv to accept 1 channel
    model.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False)
    model.fc = nn.Linear(model.fc.in_features, num_classes)
    return model


# -----------------------------
# 7) TRAINING LOOP
# -----------------------------

def train(
    final_root: Path,
    epochs: int = 20,
    batch_size: int = 128,
    lr: float = 1e-3,
    img_size: int = 96,
    patience: int = 5,
    device: str | None = None,
    checkpoint_path: Path = Path("checkpoints/best_model.pt") # Added checkpoint_path
):
    device = device or ("cuda" if torch.cuda.is_available() else "cpu")
    train_loader, val_loader, class_to_idx = build_dataloaders(final_root, batch_size, img_size)

    model = build_model(len(class_to_idx)).to(device)
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.AdamW(model.parameters(), lr=lr)
    scaler = torch.cuda.amp.GradScaler(enabled=(device == "cuda"))

    best_val = 0.0
    epochs_no_improve = 0
    ensure_dir(Path("checkpoints"))

    for epoch in range(1, epochs + 1):
        model.train()
        running_loss = 0.0
        correct = 0
        total = 0

        for images, labels in train_loader:
            images, labels = images.to(device), labels.to(device)
            optimizer.zero_grad(set_to_none=True)
            with torch.cuda.amp.autocast(enabled=(device == "cuda")):
                outputs = model(images)
                loss = criterion(outputs, labels)
            scaler.scale(loss).backward()
            scaler.step(optimizer)
            scaler.update()

            running_loss += loss.item() * images.size(0)
            _, preds = outputs.max(1)
            correct += (preds == labels).sum().item()
            total += labels.size(0)

        train_loss = running_loss / total
        train_acc = correct / total

        # validation
        model.eval()
        v_correct, v_total = 0, 0
        v_loss_total = 0.0
        with torch.no_grad():
            for images, labels in val_loader:
                images, labels = images.to(device), labels.to(device)
                outputs = model(images)
                loss = criterion(outputs, labels)
                v_loss_total += loss.item() * images.size(0)
                _, preds = outputs.max(1)
                v_correct += (preds == labels).sum().item()
                v_total += labels.size(0)
        val_loss = v_loss_total / v_total
        val_acc = v_correct / v_total

        print(f"Epoch {epoch:02d} | train loss {train_loss:.4f} acc {train_acc:.4f} | val loss {val_loss:.4f} acc {val_acc:.4f}")

        # early stopping on val_acc
        if val_acc > best_val:
            best_val = val_acc
            epochs_no_improve = 0
            torch.save({
                "model_state": model.state_dict(),
                "class_to_idx": class_to_idx,
            }, checkpoint_path) # Use checkpoint_path
            print(f"  ↳ ✅ Saved new best (val acc={val_acc:.4f})")
        else:
            epochs_no_improve += 1
            if epochs_no_improve >= patience:
                print("  ↳ Early stopping")
                break

    print(f"Best val acc: {best_val:.4f}")


# -----------------------------
# 8) EVALUATION
# -----------------------------

def evaluate(final_root: Path, checkpoint_path: Path = Path("checkpoints/best_model.pt")):
    device = "cuda" if torch.cuda.is_available() else "cpu"
    _, val_loader, _ = build_dataloaders(final_root, batch_size=256)

    if not checkpoint_path.exists():
        print(f"Error: Checkpoint not found at {checkpoint_path}. Please train the model first.")
        return

    ckpt = torch.load(checkpoint_path, map_location=device)
    class_to_idx = ckpt["class_to_idx"]
    idx_to_class = {v: k for k, v in class_to_idx.items()}

    model = build_model(len(class_to_idx))
    model.load_state_dict(ckpt["model_state"])
    model.to(device)
    model.eval()

    y_true, y_pred = [], []

    with torch.no_grad():
        for images, labels in val_loader:
            images, labels = images.to(device), labels.to(device)
            outputs = model(images)
            _, preds = outputs.max(1)
            y_true.extend(labels.cpu().numpy().tolist())
            y_pred.extend(preds.cpu().numpy().tolist())

    acc = (np.array(y_true) == np.array(y_pred)).mean()
    print(f"Validation accuracy: {acc:.4f}")

    cm = confusion_matrix(y_true, y_pred, labels=list(range(len(idx_to_class))))
    print("Confusion matrix (rows=true, cols=pred):")
    np.set_printoptions(linewidth=160)
    print(cm)

    target_names = [idx_to_class[i] for i in range(len(idx_to_class))]
    print(classification_report(y_true, y_pred, target_names=target_names))


# -----------------------------
# 9) INFERENCE
# -----------------------------
def predict_image(img_path: Path, checkpoint_path: Path = Path("checkpoints/best_model.pt")):
    device = "cuda" if torch.cuda.is_available() else "cpu"

    if not checkpoint_path.exists():
        print(f"❌ Checkpoint not found at {checkpoint_path}. Train the model first.")
        return None

    # Load checkpoint
    ckpt = torch.load(checkpoint_path, map_location=device)
    class_to_idx = ckpt["class_to_idx"]
    idx_to_class = {v: k for k, v in class_to_idx.items()}

    # Build and load model
    model = build_model(len(class_to_idx))
    model.load_state_dict(ckpt["model_state"])
    model.to(device)
    model.eval()

    # Transform for single image (same as val)
    tf = transforms.Compose([
        transforms.Grayscale(),
        transforms.ToTensor(),
        transforms.Normalize((0.5,), (0.5,))
    ])

    img = Image.open(img_path).convert("RGB")
    img = tf(img).unsqueeze(0).to(device)

    with torch.no_grad():
        outputs = model(img)
        probs = torch.softmax(outputs, dim=1)
        pred_idx = probs.argmax(dim=1).item()
        pred_label = idx_to_class[pred_idx]
        confidence = probs[0, pred_idx].item()

    print(f"🖼️ Prediction: '{pred_label}' (confidence={confidence:.2f})")
    return pred_label, confidence



# -----------------------------
# 10) MAIN EXECUTION BLOCK
# -----------------------------

seed_everything(123)

if generate:
    generate_synthetic_dataset(fonts_dir, synth_out_dir, images_per_char=images_per_char)
    # also split any handwritten you already have
    if raw_hand_dir.exists():
        split_handwritten(raw_hand_dir, hand_out_dir)
    merge_datasets(synth_out_dir, hand_out_dir, final_out_dir)

if train:
    if not (final_out_dir / "train").exists():
        print("Merged training data not found. Please set `generate = True` first or manually prepare data/final/...")
    else:
        train(final_out_dir, epochs=epochs, batch_size=batch_size, lr=lr, patience=patience, checkpoint_path=checkpoint_path)

if eval:
    evaluate(final_out_dir, checkpoint_path=checkpoint_path)

if predict is not None:  # predict is string path now
    predict_image(Path(predict), checkpoint_path=checkpoint_path)

KeyboardInterrupt: 

In [None]:
# Install a compatible version of PyTorch with CUDA support
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Looking in indexes: https://download.pytorch.org/whl/cu121


In [None]:
from pathlib import Path

# Define the paths for the directories
assets_dir = Path("assets")
fonts_dir = assets_dir / "fonts"
raw_handwritten_dir = Path("data/raw_handwritten")
synth_out_dir = Path("data/synth")
handwritten_out_dir = Path("data/handwritten")
final_out_dir = Path("data/final")
checkpoints_dir = Path("checkpoints")


# Create the directories if they don't exist
assets_dir.mkdir(exist_ok=True)
fonts_dir.mkdir(exist_ok=True)
raw_handwritten_dir.mkdir(parents=True, exist_ok=True)
synth_out_dir.mkdir(parents=True, exist_ok=True)
handwritten_out_dir.mkdir(parents=True, exist_ok=True)
final_out_dir.mkdir(parents=True, exist_ok=True)
checkpoints_dir.mkdir(exist_ok=True)

print(f"Created directories: {assets_dir}, {fonts_dir}, {raw_handwritten_dir}, {synth_out_dir}, {handwritten_out_dir}, {final_out_dir}, {checkpoints_dir}")


Created directories: assets, assets/fonts, data/raw_handwritten, data/synth, data/handwritten, data/final, checkpoints


In [None]:
!wget https://releases.pagure.org/lohit/lohit-gurmukhi-ttf-2.91.2.tar.gz
!tar -xvzf lohit-gurmukhi-ttf-2.91.2.tar.gz -C assets/fonts/

--2025-08-22 08:20:05--  https://releases.pagure.org/lohit/lohit-gurmukhi-ttf-2.91.2.tar.gz
Resolving releases.pagure.org (releases.pagure.org)... 8.43.85.76, 2620:52:3:1:dead:beef:cafe:fed8
Connecting to releases.pagure.org (releases.pagure.org)|8.43.85.76|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 18397 (18K) [application/x-gzip]
Saving to: ‘lohit-gurmukhi-ttf-2.91.2.tar.gz’


2025-08-22 08:20:06 (189 KB/s) - ‘lohit-gurmukhi-ttf-2.91.2.tar.gz’ saved [18397/18397]

lohit-gurmukhi-ttf-2.91.2/
lohit-gurmukhi-ttf-2.91.2/io.pagure.lohit.gurmukhi.font.metainfo.xml
lohit-gurmukhi-ttf-2.91.2/COPYRIGHT
lohit-gurmukhi-ttf-2.91.2/Lohit-Gurmukhi.ttf
lohit-gurmukhi-ttf-2.91.2/ChangeLog
lohit-gurmukhi-ttf-2.91.2/AUTHORS
lohit-gurmukhi-ttf-2.91.2/66-lohit-gurmukhi.conf
lohit-gurmukhi-ttf-2.91.2/README
lohit-gurmukhi-ttf-2.91.2/OFL.txt


In [None]:
!mv assets/fonts/lohit-gurmukhi-ttf-2.91.2/Lohit-Gurmukhi.ttf assets/fonts/


In [None]:
%pip install scikit-image



# Task
Update the code to allow image uploads from Google Drive for prediction.

## Update imports

### Subtask:
Add necessary imports for Google Drive integration.


**Reasoning**:
Add the necessary import statements for Google Drive integration to the main script.



**Reasoning**:
The module `google.colab.vcheck` is not found. Remove the import statement `from google.colab.vcheck import SkipTest`.



## Authenticate google drive

### Subtask:
Add a cell to authenticate and mount Google Drive.


**Reasoning**:
Add a cell to authenticate and mount Google Drive as requested by the subtask instructions.



**Reasoning**:
The previous attempt to mount Google Drive failed due to a credential propagation error. This often indicates a temporary issue with the Colab environment's ability to authenticate. Retrying the mount command is the standard procedure to resolve this.



## Modify prediction function

### Subtask:
Update the `predict_image` function to handle Google Drive file IDs and download the image.


**Reasoning**:
Modify the predict_image function to handle Google Drive file paths and update the prediction message.



**Reasoning**:
The KeyError indicates an issue with mapping the predicted index to a character label. The `idx_to_class` dictionary seems to be empty or incorrectly populated. This might be due to how `class_to_idx` is loaded and then inverted. I will regenerate the `predict_image` function, ensuring the `idx_to_class` dictionary is correctly created.



## Update execution block

### Subtask:
Modify the main execution block to accept a Google Drive file ID for prediction.


**Reasoning**:
Modify the main execution block to pass the predict variable to the predict_image function when prediction is active.



**Reasoning**:
Correct the NameError by changing `final_out_out_dir` to `final_out_dir` in the train function call within the main execution block.



## Summary:

### Data Analysis Key Findings

*   The necessary import `from google.colab import drive` was successfully added to the script for Google Drive integration.
*   An attempt to import `google.colab.vcheck` failed with a `ModuleNotFoundError`, indicating it was not a valid or accessible module.
*   Google Drive authentication and mounting were successfully performed at `/content/drive` after an initial transient error.
*   The `predict_image` function was updated to handle image paths that could potentially reside in the mounted Google Drive (`/content/drive/My Drive/...`).
*   A `KeyError` occurred during prediction in an intermediate step, which was resolved by correctly populating the `idx_to_class` mapping.
*   The main execution block was modified to correctly pass the specified prediction image path (or potential Google Drive ID/path) to the updated `predict_image` function.

### Insights or Next Steps

*   The current implementation relies on Google Drive being mounted to `/content/drive`. For robustness, consider adding a check within `predict_image` to confirm if `/content/drive` is mounted before attempting to access paths within it, or provide clearer instructions to the user about mounting.
*   While the code now handles Google Drive *paths*, the original request mentioned handling Google Drive *file IDs*. If predicting directly from file IDs without mounting is a requirement, the `predict_image` function would need further modification to use the Google Drive API or `gdown` to download the file locally based on the ID before processing.
