# Kurdish Letter OCR - Model Testing Notebook

## Overview
This notebook loads the trained Kurdish letter OCR model and tests it on new images.

## Contents
1. **Setup** - Load model and dependencies
2. **Prediction Function** - Single image classification
3. **Batch Testing** - Test multiple images at once
4. **Visualization** - Display results with confidence scores

## About this notebook

This notebook demonstrates inference for a trained Kurdish Letter OCR model and provides a live camera GUI to capture a Region of Interest (ROI), binarize it, and classify it.

What you can do here:
- Load the model and verify the environment
- Predict single images from disk
- Use a threaded live camera preview with ROI, capture, and classification

Quick start:
1) Run cells from top to bottom until the "Interactive GUI" section
2) In the GUI, click Start, align the letter in the highlighted ROI, then click Capture ROI
3) The predicted Kurdish letter (large), and its confidence + per-class probabilities will appear

Notes:
- Input size is 64×64 grayscale
- Only the predicted letter is rendered large; other labels (Predicted, Confidence) are normal-sized
- Class display uses Kurdish letters mapped from training folders


In [32]:
import os
import cv2
import numpy as np
import torch
import torch.nn as nn
from sklearn.preprocessing import LabelEncoder
import glob

print("Libraries loaded successfully!")

Libraries loaded successfully!


## Configuration & Model Definition

Set up parameters and define the CNN model architecture.

In [33]:
# Configuration parameters (must match training parameters)
IMG_SIZE = 64
CHANNELS = 1
MODEL_PATH = "kurdish_letter_model_pytorch.pth"

# Device setup
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

# Categories (must match training)
CATEGORIES = [
    {"folder": "EEE_letters", "prefix": "eee_letter_"},
    {"folder": "LLL_letters", "prefix": "lll_letter_"},
    {"folder": "OOO_letters", "prefix": "ooo_letter_"},
    {"folder": "RRR_letters", "prefix": "rrr_letter_"},
    {"folder": "VVV_letters", "prefix": "vvv_letter_"}
]

# Display mapping from folder/class names to Kurdish letters
CLASS_DISPLAY_MAP = {
    "EEE_letters": "ێ",
    "LLL_letters": "ڵ",
    "OOO_letters": "ۆ",
    "RRR_letters": "ڕ",
    "VVV_letters": "ڤ",
}

EXTENSIONS = [".png", ".jpg", ".jpeg", ".bmp"]

class KurdishCNN(nn.Module):
    """
    Convolutional Neural Network for Kurdish letter classification.
    
    Architecture:
    - 3 convolutional layers with ReLU activation
    - MaxPooling after each convolution
    - 2 fully connected layers with dropout
    - Output: 5 classes (Kurdish letters)
    """
    def __init__(self, num_classes):
        super(KurdishCNN, self).__init__()

        self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, padding=1)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1)
        self.conv3 = nn.Conv2d(in_channels=64, out_channels=64, kernel_size=3, padding=1)
        self.relu = nn.ReLU()

        # Calculate flattened features
        with torch.no_grad():
            dummy_input = torch.zeros(1, 1, IMG_SIZE, IMG_SIZE)
            x = self.pool(self.relu(self.conv1(dummy_input)))
            x = self.pool(self.relu(self.conv2(x)))
            x = self.pool(self.relu(self.conv3(x)))
            num_flat_features = x.numel() // x.shape[0]

        # Fully connected layers
        self.fc1 = nn.Linear(num_flat_features, 64)
        self.dropout = nn.Dropout(0.5)
        self.fc2 = nn.Linear(64, num_classes)

    def forward(self, x):
        x = self.pool(self.relu(self.conv1(x)))
        x = self.pool(self.relu(self.conv2(x)))
        x = self.pool(self.relu(self.conv3(x)))
        x = x.view(x.size(0), -1)
        x = self.relu(self.fc1(x))
        x = self.dropout(x)
        x = self.fc2(x)
        return x


Using device: cpu


## Model Loading

Load the trained model weights and prepare for inference.

In [34]:
# Initialize label encoder with training categories
le = LabelEncoder()
le.fit(np.array([cat["folder"] for cat in CATEGORIES]))

print(f"Classes available: {le.classes_}\n")

# Load the model
model = KurdishCNN(num_classes=len(CATEGORIES)).to(device)

# Load saved weights
if os.path.exists(MODEL_PATH):
    model.load_state_dict(torch.load(MODEL_PATH, map_location=device))
    print(f"✓ Model loaded from: {MODEL_PATH}")
else:
    print(f"✗ Model file not found: {MODEL_PATH}")
    print("Please train the model first using Kurdish_Letter_OCR_CNN.ipynb")

# Set model to evaluation mode (disables dropout, batch norm, etc.)
model.eval()
print("✓ Model set to evaluation mode\n")


Classes available: ['EEE_letters' 'LLL_letters' 'OOO_letters' 'RRR_letters' 'VVV_letters']

✓ Model loaded from: kurdish_letter_model_pytorch.pth
✓ Model set to evaluation mode



## Prediction Functions

Functions to predict class for single or batch images.

In [35]:
def predict_single_image(image_path):
    """
    Predict the class of a single image.
    
    Parameters:
    -----------
    image_path : str
        Path to the image file
    
    Returns:
    --------
    tuple
        (predicted_class_display, confidence, probabilities)
        - predicted_class_display: str - The predicted Kurdish letter (mapped)
        - confidence: float - Confidence score (0-100)
        - probabilities: np.array - Confidence for each class
    """
    try:
        # Read image in grayscale
        img_array = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
        
        if img_array is None:
            print(f"✗ Error: Could not load image from {image_path}")
            return None, None, None
        
        # Resize to model input size
        img_resized = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
        
        # Normalize (0-1 range)
        img_normalized = img_resized / 255.0
        
        # Convert to tensor (batch_size=1, channels=1)
        img_tensor = torch.tensor(img_normalized, dtype=torch.float32).unsqueeze(0).unsqueeze(0)
        
        # Move to device
        img_tensor = img_tensor.to(device)
        
        # Forward pass (no gradient computation)
        with torch.no_grad():
            output = model(img_tensor)
            probabilities = torch.softmax(output, dim=1)[0].cpu().numpy()
            predicted_idx = torch.argmax(output, dim=1).item()
            confidence = probabilities[predicted_idx] * 100
        
        # Get class name and map to display
        raw_class = le.classes_[predicted_idx]
        predicted_class_display = CLASS_DISPLAY_MAP.get(raw_class, raw_class)
        
        return predicted_class_display, confidence, probabilities
    
    except Exception as e:
        print(f"✗ Error processing image: {str(e)}")
        return None, None, None


def display_prediction(image_path, predicted_class, confidence, probabilities):
    """
    Display prediction results in a formatted table.
    
    Parameters:
    -----------
    image_path : str
        Path to the image
    predicted_class : str
        Predicted class display name (mapped)
    confidence : float
        Confidence score
    probabilities : np.array
        Probabilities for all classes
    """
    print(f"\n{'='*70}")
    print(f"File: {os.path.basename(image_path)}")
    print(f"{'='*70}")
    print(f"Predicted Class: {predicted_class}")
    print(f"Confidence: {confidence:.2f}%\n")
    
    print("Class Probabilities:")
    print(f"{'Class':<10} {'Probability':<15} {'Confidence %':<15}")
    print("-" * 50)
    
    for idx, raw_class in enumerate(le.classes_):
        class_name = CLASS_DISPLAY_MAP.get(raw_class, raw_class)
        prob = probabilities[idx]
        percent = prob * 100
        bar_length = int(percent / 5)
        bar = "█" * bar_length + "░" * (20 - bar_length)
        print(f"{class_name:<10} {prob:<15.4f} {bar} {percent:6.2f}%")
    
    print(f"{'='*70}\n")


## Usage Examples

Use the functions to test images.

In [36]:
# Example 1: Test a single image
# Uncomment and modify the path to test a specific image
# image_path = "path/to/your/image.jpg"
# predicted_class, confidence, probabilities = predict_single_image(image_path)
# if predicted_class:
#     display_prediction(image_path, predicted_class, confidence, probabilities)


# Example 2: Test all JPG files in current directory
print("Testing JPG files in current directory...\n")
jpg_files = glob.glob("*.jpg") + glob.glob("*.JPG") + glob.glob("*.jpeg") + glob.glob("*.JPEG")

if jpg_files:
    test_multiple_images(file_list=sorted(jpg_files))
else:
    print("No JPG files found in current directory")


# Example 3: Test images in a specific folder
# Uncomment to test images in a folder
# test_multiple_images(folder_path="EEE_letters")


# Example 4: Test specific image files
# Uncomment to test specific files
# test_multiple_images(file_list=["image1.jpg", "image2.png", "image3.jpg"])


Testing JPG files in current directory...


TESTING 1 IMAGE(S)


File: IMG_1836.jpg
Predicted Class: ۆ
Confidence: 99.92%

Class Probabilities:
Class      Probability     Confidence %   
--------------------------------------------------
ێ          0.0008          ░░░░░░░░░░░░░░░░░░░░   0.08%
ڵ          0.0000          ░░░░░░░░░░░░░░░░░░░░   0.00%
ۆ          0.9992          ███████████████████░  99.92%
ڕ          0.0000          ░░░░░░░░░░░░░░░░░░░░   0.00%
ڤ          0.0000          ░░░░░░░░░░░░░░░░░░░░   0.00%



In [37]:
# Sanity check: ensure OpenCV (cv2) is available and show version
import cv2
print("cv2 imported OK, version:", cv2.__version__)

cv2 imported OK, version: 4.12.0


In [38]:
# Sanity check: ensure torch is available and show device
import torch
print("torch imported OK, version:", torch.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
    print("GPU name:", torch.cuda.get_device_name(0))
else:
    print("Using CPU")

torch imported OK, version: 2.9.1
CUDA available: False
Using CPU


In [39]:
# Sanity check: ensure scikit-learn is available and show version
import sklearn
from sklearn.preprocessing import LabelEncoder
print("sklearn imported OK, version:", sklearn.__version__)
print("LabelEncoder available:", LabelEncoder is not None)

sklearn imported OK, version: 1.7.2
LabelEncoder available: True


## Live Camera GUI – What it does and how to use it

This GUI provides a threaded live preview from your camera with a center ROI overlay. When you click Capture ROI, the selected area is converted to grayscale, binarized with Otsu thresholding, resized to 64×64, and classified by the CNN model.

Controls:
- Camera: choose a camera index (0/1/2)
- Start: opens the camera and starts a smooth preview
- Stop: stops preview and releases the camera
- Capture ROI: classifies the current ROI and shows:
  - ROI (left) and binarized copy (right)
  - Predicted Kurdish letter (large)
  - Confidence (%) and a per-class probability bar list

Tips:
- Place a black letter on white background inside the ROI
- Ensure good lighting and focus
- If you see no frames, try a different camera index or grant camera permissions

Styling:
- Only the predicted class glyph is large; labels and confidence remain normal size for readability.


In [40]:
## Interactive GUI: Camera ROI Capture and Prediction (robust, with diagnostics)

# Enhancements:
# - Camera index selector (0/1/2) and resolution hints
# - Immediate single-frame test after start to confirm preview
# - Heartbeat indicator updated from the preview loop
# - Defensive checks and clear status messages
# - Stop always releases camera; Start prevents duplicate threads

import cv2
import numpy as np
import time
import threading
from IPython.display import display
import ipywidgets as widgets

# Helper: predict from a grayscale numpy array (already 2D)
def predict_from_array(gray_array):
    assert len(gray_array.shape) == 2, "Expected grayscale 2D array"
    img_resized = cv2.resize(gray_array, (IMG_SIZE, IMG_SIZE))
    img_normalized = img_resized / 255.0
    img_tensor = torch.tensor(img_normalized, dtype=torch.float32).unsqueeze(0).unsqueeze(0).to(device)
    with torch.no_grad():
        output = model(img_tensor)
        probabilities = torch.softmax(output, dim=1)[0].cpu().numpy()
        predicted_idx = int(torch.argmax(output, dim=1).item())
        confidence = float(probabilities[predicted_idx] * 100)
    raw_class = le.classes_[predicted_idx]
    predicted_class_display = CLASS_DISPLAY_MAP.get(raw_class, raw_class)
    return predicted_class_display, confidence, probabilities

# Widgets
cam_index_dropdown = widgets.Dropdown(options=[0,1,2], value=0, description='Camera', layout=widgets.Layout(width='180px'))
start_button = widgets.Button(description="Start", button_style="success")
stop_button = widgets.Button(description="Stop", button_style="warning")
capture_button = widgets.Button(description="Capture ROI", button_style="primary")
status_label = widgets.HTML(value="<b>Status:</b> Ready")
heartbeat_label = widgets.HTML(value="<b>Heartbeat:</b> idle")
output_area = widgets.Output()
preview_area = widgets.VBox([])
image_widget = widgets.Image(format='png')
preview_area.children = [image_widget]

controls = widgets.HBox([cam_index_dropdown, start_button, stop_button, capture_button])
ui = widgets.VBox([
    widgets.HTML("""
    <div style='font-family:Inter,Helvetica,Arial,sans-serif; padding:8px 0;'>
      <h2 style='margin:0 0 8px;'>Kurdish Letter OCR – Live Capture</h2>
      <p style='color:#555; margin:0;'>Place a white paper with a black Kurdish letter within the ROI, then press <b>Capture ROI</b>.</p>
    </div>
    """),
    controls,
    status_label,
    heartbeat_label,
    preview_area,
    output_area
])

display(ui)

# Camera and ROI config
cap = None
running = False
preview_thread = None
last_frame = None
roi = None  # (x,y,w,h)
COLOR_ROI = (0,200,255)

# Frame overlay
def draw_overlay(frame):
    h, w = frame.shape[:2]
    global roi
    if roi is None:
        rw = int(w * 0.6); rh = int(h * 0.6)
        rx = (w - rw) // 2; ry = (h - rh) // 2
        roi = (rx, ry, rw, rh)
    rx, ry, rw, rh = roi
    overlay = frame.copy()
    mask = np.zeros_like(frame)
    cv2.rectangle(mask, (0,0), (w,h), (0,0,0), -1)
    cv2.rectangle(mask, (rx, ry), (rx+rw, ry+rh), (0,0,0), -1)
    alpha = 0.35
    cv2.addWeighted(mask, alpha, overlay, 1-alpha, 0, overlay)
    frame = overlay
    cv2.rectangle(frame, (rx, ry), (rx+rw, ry+rh), COLOR_ROI, 2)
    cv2.putText(frame, 'Align paper within ROI', (rx+10, max(30, ry-10)), cv2.FONT_HERSHEY_SIMPLEX, 0.7, COLOR_ROI, 2, cv2.LINE_AA)
    return frame

# Binarize to white background / black text
def to_binary(gray):
    blur = cv2.GaussianBlur(gray, (5,5), 0)
    _, thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
    if np.mean(thresh) < 127:
        thresh = 255 - thresh
    return thresh

# Preview loop

def preview_loop():
    global cap, running, last_frame
    frame_count = 0
    while running and cap is not None and cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            status_label.value = "<b>Status:</b> Camera read failed"
            break
        target_w = 800
        target_h = int(frame.shape[0] * target_w / frame.shape[1])
        frame = cv2.resize(frame, (target_w, target_h))
        last_frame = frame.copy()
        frame_disp = draw_overlay(frame.copy())
        rgb = cv2.cvtColor(frame_disp, cv2.COLOR_BGR2RGB)
        ok, buf = cv2.imencode('.png', rgb)
        if ok:
            image_widget.value = buf.tobytes()
        frame_count += 1
        if frame_count % 15 == 0:
            heartbeat_label.value = f"<b>Heartbeat:</b> {frame_count} frames"
        time.sleep(0.02)
    heartbeat_label.value = "<b>Heartbeat:</b> stopped"

# Event handlers

def on_start_clicked(_):
    global cap, running, preview_thread, last_frame
    if running:
        status_label.value = "<b>Status:</b> Preview already running"
        return
    cam_idx = cam_index_dropdown.value
    cap = cv2.VideoCapture(cam_idx)
    cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
    cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)
    if not cap.isOpened():
        status_label.value = f"<b>Status:</b> Could not open camera index {cam_idx}"
        cap = None
        return
    ret, frame = cap.read()
    if not ret or frame is None:
        status_label.value = "<b>Status:</b> Camera delivered no frames"
        cap.release(); cap = None
        return
    target_w = 800
    target_h = int(frame.shape[0] * target_w / frame.shape[1])
    frame = cv2.resize(frame, (target_w, target_h))
    last_frame = frame.copy()
    frame_disp = draw_overlay(frame.copy())
    rgb = cv2.cvtColor(frame_disp, cv2.COLOR_BGR2RGB)
    ok, buf = cv2.imencode('.png', rgb)
    if ok:
        image_widget.value = buf.tobytes()
    status_label.value = "<b>Status:</b> Camera started"
    heartbeat_label.value = "<b>Heartbeat:</b> starting"
    running = True
    preview_thread = threading.Thread(target=preview_loop, daemon=True)
    preview_thread.start()


def on_stop_clicked(_):
    global cap, running, preview_thread
    running = False
    time.sleep(0.05)
    if cap is not None:
        cap.release()
    cap = None
    preview_thread = None
    status_label.value = "<b>Status:</b> Camera stopped"


def on_capture_clicked(_):
    global last_frame, roi
    if last_frame is None:
        status_label.value = "<b>Status:</b> No frame available; start camera first"
        return
    rx, ry, rw, rh = roi if roi is not None else (0,0,last_frame.shape[1], last_frame.shape[0])
    roi_img = last_frame[ry:ry+rh, rx:rx+rw]
    if roi_img.size == 0:
        status_label.value = "<b>Status:</b> ROI out of bounds"
        return
    gray = cv2.cvtColor(roi_img, cv2.COLOR_BGR2GRAY)
    binary = to_binary(gray)

    predicted_class_display, confidence, probabilities = predict_from_array(binary)

    with output_area:
        output_area.clear_output(wait=True)
        roi_rgb = cv2.cvtColor(roi_img, cv2.COLOR_BGR2RGB)
        bin_rgb = cv2.cvtColor(binary, cv2.COLOR_GRAY2RGB)
        top = np.hstack([roi_rgb, bin_rgb])
        caption = np.zeros((40, top.shape[1], 3), dtype=np.uint8)
        cv2.putText(caption, 'ROI (left)  |  Binarized (right)', (10, 28), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (255,255,255), 2, cv2.LINE_AA)
        comp = np.vstack([caption, top])
        ok, buf = cv2.imencode('.png', comp)
        if ok:
            display(widgets.Image(value=buf.tobytes(), format='png'))
        prob_lines = []
        for idx, raw_class in enumerate(le.classes_):
            class_name = CLASS_DISPLAY_MAP.get(raw_class, raw_class)
            percent = probabilities[idx] * 100
            bar_len = int(percent / 5)
            bar = '█' * bar_len + '░' * (20 - bar_len)
            prob_lines.append(f"{class_name}: {percent:6.2f}% {bar}")
        html = f"""
        <div style='font-family:Inter,Helvetica,Arial,sans-serif; padding:8px 0;'>
          <div style='margin:6px 0; line-height:1;'>
            <span style='color:#0b5;'>Predicted:</span>
            <b style='font-size:180px; line-height:1;'>{predicted_class_display}</b>
            &nbsp;&nbsp;|&nbsp;&nbsp;
            <span style='color:#06c;'>Confidence:</span>
            <b>{confidence:.2f}%</b>
          </div>
          <pre style='background:#111; color:#ddd; padding:12px; border-radius:8px; line-height:1.4;'>
        {"\n".join(prob_lines)}
          </pre>
        </div>
        """
        display(widgets.HTML(html))
    status_label.value = "<b>Status:</b> ROI captured and classified"

# Bind events
start_button.on_click(on_start_clicked)
stop_button.on_click(on_stop_clicked)
capture_button.on_click(on_capture_clicked)


VBox(children=(HTML(value="\n    <div style='font-family:Inter,Helvetica,Arial,sans-serif; padding:8px 0;'>\n …

In [41]:
# Verify ipywidgets availability
import ipywidgets as widgets
from IPython.display import display
btn = widgets.Button(description="Widgets OK", button_style="success")
display(btn)
print("ipywidgets imported OK")

Button(button_style='success', description='Widgets OK', style=ButtonStyle())

ipywidgets imported OK


## Troubleshooting & Tips

Widgets don’t render:
- Ensure the notebook is running in a Jupyter environment that supports ipywidgets
- If necessary, install/enable ipywidgets in your environment and reload the notebook

Camera doesn’t start or shows a black frame:
- Try another Camera index from the dropdown (0, 1, 2)
- Close other apps using the camera
- On macOS, grant camera permission to your Jupyter app/kernel in System Settings → Privacy & Security → Camera

Prediction seems incorrect or unstable:
- Use high-contrast input (black ink on white paper)
- Center the glyph fully within the ROI and reduce motion
- Ensure the glyph resembles training data and isn’t mirrored/rotated

Performance:
- The model runs on CPU by default
- If you have a compatible GPU, it will be used automatically when available

Known assumptions:
- Input is a single glyph centered in the ROI
- Model expects 64×64 grayscale
- Classes and UI display mapping:
  - EEE_letters →ێ 
  - LLL_letters →ڵ 
  - OOO_letters →ۆ 
  - RRR_letters →ڕ 
  - VVV_letters →ڤ 
