# 🔢 NumtaDB Bengali Digit Recognition - Google Colab Training

Train a deep learning model to recognize Bengali handwritten digits using Google Colab's free GPU!

**Dataset**: [NumtaDB by BengaliAI](https://www.kaggle.com/datasets/BengaliAI/numta) (85,000+ images)

**Model**: MobileNetV2 (Fast and accurate)

**Expected Results**: ~96-98% accuracy in 30-60 minutes

---

## 📋 Prerequisites

Before running this notebook, you need:

1. **Kaggle API Credentials** (`kaggle.json`)
   - Go to https://www.kaggle.com/settings
   - Scroll to "API" section
   - Click "Create New API Token"
   - Download `kaggle.json`

2. **Trainer Package** (`trainer_package.zip`)
   - Run `./prepare_for_colab.sh` on your local machine
   - This creates `trainer_package.zip`

**You'll be prompted to upload both files when needed!**


---
## Step 1: GPU Check

⚠️ **IMPORTANT**: Make sure you've enabled GPU!

Go to: **Runtime → Change runtime type → GPU → Save**

Then run the cell below to verify:


In [None]:
import torch
import sys

print("Python version:", sys.version.split()[0])
print("PyTorch version:", torch.__version__)
print("CUDA available:", torch.cuda.is_available())

if torch.cuda.is_available():
    print(f"✅ GPU: {torch.cuda.get_device_name(0)}")
    print(f"   Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")
else:
    print("⚠️  No GPU detected! Training will be very slow.")
    print("   Go to Runtime → Change runtime type → GPU → Save")


---
## Step 2: Mount Google Drive (Optional but Recommended)

Mounting Google Drive allows you to:
- Save your trained model permanently
- Resume training later
- Keep your data between sessions

Click the link and authorize access when prompted.


In [None]:
from google.colab import drive
import os

# Mount Google Drive
drive.mount('/content/drive')

# Create project directory in Google Drive
project_dir = '/content/drive/MyDrive/numtadb-project'
os.makedirs(project_dir, exist_ok=True)
os.chdir(project_dir)

print(f"\n✅ Working directory: {os.getcwd()}")
print("   Your files will be saved in Google Drive!")


---
## Step 3: Install Dependencies

Install required Python packages. This takes ~30 seconds.


In [None]:
!pip install -q kaggle pandas matplotlib seaborn scikit-learn tqdm Pillow

print("✅ All dependencies installed!")


---
## Step 4: Setup Kaggle API

**Upload your `kaggle.json` file when prompted below!**

This file contains your Kaggle API credentials needed to download the dataset.


In [None]:
from google.colab import files
import os

# Create Kaggle directory
os.makedirs('/root/.kaggle', exist_ok=True)

# Check if kaggle.json already exists
if os.path.exists('/root/.kaggle/kaggle.json'):
    print("✅ Kaggle credentials already configured!")
else:
    print("📤 Please upload your kaggle.json file:")
    uploaded = files.upload()
    
    # Move and set permissions
    !mv kaggle.json /root/.kaggle/kaggle.json
    !chmod 600 /root/.kaggle/kaggle.json
    
    print("\n✅ Kaggle credentials configured successfully!")

# Verify
!kaggle datasets list -s numta | head -3


---
## Step 5: Download NumtaDB Dataset

Downloads ~300MB of Bengali handwritten digits from Kaggle.

This takes 2-3 minutes depending on connection speed.


In [None]:
import os

# Check if dataset already exists
if os.path.exists('data/raw/training-a.csv'):
    print("✅ Dataset already downloaded!")
    print(f"   Location: {os.path.abspath('data/raw')}")
else:
    print("📥 Downloading NumtaDB dataset from Kaggle...")
    
    # Create data directory
    os.makedirs('data/raw', exist_ok=True)
    os.chdir('data/raw')
    
    # Download and extract
    !kaggle datasets download -d BengaliAI/numta
    !unzip -q numta.zip
    !rm numta.zip
    
    # Go back to project root
    os.chdir('../..')
    
    print("\n✅ Dataset downloaded successfully!")

# Show dataset structure
print("\n📊 Dataset structure:")
!ls -lh data/raw/*.csv 2>/dev/null || echo "CSV files loaded"
print("\nImage directories:")
!ls -d data/raw/training-* data/raw/testing-* 2>/dev/null | head -10


---
## Step 6: Upload Trainer Package

**Upload your `trainer_package.zip` file when prompted!**

This contains all the training code from your local machine.


In [None]:
from google.colab import files
import zipfile
import os

# Check if trainer folder already exists
if os.path.exists('trainer') and os.path.exists('train_model.py'):
    print("✅ Trainer package already uploaded!")
else:
    print("📤 Please upload your trainer_package.zip file:")
    uploaded = files.upload()
    
    # Extract the zip file
    zip_file = list(uploaded.keys())[0]
    print(f"\n📦 Extracting {zip_file}...")
    
    with zipfile.ZipFile(zip_file, 'r') as z:
        z.extractall('.')
    
    # Clean up zip file
    os.remove(zip_file)
    
    print("✅ Trainer package extracted successfully!")

# Verify files
print("\n📁 Project files:")
!ls -lh train_model.py 2>/dev/null
!ls trainer/ 2>/dev/null | head -10


---
## Step 7: Configure Training

Set up training parameters. Adjust these based on your needs:

- **BATCH_SIZE**: Larger = faster training but more memory (reduce if OOM error)
- **NUM_EPOCHS**: More epochs = better accuracy but longer training
- **LEARNING_RATE**: Adjust if training is unstable


In [None]:
import sys
import os

# Add current directory to Python path
sys.path.insert(0, os.getcwd())

# Import configuration
from trainer.config import Config

# Configure training parameters
Config.BATCH_SIZE = 64          # Reduce to 32 or 16 if you get OOM errors
Config.NUM_EPOCHS = 30          # Increase for better accuracy
Config.LEARNING_RATE = 0.001
Config.NUM_WORKERS = 2          # Colab works best with 2 workers
Config.MODEL_NAME = 'mobilenetv2'

# Print configuration
print("\n⚙️  Training Configuration:")
print("="*60)
print(f"Model:         {Config.MODEL_NAME}")
print(f"Batch Size:    {Config.BATCH_SIZE}")
print(f"Epochs:        {Config.NUM_EPOCHS}")
print(f"Learning Rate: {Config.LEARNING_RATE}")
print(f"Image Size:    {Config.IMAGE_SIZE}")
print(f"Device:        {Config.DEVICE}")
print(f"Workers:       {Config.NUM_WORKERS}")
print("="*60)

# Create necessary directories
Config.create_dirs()
print("\n✅ Configuration complete!")


---
## Step 8: Load Dataset

Creates training, validation, and test data loaders.


In [None]:
from trainer.dataset import create_dataloaders
import os

print("📊 Loading dataset...\n")

# Create data loaders
train_loader, val_loader, test_loader = create_dataloaders(
    str(Config.DATA_DIR), 
    Config
)

# Show dataset statistics
print("\n✅ Dataset loaded successfully!")
print("\n📈 Dataset Statistics:")
print("="*60)
print(f"Training samples:   {len(train_loader.dataset):,}")
print(f"Validation samples: {len(val_loader.dataset):,}")
print(f"Test samples:       {len(test_loader.dataset):,}")
print(f"Training batches:   {len(train_loader):,}")
print(f"Batch size:         {Config.BATCH_SIZE}")
print("="*60)


---
## Step 9: Train the Model

**This is the main training step!**

Training time: ~30-60 minutes depending on:
- Dataset size
- Number of epochs
- GPU type (usually T4 on free tier)

You'll see:
- Loss and accuracy for each epoch
- Progress bars
- Validation metrics

**The best model will be saved automatically!**


In [None]:
from trainer.train import Trainer
import time

print("🚀 Starting training...\n")
start_time = time.time()

# Create trainer
trainer = Trainer(Config)

# Train the model
metrics = trainer.train(train_loader, val_loader)

# Calculate training time
training_time = time.time() - start_time
hours = int(training_time // 3600)
minutes = int((training_time % 3600) // 60)
seconds = int(training_time % 60)

print("\n" + "="*60)
print("🎉 Training Complete!")
print("="*60)
print(f"Training time: {hours}h {minutes}m {seconds}s")
print(f"Best validation accuracy: {max(metrics.get('val_acc', [0])):.2f}%")
print(f"Model saved at: {Config.CHECKPOINT_DIR}/best_model.pth")
print("="*60)


---
## Step 10: Evaluate on Test Set

Test the trained model on unseen data to get final accuracy.


In [None]:
print("📊 Evaluating on test set...\n")

# Evaluate
test_loss, test_acc = trainer.validate(test_loader)

print("\n" + "="*60)
print("📈 Final Test Results")
print("="*60)
print(f"Test Loss:     {test_loss:.4f}")
print(f"Test Accuracy: {test_acc:.2f}%")
print("="*60)


---
## Step 11: Visualize Training Results

Generate plots to visualize training progress.


In [None]:
from trainer.visualize import plot_training_history
import matplotlib.pyplot as plt
import pandas as pd
import os

metrics_file = Config.LOG_DIR / 'training_metrics.csv'
if os.path.exists(metrics_file):
    history = pd.read_csv(metrics_file)
    
    # Plot
    plot_training_history(history, save_path=Config.LOG_DIR / 'training_plots.png')
    
    print("✅ Training plots generated!")
    print(f"   Saved at: {Config.LOG_DIR}/training_plots.png")
    
    # Display in notebook
    plt.show()
else:
    print("⚠️  No training metrics file found.")


---
## Step 12: Download Trained Model

Download your trained model to use it locally or deploy it!

The model file is ~14MB.


In [None]:
from google.colab import files
import os

print("📥 Preparing files for download...\n")

# Files to download
download_files = [
    ('checkpoints/best_model.pth', 'Best model checkpoint'),
    ('logs/training_metrics.csv', 'Training history'),
    ('logs/training_plots.png', 'Training plots'),
]

for file_path, description in download_files:
    if os.path.exists(file_path):
        print(f"✅ {description}: {file_path}")
        files.download(file_path)
    else:
        print(f"⚠️  {description} not found: {file_path}")

print("\n✅ Download complete!")
print("\n💡 If you used Google Drive, your files are also saved at:")
print(f"   {os.getcwd()}")


---
## 🎁 Bonus: Test with Sample Images

Try your trained model on sample images from the test set!


In [None]:
import torch
import matplotlib.pyplot as plt
from PIL import Image
import numpy as np

# Load model
model = trainer.model
model.eval()

# Get some test samples
test_iter = iter(test_loader)
images, labels = next(test_iter)

# Predict
with torch.no_grad():
    images = images.to(Config.DEVICE)
    outputs = model(images)
    _, predicted = torch.max(outputs, 1)

# Display first 10 predictions
fig, axes = plt.subplots(2, 5, figsize=(15, 6))
axes = axes.ravel()

for i in range(10):
    img = images[i].cpu().numpy().transpose(1, 2, 0)
    img = (img - img.min()) / (img.max() - img.min())  # Normalize for display
    
    axes[i].imshow(img)
    axes[i].axis('off')
    
    true_label = labels[i].item()
    pred_label = predicted[i].item()
    
    color = 'green' if true_label == pred_label else 'red'
    axes[i].set_title(f'True: {true_label}\nPred: {pred_label}', color=color)

plt.tight_layout()
plt.savefig('sample_predictions.png', dpi=150, bbox_inches='tight')
plt.show()

print("\n✅ Sample predictions displayed above!")
print("   Green = Correct, Red = Incorrect")


---
## 🎯 Next Steps

Congratulations! You've successfully trained a Bengali digit recognition model! 🎉

### What to do next:

1. **Deploy your model**
   - Convert to ONNX for web deployment
   - Create a web interface
   - Build a mobile app

2. **Improve accuracy**
   - Train for more epochs
   - Try different architectures
   - Tune hyperparameters
   - Add more data augmentation

3. **Experiment**
   - Try AlexNet model
   - Adjust learning rate
   - Use different optimizers
   - Enable/disable pretrained weights

4. **Share your work**
   - Create a demo
   - Write a blog post
   - Share on social media

### Resources:

- 📄 [NumtaDB Paper](https://arxiv.org/abs/1806.02452)
- 🌐 [BengaliAI Community](https://www.bengali.ai/)
- 📊 [Dataset on Kaggle](https://www.kaggle.com/datasets/BengaliAI/numta)
- 💻 [Project Repository](https://github.com/smafjal/NumtaDB)

---

**Made with ❤️ for Bengali language technology**

If you found this useful, give it a ⭐ on GitHub!
