# Image Recognition with EuroSAT Dataset

[![image](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/opengeos/geoai/blob/main/docs/examples/image_recognition.ipynb)

This notebook demonstrates how to train an image recognition (classification) model using the `geoai.recognize` module. The module provides a high-level, single-function API for training classifiers on ImageFolder-style datasets â€” directories of images organized by class.

## Key Features

- **Single-function training API**: Pass a directory, get a trained model back
- **1000+ architectures**: ResNet, EfficientNet, Vision Transformers, ConvNeXt, and more via [timm](https://github.com/huggingface/pytorch-image-models)
- **Multi-format support**: Works with JPEG, PNG, and multi-band GeoTIFF images
- **Built-in evaluation**: Classification reports, confusion matrices, and prediction visualization
- **Transfer learning**: Pretrained ImageNet weights with optional backbone freezing

## Install packages

In [None]:
# %pip install geoai-py

## Import libraries

In [None]:
import os
from geoai.utils import download_file
from geoai.recognize import (
    load_image_dataset,
    train_image_classifier,
    predict_images,
    evaluate_classifier,
    plot_training_history,
    plot_confusion_matrix,
    plot_predictions,
)

## Download the EuroSAT RGB Dataset

The [EuroSAT](https://github.com/phelber/eurosat) dataset contains 27,000 Sentinel-2 satellite image patches (64x64 pixels, RGB) in 10 land use/land cover classes:

- AnnualCrop, Forest, HerbaceousVegetation, Highway, Industrial
- Pasture, PermanentCrop, Residential, River, SeaLake

In [None]:
url = "https://data.source.coop/opengeos/geoai/EuroSAT_RGB.zip"
download_dir = download_file(url)

In [None]:
data_dir = os.path.join(download_dir, "EuroSAT_RGB")
print(f"Dataset directory: {data_dir}")
print(f"Classes: {sorted(os.listdir(data_dir))}")

## Explore the Dataset

Use `load_image_dataset` to scan the directory and display the class distribution.

In [None]:
dataset_info = load_image_dataset(data_dir)

In [None]:
import matplotlib.pyplot as plt
from PIL import Image

class_names = dataset_info["class_names"]
image_paths = dataset_info["image_paths"]
labels = dataset_info["labels"]
class_to_idx = dataset_info["class_to_idx"]

fig, axes = plt.subplots(2, 5, figsize=(20, 8))

for idx, class_name in enumerate(class_names):
    ax = axes[idx // 5, idx % 5]
    # Find first image of this class
    img_idx = labels.index(class_to_idx[class_name])
    img = Image.open(image_paths[img_idx])
    ax.imshow(img)
    ax.set_title(class_name, fontsize=12)
    ax.axis("off")

plt.suptitle("Sample Image from Each Class", fontsize=14)
plt.tight_layout()
plt.show()

## Train a ResNet50 Classifier

The `train_image_classifier` function handles everything: scanning the directory, splitting into train/val/test, creating datasets, training, and evaluating.

In [None]:
result = train_image_classifier(
    data_dir=data_dir,
    model_name="resnet50",
    num_epochs=5,
    batch_size=32,
    learning_rate=1e-3,
    image_size=64,
    in_channels=3,
    pretrained=True,
    output_dir="image_recognition_output/resnet50",
    num_workers=4,
    seed=42,
)

## Plot Training History

Visualize the training and validation loss and accuracy curves.

In [None]:
fig = plot_training_history("image_recognition_output/resnet50/models")
plt.show()

## Evaluate on Test Set

Generate a classification report with precision, recall, and F1-score for each class.

In [None]:
eval_result = evaluate_classifier(
    model=result["model"],
    dataset=result["test_dataset"],
    class_names=result["class_names"],
)

## Plot Confusion Matrix

In [None]:
fig = plot_confusion_matrix(
    eval_result["confusion_matrix"],
    result["class_names"],
)
plt.show()

In [None]:
fig = plot_confusion_matrix(
    eval_result["confusion_matrix"],
    result["class_names"],
    normalize=True,
)
plt.show()

## Visualize Predictions

Show model predictions on test images with confidence scores. Green titles indicate correct predictions; red titles indicate misclassifications.

In [None]:
test_dataset = result["test_dataset"]
test_paths = test_dataset.image_paths
test_labels = test_dataset.labels

pred_result = predict_images(
    model=result["model"],
    image_paths=test_paths[:20],
    class_names=result["class_names"],
    image_size=64,
    in_channels=3,
)

fig = plot_predictions(
    image_paths=test_paths[:20],
    predictions=pred_result["predictions"],
    true_labels=test_labels[:20],
    class_names=result["class_names"],
    probabilities=pred_result["probabilities"],
)
plt.show()

## Train an EfficientNet-B0 Classifier

EfficientNet models provide an excellent balance between accuracy and efficiency. Let's compare it with ResNet50.

In [None]:
result_effnet = train_image_classifier(
    data_dir=data_dir,
    model_name="efficientnet_b0",
    num_epochs=5,
    batch_size=32,
    learning_rate=1e-3,
    image_size=64,
    in_channels=3,
    pretrained=True,
    output_dir="image_recognition_output/efficientnet_b0",
    num_workers=4,
    seed=42,
)

In [None]:
eval_effnet = evaluate_classifier(
    model=result_effnet["model"],
    dataset=result_effnet["test_dataset"],
    class_names=result_effnet["class_names"],
)

In [None]:
fig = plot_confusion_matrix(
    eval_effnet["confusion_matrix"],
    result_effnet["class_names"],
    normalize=True,
)
plt.show()

## Compare Results

Compare the overall accuracy of both models.

In [None]:
print(f"ResNet50 accuracy:       {eval_result['accuracy']:.4f}")
print(f"EfficientNet-B0 accuracy: {eval_effnet['accuracy']:.4f}")

## Summary

This notebook demonstrated:

1. **Dataset download**: Using `download_file` to fetch the EuroSAT RGB dataset
2. **Single-function training**: `train_image_classifier` handles splitting, dataset creation, training, and evaluation
3. **Evaluation**: Classification reports and confusion matrices with `evaluate_classifier`
4. **Visualization**: Training curves, confusion matrices, and prediction grids
5. **Architecture comparison**: Comparing ResNet50 and EfficientNet-B0

## Key Parameters

- `model_name`: Choose from 1000+ timm models
- `image_size`: Resize images to this size (default 64 for EuroSAT)
- `in_channels`: 3 for RGB, 4+ for multispectral GeoTIFF
- `pretrained`: Use ImageNet pretrained weights for transfer learning
- `freeze_backbone`: Freeze backbone for faster fine-tuning
- `num_epochs`, `batch_size`, `learning_rate`: Standard training hyperparameters

## Next Steps

- Try more architectures (ConvNeXt, Swin Transformer, ViT)
- Experiment with data augmentation transforms
- Use `freeze_backbone=True` for faster fine-tuning
- Apply to multi-band GeoTIFF datasets by setting `in_channels` > 3