# EverWatch Bird Object Detection Dataset

## Intro

EverWatch [(Garner et al. 2024)](https://zenodo.org/records/11165946) is a dataset for aerial bird detection in high-resolution wildlife monitoring imagery. The dataset focuses on automated detection and monitoring of bird species using aerial drone imagery. It addresses the critical challenge of wildlife conservation through automated aerial monitoring, providing precise bird detection capabilities for ecological research and conservation efforts.

## Dataset Characteristics

- **Modalities**: 
  - High-resolution RGB aerial imagery from drones
- **Spatial Resolution**: Sub-meter resolution (typical 0.1-0.5m/pixel)
- **Temporal Resolution**: Single acquisitions per location
- **Spectral Bands**: 
  - RGB: 3 channels (Red, Green, Blue)
- **Image Dimensions**: Variable sizes (typically 1000-4000 pixels)
- **Labels**: Bird object detection bounding boxes
  - Annotations for seven bird species
- **Geographic Distribution**: Locations across the Everglades, Florida, USA

## Dataset Setup and Initialization

In [None]:
from pathlib import Path

from geobench_v2.datamodules import GeoBenchEverWatchDataModule

# Setup paths
PROJECT_ROOT = Path("../../")

# Initialize datamodule
datamodule = GeoBenchEverWatchDataModule(
    img_size=512,
    batch_size=8,
    num_workers=4,
    root=PROJECT_ROOT / "data" / "everwatch",
    download=True,
)
datamodule.setup("fit")
datamodule.setup("test")

print("EverWatch datamodule initialized successfully!")
print(f"Training samples: {len(datamodule.train_dataset)}")
print(f"Validation samples: {len(datamodule.val_dataset)}")
print(f"Test samples: {len(datamodule.test_dataset)}")

## Geographic Distribution Visualization

The EverWatch dataset covers diverse wildlife monitoring sites, providing comprehensive coverage for bird detection across different ecosystems:

In [None]:
geo_fig = datamodule.visualize_geospatial_distribution()

## Sample Data Visualization

The dataset provides high-resolution aerial imagery with precise bird detection annotations for automated wildlife monitoring:

In [None]:
fig, batch = datamodule.visualize_batch()

## GeoBenchV2 Processing Pipeline

### Preprocessing Steps

1. **Image Processing**:
   - Resize variable-size drone imagery to consistent patches of size 512x512

2. **Split Generation**:
   - Use original train/test split from dataset, create a validation set from splitting on different colony locations (which is metadata info available for some but not all images)

## References

1. Garner, Lindsey, Ben Weinstein, Michael Rickershauser, Melissa Baldino, Holly Coates, Mary Commins, Tracey Faber, et al. “Everwatch Benchmark: Training and Evaluation Data for Detection and Species Classification of Everglades Wading Birds from Airborne Imagery”. Zenodo, May 13, 2024. https://doi.org/10.5281/zenodo.11165946.