# Anomaly Detection in Remote Sensing Data

This notebook demonstrates how to perform anomaly detection on remote sensing raster data using the Isolation Forest algorithm with `scikit-learn` in Python. Anomaly detection identifies unusual or rare patterns in imagery, such as deforestation, urban expansion, or sensor errors.

## Prerequisites
- Install required libraries: `rasterio`, `scikit-learn`, `numpy`, `matplotlib` (listed in `requirements.txt`).
- A multi-band GeoTIFF file (e.g., `sample.tif`). Replace the file path with your own data.

## Learning Objectives
- Apply Isolation Forest for anomaly detection on raster data.
- Visualize and interpret anomaly maps.
- Save the anomaly detection results as a raster.

In [None]:
# Import required libraries
import rasterio
import numpy as np
import matplotlib.pyplot as plt
from sklearn.ensemble import IsolationForest

## Step 1: Load the Raster File

Load a multi-band raster file and prepare it for anomaly detection.

In [None]:
# Define the path to the raster file
raster_path = 'sample.tif'

# Open the raster file
with rasterio.open(raster_path) as src:
    raster_data = src.read(masked=True)  # Shape: (bands, height, width)
    profile = src.profile

# Reshape for anomaly detection: (height * width, bands)
height, width = raster_data.shape[1], raster_data.shape[2]
X = raster_data.transpose(1, 2, 0).reshape(-1, raster_data.shape[0])

# Remove masked (no-data) pixels
mask = np.any(raster_data.mask, axis=0).ravel()
X_valid = X[~mask]

# Print basic information
print(f'Raster shape: {raster_data.shape}')
print(f'Valid pixels for anomaly detection: {X_valid.shape[0]}')

## Step 2: Apply Isolation Forest

Use Isolation Forest to detect anomalies in the raster data.

In [None]:
# Initialize Isolation Forest
iso_forest = IsolationForest(contamination=0.1, random_state=42)  # 10% expected anomalies

# Fit and predict anomalies
anomaly_labels = iso_forest.fit_predict(X_valid)

# Create anomaly map (-1 for anomalies, 1 for normal, masked pixels remain masked)
anomaly_map = np.full((height * width), -9999, dtype=np.int32)  # -9999 for masked pixels
anomaly_map[~mask] = anomaly_labels
anomaly_map = anomaly_map.reshape(height, width)

# Visualize anomaly map
plt.figure(figsize=(8, 8))
plt.imshow(anomaly_map, cmap='bwr', vmin=-1, vmax=1)
plt.colorbar(label='Anomaly (-1) / Normal (1)')
plt.title('Anomaly Detection Map')
plt.xlabel('Column')
plt.ylabel('Row')
plt.show()

## Step 3: Save Anomaly Map

Save the anomaly detection result as a single-band GeoTIFF.

In [None]:
# Update profile for single-band output
output_profile = profile.copy()
output_profile.update(count=1, dtype=rasterio.int32, nodata=-9999)

# Save anomaly map
with rasterio.open('anomaly_map.tif', 'w', **output_profile) as dst:
    dst.write(anomaly_map, 1)

print('Anomaly map saved to: anomaly_map.tif')

## Next Steps

- Replace `sample.tif` with your own multi-band raster file.
- Adjust the `contamination` parameter to control the sensitivity of anomaly detection.
- Explore other anomaly detection algorithms (e.g., One-Class SVM).
- Proceed to the next notebook (`20_object_detection_yolo.ipynb`) for object detection with YOLO.

## Notes
- Ensure the raster has multiple bands for meaningful anomaly detection.
- Normalize or scale data (e.g., see `11_normalization_scaling.ipynb`) for better results.
- See `docs/installation.md` for troubleshooting library installation.