# Normalization and Scaling of Raster Data

This notebook demonstrates how to normalize and scale raster data using `rasterio` and `numpy` in Python. Normalization and scaling are essential preprocessing steps in remote sensing to prepare data for machine learning or visualization by adjusting pixel values to a standard range.

## Prerequisites
- Install required libraries: `rasterio`, `numpy`, `matplotlib` (listed in `requirements.txt`).
- A sample GeoTIFF file (e.g., `sample.tif`). Replace the file path with your own raster file.

## Learning Objectives
- Apply min-max normalization to scale raster data to [0, 1].
- Standardize raster data using z-score scaling.
- Visualize and save the normalized/scaled raster.

In [None]:
# Import required libraries
import rasterio
import numpy as np
import matplotlib.pyplot as plt

## Step 1: Load the Raster File

Load the raster file and handle no-data values.

In [None]:
# Define the path to the raster file
raster_path = 'sample.tif'

# Open the raster file
with rasterio.open(raster_path) as src:
    raster_data = src.read(masked=True)  # Read with mask for no-data
    profile = src.profile

# Print basic information
print(f'Raster shape: {raster_data.shape}')
print(f'Number of bands: {raster_data.shape[0]}')

## Step 2: Min-Max Normalization

Normalize each band to the range [0, 1] using min-max scaling: `(x - min) / (max - min)`.

In [None]:
# Initialize array for normalized data
normalized_data = np.zeros_like(raster_data, dtype=np.float32)

# Normalize each band
for band in range(raster_data.shape[0]):
    band_data = raster_data[band]
    band_min = np.min(band_data)
    band_max = np.max(band_data)
    normalized_data[band] = (band_data - band_min) / (band_max - band_min + 1e-10)  # Avoid division by zero

# Visualize first normalized band
plt.figure(figsize=(8, 8))
plt.imshow(normalized_data[0], cmap='gray', vmin=0, vmax=1)
plt.colorbar(label='Normalized Value')
plt.title('Min-Max Normalized First Band')
plt.xlabel('Column')
plt.ylabel('Row')
plt.show()

## Step 3: Z-Score Standardization

Standardize each band using z-score scaling: `(x - mean) / std`.

In [None]:
# Initialize array for standardized data
standardized_data = np.zeros_like(raster_data, dtype=np.float32)

# Standardize each band
for band in range(raster_data.shape[0]):
    band_data = raster_data[band]
    band_mean = np.mean(band_data)
    band_std = np.std(band_data)
    standardized_data[band] = (band_data - band_mean) / (band_std + 1e-10)  # Avoid division by zero

# Visualize first standardized band
plt.figure(figsize=(8, 8))
plt.imshow(standardized_data[0], cmap='viridis')
plt.colorbar(label='Z-Score')
plt.title('Z-Score Standardized First Band')
plt.xlabel('Column')
plt.ylabel('Row')
plt.show()

## Step 4: Save Normalized and Standardized Rasters

Save the normalized and standardized rasters to new GeoTIFF files.

In [None]:
# Update profile for float32 output
output_profile = profile.copy()
output_profile.update(dtype=rasterio.float32)

# Save normalized raster
with rasterio.open('normalized_raster.tif', 'w', **output_profile) as dst:
    dst.write(normalized_data.astype(rasterio.float32))

# Save standardized raster
with rasterio.open('standardized_raster.tif', 'w', **output_profile) as dst:
    dst.write(standardized_data.astype(rasterio.float32))

print('Normalized raster saved to: normalized_raster.tif')
print('Standardized raster saved to: standardized_raster.tif')

## Next Steps

- Replace `sample.tif` with your own raster file.
- Experiment with different scaling ranges (e.g., [0, 255] for visualization).
- Apply normalization to specific bands or derived indices (e.g., NDVI).
- Proceed to the next notebook (`12_classification_rf_svm.ipynb`) for machine learning classification.

## Notes
- Use `masked=True` to handle no-data values during calculations.
- Min-max normalization is suitable for visualization, while z-score is better for ML.
- See `docs/installation.md` for troubleshooting library installation.