## Training Material Outline: Accuracy Assessment of Remote Sensing Data

This training material will guide you through the process of conducting accuracy assessments for thematic maps or classified images derived from remote sensing data.

**I. Introduction**

* **What is Accuracy Assessment?**
    * Definition:  Verifying the accuracy of interpreted/classified remote sensing data by comparing it to ground truth.
    * Importance: Ensuring the reliability and usability of remote sensing products.
    * Qualitative vs. Quantitative Assessment: Briefly explain the difference.
* **Objectives of this Training:**
    * Understand the concept of accuracy assessment.
    * Learn how to create and interpret an error matrix.
    * Calculate and interpret key accuracy metrics (Overall Accuracy, Producer's Accuracy, Consumer's Accuracy, Kappa Coefficient).
    * Apply these concepts using Python and relevant libraries.

**II.  Core Concepts**

* **Ground Truth Data:**
    * Definition:  Independently collected data representing the “true” conditions on the ground.
    * Sources: Field surveys, high-resolution imagery, existing reference maps.
    * Importance of representative and accurate ground truth data.
* **Error Matrix (Confusion Matrix):**
    * Structure: Rows represent ground truth, columns represent classified data.
    * Interpretation of cells (True Positives, True Negatives, False Positives, False Negatives).
    * Example error matrix.
* **Accuracy Metrics:**
    * **Overall Accuracy:**  Percentage of correctly classified pixels.  Formula and interpretation.
    * **Producer's Accuracy (PA):**  Accuracy from the data producer's perspective.  Focus on errors of omission (missed classifications). Formula and interpretation.
    * **Consumer's Accuracy (CA):** Accuracy from the data user's perspective. Focus on errors of commission (incorrect classifications). Formula and interpretation.
    * **Kappa Coefficient:**  Measure of agreement correcting for chance agreement. Interpretation of Kappa values.

**III. Practical Implementation using Python**

In [1]:
# Import necessary libraries
import geopandas as gpd
import rasterio
import numpy as np
from sklearn.metrics import accuracy_score, cohen_kappa_score, confusion_matrix

In [2]:
# Load vector data (replace with your shapefile path)
vector_data = gpd.read_file('/home/jovyan/shared/Arissara/ALOS-2/sample-points/Indo_Random_sampling_points.shp')

In [3]:
# Load raster data (replace with your raster file path)
with rasterio.open('/home/jovyan/shared/Arissara/ALOS-2/Output/rm-noise_lee_calib_N01E102_sl_HH_water_extent_mask.tif') as src:
    raster_data = src.read(1)
    coords = [(x, y) for x, y in zip(vector_data.geometry.x, vector_data.geometry.y)]
    raster_values = [val[0] for val in src.sample(coords)]

In [4]:
# Extract ground truth labels
ground_truth_labels = vector_data['class'].values

In [5]:
# Convert to NumPy arrays
raster_values = np.array(raster_values)
ground_truth_labels = np.array(ground_truth_labels)

In [6]:
# Calculate accuracy metrics
overall_accuracy = accuracy_score(ground_truth_labels, raster_values)
kappa = cohen_kappa_score(ground_truth_labels, raster_values)
conf_matrix = confusion_matrix(ground_truth_labels, raster_values)

print(f"Overall Accuracy: {overall_accuracy}")
print(f"Kappa Coefficient: {kappa}")
print("Confusion Matrix:")
print(conf_matrix)

Overall Accuracy: 1.0
Kappa Coefficient: 1.0
Confusion Matrix:
[[20  0]
 [ 0 20]]


Here's what it means:

- 20 (True Negative, TN): 20 points were correctly classified as non-water (0).
- 0 (False Positive, FP): No points were incorrectly classified as water (1) when they should have been non-water (0).
- 0 (False Negative, FN): No points were incorrectly classified as non-water (0) when they should have been water (1).
- 20 (True Positive, TP): 20 points were correctly classified as water (1).