<a href="https://colab.research.google.com/github/alemlakes/accuracyAssessmentTools-hfff/blob/main/AccuracyAssessmentTools-hfff.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Installs and Imports

In [None]:
%cd /content
!git clone https://github.com/alemlakes/accuracyAssessmentTools-hfff.git
%cd accuracyAssessmentTools-hfff
!pip install .

In [None]:
# Refresh to the latest GitHub version in Colab
%cd /content
!rm -rf accuracyAssessmentTools-hfff
!git clone https://github.com/alemlakes/accuracyAssessmentTools-hfff.git
%cd accuracyAssessmentTools-hfff
!pip install -U .

In [None]:
import pandas as pd
from acc_assessment.olofsson import Olofsson
from acc_assessment.congalton import Congalton
from acc_assessment.stehman import Stehman
from acc_assessment.cardille import Cardille
from acc_assessment.gue import GUE
from acc_assessment.mcem import MCEM

# Verify that the code works as expected

In [None]:
!pytest

# Stehman Assessment Example

The Stehman assessment is based on:
Stehman, S.V., 2014. "Estimating area and map accuracy for stratified
random sampling when the strata are different from the map classes",
International Journal of Remote Sensing. Vol. 35 (No. 13).
https://doi.org/10.1080/01431161.2014.930207

## Load Data

Use the data from Table 2 in Stehamn 2014 as the example assessment. Each row represents one pixel. The table must have at least three columns: one for the stratum each pixel was sampled from, one for the map class of each pixel and one for the reference class of each pixel.

In [None]:
stehman_df = pd.read_csv("./tests/stehman2014_table2.csv", skiprows=1)

In addition to the table of reference data, the Stehman assessment also needs a dictionary containing the total size of each of the strata. Keys in the dictionary should match the labels used in the strata column of the table.

In [None]:
stehman_strata_populations= {1: 40000, 2: 30000, 3: 20000, 4: 10000}

## Create the Assessment

In [None]:
stehman_assessment = Stehman(
    data=stehman_df,
    strata_col="Stratum",
    map_col="Map class",
    ref_col="Reference class",
    strata_population=stehman_strata_populations
)
stehman_assessment

The Stehman assessment object has the same `user_accuracy`, `producers_accuracy` methods as the Cardille assessment.

# Olofsson Assessment Example

The Olofsson assessment is based on: Olofsson, P., et al., 2014 "Good practices for estimating area and assessing accuracy of land change", Remote Sensing of Environment. Vol 148 pp. 42-57 https://doi.org/10.1016/j.rse.2014.02.015

## Load Data

Use the data from Table 8 in Olofsson et al. 2014 for the example.

The Olofsson assessment can either be initialized with an error matrix of pixel counts plus a dictionary of the mapped areas or with a longform table of each pixels map and reference values. Both produce the same results, pick the one that matches the form that your data is in.

In [None]:
olofsson_mapped_populations = {
    "Deforestation": 200000,
    "Forest gain": 150000,
    "Stable forest": 3200000,
    "Stable non-forest": 6450000
}

In [None]:
# from an error matrix
olofsson_data = [
    [66, 0, 1, 2],
    [0, 55, 0, 1],
    [5, 8, 153, 9],
    [4, 12, 11, 313],
]
classes = ["Deforestation", "Forest gain", "Stable forest", "Stable non-forest"]
olofsson_error_matrix = pd.DataFrame(olofsson_data, index=classes, columns=classes)

olofsson_error_matrix

In [None]:
olofsson_assessment1 = Olofsson(olofsson_error_matrix, olofsson_mapped_populations)
olofsson_assessment1

In [None]:
# from a lonfrom table
# first convert the error matrix to the long form
from acc_assessment.utils import _expand_error_matrix
olofsson_longform = _expand_error_matrix(olofsson_error_matrix, "map", "ref")
olofsson_longform

In [None]:
# to tell it that you are passing a longform table tell it the names of the map
# and the reference columns
olofsson_assessment2 = Olofsson(
    olofsson_longform,
    olofsson_mapped_populations,
    map_col="map",
    ref_col="ref",
)
olofsson_assessment2

## Create the Assessment

# GUE Assessment ExampleUse probabilistic map and reference tables to compute analytical estimates with GUE.

In [ ]:
map_table = pd.read_csv("./tests/map_data_table.csv")ref_table = pd.read_csv("./tests/ref_data_table.csv")strata_population_dict = {    "a": 5000,    "f": 10000,    "w": 1000,}gue_assessment = GUE(    map_data=map_table,    ref_data=ref_table,    strata_col="strata",    id_col="id",    strata_population=strata_population_dict,)gue_assessment

# MCEM Assessment ExampleRun Monte Carlo simulations using the same probabilistic inputs.

In [ ]:
mcem_assessment = MCEM(    map_data=map_table,    ref_data=ref_table,    strata_col="strata",    id_col="id",    strata_population=strata_population_dict,    n_simulations=10000,)mcem_assessment

# Comparing GUE to MCEMThe mean of the MCEM simulation distribution should converge to the GUE point estimate. While point estimates (Accuracy and Area) should align, MCEM percentile-based confidence intervals can differ from GUE analytical standard errors, providing a more honest representation of uncertainty for rare classes like deforestation.

In [ ]:
print("=== ANALYTICAL RESULTS (GUE) ===")print(gue_assessment)print("\n=== MONTE CARLO RESULTS (MCEM) ===")print(mcem_assessment)

# Congalton Assessment Example

The Congalton assessment is based on: "A Review of Assessing the Accuracy of Classifications of Remotely Sensed Data", Congalton, R. G., 1991. Remote Sensing of Environment, Vol. 37. pp 35-46 https://doi.org/10.1016/0034-4257(91)90048-B

## Load Data

Use the data from Table 1 in Congalton 1991 to create the example.

In [None]:
congalton_data = [[65, 4, 22, 24], [6, 81, 5, 8], [0, 11, 85, 19], [4, 7, 3, 90]]
congalton_classes = ["D", "C", "BA", "SB"]
congalton_df = pd.DataFrame(congalton_data, index=congalton_classes, columns=congalton_classes)
congalton_table = pd.DataFrame(_expand_error_matrix(congalton_df, "map", "ref"),)
congalton_table

## Create the Assessment

In [None]:
congalton_assessment = Congalton(congalton_table, "map", "ref")
congalton_assessment

# Cardille Assessment Example

The Cardille Assessment is based on ongoing work in the Cardille Computational Landscape Ecology Lab.

## Load data

Data should come in two csv files: one containing the map data and one containing the reference data. Each file should have one column containing the point id (to link rows from the map csv file to the reference csv file), and one column for the strata that the point was sampled from, and then one column for each of the possible classes containing the reference/map probability that the point belongs to that class. Column names should match between the two csv files.

In addition to the two csv files you also need to supply a dictionary mapping the stratum to their total size.

In [None]:
map_table = pd.read_csv("./tests/map_data_table.csv")
map_table

In [None]:
ref_table = pd.read_csv("./tests/ref_data_table.csv")
ref_table

In [None]:
strata_population_dict = {
    'a': 5000,
    'f': 10000,
    'w': 1000,
}

## Create the assessment

The Cardille accuracy assessment is a class. An explanation of its constructor can be viewed by calling `help` on the `__init__` function.

In [None]:
help(Cardille.__init__)

In [None]:
assessment = Cardille(
    map_data=map_table,
    ref_data=ref_table,
    strata_col="strata",
    id_col="id",
    strata_population=strata_population_dict,
)

## View assessment results

An overview of the assessment can be seen by printing the assessment object.

In [None]:
assessment

Individual accuracies can be accessed by calling the appropriate method.

These methods all return a tuple of two floats which are the value and the standard error respectively. If the standard error is not calculable it will be returned as `None`.

In [None]:
forest_users_accuracy = assessment.users_accuracy("Forest")
forest_producers_accuracy = assessment.producers_accuracy("Forest")

print(forest_users_accuracy)
print(forest_producers_accuracy)

To follow the practices outlined in Olofsson 2014 and Stehman 2014, the error matrix is returned as proportion of area by default. You can get an error matrix of point counts by setting `proportions=False`. Note that for the Cardille assessment these counts are scaled by their "weights".

In [None]:
assessment.error_matrix()

In [None]:
assessment.error_matrix(proportions=False)

An overview of all the methods can be seen by calling `help`.

In [None]:
help(assessment)