<img src="gval_light_mode.png" style="float:left">

In [None]:
import rioxarray as rxr
import gval
import numpy as np

## Load Datasets

It is preferred to use masking and scaling by default.  If your original data does not have nodata or does not have nodata assigned, please assign using: `rio.set_nodata(<your_nodata_value>)`

In [None]:
candidate = rxr.open_rasterio('candidate_map_two_class_categorical.tif', mask_and_scale=True)
benchmark = rxr.open_rasterio('benchmark_map_two_class_categorical.tif', mask_and_scale=True)

## Run GVAL Categorical Compare

An example of running the entire process with one command using minimal arguments is deomnstrated below.

In [None]:
agreement_map, crosstab_table, metric_table = candidate.gval.categorical_compare(benchmark,
                                                                                 positive_categories=[2],
                                                                                 negative_categories=None)

## Output

#### Agreement Map

The agreement map compares the encodings of the benchmark map and candidate map using a "comparison function" to then output unique encodings.  In this particular case the "Szudzik" comparison function was used by default since no argument was passed in for the `comparison_function` argument.  The Szudzik function is defined below:

$
c = \text{candidate value} \\
b = \text{benchmark value} \\
f(x)= 
\begin{cases}
    c^{2} + c + b,& \text{if } c\geq b\\
    b^{2} + c,              & \text{otherwise}
\end{cases}$


The resulting map allows a user to visualize these encodings as follows:

In [None]:
agreement_map.plot()

#### Cross-tabulation Table

A cross-tabulation table displays the frequency of each class in the presence of another within the spatial unit of interest. (In this case a pixel in each raster dataset.)  This can then be used to compute categorical statistics.

In [None]:
crosstab_table

#### Metric Table

A metric table contains information about the unit of analysis, (a single band in this case), and selected categorical statistics.  This is done by specifying the positive and negative categories of each dataset and then choosing the statistics of interest.  Since we did not provide the `metrics` argument GVAL computed all of the available categorical statistics.  (<b>Note: if there is no negative class encoding all statistics computing negatives will be skipped.</b>)

In [None]:
metric_table

## Access to Individual GVAL Operations

Aside form running the entire process, it is possible to run each of the following steps individually: spatial alignment, computing an agreement map, computing a cross-tabulation table, and computing a metric table. This allows for flexibility in workflows so that a user may use as much or as little functionality as needed.

### Spatial Alignment

Spatial alignment by default aligns to the benchmark map, however one can also align to the candidate map:

In [None]:
candidate, benchmark = candidate.gval.spatial_alignment(benchmark_map=benchmark,
                                                        target_map = "candidate")

Or to another alternate map altogether:

In [None]:
target_map = rxr.open_rasterio('target_map_two_class_categorical.tif')
candidate, benchmark = candidate.gval.spatial_alignment(benchmark_map=benchmark,
                                                        target_map = target_map)

The default is to resample using the "nearest" method.  Although not applicable for this case of categorical comparisons, one can change the `resampling` argument to use alternative resampling methods such as bilinear or cubic resampling.  These methods would be relevant in the case of continuous datasets.

### Agreement Map

The "szudzik" comparison function is run by default if the `comparison_function` argument is not provided, but one may use the "cantor" pairing function, the "pairing_dict" function, or a custom callable.

In [None]:
agreement_map = candidate.gval.compute_agreement_map(benchmark_map=benchmark, 
                                                     comparison_function='cantor')

agreement_map.plot()

#### Pairing Dictionary

A pairing dictionary can be provided by the user to allow for more control when specifying the agreement value outputs. 

A pairing dictionary has keys that are tuples corresponding to every unique combination of values in the candidate and benchmark, respectively. The values represent the agreement values for each combination. An example pairing dictionary for the candidate values `[1,2]` and benchmark values `[0, 2]` is provided below. A user is currently responsible for including the NoDataValue in the pairing dictionary which in the masked case is `np.nan`. 

NOTE: Pairing dictionary functionality is currently slow and needs some work to speedup.

In [None]:
pairing_dict = {
    (1, 0) : 0,
    (1, 2) : 1, 
    (2, 0) : 2,
    (2, 2) : 3,
    (np.nan, 0) : np.nan,
    (np.nan, 2) : np.nan,
    (1, np.nan) : np.nan,
    (2, np.nan) : np.nan
}

agreement_map = candidate.gval.compute_agreement_map(benchmark_map=benchmark, 
                                                     pairing_dict = pairing_dict)

agreement_map.plot()

Instead of building a pairing dictionary, a user can pass the unique candidate and benchmark values to use and a pairing dictionary will be built for the user. The user should also pass the NoDataValue used within the candidate and benchmark maps.

In [None]:
agreement_map = candidate.gval.compute_agreement_map(benchmark_map=benchmark, 
                                                     comparison_function='pairing_dict',
                                                     allow_candidate_values=[np.nan, 1, 2],
                                                     allow_benchmark_values=[np.nan, 0, 2])

agreement_map.plot()

#### Registration of Custom Functions

In this case we register the arbitrary pairing function `multiply` with the name "multi" and then vectorize it.  `Multiply` can also be passed in as a function in the `comparison_function` argument

In [None]:
from gval import Comparison
from numbers import Number

@Comparison.register_function(name='multi', vectorize_func=True)
def multiply(c: Number, b: Number) -> Number:
    return c * b

agreement_map = candidate.gval.compute_agreement_map(benchmark_map=benchmark, 
                                                     comparison_function="multi")

agreement_map.plot()

A user can also pick which candidate values or benchmark values to use by providing lists to the `allow_candidate_values` and `allow_benchmark_values` arguments.  Finally, a user can choose to write nodata to unmasked datasets with the `nodata` value, or to masked/scaled datasets with `encode_nodata`. 

### Cross-tabulation Table

When computing a crosstab table, a user may create an allow list for candidate/benchmark values just as done in the compute agreement map method.  They may also exclude a nodata value with `exclude_value` in the case that `mask_and_scale` is not applied when loading the original data files.

In [None]:
crosstab_table_allow = candidate.gval.compute_crosstab(benchmark,
                                                       allow_benchmark_values=[0, 2],
                                                       allow_candidate_values=[2])

In [None]:
crosstab_table_allow

#### Metric Table

Although all categorical metrics are computed by default if no argument is provided, `metrics` can also take a list of the desired metrics and will only return metrics in this list.

In [None]:
metrics = []
metric_table_select = crosstab_table.gval.compute_metrics(negative_categories= [0, 1],
                                                          positive_categories = [2],
                                                          metrics=['true_positive_rate', 'prevalence'])

In [None]:
metric_table_select

Just like registering pairing functions, you are able to register a metric function on both a method and a class of functions.  Below is registering a metric funciton:

In [None]:
from gval import CatStats

@CatStats.register_function(name="error_balance", vectorize_func=True)
def error_balance(fp: Number, fn: Number) -> float:
    return fp / fn

The following is registering a class of metric functions. In this case, the names associated with each function will respond to each method's name.

In [None]:
@CatStats.register_function_class(vectorize_func=True)
class MetricFunctions:
    
    @staticmethod
    def arbitrary1(tp: Number, tn: Number) -> float:
        return tp + tn
    
    @staticmethod
    def arbitrary2(tp: Number, tn: Number) -> float:
        return tp - tn

All of these functions are now callable as metrics:

In [None]:
metric_table_register = crosstab_table.gval.compute_metrics(negative_categories= None,
                                                            positive_categories = [2],
                                                            metrics=['error_balance', 'arbitrary1', 'arbitrary2'])

In [None]:
metric_table_register

## Save Output

Finally, a user can take the results and save them to a directory of their choice.  The following is an example of saving the agreement map and then the metric table:

In [None]:
# output agreement map
agreement_file = 'agreement_map.tif'
metric_file = 'metric_file.csv'

agreement_map.rio.to_raster(agreement_file)
metric_table.to_csv(metric_file)