In [5]:
import scores.categorical
import xarray as xr

Typically, forecast systems (including NWP or point forecasts) will generate numerical data rather than categorical data. So, the first step of a contingency score is often to generate the event/non-event tables for the forecast and observed conditions. Sometimes, the user will have their own way of doing things, and so `scores` will accept such event tables as input.

However, sometimes users will want a streamlined way of defining events, and then using scores to generate the event tables and contingency tables together. This notebook demonstrates several approaches, starting with sample real-varying data and deriving the contingency scores.

In [6]:
# Provides a basic forecast data structure in three dimensions
simple_forecast = xr.DataArray(
    [
		[
			[0.9, 0.0,   5], 
			[0.7, 1.4, 2.8],
			[.4,  0.5, 2.3],
		], 
			[
			[1.9, 1.0,  1.5], 
			[1.7, 2.4,  1.1],
			[1.4,  1.5, 3.3],
		], 
	],
	coords=[[10, 20], [0, 1, 2], [5, 6, 7]], dims=["height", "lat", "lon"])

In [7]:
# Within 0.1 or 0.2 of the forecast in all cases except one
# Can be used to find some exact matches, and some close matches
simple_obs = xr.DataArray(
    [
		[
			[0.9, 0.0,   5], 
			[0.7, 1.3, 2.7],
			[.3,  0.4, 2.2],
		], 
			[
			[1.7, 1.2,  1.7], 
			[1.7, 2.2,  3.9],
			[1.6,  1.2, 9.9],
		], 
	],
	coords=[[10, 20], [0, 1, 2], [5, 6, 7]], dims=["height", "lat", "lon"])

In [11]:
# An event here is defined as a value (e.g. temperature) above 1.3
# The EventThresholdOperator can take a variety of operators from the python "operator" module, or a user-defined function
# The default is operator.gt, which is the same as ">" but in functional form.
event_operator = scores.categorical.EventThresholdOperator(default_event_threshold=1.3)

In [12]:
# This is the simplified functional API, where the forecast, observed and operator are all passed in and a single score is returned
scores.categorical.accuracy(simple_forecast, simple_obs, event_operator=event_operator)

In [27]:
# The event operator can also be used to create a contingency table object
# It is more efficient to use this approach if generating multiple scores
table = event_operator.make_table(simple_forecast, simple_obs, event_threshold=1.3)

In [19]:
table.accuracy()

In [24]:
table.false_alarm_rate()

In [25]:
# It is possible to preserve or reduce specified dimensions in the call to the table
table.accuracy(preserve_dims=['height'])

In [26]:
help(table)

Help on BinaryContingencyTable in module scores.categorical.contingency_impl object:

class BinaryContingencyTable(builtins.object)
 |  BinaryContingencyTable(forecast_events, observed_events)
 |
 |  At each location, the value will either be:
 |   - A true positive    (0)
 |   - A false positive   (1)
 |   - A true negative    (2)
 |   - A false negative   (3)
 |
 |  It will be common to want to operate on masks of these values,
 |  such as:
 |   - Plotting these attributes on a map
 |   - Calculating the total number of these attributes
 |   - Calculating various ratios of these attributes, potentially
 |     masked by geographical area (e.g. accuracy in a region)
 |
 |  As such, the per-pixel information is useful as well as the overall
 |  ratios involved.
 |
 |  Methods defined here:
 |
 |  __init__(self, forecast_events, observed_events)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |
 |  accuracy(self, *, preserve_dims=None, reduce_dims=None)
 |      Th

In [29]:
# If it is wanted, the underlying event counts can be accessed
table.generate_counts(reduce_dims=["lat", "height"])

{'tp_count': <xarray.DataArray (lon: 3)> Size: 24B
 array([3, 1, 5])
 Coordinates:
   * lon      (lon) int64 24B 5 6 7,
 'tn_count': <xarray.DataArray (lon: 3)> Size: 24B
 array([3, 3, 0])
 Coordinates:
   * lon      (lon) int64 24B 5 6 7,
 'fp_count': <xarray.DataArray (lon: 3)> Size: 24B
 array([0, 2, 0])
 Coordinates:
   * lon      (lon) int64 24B 5 6 7,
 'fn_count': <xarray.DataArray (lon: 3)> Size: 24B
 array([0, 0, 1])
 Coordinates:
   * lon      (lon) int64 24B 5 6 7,
 'total_count': <xarray.DataArray (lon: 3)> Size: 24B
 array([6, 6, 6])
 Coordinates:
   * lon      (lon) int64 24B 5 6 7}