Skip to content

Commit

Permalink
r.edm.eval: port shell/R script to Python (#1001)
Browse files Browse the repository at this point in the history
The addon is rewritten in Python. This includes some changes in parameter names and functionality to anticipate some additional functionality. 

Added option to compute stats for multiple prediction layers. Added example to illustrate this option.
  • Loading branch information
ecodiv committed Jan 12, 2024
1 parent e978f60 commit 73bd437
Show file tree
Hide file tree
Showing 5 changed files with 793 additions and 502 deletions.
250 changes: 231 additions & 19 deletions src/raster/r.edm.eval/r.edm.eval.html
Original file line number Diff line number Diff line change
@@ -1,36 +1,248 @@
<h2>DESCRIPTION</h2>

<p><em>r.edm.eval</em> is an environmental distribution model evaluation script. This addon is a GRASS GIS wrapper for an R script. The function returns a number of evaluation statistics (on the command line and as a text file with user defined name), including the AUC, maximum sum of specificity and sensitivity (spec. + sens), the maximum true skill statistic (TSS - which is just spec + sens - 1), and the maximum kappa. For all three statistics, the corresponding probability threshold values are also given.
The <em>r.roc</em> addon returns a few evaluation statistics,
including the area under the curve (AUC), the maximum true skill
statistic (TSS), and the maximum kappa. For all three statistics, the
corresponding probability threshold values are also given. Optionally,
it draws the receiver operating characteristic (ROC) curve.

<p>
The addon takes as input a raster layer with
observations(<em>observations</em>) and one or more
<em>predictions</em> layers. The <em>observations</em> layer
should be encoded with 1 (presence/cases) and 0 (absences/controls).
The (<em>predictions</em> layer gives the predicted values (e.g., the
outcome of a model). This typically are probability or suitability
scores between 0 and 1, but it will work on other scales too. With two
or more <em>predictions</em> layers, statistics are calculated for each
prediction layer separately. This allows one to compare predictions of
different models.

<p>
By default, the prediction layer is divided in 200 bins (the user can
change this number). For each bin, the module computes the cumulative
true and false positives (TP, FP), true and false negatives (TN, FN), the
true and false positive rate (TPR, FPR), the true negative rate (TNR),
Cohen's kappa and the true skill statistic.

<div class="code"><pre>
TPR = TP / (TP + FN)
</pre>
</div>
<div class="code"><pre>
FPR = FP / (FP + TN)
</pre>
</div>
<div class="code"><pre>
TNR = TN / (TN + FP)
</pre>
</div>
<div class="code"><pre>
TSS = TPR - FPR
</pre>
</div>
<div class="code"><pre>
Kappa = 2 * (TP * TN - FN * FP) / ((TP + FP) * (FP + TN) + (TP + FN) * (FN + TN)))
</pre>
</div>

<p>
These values can be saved to a CSV file, together with the
corresponding upper and lower boundaries of the bins values. These can
be used to calculate additional statistics.

<p>
With the <em>-b</em> flag, the module will use the fraction of
points that is predicted to be presence/true instead of the false
positive rate to create the ROC curve and calculate the AUC. Use this
when the predictions are created using a model that works with
background points instead of (pseudo-)absence points (like Maxent).

<p>
A problem with the AUC is that it varies with the spatial extent used
to select background points (Lobo et al. 2008, Jiménez-Valverde 2012).
Generally, the larger that extent, the higher the AUC value. Therefore,
AUC values are generally biased and cannot be directly compared. An
admittedly simple way to evaluate this issue of changing AUC values
with changing extents is to only use (pseudo-)absence or background
points within certain distances of the presence points (raster cells).
To do so, the user can use the <em>buffer</em> parameter to limit the
absence/background points that are used to calculate the various
statistics to those within a certain distance from the presence
locations/areas.

<p>
The user can set the required ratio of presence and total points to be
used in the evaluation (<em>preval</em>; value between 0 an 1). This
allows the user to test how sensitive the evaluation statistics are for
the prevalence of presence points.

<p>In addition, a csv file (with user-defined name) is created with the number of true and false positives (TP,FP), true and false negatives (TN,FN), the true and false positive rate (TPR,FPR), the true negative rate (TNR) and the three statistics mentioned above are given. The user can use this to create for example the receiver operating characteristic (ROC) curve and other evaluation statistics in her/his favorite software.

<p>The observed distribution map should be a binary map, with 1 for presence and 0 for absence. The modeled distribution normally contains probability scores between 0 and 1, but it will work on other scales too.
<h2>NOTES</h2>

<p>It is fairly common to use background rather than (pseudo-)absence points in species distribution modeling. To evaluate models that are built using presence and background points, there is the option to add the presence points to the absence points (flag b). If selected, the combined presence-absence points will be used instead of the absence points.
The module expects an <em>reference</em> raster layer, encoded with 1
(presence/cases) and 0 (absences/controls). Optionally, the user can
define other values using the <em>absence</em> and <em>presence</em>
parameters. If the layer contains more than 2 unique values, or if the
layer is not of the type CELL (integer), the module will terminate with
a warning.

<p>
The selection of the best evaluation statistic depends on your data and
what you want to test. This module provides a few only, but the user can
calculate his/her own using the table in the CSV file. See the
References list for a few useful references in that respect.

<p>
This module depends on the Python <a
href="https://pandas.pydata.org/">Pandas library.</a>

<h2>EXAMPLES</h2>

Run this example in the "landsat" mapset of the North Carolina sample
data set location. First, create a binary map with herbaceous land cover
(1) and other (0). Next, create a raster layer with training pixels
using <em>r.learn.train</em> and <em>r.learn.predict</em> from the <a
href="https://grass.osgeo.org/grass-stable/manuals/addons/r.learn.ml2.html"></a>r.learn.ml2</a>
addon. Note, <em>r.learn.train</em> can compute its own accuracy
measures as well.

<div class="code"><pre>
g.region raster=landclass96
r.mapcalc "herbaceous = if(landclass96 == 3, 1, 0)"
r.random input=herbaceous npoints=1000 raster=training_pixels seed=10
</pre>
</div>

<p>
Then we can use these training pixels to perform a classification on
the more recently obtained landsat 7 image using <em>r.learn.train</em>.

<div class="code"><pre>
# train a random forest regression model using r.learn.train
r.learn.train group=lsat7_2000 training_map=training_pixels \
model_name=RandomForestRegressor n_estimators=500 \
save_model=rf_model.gz cv=2

# perform prediction using r.learn.predict
r.learn.predict group=lsat7_2000 load_model=rf_model.gz \
output=rf_regressor
</pre>
</div>

<p>
Now we can see how well the model performed using the
<em>r.edm.eval</em>.

<div class="code"><pre>
r.edm.eval -p reference=herbaceous prediction=rf_regressor
</pre>
</div>

<p>
The accuracy measures are printed on screen.

<div class="code"><pre>
AUC = 0.8362
maximum TSS = 0.5289
maximum kappa = 0.4438
treshold maximum TSS = 0.1118
treshold maximum kappa = 0.2936
treshold minimum distance to (0,1) = 0.0979
</pre>
</div>

<p>
And the accompanying ROC curve is shown below (the figure can
optionally be saved to file).

<p>
<div align="center" style="margin: 10px"> <a
href="r_edm_eval_01.png"> <img src="r_edm_eval_01.png" alt="
ROC curve" border="0"> </a></div>

<p>
Let's try how well a logistic regression performs, and compare this
with the previous results.

<div class="code"><pre>
# train a logistic regression model using r.learn.train
r.learn.train group=lsat7_2000 training_map=training_pixels \
model_name=LogisticRegression save_model=lr_model.gz cv=2

# perform prediction using r.learn.predict -p
r.learn.predict group=lsat7_2000 load_model=lr_model.gz \
output=lr_regressor
</pre>
</div>

<div class="code"><pre>
r.edm.eval -p reference=herbaceous prediction=rf_regressor,lr_regressor_1
</pre>
</div>

<p>
The accuracy measures are printed on screen.

<div class="code"><pre>
AUC = 0.8362
maximum TSS = 0.5289
maximum kappa = 0.4438
treshold maximum TSS = 0.1118
treshold maximum kappa = 0.2936
treshold minimum distance to (0,1) = 0.0979

--- performance lr_regressor_1 ---
AUC = 0.8384
maximum TSS = 0.5621
maximum kappa = 0.4669
treshold maximum TSS = 0.1047
treshold maximum kappa = 0.1995
treshold minimum distance to (0,1) = 0.0848
</pre>
</div>

And the accompanying ROC curve is shown below (the figure can optionally be saved to file).

<p>
<div align="center" style="margin: 10px"> <a href="r_edm_eval_02.png">
<img src="r_edm_eval_02.png" alt=" ROC curve of random forest
regression model and logistic regression model." border="0"> </a></div>

<p>A problem with the AUC is that it varies with the spatial extent used to select background points (Lobo et al. 2008, Jim&eacute;nez-Valverde 2012). Generally, the larger that extent, the higher the AUC value. Therefore, AUC values are generally biased and cannot be directly compared. One possible way to address this problem is to use a pairwise distance sampling for presence and absence testing points to remove this bias (Hijmans 2012). Perhaps a simpler way to evaluate this problem of changing AUC values with changing extents is to only sample absence/background points within certain distances of the presence points.

<p>The addon offers the option to limit the area over which the evaluation statistics are calculated. The user can define a buffer (in kilometers) around the absence and presence raster cells (the options <b>buffer_abs</b> and <b>buffer_pres</b> respectively). The inside buffer is obviously only useful when your presence data consists of areas (polygons) rather than point data. Note that currently the script does not check if there are a sufficient number of absence or background points within the given distance of the presence points. The default value is 0, which means no buffer is used.
<h2>REFERENCES</h2>

<p>To test the sensitivity of the evaluation statistics for the number of evaluation points, it is possible to set the number of presence and absence points to be used (they will be randomly selected) using the <b>num_pres</b> and <b>num_abs</b> parameters (the default, 0, means all points are included). If the number of presence or absence set by the user is larger than the actual number of presence or absence points, this step is silently skipped. Note, however, that the number of points used to calculate the statistics are given in the summary.
Jiménez-Valverde, A. 2012. Insights into the area under the receiver
operating characteristic curve (AUC) as a discrimination measure in
species distribution modelling. Global Ecology and Biogeography 21:
498-507

<p>The user can set the required ratio of presence and total points to be used in the evaluation (0&lt;<b>preval</b>&lt;1, any value outside this range will be ignored). This allows the user to test how sensitive the evaluation statistics are for the prevalence of presence points. Note that if <b>preval</b> > 0, the num_pre and num_abs parameters will be ignored. Note that if you use the flag to use background points rather than absence points, this <b>preval</b> option will probably not make sense (check the reported prevalence in the summary output).
<p>
Lobo, J. M., Jim&eacute;nez-Valverde, A., & Real, R. 2008. AUC: a
misleading measure of the performance of predictive distribution
models. Global Ecology and Biogeography 17: 145-151.

<h2>NOTES</h2>
<p>
Hijmans, R. J. 2012. Cross-validation of species distribution models:
removing spatial sorting bias and calibration with a null model.
Ecology 93: 679-688.

<p>The selection of the best evaluation statistic depends on your data and what you want to test. This add-on provides a few, but the user can calculate his/her own using the table in the csv file. See the REFERENCES list for a few useful references in that respect.
<p>
Allouche, O., Tsoar, A., & Kadmon, R. 2006. Assessing the accuracy of
species distribution models: prevalence, kappa and the true skill
statistic (TSS). Journal of Applied Ecology 43: 1223-1232.

<p>Besides R, the script uses the R packages <em>rgrass7</em>, <em>reshape</em> and <em>caTools</em>. If they are not installed yet, the script will attempt to do so. This may take some time, so if you are planning to use the script more often, you are advised to install the packages yourself first.
<p>
McPherson, J. M., Jetz, W., & Rogers, D. J. 2004. The effects of
species' range sizes on the accuracy of distribution models: ecological
phenomenon or statistical artefact? Journal of Applied Ecology 41:
811-823.

<h2>SEE ALSO</h2>

<h2>REFERENCES</h2>
<p>Jim&eacute;nez-Valverde, A. 2012. Insights into the area under the receiver operating characteristic curve (AUC) as a discrimination measure in species distribution modelling. Global Ecology and Biogeography 21: 498-507
<p>Lobo, J. M., Jim&eacute;nez-Valverde, A., & Real, R. 2008. AUC: a misleading measure of the performance of predictive distribution models. Global Ecology and Biogeography 17: 145-151.
<p>Hijmans, R. J. 2012. Cross-validation of species distribution models: removing spatial sorting bias and calibration with a null model. Ecology 93: 679-688.
<P>Allouche, O., Tsoar, A., & Kadmon, R. 2006. Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS). Journal of Applied Ecology 43: 1223-1232.
<p>McPHERSON, J. M., Jetz, W., & Rogers, D. J. 2004. The effects of species' range sizes on the accuracy of distribution models: ecological phenomenon or statistical artefact? Journal of Applied Ecology 41: 811-823.
<em><a href="https://grass.osgeo.org/grass-stable/manuals/addons/r.confusionmatrix.html">r.confusionmatrix</a></em>


<h2>AUTHOR</h2>

Paulo van Breugel <a href="https://ecodiv.earth/contact.html">ecodiv.earth/contact.html</a>
Paulo van Breugel

0 comments on commit 73bd437

Please sign in to comment.