vignettes/articles/make-predictions.Rmd

---
title: "Make predictions"
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(collapse = TRUE,
                      comment = "#>",
                      fig.align = "center")
options(knitr.table.format = "html")
```

```{r load-data, echo=FALSE, message=FALSE}
default_model <- SDMtune:::bm_maxent
data <- SDMtune:::t
files <- list.files(path = file.path(system.file(package = "dismo"), "ex"),
                    pattern = "grd",
                    full.names = TRUE)
predictors <- terra::rast(files)
```

## Intro

In the previous articles you have learned how to [prepare the data](./prepare-data.html) for the analysis and how to [train a model](./train-model.html) using **SDMtune**. In this article you will learn how to use a trained model to make predictions. We will use the `default model` that we trained in the previous article to predict the distribution of the virtual species.

## Make prediction

First we load the **SDMtune** package:

```{r load-SDMtune}
library(SDMtune)
```

New locations are predicted with the function [predict()](../../reference/predict-SDMmodel-method.html). The function takes three main arguments:

* a trained model given as `SDMmodel()` object;
* a new dataset, used to make prediction (can be a `data.frame`, an `SWD()` object or a raster object);
* the output type, that for **Maxent** models can be: **raw**, **logistic** or **cloglog**.

Next we get the prediction for our training locations using the **cloglog** output type:

```{r predict-train}
pred <- predict(default_model,
                data = data,
                type = "cloglog")
```

The output in this case is a vector containing all the predicted values for the training locations:

```{r print-pred}
head(pred)
```

We can get the prediction only for the presence location with:

```{r predict-presence}
p <- data@data[data@pa == 1, ]
pred <- predict(default_model,
                data = p,
                type = "cloglog")
tail(pred)
```

For models trained with the **Maxent**  method, the function performs the prediction in R without calling the MaxEnt Java software. This results in a faster computation for large datasets and might result in a slightly different output compared to the Java software.

## Create a distribution map

We can use the same function to create a distribution map starting from the `predictors` raster object that we created in the [first article](./prepare-data.html#acquire-environmental-variables).

```{r predict-raster}
map <- predict(default_model,
               data = predictors,
               type = "cloglog")
```

In this case the output is a raster object:

```{r print-raster-output}
map
```

The map can be saved in a file directly when running the prediction, we just have to pass additional arguments to the [predict()](../../reference/predict-SDMmodel-method.html) function. In the next example we save the map in a file called "**my_file**" in the **GeoTIFF** format:

```{r save-map, eval=FALSE}
map <- predict(default_model,
               data = predictors,
               type = "cloglog",
               file = "my_map",
               format = "GTiff")
```

The [predict()](../../reference/predict-SDMmodel-method.html) function has other arguments useful when predicting large datasets:

* **progress**: can be set to `"text"` to visualize a progress bar;
* **extent**: can be passed to reduce the prediction to the given extent.

## Plot a distribution map

To plot the distribution map we can use the function `plotPred()`:

```{r plot-map-default}
plotPred(map)
```

The function `plotPred()` plots a map with a color ramp similar to the one used by the MaxEnt Java software. We can pass additional arguments to customize the map. In the next example we provide a custom color ramp and we add a title to the legend:

```{r plot-map-custom}
plotPred(map,
         lt = "Habitat\nsuitability",
         colorramp = c("#2c7bb6", "#abd9e9", "#ffffbf", "#fdae61", "#d7191c"))
```

### Try yourself

Restrict the prediction to Chile and plot the prediction. To see the solution highlight the following cell:

```{r exercise, eval=FALSE, class.source='exercise'}
# First create the extent that surrounds Chile
e = terra::ext(c(-77, -60, -56, -15))
# Now use the extent to make the prediction
map_e <- predict(default_model, 
                 data = predictors, 
                 type = "cloglog", 
                 extent = e)
# And finally plot the prediction
plotPred(map_e)
```

## Plot a presence/absence map

To plot a presence/absence map we need a threshold value that splits the prediction in presence and absence values. The function `thresholds()` returns some commonly used threshold values starting from an `SDMmodel()` object. In the next example we print the threshold values for the `default_model` object using the type `"cloglog"`:

```{r thresholds, eval=FALSE}
ths <- thresholds(default_model, 
                  type = "cloglog")

ths
```
```{r thresholds output, echo=FALSE}
ths <- thresholds(default_model, 
                  type = "cloglog")
kableExtra::kable(ths) |> 
  kableExtra::kable_styling(bootstrap_options = c("striped", "hover"),
                            position = "center",
                            full_width = FALSE)
```

For example we want to create a presence/absence map using the threshold that maximize the training sensitivity plus specificity. We use the function `plotPA()` passing the threshold value as argument:

```{r plot-pa}
plotPA(map, 
       th = ths[3, 2])
```

We can also save the map in a file with the following code:

```{r plot-and-save-pa, eval=FALSE}
plotPA(map, 
       th = ths[3, 2], 
       filename = "my_pa_map", 
       format = "GTiff")
```

Both functions `plotPred()` and `plotPA()` have the argument `hr`to plot the the map with high resolution, useful when the map is used in a scientific publication. For all the arguments of the functions that we have used in this article please refer to the [documentation](../..//reference/index.html).

### Try yourself

Repeat the same analysis using a model trained with the **Maxnet** method.

## Conclusion

In this article you have learned:

* how to make prediction for datasets;
* how to create and save a distribution map;
* how to plot a distribution map;
* how to get thresholds values to create a presence/absence map;
* how to plot and save a presence/absence map.

In the [next article](./evaluate-model.html) you will learn how to evaluate a model.