This project contains code to update LANDFIRE raster maps of Existing Vegetation Cover (EVC) and and Existing Vegetation Height (EVH) using Fuel Disturbance (FDist) data and transition rules. This process is broken into three principal tasks: generating a SyncroSim library that organizes these inputs and transition rules by Map Zone; running the SyncroSim library to forecast these new vegetation parameters by Map Zone; and reconstructing EVC and EVH maps for the entire Geo Area from the maps generated for each Map Zone.
Below are the instructions to setup, configure, and run the code.
These scripts require working installations of R and SyncroSim, and were
developed on R version v4.3.3, SyncroSim v2.4.18, and rsyncrosim v1.4.2.
Additionally the following R packages must be installed: rsyncrosim
,
tidyverse
, terra
, furrr
, logr
, yaml
, raster
. The ST-Sim package
(v3.3.14) must also be installed in SyncroSim. The instructions to run the
script assume you will be using RStudio, however, this
is not a strict requirement.
Note that the raster
package is installed as a soft dependency for this
version of rsyncrosim
, but is not actually used by the scripts.
A number of data files are required that are not included on the GitHub repository due to size constraints. These files have been compressed into a zip folder that can be downloaded here. Please note that this file is quite large (~2.2GB).
The zip folder also contains the expected paths of the files, and so it is
recommended to extract the zip file directly into the root of this git
repository. In other words, the data/
folder should be present in the same
folder as this README after extraction, not inside another folder (such as
LANDFIRE Update Data Files/
). This data archive includes all the input data
files needed to process the NW Geo Area. The default configuration is set to run
a small test section of Map Zone 19 in the NW Geo Area.
The run can be configured by editing the config/config.yaml
file. R Studio and
most modern text editors have syntax highlighting for YAML files that can help
with editing these files, but may not be associated with *.yaml
files by
default. The YAML file syntax is fairly self-evident, but please see this short
overview
for details.
Descriptions of the configuration variables follow in the sections below.
This section is used to decide which Geo Area and Map Zones should be processed.
The first variable, dataFolder
, should point to the location of a folder
containing all the raw input data for a single Geo Area. The expected contents
of this folder are described in the Input Data Structure
section, and descriptions of all input data including these Geo-Area-specific
files can be found in the Data Dictionary section.
The second variable, mapzonesToKeep
, is an itemized list of every Map Zone
in the Geo Area that you would like to retain in the simulation. One SyncroSim
scenario will be built for each Map Zone in this list. By default, all the Map
Zones in the NW Geo Area are listed here but are commented out, with the exception
of Map Zone 19. You can remove these comments to run the entire NW Geo Area. For
all other Geo Areas you must update the list of Map Zones to keep.
Finally, tsdToRemove
is another itemized list that is used to decide which if
any Time Since Distrubance (TSD) codes to filter out of the FDIST layer during
processing. This can be used to exclude older disturbances for the update, such
as those that have already been accounted for in the raw vegetation maps.
These are the names of the SyncroSim library and project that will be built by the R scripts. It is recommended to include the name of the Geo Area in both of these names.
This workflow is parallelized at two different stages: building the SyncroSim
library and running the SyncroSim library. The nThreads
configuration variable
determines the number of cores R should use while building the library, while
ssimJobs
determines the number of cores SyncroSim should use while running the
library.
These variables are set independently as these two overarching tasks have very
different resource requirements and accordingly benefit from different numbers
of cores on most machines. In particular, building the library can only ever use
as many cores as there are unique disturbance events in a single Map Zone (about
20, on average) and makes most efficient use of about 8 cores. In contrast,
running the library will almost always benefit from more cores, but this task
is strongly memory limited. Accordingly, setting too many jobs here can lead to
Out-of-Memory errors. Please refer to the Suggested Configuration
Section below for suggestions on setting ssimJobs
.
Finally, tileSize
can be used to decide how finely the raster maps are split
up for the spatial multiprocessing in SyncroSim. Lower values will decrease
memory usage in SyncroSim (to a point), but will generally increase run time.
These options can be used to crop the output raster maps down to a smaller fixed
extent, primarily to speed up runs for testing. This feature can be enabled by
setting cropToExtent
to 1. Set this variable to 0 to disable cropping to
extent. When this feature is enabled, an extent will be read from the CSV file
at cropExtentPath
. See the example crop extent CSV for Map Zone 19,
config/Crop Extents/MZ 19 Example.csv
for the proper format. Note that all
Map Zones to be kept must at least partially overlap the chosen extent.
ssimDir
can be used to explicitly set the location of the SyncroSim
installation. This is primarily of use on Linux machines. Leave this cell empty
to use the default installation directory.
Although continuous EVC and EVH maps are required as inputs for the update, the
actual transitions are based on categorical EVC and EVH codes. As part of the
pre-processing, the input continuous maps are converted to categorical maps as
needed using the included crosswalk tables. To view and configure this
conversion, please refer to data/shared/EVC Crosswalk.csv
and data/shared/EVH Crosswalk.csv
.
The first column CONTINUOUS
, lists all valid continuous codes for EVC and EVH
respectively, and the second column, CATEGORICAL
, lists the categorical code
that the corresponding continuous code will be converted to. Please refer to the
Data Dictionary section for more details about these and other
data files.
As part of the final reconstruction of the entire Geo Area, the categorical Existing Vegetation Cover (EVC) and Existing Vegetation Height (EVH) codes are converted to their continuous code counterparts. While this conversion is straightforward for some categorical codes, there is some degree of choice for others.
To view and configure this conversion, refer to the data/shared/EVC LUT.csv
and data/shared/EVH LUT.csv
look up tables. The first column, VALUE
, lists
all valid categorical codes for EVC and EVH respectively while the final column,
CONTINUOUS
, lists the continuous code that corresponding categorical value
will be converted to. Please do not edit any of the columns in these tables
except for the final column, CONTINUOUS
. Please refer to the
Data Dictionary section for more details about these and other
data files.
If changes are made to the conversion rules after completing a run, they can be
applied retroactively by rerunning the reconstructGeoArea.R
script.
The suggested run configuration will depend on the available compute resources. Below is a table outlining some general recommendations and the corresponding expected run times.
Min. Processors | Min. Memory | SyncroSim Jobs | Tile Size | Approx. Run Time for NW Geo Area | Approx. Run Time for Map Zone 19 |
---|---|---|---|---|---|
4 | 32GB | 2 | 150 | 4 - 5 d | 4 - 8 h |
4 | 64GB | 4 | 150 | 2 - 3 d | 2 - 4 h |
8 | 128GB | 8 | 250 | ~ 1 d | 1 - 2 h |
16 | 256GB | 16 | 250 | 12 - 18 h | ~ 45 m |
32 | 512GB | 32 | 250 | 8 - 12 h | ~ 30 m |
To use this table, find the last row for which you have at least the minimum
number of available processors and minimum amount of memory. Set the number of
SyncroSim jobs (ssimJobs
) and tile sizes (tileSize
) accordingly in the
config file. The remaining two columns provide ranges for the expected run times.
Begin by opening the LANDFIRE Update.Rproj
R project file to ensure the
correct working directory is set in RStudio. Next, open the batchProcess.R
script and either run line-by-line or press the source
button in the top right
corner of the file editor pane of RStudio.
This script is responsible for processing the raw input raster maps, including cropping and masking down to the chosen Map Zone, converting FDist maps to vDist, and layerizing the disturbance map for SyncroSim. The script then uses these cleaned rasters to generate a SyncroSim library file to run the update.
Next, open the generated SyncroSim library using the SyncroSim UI. This file
will be stored in the library/
subdirectory. Select the NW Geo Area
project,
(or the projectName
you chose in the configuration) and the scenario you would
like to run and press run
in the main toolbar. Note that, depending on your
computer configuration, you may need to change the maximum number of
multiprocessing jobs.
As an example of requirements, the NW Geo Area (consisting of 12 Map Zones) was
run with a maximum of 32 multiprocessing jobs on a Linux server with 64 virtual
cores and 512GB of RAM and took just under 21 hours.
Results can be viewed directly in the graphical user interface (GUI) by creating Charts and Maps. You can also export tabular and map data to be viewed externally.
Finally, to reconstruct EVC and EVH raster maps for the entire Geo Area from the
SyncroSim results, run the reconstructGeoArea.R
script as before. The full maps
will be stored in the stitched/
subdirectory of the input dataFolder
you set
in the configuration.
NOTE that using the GUI is optional and the scenario can also be run directly through the SyncroSim commandline or by using the rsyncrosim package for R.
A number of automated checks are incorporated into the library building process to catch transition issues prior to simulation. These tests are described here:
The Geo-Area-specific input file Transition Table.csv
defines transition rules
for cells as a function of the disturbance (VDIST), Existing Vegetation Cover
(EVC), Existing Vegetation Height (EVH), Existing Vegetation Type (EVT), and Map
Zone. This check ensures that each combination of inputs maps to a unique output
EVC and EVH. If there are multiple transition rules for a given set of inputs,
the script will throw an error and stop building the library. The script will
also a log file to the library/
folder if any duplicate rules are found to
help identify the duplicates.
This test finds every combination of disturbance (VDIST), Existing Vegetation Cover (EVC), Existing Vegetation Height (EVH), Existing Vegetation Type (EVT), and Map Zone that are present in the cleaned raster maps. By comparing this table to the table of all defined transition rules, this test identifies all the disturbed cells which are missing an explicit transition rule.
This table of missing transition rules is then split into a table of missing
rules that are known not to be problematic and a table of unexpected missing
rules. Both are written to the library/
folder and a warning is issued if any
unexpected missing rules were found.
Currently, only mechanical disturbances in herb cover and any disturbances in urban, developed, and orchard vegetation types are recognized as known disturbances.
This test finds every combination of Existing Vegetation Cover (EVC) and
Existing Vegetation Height (EVH) that is present in the cleaned raster maps and
ensures that they are valid combinations. For example, a cell with 50% forest
cover should not have a corresponding shrub vegetation height code. A table of
the invalid states and the Map Zones in which they were found are written to the
library/
folder and a warning is given if any invalid states were found.
Flagged states that should be considered valid can be appended to the
data/shared/All Combinations.csv
file to avoid future warnings. See the
Data Dictionary section for more information about this file.
Flagged states that truly are invalid are indicative of errors in the raw EVC or
EVH map for the Geo Area.
To generalize the workflow for running multiple Geo Areas with minimal reconfiguration, the scripts have strict requirements for how input files are named and organized.
The input files for each Geo Area must be stored in its own folder, preferably
in the data/
directory with a name that indicates which Geo Area the folder is
for. For example, the example data archive linked in the Setup section
contains the input files for the NW Geo Area and are accordingly are held in
data/NW/
.
Within this directory, there must be a subdirectory named raw/
. This is used
to separate the raw inputs from the final clean and stitched raster maps that
will be generated and stored in their respective subdirectories.
Within the raw/
subdirectory, the raw raster maps of the Map Zones, EVC, EVH,
EVT, and FDist must be present and named Map Zones.tif
, EVC.tif
, EVH.tif
,
EVT.tif
, and FDIST.tif
respectively.
Also required within the raw/
subdirectory is the nonspatial/
subdirectory,
which includes Geo-Area-specific data files that are not stored as maps. This
folder requires a CSV file of all transition rules, Transition Table.csv
,
and a color mapping for EVT codes present in the Geo Area, EVT Colors.csv
.
Altogether, the folder should have the following structure:
Geo Area Name
│
└───raw
│ Map Zones.tif
│ EVC.tif
│ EVH.tif
│ EVT.tif
│ FDIST.tif
│
└───nonspatial
│ Transition Table.csv
│ EVT Colors.csv
Please see the Data Dictionary section below for a description of these input files and the example data files linked in the Setup section for the expected format of each file.
Below is a description of the steps to prepare data for updating a Geo Area using the South Central Geo Area as an example.
To begin, create a folder for the Geo Area (data/SC/
in this example) as well
as the necessary subfolders, raw/
and raw/nonspatial/
. The folder structure
will look like this:
SC
│
└───raw
│
└───nonspatial
│
Next, we will populate the spatial data, which consists of seven raster maps of the entire Geo Area. These maps are provided by LANDFIRE but must be masked by Geo Area. Wendell Hann and Katherine Hegewisch from the University of Idaho have already done this masking for us in ArcGIS and converted the output to GeoTiff format. As part of this pre-processing, they also removed a handful of EVC and EVH cells that were mistakenly coded 0 or 255. Finally, these masked files were made available to Apex RMS through the University of Idaho OwnCloud server. Please contact Wendel or Katherine to get access to these files.
These pre-porcessed files were then downloaded, decompressed, moved into the
raw/
subfolder of the corresponding Geo Area data folder and renamed in order
to be recognized by the script.
For each of these seven maps, the following table lists the original file and archive names for the SC Geo Area.
Cleaned File Name | Name of Archive Containing the File | Original File Name |
---|---|---|
Map Zones.tif | sc_mapzone_tiff.zip | sc_mapzone.tif |
EVC.tif | sc_evc2.0_tiff.zip | sc_evc20.tif |
EVH.tif | sc_evh2.0_tiff.zip | sc_evh20.tif |
EVT.tif | sc_evt2.0_tiff.zip | sc_evt20.tif |
FDIST.tif | sc_fdist2020_tiff.zip | sc_fdist.tif |
To use this table, download the zip archives listed in the second column. Next,
extract the contents and copy the file listed in the third column into the
raw/
subfolder created in the previous section. Finally rename the files
to the clean names listed in the first column.
For example, to prepare the raster map of Map Zones in the SC Geo Area for the
example data archive, I first downloaded and extracted sc_mapzone_tiff.zip
.
In the uncompressed folder, I found and moved the file sc_mapzone.tif
into
the folder data/SC/raw/
. Finally, I renamed this raster map to
Map Zones.tif
. After repeating this process for all seven maps, the SC data
folder should have the following structure:
SC
│
└───raw
│ Map Zones.tif
│ EVC.tif
│ EVH.tif
│ EVT.tif
│ FDIST.tif
│
└───nonspatial
│
Next, we will prepare the two non-spatial data files needed for each Geo Are.
Both of these files are provided by LANDFIRE as tables within a Microsoft Access
database file. These Access database tables must be exported, converted to Comma
Separated Value (CSV) format, moved into the raw/nonspatial/
subfolder of the
corresponding Geo Area data folder and renamed to be recognized for the script.
Both tables needed for the SC Geo Area were found within the
SC_GeoArea_VegTransitions_Update_for_Remap_KCH_complete_KCH_2020_12_18.accdb
database file. The following table connects the non-spatial data file names
with their original table names in the Access database.
Cleaned File Name | Original Table Name within Database |
---|---|
Transition Table.csv | vegtransf_rv02i_d |
EVT Colors.csv | sc_evt200 |
To use this table, open the Access database file using Microsoft Access and
locate the tables listed in the second column. Select the table from the
navigator on the left and open the External Data
tab in the main toolbar at
the top of the window. In this tab, select Text File
to export the table to a
text file. In the dialog window that opens, use the Browse...
button to save
the files to the raw/nonspatial
subfolder and use the textbox to rename the
files to the cleaned file names listed in the table above. Ensure that the
Export data with formatting and layout
checkbox is not checked, and press
OK
to continue. In the next window, selected Delimited
to use comma-separated
values and hit Next
. In this window, ensure that Comma
is selected in the
delimiter section, check the Include Field Names on First Row
checkbox, and
set the Text Qualifier
value to {none}
. Finally, hit Next
one last time,
double check that the correct folder and cleaned file name is set in the
Export to file
text box, and hit Finish
.
For example, to prepare the transition table for the SC Geo Area, I first
downloaded the Access database file. I located the vegtransf_rv02i_d
table
within this database, exported it into an Excel file using Access, and then
used Excel to convert the file into a CSV. I moved this CSV file into the
folder data/SC/raw/nonspatial/
and renamed it to Transition Table.csv
.
After repeating these steps for both nonspatial datasets, you should have
all the necessary data to process the entire Geo Area, with your folder/file
structure shown below.
SC
│
└───raw
│ Map Zones.tif
│ EVC.tif
│ EVH.tif
│ EVT.tif
│ FDIST.tif
│
└───nonspatial
│ Transition Table.csv
| EVT Colors.csv
Once the data files for a new Geo Area have been prepared, it is important to
update the configuration file config/config.yaml
to reflect these changes.
Please refer to the Configuration section for details, but
be sure to update the path to the new data folder (dataFolder
), the list of
map zones to generate scenarios for (mapzonesToKeep
), and the SyncroSim
library and project names (libraryName
and projectName
, respectively). The
crop extent used for testing (cropExtentPath
) must also be updated to use
testing mode with a new Geo Area.
Below is a description of all data files used in the update. This is includes both spatial and non-spatial inputs. A number of data files are common to all Geo Areas and are described in the Shared Data Files subsection below. The remaining input data files are described in the Geo-Area-Specific Data Files subsection. Descriptions of the columns within each table can also be found within these subsections.
These common files are stored in the data/Shared/
folder and are used for all
Geo Areas.
File Name | Description |
---|---|
All Combinations.csv | This table provides every valid combination of EVC and EVH. This is used to check for invalid state classes. |
Disturbance Crosswalk.csv | This crosswalk is used to convert between Fuel Disturbance (FDist) and Vegetation Disturbance (VDist) codes. |
EVC Crosswalk.csv | This crosswalk is used to convert from the continuous Existing Vegetation Cover (EVC) codes used in the raster maps to the categorical codes used by the transition rules. |
EVH Crosswalk.csv | This crosswalk is used to convert from the continuous Existing Vegetation Height (EVH) codes used in the raster maps to the categorical codes used by the transition rules. |
EVC LUT.csv | This look-up table is used connect to Existing Vegetation Cover (EVC) codes to human-readable names. |
EVH LUT.csv | This look-up table is used connect to Existing Vegetation Height (EVH) codes to human-readable names. |
VDIST LUT.csv | This look-up table is used connect to Vegetation Disturbace (VDist) codes to human-readable names. |
Recently Disturbed EVT.csv | This table is used to identify recently disturbed cells that should be filtered out from the raster maps prior to simulation. |
EVC Colors.csv | This table is used to assign colors to Existing Vegetation Cover (EVC) codes. These are used when rendering raster maps. |
Default SyncroSim Charts.csv | This table is used to build the default SyncroSim charts for visualizing the run results. |
Default SyncroSim Maps.csv | This table is used to build the default SyncroSim maps for visualizing the run results. |
Column | Description |
---|---|
EVC | An Existing Vegetation Cover (EVC) code. |
EVH | An Existing Vegetation Height (EVH) code that can be paired with the given EVC. |
Column | Description |
---|---|
VDIST | A Vegetation Disturbance (VDist) code. |
FDIST | A matching Fuel Disturbance (FDist) code. |
d_type | The type of disturbance. |
d_severity | The severity of the disturbance. |
d_time | The time since the disturbance. |
Column | Description |
---|---|
CONTINUOUS | A continuous Existing Vegetation Cover (EVC) code. |
CATEGORICAL | The corresponding categorical Existing Vegetation Cover (EVC) code. |
Column | Description |
---|---|
CONTINUOUS | A continuous Existing Vegetation Height (EVH) code. |
CATEGORICAL | The corresponding categorical Existing Vegetation Height (EVH) code. |
Column | Description |
---|---|
VALUE | An Existing Vegetation Cover (EVC) code. |
CLASSNAMES | A matching name for the cover code. |
EVT_LIFEFORM | A grouping variable describing the lifeform type. |
CONTINUOUS | A matching continuous Existing Vegetation Cover (EVC) code to use during Geo Area reconstruction. |
Column | Description |
---|---|
VALUE | An Existing Vegetation Height (EVH) code. |
CLASSNAMES | A matching name for the cover code. |
LIFEFORM | A grouping variable describing the lifeform type. |
HC_ID | Not used. |
CONTINUOUS | A matching continuous Existing Vegetation Height (EVH) code to use during Geo Area reconstruction. |
Column | Description |
---|---|
value | A Vegetation Disturbance (VDist) code. |
d_type | The type of disturbance. |
d_severity | The severity of the disturbance. |
d_time | The time since the disturbance. |
R | The Red component of the disturbance color as an integer from 0 to 255. |
G | The Green component of the disturbance color as an integer from 0 to 255. |
B | The Blue component of the disturbance color as an integer from 0 to 255. |
Column | Description |
---|---|
EVT | An Existing Vegetation code that identifies a recently disturbed cell. |
Column | Description |
---|---|
value | An Existing Vegetation Cover (EVC) code. |
R | The Red component of the state class color as an integer from 0 to 255. |
G | The Green component of the state class color as an integer from 0 to 255. |
B | The Blue component of the state class color as an integer from 0 to 255. |
These two CSV files are generated and exported from SyncroSim and are used to construct the default charts and maps. It is not recommended to edit these tables by hand.
These files are specific to the Geo Area being processed. As described in the
Input Data Structure section, these files are suggested to be
stored in a folder within the data/
folder that indicates the Geo Area name.
This section also describes the expected organization of these files within this
Geo-Area-specific folder.
File Name | Description |
---|---|
Transition Table.csv | This table defines how every combination of Existing Vegetation Cover (EVC), Height (EVH), and Type (EVT) responds to a given disturbance (VDist). |
EVT Colors.csv | This table is used to assign colors to Existing Vegetation Type (EVT) codes. These are used when rendering raster maps. |
Map Zones.tif | This is the raster map of Map Zones found within the Geo Area. |
EVC.tif | This is the raster map of Existing Vegetation Covers (EVC) for the entire Geo Area using the continuous EVC codes. |
EVH.tif | This is the raster map of Existing Vegetation Heights (EVH) for the entire Geo Area using the continuous EVH codes. |
EVT.tif | This is the raster map of Existing Vegetation Types (EVT) for the entire Geo Area. |
FDIST.tif | This is the raster map of Fuel Disturbances (FDist) for the entire Geo Area. |
Column | Description |
---|---|
MZ | The Map Zone to apply the transition rule. |
VDIST | The Vegetation Disturbance (VDist) code that triggers the transition. |
DIST_CATID | Not used. |
EVT7B | The Existing Vegetation Type (EVT) code prior to the disturbance. |
EVT7B_Name | The name of the Existing Vegetation Type (EVT) prior to the disturbance. |
EVHB | The Existing Vegetation Cover (EVC) code prior to the disturbance. |
EVCB | The Existing Vegetation Height (EVH) code prior to the disturbance. |
EVT7R | Not used. |
EVT7R_Name | Not used. |
EVHR | The Existing Vegetation Cover (EVC) code after the disturbance. |
EVCR | The Existing Vegetation Height (EVH) code after the disturbance. |
Column | Description |
---|---|
VALUE | An Existing Vegetation Type (EVT) code. |
COUNT | Not used. |
VALUE_1 | Not used. |
EVT_NAME | Not used. |
LFRDB | Not used. |
EVT_FUEL | Not used. |
EVT_FUEL_N | Not used. |
EVT_LF | Not used. |
EVT_PHYS | Not used. |
EVT_GP | Not used. |
EVT_GP_N | Not used. |
SAF_SRM | Not used. |
EVT_ORDER | Not used. |
EVT_CLASS | Not used. |
EVT_SBCLS | Not used. |
R | The Red component of the primary stratum color as an integer from 0 to 255. |
G | The Green component of the primary stratum color as an integer from 0 to 255. |
B | The Blue component of the primary stratum color as an integer from 0 to 255. |
RED | Not used. |
GREEN | Not used. |
BLUE | Not used. |