Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Getting Started Guide
Update package versions and install the
SpaDESpackage via CRAN:
# Restart your R session so it is clear # Ctrl-shift-F10 if you are in Rstudio # # If you have any of our packages or their dependencies, please update them first # Get latest versions of key SpaDES packages from CRAN dependencies <- tools::package_dependencies("SpaDES", recursive = TRUE) # Update any versions of the dependencies of those packages update.packages(oldPkgs = unlist(dependencies), ask = FALSE) # install the latest version of the SpaDES packages install.packages("SpaDES")
SpaDESis still a package in the early stages of maturity, the development branch on GitHub may contain useful bug fixes that are not in the CRAN version. To install this development version:
library("devtools") install_github("PredictiveEcology/SpaDES", ref = "development")
Using the instructions above,
devtools::install_github()will automatically try to install the optional
fastshppackage. This additional package requires OS development tools (e.g., Rtools for Windows). If the above installation doesn't work for you, be sure to install the necessary development tools before reinstalling the package.
install.packages("fastshp", repos = "https://rforge.net", type = "source")
Getting Started with
SpaDES package in your R session using:
Set your default working directories
Simulations make use of several working directories:
- an inputs directory,
SpaDESlooks to find simulation inputs;
- an outputs directory,
outputPath, where simulation outputs are saved;
- a cache directory,
cachePath, where simulation outputs are cached;
- a modules directory,
modulePath, where modules and their data are downloaded and saved.
Unless otherwise specified during
simInit (by passing a
paths argument), the default working directories are set via options.
Unless these options are changed by the user, the temporary locations are used.
To configure the location of these working directories:
## use 'setPaths' to quickly set all paths to a default location setPaths() ## set all paths to defaults ## alternatively, custom paths can be set as arguments to 'setPaths' setPaths(inputPath = 'path/to/my/inputs') ## set custom inputPath; all others set to defaults setPaths(inputPath = 'path/to/my/inputs', outputPath = 'path/to/my/outputs', cachePath = 'path/to/my/cache', modulePath = 'path/to/my/modules') ## set all paths custom ## or by chaging the global options directly options(spades.inputPath = 'path/to/my/inputs') options(spades.outputPath = 'path/to/my/outputs') options(spades.cachePath = 'path/to/my/cache') options(spades.modulePath = 'path/to/my/modules')
Remember that once custom paths are set by the user, calling
setPaths will reset the the directories to default. So if custom set paths have to be called, use
myinputPaths <- getPaths()$inputPath
Using pre-existing modules
Browse locally available modules:
openModules(path = "/path/to/my/modules") # opens all modules in a directory openModules("moduleName", "/path/to/my/modules") # opens only the named module
Browse modules at https://github.com/PredictiveEcology/SpaDES-modules
Download modules for use:
downloadModule("moduleName", path = "/path/to/my/modules", data = TRUE) openModules("moduleName", "/path/to/my/modules")
If no path is specified, modules and data will be downloaded and saved in the location returned by
getOption('spades.modulePath'). See above to change this default location.
Try the LCC2005 module tutorial to see SpaDES at work.
Creating new modules
Create an empty module template:
newModule("moduleName", path = "/path/to/my/modules")
Read the modules vignette for more details.
Module development checklist
- are module metadata fully and correctly specified (module description, authorship and citation info, parameters and inputs/outputs, etc.)?
- citation should specify how to cite the module, or if published, the paper that describes the module.
module object dependencies: use
objectDiagramto confirm how data objects are passed among modules.
- are all event types defined in doEvent?
function(sim)to access event functions from within a module: functions calls are correctly namespaced (i.e., it looks first inside the functions built in the module)
sim$objectto access and make "global" data objects, shared among events and modules
- use `sim$$object to access and make module-specific functions, not intended to be shared with other modules
sim[[globals(sim)$objectName]]to access variable-named objects
have you provided useful (meaningful) documentation in the module's
have you built (knitted) the
.Rmdfile to generate a
- have you specified the terms under which your module code can be reused and/or modified? Add a license!
we suggest that data you wish to include with your module are saved in
data/; this makes modules more easily shareable with other people. Access those data with
verify that external data sources are included in the
verify that any additional data preparation/transformation steps used in
SpaDES.tools::prepInputs( )may be very useful
CHECKSUMS.txtfile for all data using
checksums(..., write = TRUE)
Distributing your module
where will your module code/data be hosted? Currently Google Drive and Dropbox appear to be easy places which can be private or public, and can now be easily accessed with
downloadDatafrom a temp dir to ensure your module can be downloaded correctly by others
Strategies for module development
Since modules will often have to run many, many times because of replication, there are a few strategies that should be followed:
Always write fast code. This likely means using
data.table(usually fastest) or
dplyr(not quite as fast) for data and data wrangling.
- Matrices and vectors are generalyl fastest, if they provide the necessary features.
- Avoid loops.
reproduciblepackage to Cache functions for speed.
For computationally intensive functions, consider writing them in C++, via the
For large (out of RAM) situations, use
bigMemory. Sometimes, these can be done seamlessly inside functions using the
getOption("spades.lowMemory"), where two alternatives a provided, one "in Memory" the other "on disk". See "if (lowMemory)" code block about 20 lines from start of
spreadfunction for one way to do this with
Other best practices
Don't write modules that depend internally on other modules. Instead, pass data via the
outputObjectsin the metadata. This means avoid scheduling one event in module A from module B, if possible.
Use and push publicly sharable modules from and to the
SpaDES-Modulesrepository (https://github.com/PredictiveEcology/SpaDES-modules) using
downloadModule()or via pull request.
Types of modules
The concept of a "module" can be very broadly defined, i.e., what a particular module does can vary widely.
The only components that must exist are the metadata and the
This means that many, many types of modules can be written.
As we slowly build a
SpaDES ecosystem of modules designed to be used and re-used, we can consider writing our entire work flow -- raw data, data wrangling, data analysis, calibration of simulation model, simulation, output analysis, decision support -- all in one chain.
We can cache everything along the way, so that if something must run again, but its inputs are identical to a previous run, then it can just read from disk.
This is an evolving list of types of modules that would be useful to have in this "re-use" cycle:
- "classical" simulation models
- NetLogo-type models
- SELES-type models
- time is a component of the model
- e.g., predict methods from statistical outputs
agent based models
- animals, plants
- processes, such as fire
- e.g., forest succession, cellular automata
calibration and optimization
- taking outputs from other modules and rescheduling those other modules again, iterating through a heuristic optimization
- from one data type to another to allow two different modules to talk
- reprojection, crop, mask etc.
- modules that go to specific web resources (e.g., Dryad etc.)
- simplifying, joining etc.
- e.g., takes time series of rasters and visualizes them
quality scanning - e.g., from external databases
Current modules on the
SpaDES-Modules repository (see above) include simple versions of dynamic forecasting (Forest Succession, fireSpreadLcc, forestAge), GIS (cropReprojectLccAge), translators (LccToBeaconsReclassify),