Skip to content

Latest commit

 

History

History

distribution_models

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
title author date output
README
Renata L. Muylaert
3/28/2022
html_document
df_print
paged
knitr::opts_chunk$set(echo = TRUE)

Guidelines

This repository contains the code from Muylaert et al. (2022) and guidelines for replicating the workflow.

Please read the following guidelines:

  • Read the paper.
  • Make sure you download all data from Dryad.
  • Unzip the data inmuylaert_et_al_data.zip.
  • Clone the dynamic repository GitHub.
  • Unzip the code repository and find the distribution_models folder.
  • Move the extracted data files from muylaert_et_al_data to the distribution_models folder.
  • Make sure you are working in the distribution_models R project.
  • Run the 00_packages and 01_settings scripts first.
  • Run the other scripts following the numerical 3-digit ascending order or as needed.

Folder structure

Please find the content description for each folder below:

  • _env_27km_ss6 contains all environmental layers (*.tif) for the present, already scaled.
  • _env_fut_27km_ss6 contains all environmental layers (*.tif) for the future, already scaled, for all global circulation models, periods and scenarios.
  • dynamic_copies is an auxiliary data folder for when the user decides to update the species list used (*.xlsx and *.csv files).
  • dynamic_master contains the text files for the master dataset with IUCN-intersected and non-intersected occurrences iwthin our working extent.
  • hotspots contains data for species presence and future projections.
  • IUCN_assessment_list_chiropteraIUCN assessments for the Order Chiroptera (*.xlsx).
  • iucn_shapefile contains the IUCN ranges dataset for the Class Mammalia (terrestrial only).
  • olson contains the shapefile files for all terrestrial ecosystems of the world (Olson et al. 2001).
  • rasters_temp_forest contains the rasters used for hotspot calculation (*.tif).
  • results_40o_ss6_maxent_15rep ENMTML data sctructure for non-intersected occurrences (text files and *.tif).
  • results_iucni_40o_ss6_maxent_15repENMTML data sctructure for IUCN-intersected occurrences (text files and *.tif).

Workflow table

The reference for workflow table used is Zurell et al. 2020. A standard protocol for reporting species distribution models. Ecography 43, 1261–1277.

Section Subsection Element Value
Overview Authorship Study title Present and future distribution of bat hosts of sarbecoviruses: implications for conservation and public health.
Overview Authorship Author names Renata L. Muylaert; Tigga Kingston; Jinhong Luo; Maurício Humberto Vancine; Nikolas Galli; Colin J. Carlson; Reju Sam John; Maria Cristina Rulli; David T. S. Hayman.
Overview Authorship Contact R.deLaraMuylaert@massey.ac.nz
Overview Authorship Study link https://doi.org/10.1098/rspb.rspb.2022.0397
Overview Model objective Model objective Forecasting and transfer.
Overview Model objective Target output Continuous occurrence probabilities and binary maps of potential presence.
Overview Focal Taxon Focal Taxon Bats hosts of sarbecoviruses.
Overview Location Location World.
Overview Scale of Analysis Spatial extent -30, 160, -30, 70 (xmin, xmax, ymin, ymax)
Overview Scale of Analysis Spatial resolution 0.25 dd
Overview Scale of Analysis Temporal extent Near-current and Future (2021-2100).
Overview Scale of Analysis Temporal resolution Near-current, 2021-2040, 2041-2060, 2061-2080, 2081-2100.
Overview Scale of Analysis Boundary Terrestrial areas of the world.
Overview Biodiversity data Observation type Human observation of occurrences.
Overview Biodiversity data Response data type Presence.
Overview Predictors Predictor types Bioclimatic; karst; forest cover.
Overview Hypotheses Hypotheses Implications for the conservation and public health through evaluation of species distribution change in response to climatic, karst, and forest cover.
Overview Assumptions Model assumptions Bats occur within their bioregions where they were detected, and around their highest density of occurrence points (MSDMs). Bat distribution is driven bioclimatic covariates, karst and native forest cover. Accessibility bias partially drives observed occurrences. Sampling bias is minimized by filtering, spatial thinning and minimal occurrences for inclusion criteria (N=40).
Overview Algorithms Modelling techniques Maxent through the ENMTML R package.
Overview Algorithms Model complexity Six follwoing covariates were used bio 1, bio 4, bio 12, bio 15, karstm, primf tif files.
Overview Algorithms Model averaging True skill statistics-weighted (TSS-weighted) averaging.
Overview Workflow Model workflow ENMTML workflow.
Overview Software Software R 4.
Overview Software Code availability https://github.com/renatamuy/dynamic
Overview Software Data availability Dryad.
Data Biodiversity data Taxon names
Aselliscus stoliczkanus

Hipposideros armiger

Hipposideros galeritus

Hipposideros larvatus

Hipposideros pomona (gentilis)

Hipposideros pratti

Hipposideros ruber

Miniopterus schreibersii

Chaerephon plicatus

Tadarida teniotis

Rhinolophus acuminatus

Rhinolophus affinis

Rhinolophus blasii

Rhinolophus blythi

Rhinolophus cornutus

Rhinolophus creaghi

Rhinolophus euryale

Rhinolophus ferrumequinum

Rhinolophus hipposideros

Rhinolophus luctus

Rhinolophus macrotis

Rhinolophus malayanus

Rhinolophus marshalli

Rhinolophus mehelyi

Rhinolophus monoceros

Rhinolophus pearsonii

Rhinolophus rex

Rhinolophus shameli

Rhinolophus siamensis

Rhinolophus sinicus

Rhinolophus stheno

Rhinolophus thomasi

Nyctalus leisleri

Plecotus auritus
Data Biodiversity data Taxonomic reference system Wilson D, Mittermeier R, editors. Handbook of the Mammals of the World. Barcelona: Springer; 2019.
Data Biodiversity data Ecological level assemblage-level, species-level.
Data Biodiversity data Data sources Darkcides v1, Global Biodiversity Information Facility (GBIF), Berkeley Ecoinformatics Engine (Ecoengine), Vertnet, Integrated Digitized Biocollections (IDigBio), iNaturalist, Obis, Vertnet, and data compiled for previous publications Darkcides v01, Rulli et al. (2020), Luo et al. (2013)
Data Biodiversity data Sampling design ENMTML workflow.
Data Biodiversity data Clipping Terrestrial areas of the world.
Data Biodiversity data Scaling None.
Data Biodiversity data Cleaning Data cleaning: Temporal range from 1970-2020. Cleaning process through CooordinateCleaner package including species with at least 40 occurrence points.
Data Biodiversity data Absence data None.
Data Biodiversity data Background data pres_abs_ratio = 1
Data Biodiversity data Errors and biases Errors and biases: Sampling rates estimates through sampbias R package.
Data Data partitioning Training data 75:25 training:test.
Data Data partitioning Validation data 75:25 training:test.
Data Data partitioning Test data Ratio of 75:25 training:test cross-validation splits with 10 repeats.
Data Predictor variables Predictor variables Bioclimatic variables, Karst composite layer, Primary forest cover.
Data Predictor variables Data sources Table S3.
Data Predictor variables Spatial extent -30, 160, -30, 70 (xmin, xmax, ymin, ymax)'
Data Predictor variables Spatial resolution 0.25 dd.
Data Predictor variables Coordinate reference system WGS84.
Data Predictor variables Temporal extent Bioclimatic variables cover 1970-2000 for near-current conditions. Future projection periods: 2020-2040, 2040-2060, 2060-2080, 2080-2100.
Data Predictor variables Temporal resolution Future projection periods: 2020-2040, 2040-2060, 2060-2080, 2080-2100.
Data Predictor variables Data processing Covariates resampled to 0.25 dd.
Data Predictor variables Errors and biases Assessed via sampbias R package.
Data Predictor variables Dimension reduction None.
Data Transfer data Data sources
Data Transfer data Spatial extent World.
Data Transfer data Spatial resolution 0.25 dd
Data Transfer data Temporal extent 1970-present
Data Transfer data Temporal resolution Yearly
Data Transfer data Models and scenarios Future bioclimatic data downloaded from Worldclim (CMIP6).
Data Transfer data Data processing Future occurrence projections were made for each species and then ensembled per period per GCM and SSP.
Data Transfer data Quantification of Novelty NA
Model Variable pre-selection Variable pre-selection Relevance for our conceptual model of important native habitats for the selected species.
Model Multicollinearity Multicollinearity All bioclimatic covariates, karst layer and forest layer were pre-selected and then filtered after correlation analysis (0.7 cutoff value).
Model Model settings Model settings (fitting) MXS and MXD algorithms.
Model Model settings Model settings (extrapolation) Extrapolations over near-current accessible areas assuming MSDM 'OBR' for the present.
Model Model estimates Coefficients NA
Model Model estimates Parameter uncertainty NA
Model Model estimates Variable importance Correlative.
Model Model selection - model averaging - ensembles Model selection NA
Model Model selection - model averaging - ensembles Model averaging NA
Model Model selection - model averaging - ensembles Model ensembles Weighted averaging of the algorithms through TSS.
Model Analysis and Correction of non-independence Spatial autocorrelation NA
Model Analysis and Correction of non-independence Temporal autocorrelation NA
Model Analysis and Correction of non-independence Nested data NA
Model Threshold selection Threshold selection We used the sensitivity‐specificity sum maximisation (max TSS) approach to select the optimal suitability threshold.
Assessment Performance statistics Performance on training data NA
Assessment Performance statistics Performance on validation data NA
Assessment Performance statistics Performance on test data True skill statistics (TSS).
Assessment Plausibility check Response shapes NA
Assessment Plausibility check Expert judgement IUCN range polygons and the Handbook of the Mammals of the World.
Prediction Prediction output Prediction unit Continuous suitability and estimated richness for hotspots inference (sum of final binary maps).
Prediction Prediction output Post-processing Area calculation through raster R package.
Prediction Uncertainty quantification Algorithmic uncertainty Ensemble over two algorithms and 10 repeats.
Prediction Uncertainty quantification Input data uncertainty Sampling bias adjusted map in Figure 2. 2 SSPs and 2 GCMs for future scenarios.
Prediction Uncertainty quantification Parameter uncertainty Table S2 for parameters used in sampbias.
Prediction Uncertainty quantification Scenario uncertainty SSP-2.45 and SSP-5.85 scenario evaluation.
Prediction Uncertainty quantification Novel environments NA