Skip to content

Data & Code Supplement for "Environmental DNA as a management tool for tracking artificial waterhole use in savanna ecosystems"

Notifications You must be signed in to change notification settings

maxfarrell/eDNAcamtrap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data & Code Supplement for "Environmental DNA as a management tool for tracking artificial waterhole use in savanna ecosystems"

DOI

Reference Libraries (refLib folder)

Gathering barcoding reference sequences for the Kruger National Park (also available as a stand alone git reposity.

Scripts

The two scripts to gather sequences from GenBank and BOLD are "GB_BOLD_seq_download.R" which take a list of Latin binomials and download sequences from the rentrez and bold R packages. These can be modified to download reference sequences for other marker genes. Next "CO1_from_GB_mito_genomes.R" follows the same process, but uses rentrez and modified scripts from the PrimerMiner R package to downloadd whole mitochondrial genomes and extract COI sequences.

After downloading the sequences, "format_GB_BOLD_refLib.sh", "generate_taxonomy.R", and "format_refLib_dada_2.R are used to clean up the downloaded FASTA files, generate the taxonomy file, and format for use with dada2's built-in RDP classifer.

Beyond the custom library, additional scripts are included to format the MIDORI and terrimporter COI reference databases for use with dada2 (require download of these source databases).

Data

Contains downloaded FASTA files, whole mitochondrial genomes, and species lists generated by the Kruger National Park.

Output

The "output" folder contains intermediate files, plus the final dada2-formatted reference sequences:

  • Kingdom to Genus: "Kruger_Vertebrates_refLib_dada2.fasta"
  • Species: "Kruger_Vertebrates_refLib_dada2_species.fasta"
  • Phylum to Species: "Kruger_Vertebrates_refLib_dada2_phy2species.fasta"

Data processing, merger, and analyses

Data

Analyses can be reproduced with the data files included here. To reproduce the DADA2 pipelines, the raw sequence reads are archived in the NCBI Sequence Read Archive:

BioProject PRJNA490450 Accession numbers SRR7822814 to SRR7822901

The "data" folder contains the raw camera trap annotations, field notes, mammal phylogeny, mammal trait data, and the final merged data file ("merged_eDNA_camtrap_data_nov20_2019.RData" - created by scripts/merging_data.R)

Scripts

Raw sequences are separated into primer sets ("separate_coi_by_primer.sh"), then per primer set, separate DADA2 pipelines perform quality filtering, denoising, chimera removal, ASV calling, and taxonomy assignment (dada2_*.R files). The sequence tables resulting from the separate pipelines are merged with "tax_assign_dada2.R"

"merging_data.R" merges the eDNA sequence tables, camera trap, and water sample data into an RData object for subsequent analyses ("merged_eDNA_camtrap_data_nov20_2019.RData").

Statistical analyses, figures, and tables for the most analyses are conducted withn "analyses.R", with the exception of the hierarchical models, which are conducted with "stan_models.R"

About

Data & Code Supplement for "Environmental DNA as a management tool for tracking artificial waterhole use in savanna ecosystems"

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published