# Fireveg DB - plot of fire history per sampling location

Author: [José R. Ferrer-Paris](https://github.com/jrfep) and [Ada Sánchez-Mercado](https://github.com/adasanchez)

Date: July 2024

This Jupyter Notebook includes R code to check taxonomic alignment for species names used in the Fireveg Database. 

The input is loaded from a public data record of the database.


The animal and plant lists in BioNet Atlas are maintained by the BioNet Data Team.

For taxonomic decisions on plants, BioNet uses the following sources in order of precedence (eg if there is inconsistency between the NSW Biodiversity Conservation Act 2016 and the information in PlantNet then BioNet names will be consistent with the former):
 - NSW Biodiversity Conservation Act 2016 https://legislation.nsw.gov.au/view/html/inforce/current/act-2016-063
 - PlantNet https://plantnet.rbgsyd.nsw.gov.au/search/simple.htm
 - APNI https://www.anbg.gov.au/apni/
 - IPNI https://www.ipni.org/

[BIONET species names web standard](https://www.environment.nsw.gov.au/-/media/OEH/Corporate-Site/Documents/BioNet/bionet-species-names-web-service-data-standard-v-1-2-220050.pdf)

## Set-up
### Load libraries


In [1]:
library(ggplot2)
##library(forcats)
library(dplyr)
#library(data.table)
require(tidyr)


Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union


Loading required package: tidyr



### Working directory

In [2]:
here::i_am("taxonomic-coverage/Check-taxonomy.ipynb")

here() starts at /Users/z3529065/proyectos/fireveg/fireveg-analysis



### Load data
Check if a copy of the data is available in the local data folder, otherwise download from OSF cloud drive. 

In [3]:
data_dir <- here::here("data")
if (!dir.exists(data_dir))
    dir.create(data_dir)

for (RDSfile in c("Summary-traits-species.rds")) {
    RDSpath <- here::here(data_dir, RDSfile)
    if (file.exists(RDSpath)) {
        cat(sprintf("RDS file found at:\n%s\nNo need to download!\n\n", RDSpath))
    } else {
        require(osfr)
        osf_project <- osf_retrieve_node("https://osf.io/h96q2")
        file_list <- osf_ls_files(osf_project, pattern=RDSfile)
        osf_download(file_list,
                 data_dir,
                 conflicts = "overwrite")
    }
}

RDS file found at:
/Users/z3529065/proyectos/fireveg/fireveg-analysis/data/Summary-traits-species.rds
No need to download!



In [4]:
#quadrats_table <- readRDS(here::here(data_dir,"Quadrat-sample-data.rds"))
all_traits <- readRDS(here::here(data_dir,"Summary-traits-species.rds"))


In [5]:
glimpse(all_traits)

Rows: 15,732
Columns: 24
$ family          [3m[90m<chr>[39m[23m "Brassicaceae", "Myrtaceae", "Myrtaceae", "Apiaceae", …
$ genus           [3m[90m<chr>[39m[23m "Lepidium", "Eucalyptus", "Melaleuca", "Actinotus", "A…
$ spp             [3m[90m<dbl>[39m[23m 2358, 2359, 2360, 2361, 2362, 2363, 2364, 2365, 2366, …
$ species         [3m[90m<chr>[39m[23m "Lepidium oxytrichum", "Eucalyptus williamsiana", "Mel…
$ current_spp     [3m[90m<dbl>[39m[23m 2358, 2359, 2360, 2361, 2362, 2363, 22952, 2365, 2366,…
$ current_species [3m[90m<chr>[39m[23m "Lepidium oxytrichum", "Eucalyptus williamsiana", "Mel…
$ taxonrank       [3m[90m<chr>[39m[23m "Species", "Species", "Species", "Species", "Species",…
$ establishment   [3m[90m<chr>[39m[23m "Alive in NSW, Native", "Alive in NSW, Native", "Alive…
$ current         [3m[90m<chr>[39m[23m "true", "true", "true", "true", "true", "true", "false…
$ nquadrat        [3m[90m<dbl>[39m[23m 0, 0, 0, 0, 0, 2, 1, 0, 0, 0, 15, 0, 0, 0

## Check taxonomic alignment with APCalign

From Will Cornwell:

> APC will apparently in the medium-term future going to a formal version control, but in the meantime we've been working with Anne Fuchs and Anna Munro to get a dynamic updating system here:  https://www.publish.csiro.au/BT/pdf/BT24014.
> 
>  Somewhere in the flow from RBG -> BioNET -> AVH -> ALA, there is a taxonomy switch. AVH (Australian Virtual Herbarium) seems to be APC, so I guess it's either it's on export from BioNET or input into AVH?
 
> I checked Palmeria racemosa which is for some reason not recognized by NSW, and it's not in the database, so it does seem like PlantNET taxonomy. So maybe check what's happening to names not in PlantNET?

Following recommendations from:
https://traitecoevo.github.io/APCalign/articles/APCalign.html

In [6]:
library(APCalign)

Download the stable version of the taxonomic resources

In [7]:
stable_resources <- load_taxonomic_resources(stable_or_current_data = "stable")




Loading resources into memory...





...done



In [22]:
fireveg_species_names <- all_traits |> 
  distinct(species) |>
  pull(species) |>
  create_taxonomic_update_lookup(resources = stable_resources)

Checking alignments of 15648 taxa


  -> of these [34m9044[39m names have a perfect match to a scientific name in the APC. 
      Alignments being sought for remaining names.



In [23]:

fireveg_species_names |> 
  print(n = 6)

[90m# A tibble: 15,648 × 12[39m
  original_name       aligned_name accepted_name suggested_name genus taxon_rank
  [3m[90m<chr>[39m[23m               [3m[90m<chr>[39m[23m        [3m[90m<chr>[39m[23m         [3m[90m<chr>[39m[23m          [3m[90m<chr>[39m[23m [3m[90m<chr>[39m[23m     
[90m1[39m Lepidium oxytrichum Lepidium ox… Lepidium oxy… Lepidium oxyt… Lepi… species   
[90m2[39m Eucalyptus william… Eucalyptus … Eucalyptus w… Eucalyptus wi… Euca… species   
[90m3[39m Melaleuca glomerata Melaleuca g… Melaleuca gl… Melaleuca glo… Mela… species   
[90m4[39m Actinotus helianthi Actinotus h… Actinotus he… Actinotus hel… Acti… species   
[90m5[39m Apium prostratum    Apium prost… Apium prostr… Apium prostra… Apium species   
[90m6[39m Cryptocarya obovata Cryptocarya… Cryptocarya … Cryptocarya o… Cryp… species   
[90m# ℹ 15,642 more rows[39m
[90m# ℹ 6 more variables: taxonomic_dataset <chr>, taxonomic_status <chr>,[39m
[90m#   scientific_name <chr>, 

In [27]:
fireveg_species_names |> 
    group_by(update_reason,taxon_rank) |> 
    summarise(names = n_distinct(original_name), .groups = "drop") |>
    pivot_wider(names_from = taxon_rank, values_from = names)

update_reason,form,species,subspecies,variety,family,genus,NA
<chr>,<int>,<int>,<int>,<int>,<int>,<int>,<int>
aligned name accepted by APC,28.0,8259.0,811.0,379.0,,,
basionym,,137.0,20.0,13.0,,,
doubtful taxonomic synonym,2.0,8.0,2.0,1.0,,,
excluded,3.0,96.0,4.0,1.0,,,
misapplied,,132.0,6.0,8.0,,,
nomenclatural synonym,4.0,1125.0,211.0,261.0,,,
orthographic variant,,139.0,6.0,15.0,,,
pro parte misapplied,,36.0,1.0,,,,
pro parte nomenclatural synonym,,,1.0,1.0,,,
pro parte taxonomic synonym,,12.0,3.0,5.0,,,


In [11]:
fireveg_species_names |> filter(update_reason %in% "pro parte misapplied") |> distinct()

original_name,aligned_name,accepted_name,suggested_name,genus,taxon_rank,taxonomic_dataset,taxonomic_status,scientific_name,aligned_reason,update_reason,number_of_collapsed_taxa
<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<dbl>
Myriophyllum elatinoides,Myriophyllum elatinoides,Myriophyllum caput-medusae,Myriophyllum caput-medusae [alternative possible names: Myriophyllum porcatum (pro parte misapplied) | Myriophyllum salsugineum (pro parte misapplied) | Myriophyllum triphyllum (pro parte misapplied)],Myriophyllum,species,APC,accepted,Myriophyllum caput-medusae Orchard,Exact match of taxon name to an APC-known canonical name once punctuation and filler words are removed (2024-08-16),pro parte misapplied,1
Xanthium strumarium,Xanthium strumarium,Xanthium cavanillesii,Xanthium cavanillesii [alternative possible names: Xanthium italicum (pro parte misapplied) | Xanthium occidentale (pro parte misapplied) | Xanthium orientale (pro parte misapplied)],Xanthium,species,APC,accepted,Xanthium cavanillesii Schouw,Exact match of taxon name to an APC-known canonical name once punctuation and filler words are removed (2024-08-16),pro parte misapplied,1
Mischocarpus sundaicus,Mischocarpus sundaicus,Mischocarpus australis,Mischocarpus australis [alternative possible names: Mischocarpus macrocarpus (pro parte misapplied) | Mischocarpus stipitatus (pro parte misapplied) | Mischocarpus australis (misapplied)],Mischocarpus,species,APC,accepted,Mischocarpus australis S.T.Reynolds,Exact match of taxon name to an APC-known canonical name once punctuation and filler words are removed (2024-08-16),pro parte misapplied,1
Rubus fruticosus,Rubus fruticosus,Rubus anglocandicans,Rubus anglocandicans [alternative possible names: Rubus erythrops (pro parte misapplied) | Rubus laciniatus (pro parte misapplied) | Rubus leucostachys (pro parte misapplied) | Rubus phaeocarpus (pro parte misapplied) | Rubus riddelsdellii (pro parte misapplied) | Rubus rubritinctus (pro parte misapplied) | Rubus ulmifolius var. ulmifolius (pro parte misapplied) | Rubus anglocandicans (misapplied) | Rubus laudatus (misapplied)],Rubus,species,APC,accepted,Rubus anglocandicans A.Newton,Exact match of taxon name to an APC-known canonical name once punctuation and filler words are removed (2024-08-16),pro parte misapplied,1
Phaius tancarvilleae,Phaius tancarvilleae,Phaius amboinensis,Phaius amboinensis [alternative possible names: Phaius australis (pro parte misapplied) | Phaius bernaysii (pro parte misapplied) | Phaius australis (misapplied) | Phaius bernaysii (misapplied)],Phaius,species,APC,accepted,Phaius amboinensis Blume,Exact match of taxon name to an APC-known canonical name once punctuation and filler words are removed (2024-08-16),pro parte misapplied,1
Cuscuta racemosa,Cuscuta racemosa,Cuscuta epithymum,Cuscuta epithymum [alternative possible names: Cuscuta suaveolens (pro parte misapplied)],Cuscuta,species,APC,accepted,Cuscuta epithymum (L.) L.,Exact match of taxon name to an APC-known canonical name once punctuation and filler words are removed (2024-08-16),pro parte misapplied,1
Rubus chloocladus,Rubus chloocladus,Rubus anglocandicans,Rubus anglocandicans [alternative possible names: Rubus leucostachys (pro parte misapplied) | Rubus vestitus (pro parte misapplied) | Rubus leucostachys (misapplied)],Rubus,species,APC,accepted,Rubus anglocandicans A.Newton,Exact match of taxon name to an APC-known canonical name once punctuation and filler words are removed (2024-08-16),pro parte misapplied,1
Festuca elatior,Festuca elatior,Lolium arundinaceum,Lolium arundinaceum [alternative possible names: Lolium pratense (pro parte misapplied)],Lolium,species,APC,accepted,Lolium arundinaceum (Schreb.) Darbysh.,Exact match of taxon name to an APC-known canonical name once punctuation and filler words are removed (2024-08-16),pro parte misapplied,1
Picris hieracioides,Picris hieracioides,Picris angustifolia subsp. merxmuelleri,Picris angustifolia subsp. merxmuelleri [alternative possible names: Picris angustifolia subsp. angustifolia (misapplied) | Picris angustifolia subsp. merxmuelleri (misapplied)],Picris,species,APC,accepted,Picris angustifolia subsp. merxmuelleri Lack & S.Holzapfel,Exact match of taxon name to an APC-known canonical name once punctuation and filler words are removed (2024-08-16),pro parte misapplied,1
Rubus discolor,Rubus discolor,Rubus anglocandicans,Rubus anglocandicans [alternative possible names: Rubus anglocandicans (misapplied)],Rubus,species,APC,accepted,Rubus anglocandicans A.Newton,Exact match of taxon name to an APC-known canonical name once punctuation and filler words are removed (2024-08-16),pro parte misapplied,1


### Reproducibility

Here are details about the version of the `APCalign` package used here:

In [12]:
packageVersion("APCalign")
default_version()

[1] ‘1.0.2’

In [13]:
citation("APCalign")

To cite package ‘APCalign’ in publications use:

  Wenk E, Cornwell W, Fuchs A, Kar F, Monro A, Sauquet H, Stephens R,
  Falster D (2024). “APCalign: an R package workflow and app for
  aligning and updating flora names to the Australian Plant Census.”
  _Australian Journal of Botany_. R package version: 1.0.1,
  <https://www.biorxiv.org/content/10.1101/2024.02.02.578715v1>.

A BibTeX entry for LaTeX users is

  @Article{,
    title = {APCalign: an R package workflow and app for aligning and updating flora names to the Australian Plant Census},
    journal = {Australian Journal of Botany},
    author = {Elizabeth Wenk and Will Cornwell and Ann Fuchs and Fonti Kar and Anna Monro and Herve Sauquet and Ruby Stephens and Daniel Falster},
    year = {2024},
    note = {R package version: 1.0.1},
    url = {https://www.biorxiv.org/content/10.1101/2024.02.02.578715v1},
  }

## Check taxonomy with MCVP

In [28]:

library(rWCVP)

In [31]:
matched_table <- wcvp_match_names(head(all_traits), name_col="species")



[36m──[39m [1mMatching names to WCVP[22m [36m──────────────────────────────────────────────────────[39m

[36mℹ[39m Using the `species` column

[33m![39m No author information supplied - matching on taxon name only



── [1m[1mExact matching  names[1m[22m ──



[32m✔[39m Found 6 of  names



── [1m[1mMatching complete![1m[22m ──



[32m✔[39m Matched 6 of 6 names

[36mℹ[39m Exact (without author): 6

[33m![39m Names with multiple matches: 0



In [32]:
str(matched_table)

'data.frame':	6 obs. of  36 variables:
 $ family             : chr  "Brassicaceae" "Myrtaceae" "Myrtaceae" "Apiaceae" ...
 $ genus              : chr  "Lepidium" "Eucalyptus" "Melaleuca" "Actinotus" ...
 $ spp                : num  2358 2359 2360 2361 2362 ...
 $ species            : chr  "Lepidium oxytrichum" "Eucalyptus williamsiana" "Melaleuca glomerata" "Actinotus helianthi" ...
 $ current_spp        : num  2358 2359 2360 2361 2362 ...
 $ current_species    : chr  "Lepidium oxytrichum" "Eucalyptus williamsiana" "Melaleuca glomerata" "Actinotus helianthi" ...
 $ taxonrank          : chr  "Species" "Species" "Species" "Species" ...
 $ establishment      : chr  "Alive in NSW, Native" "Alive in NSW, Native" "Alive in NSW, Native" "Alive in NSW, Native" ...
 $ current            : chr  "true" "true" "true" "true" ...
 $ nquadrat           : num  0 0 0 0 0 2
 $ germ8              : num  0 0 0 0 0 0
 $ rect2              : num  0 1 0 2 0 0
 $ germ1              : num  0 1 0 1 0 0
 $ grow1

In [41]:
fireveg_species <- all_traits |> 
    filter(!species %in% c(NA,"")) |>
    select(species) |>
    distinct()

In [43]:
matched_table <- wcvp_match_names(fireveg_species, name_col="species", progress_bar = FALSE)

[K


Matching [32m■■■■                            [39m  12% | ETA:  2h
[K
[36m──[39m [1mMatching names to WCVP[22m [36m──────────────────────────────────────────────────────[39m

Matching [32m■■■■                            [39m  12% | ETA:  2h
[K
[36mℹ[39m Using the `species` column

Matching [32m■■■■                            [39m  12% | ETA:  2h
[K
[33m![39m No author information supplied - matching on taxon name only

Matching [32m■■■■                            [39m  12% | ETA:  2h
[K


Matching [32m■■■■                            [39m  12% | ETA:  2h
[K
── [1m[1mExact matching  names[1m[22m ──

Matching [32m■■■■                            [39m  12% | ETA:  2h
[K


Matching [32m■■■■                            [39m  12% | ETA:  2h
[K
[32m✔[39m Found 11718 of  names

Matching [32m■■■■                            [39m  12% | ETA:  2h
[K


Matching [32m■■■■                            [39m  12% | ETA:  2h
[K
── [1m[1mFuzzy matching [1m3930

In [44]:
s

ERROR: Error in eval(expr, envir, enclos): object 's' not found
