Scientific names

Scientific names follow nomenclatural rules, mainly governed by the ICZN and ICN.

GBIF name parser

Rather than manually verifying if names in a dataset are well-formed, you can use the GBIF name parser to do that automatically. rgbif provides a function parsenames() to interact with the GBIF name parser:

parsed_names <- input_data %>%
  distinct(scientific_name) %>% # Remove duplicate names: you only need to parse each name once
  pull() %>%                          # Transform dataframe to vector: parsenames() needs a vector of names
  rgbif::parsenames()                 # Parse names

The name parser will dissect the name into its components and return the following values for a well-formed name:

type = SCIENTIFIC
parsed = TRUE
parsedpartially = FALSE

Information deviating from these criteria could imply that the scientific name is incorrect.

Note that the name parser does not check the existence of a scientific name against an existing registry. That is done by the GBIF species lookup, which verifies the existence of a name in the GBIF backbone taxonomy. Since checklists are sometimes the source of new names, checking them against the backbone is of less importance here.

Potentially incorrect names

The type field indicates whether or not the scientific name is truly scientific (type = SCIENTIFIC) or whether it is not a scientificname of any kind (type = NO_NAME). It is important to understand that scientific names deviating from the above criteria are not necessarily incorrect: the name parser just gives you a (very) good idea about which names could be wrong.

The parsed and parsedpartially fields indicate whether or not the name parser has parsed the name fully, which is not always the case. This could be due to spelling errors or when taxonomic, nomenclatural or identification notes are added to the end of the name. In these cases the name will only be parsed partially (parsedpartially = TRUE) or not at all (parsed = FALSE).

Some examples:

For Acmella agg. the name parser returns:

scientificname	type	genusorabove	parsed	parsedpartially	canonicalname	canonicalnamewithmarker	canonicalnamecomplete	rankmarker
Acmella agg.	INFORMAL	Acmella	TRUE	FALSE	Acmella	Acmella	Acmella	agg.

Here, the output indicates that Acmella agg. is a scientific name with some informal addition (type = "INFORMAL"). The decision whether or not to change the name is up to the author of the checklist.

For AseroÙ rubra the name parser returns:

scientificname	type	genusorabove	parsed	parsedpartially	canonicalname	canonicalnamewithmarker	canonicalnamecomplete	rankmarker
AseroÙ rubra	SCIENTIFIC	Asero	Ù	TRUE	TRUE	Asero	Asero	Asero Ù

The output indicates that the name was parsed only partially (parsedpartially = TRUE). This is due to a typo, i.e. the species name should be Asero rubra. There are two options to correct the scientific name in this case:

In the raw data file (= permanently, recommended in this case)
In the R code, using recode:

input_data %<>% mutate(variable = recode(scientific_name_column,
  "Asero rubra" = "AseroÙ rubra"
))

Home
Getting started
Basics
- Ingredients: Source data
- Instructions: R Markdown
- Utensils: Tidyverse functions
- Dinner: Darwin Core data
Mapping script
- Data preparation
- Mapping
  - Taxon core
  - Distribution extension
GitHub
Publishing data
Examples

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scientific names

GBIF name parser

Potentially incorrect names

Clone this wiki locally