This repository contains a list of Marine Fauna localized to the Hawaii Exclusive Economic Zone along with the code used to update taxonomy and consolidate species across all data sources.
A number of sources were used to compile this list:
- The Bishop Museum's Marine Invertebrates of the Hawaiian Islands Checklist was the primary source we used which can be accessed here.
- Invert-E-Base's Marine Invertebrates of Kaneohe Bay Checklist found here.
- Ocean Biodiversity Information System's (OBIS) records for the Hawaii Exclusive Economic Zone hosted here.
- Lifewatch's records for the Hawaii Exclusive Economic Zone hosted here.
- ARMS Data
- Various Professionals in their respective fields
- Local Names were sourced from the 2022 Ecosystem Status Report for Hawaii (Appendix A1)
- NOAA's Pacific Island Region Cetacean Research Program for Marine Mammals
The Bishop Museum list used as the primary source came as raw text data stored in .htm
files. These files were last updated in 2001 according to their website. The first step in creating the list was parsing the raw text into a more user-friendly format.
- Extract the text data from the htm files.
- Use Regular Expressions to extract Species names using the standard taxonomic naming scheme (Genus species)
- Convert the now vectorized text data into a dataframe which can be manipulated by
tidyverse
tools. - Collapse Sp. and cf. entries to Genus level identification.
Once the Bishop Museum data was parsed into the dataframe format, we were then able to merge data from the supplementary sources with the primary source to produce the end result.
- Look up taxa using the World Registry of Marine Species API
- If the taxa was found, save off a number of columns of interest from the returned data entry.
- If the taxa wasn't found, use fuzzy searching to complete another lookup.
- Combine fuzzy searched lookup with non-fuzzy lookup.
- Repeat for all data sources.
- Combine all data sources to produce a comprehensive species list of marine invertebrates for the Hawaii EEZ.
Almost all data and code needed to reproduce the marine invertebrate list can be found in this repository. The exception for this is the OBIS dataset which was too large to be stored in a Github repository. This dataset must be downloaded directly from their data explorer located here.
Overview of the variables output in HawaiiMarineFaunaList.csv
:
Variable (group) | Description |
---|---|
scientificname | Original scientific name given to the WoRMS API to look up |
AphiaID | The Aphia ID associated with the original scientific name |
worms_name | The name returned from the WoRMS API lookup |
valid_AphiaID | The Aphia ID associated with the scientific name returned from WoRMS |
status | Current taxonomic status of the original scientific name |
rank | Taxonomic ranking of the returned listing |
kingdom:species | Taxonomic heirarchy for the valid Aphia ID |
origin | The original source list where the data came from. |
endemic | Binary classification detailing if species is endemic to Hawaii (Limited data) |
deepsea | Binary classification detailing if species is deep sea (Limited to corals) |
distribution | Data regarding the distribution of the species (If known) |
localName | Known names used locally to identify the entry |
common | Common name for the species |
fishery | Info regarding which fishery the species is associated with |
If you find any problems with the list and/or would like to contribute in any way, please feel free to create a pull request, open an issue, or send me a message.