Skip to content

Commit

Permalink
Minor text additions to improve article clarity.
Browse files Browse the repository at this point in the history
In response to @mhesselbarth's comments in Issue #37 I have edited the Summary and body of the article.

To address the question about the `set_*()` functions I have indicated that the intention is to support upload, but that that the functionality (on the Neotoma Database side) is not present yet:

> The `set_*()` functions are intended as a precursor to utilities to upload data directly to Neotoma, although this functionality is not yet available.

The question about how the package conforms to a `tidyverse` model is explained here:

> This package conforms to a `tidyverse` [@wickham2019tidyverse] approach for data management, with methods and data objects that are suited to piping using the `%>%` (or `|>`) pipe convention and with implementations for filtering and other common `dplyr` methods.  The package also now uses "long" `data.frames` and `tibbles` by default, using the `toWide()` function to transform data into "wide" tables for use with common ecological data packages such as `vegan`.  Data objects in the `neotoma2` package now more closely resemble the underlying data model within Neotoma (https://open.neotomadb.org/db_schema) than in the previous package.

Change also reflects the addition of a bibtex entry for the `vegan` package.
  • Loading branch information
SimonGoring committed Aug 29, 2023
1 parent 9e7fe22 commit 7905670
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 2 deletions.
8 changes: 8 additions & 0 deletions paper.bib
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,14 @@ @article{uhen2021earthlife
number={2},
year={2021}
}

@Manual{oksanen2022vegan,
title = {vegan: Community Ecology Package},
author = {Jari Oksanen and Gavin L. Simpson and F. Guillaume Blanchet and Roeland Kindt and Pierre Legendre and Peter R. Minchin and R.B. O'Hara and Peter Solymos and M. Henry H. Stevens and Eduard Szoecs and Helene Wagner and Matt Barbour and Michael Bedward and Ben Bolker and Daniel Borcard and Gustavo Carvalho and Michael Chirico and Miquel {De Caceres} and Sebastien Durand and Heloisa Beatriz Antoniazi Evangelista and Rich FitzJohn and Michael Friendly and Brendan Furneaux and Geoffrey Hannigan and Mark O. Hill and Leo Lahti and Dan McGlinn and Marie-Helene Ouellette and Eduardo {Ribeiro Cunha} and Tyler Smith and Adrian Stier and Cajo J.F. {Ter Braak} and James Weedon},
year = {2022},
note = {R package version 2.6-2},
url = {https://CRAN.R-project.org/package=vegan},
}
@article{williams2018neotoma,
title={The Neotoma Paleoecology Database, a multiproxy, international, community-curated data resource},
Expand Down
4 changes: 2 additions & 2 deletions paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,15 +27,15 @@ bibliography: paper.bib

# Summary

The `neotoma2` R package is a tool to access and manipulate data from the Neotoma Paleoecology Database (https://www.neotomadb.org) within the R environment. Neotoma is a community curated paleoecological data resource [@williams2018neotoma], containing nearly 9 million unique observations of paleoecological proxies with global coverage from 37 constituent databases. The package uses the Neotoma API v2.0 [@goring2023api] as a tool to import records from the Neotoma database, allowing researchers to examine taxonomic, spatial and temporal patterns across space and time over the last 5.4 million years. The R package allows researchers to download, and create new records using `get_*()` (e.g., `get_sites()`) and `set_*()` functions (e.g., `set_sites()`) respectively. This provides researchers with the opportunity to develop dynamic workflows that include data generated locally, and not yet submitted to the Neotoma database.
The `neotoma2` R package is a tool to access and manipulate data from the Neotoma Paleoecology Database (https://www.neotomadb.org) within the R environment. Neotoma is a community curated paleoecological data resource [@williams2018neotoma], containing nearly 9 million unique observations of paleoecological proxies with global coverage from 37 constituent databases. The package uses the Neotoma API v2.0 [@goring2023api] as a tool to import records from the Neotoma database, allowing researchers to examine taxonomic, spatial and temporal patterns across space and time over the last 5.4 million years. The R package allows researchers to download, and create new records using `get_*()` (e.g., `get_sites()`) and `set_*()` functions (e.g., `set_sites()`) respectively. This provides researchers with the opportunity to develop dynamic workflows that include data generated locally, and not yet submitted to the Neotoma database. The `set_*()` functions are intended as a precursor to utilities to upload data directly to Neotoma, although this functionality is not yet available.

The `neotoma2` R package has been under dynamic development for over a year, but has been used for teaching and training [@Goring2023APD]. This release of the `neotoma2` R package is a clean release of the package, with all of the core features provided and extensive test coverage implemented.

# Statement of Need

The `neotoma` R package [@goring2015neotoma] leveraged the Neotoma Paleoeocology Database v1.0 API and had been one of the primary tools for researchers working with data from Neotoma [@wang2023plants;@kujawa2016effects; @byun2021extensive]. Changes to the underlying database and a rebuilding of the API required new data objects within the R package to more closely align to the Neotoma data model [@grimm2008neotoma]. Additionally, the original v1.0 API that was accessed by the `neotoma` pacakge was deprecated in 2020, meaning the `neotoma` package could no longer access data from Neotoma.

The broad user community for Neotoma [@williams2018neotoma; @goring2018nexus] requires a toolset that can access and manage data for each of the more than 40 dataset types within Neotoma and so extensive metadata must be accessed for each record. This package conforms to a `tidyverse` [@wickham2019tidyverse] approach for data management, with data objects that more closely resemble the underlying data model within Neotoma (https://open.neotomadb.org/db_schema). Most importantly the `neotoma2` package provides a toolset for paleoecologists, ecologists, conservation ecologists, archaeologists, and others, to access and examine the broad range of fossil data contained within the Neotoma Paleoecology Database.
The broad user community for Neotoma [@williams2018neotoma; @goring2018nexus] requires a toolset that can access and manage data for each of the more than 40 dataset types within Neotoma and so extensive metadata must be accessed for each record. This package conforms to a `tidyverse` [@wickham2019tidyverse] approach for data management, with methods and data objects that are suited to piping using the `%>%` (or `|>`) pipe convention and with implementations for filtering and other common `dplyr` methods. The package also now uses "long" `data.frames` and `tibbles` by default, using the `toWide()` function to transform data into "wide" tables for use with common ecological data packages such as `vegan` [@oksanen2022vegan]. Data objects in the `neotoma2` package now more closely resemble the underlying data model within Neotoma (https://open.neotomadb.org/db_schema) than in the previous package. Most importantly the `neotoma2` package provides a toolset for paleoecologists, ecologists, conservation ecologists, archaeologists, and others, to access and examine the broad range of fossil data contained within the Neotoma Paleoecology Database.

Data from the Neotoma Database can be accessed through the public API (https://api.neotomadb.org) or a Postgres database snapshot with a database client [@williams2018neotoma] or through the EarthLife Consortium API [@uhen2021earthlife]. The R package will simplify many of the operations required to assemble and manipulate datasets, and provides functions thatsupport researchers as part of their analytic workflows by linking Neotoma directly to packages within R used for ecological and earth science research, data visualization and statistical analysis.

Expand Down

1 comment on commit 7905670

@mhesselbarth
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Please sign in to comment.