Skip to content
This repository has been archived by the owner on Sep 9, 2022. It is now read-only.

use case: real world eg of data cleaning with gbif data #22

Closed
sckott opened this issue Dec 30, 2016 · 4 comments
Closed

use case: real world eg of data cleaning with gbif data #22

sckott opened this issue Dec 30, 2016 · 4 comments

Comments

@sckott
Copy link
Collaborator

sckott commented Dec 30, 2016

from https://doi.org/10.1093/jxb/erw451

Occurrence data for each taxon were downloaded from the Global Biodiversity Information Facility (GBIF, http://www.gbif.org) using the RGBIF package in R (Chamberlain et al., 2016; data accessed 1 and 2 July 2016). Occurrence data for the Zambezian C3–C4 within Alloteropsis semialata were taken from Lundgren et al. (2015, 2016). All occurrence data were cleaned by removing any anomalous lati- tude or longitude points, points falling outside of a landmass, and any points close to GBIF headquarters in Copenhagen, Denmark, which may result from erroneous geolocation. To avoid repeated occurrences, latitude and longitude decimal degree values were rounded to two decimal places, and any duplicates at this resolution were removed. These lters are commonly applied to data extracted from GBIF (Zanne et al., 2014).

@SriramRamesh
Copy link

Hi, I am interested in biodiversity data cleaning for GSOC 17. Can I create a function for this use case?

@sckott
Copy link
Collaborator Author

sckott commented Feb 27, 2017

hi @SriramRamesh I wasn't thinking of a separate function for this, but an example to put in a vignette and/or README.

we may want to add additional functions to this pkg if warranted

@SriramRamesh
Copy link

Did you mean that we can mention this kind of usage in the README so that users can customize to their dataset?

@sckott
Copy link
Collaborator Author

sckott commented Feb 27, 2017

no.

the idea is to make a vignette like https://github.com/ropensci/scrubr/blob/master/inst/vign/scrubr_vignette.Rmd that has one or more use cases like that described above - so a set of code replicating as close as possible what they did in the paper -

a second issue is in the process of doing that, we may find we need additional functions, which we can talk about if that comes up

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants