-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardize data.frame's across data sources #38
Comments
@hlapp sorry, hadn't looked at this yet. Will do so today |
Run down of what data objects functions currently output;
|
Common fields among functions that could be standardized:
The above aren't real columns in outputs yet, but the ones I think could be standard across most of the data sources in this package. Part of what makes this hard is that we have a diverse set of data sources, from morphological trait data, to nativity status, to molecular data I think a way forward could be to provide a suite of functions that do some set of transformations to the data to standardize column names/etc. to allow them to be easily combined across taxa and data sources - at least across the standard fields - and other fields could be included as additional columns at the end |
You mean columns among data frames? |
yes |
@hlapp I don't know what's available to add |
@sckott do you mean the classification, or family and genus for taxa that are species? |
@hlapp I mean I favor leaving authority off the name, and having in a separate column, if provided. If there data record has lowest ID to family e.g,. then I don't know what best practice is. Perhaps we'd leave |
Right now, we haven't thought about outputs of each function. I believe all are data.frames. I'll look at each and see what they currently output and see what is shared among them, and see what standard format we can use that will also make combining outputs easier
The text was updated successfully, but these errors were encountered: