-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider an extensible metadata model? #31
Comments
The original motivation for BB was for mirroring external data sets that generally already have a full metadata record elsewhere. BB's minimal metadata was intended to be enough to trigger users to know where the data actually came from and cite it if appropriate, and link them to documentation (that full MD record).
to give a user a skeleton record from an EML/ISO/DIF/whatever record, which they then edit as appropriate? (And accommodate some extra optional fields within that, too.) |
Yeah, that makes sense. I could still imagine that it might be useful for a user to add their own keyword or tags, e.g. to associate data with a project etc, but of course there's other places they could record that. Yeah, an automated import of the minimal data might make sense (e.g. datacite API will generate a citation for anything with a DOI) but probably involves supporting too many different formats, so I'm happy to close this as out of scope but your call. |
Chipping away at this - I have implemented a 'source generator' function for Zenodo data sets ( |
👏 very cool! I think having even one of these is a nice proof of concept. A user who commonly accesses data through a specific platform could then more easily template off your example at least. And given the ease of depositing data in Zenodo it seems like a good one to start with. nice work! |
I like the minimal metadata required by
bb_source()
that we can search from thebb_data_sources()
table. For large collections though, I wonder if it would make sense to support some additional optional fields that users could specify to make it easier to search their collections later, e.g. akeyword
field, or file type, etc?Going further -- much ink has been spelt over metadata descriptions for scientific data, but I am curious if it would be worth crimping from some of those. e.g. bowerbird could adopt the https://schema.org/Dataset or DCAT2 as the basis for it's metadata representation. I imagine most fields would still be optional, but this would allow for a bit greater expressiveness. Perhaps more relevantly, these fields could be auto-populated when importing data from sources that already expose metadata in these formats (e.g. Zenodo, data.gov, and many others serve the schema.org/Dataset metadata descriptions).
The text was updated successfully, but these errors were encountered: