Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Next steps: Slinky as a layer of enhancement on top of static holdings #44

Open
amoeba opened this issue Sep 17, 2021 · 1 comment
Open
Assignees
Labels
idea Ideas for the project

Comments

@amoeba
Copy link
Contributor

amoeba commented Sep 17, 2021

[I'm filing this not because we're moving on to next steps already but just to file it and let people chime in with ideas]

We can always find ways to improve the metadata we have but most metadata are written once, possibly checked and tweaked by a moderation team, and the left fixed in stone. What if we could extend the ways Slinky already improves metadata (ie co-reference resolution, minting/finding party identifiers) beyond what we're doing now?

I got to thinking about this after one of our recent mobilization calls and a recent example got me here to writing this ticket. Take the metadata record at https://search.dataone.org/view/urn%3Auuid%3A84f4e415-53c3-55e9-bb6d-3ee34419595d. It's a JSON-LD record from NPDC. The abstract starts:

Data from Polarstern cruise PS94 in the Arctic in 2015 with chief scientist Ursula Schauer.

There's a few really key elements to this free text description that we could totally extract into linked data and make for a much richer landing page: (1) Polarstern (2) PS94 (3) Arctic (4) 2015, (5) Ursula Shauer (6) Ursula Shauer as a Chief Scientist and (7) the role of Chief Scientist.

Extracting and linking information like this would be a really nice enhancement for a lot of metadata records, but especially our science-on-schema ones which will tend to be more minimal. We might also think about how we preserve any enhancements in our Data Package exports.

Specific things we could build on top:

  • Name entity extraction (with semantic linkages)
  • FIPS codes or gazetteer linking for arbitrary spatial bounding boxes (dataset X has coverage in FIPS codes A, B, and C)
@amoeba amoeba added the idea Ideas for the project label Sep 17, 2021
@amoeba amoeba self-assigned this Sep 17, 2021
@mpsaloha
Copy link

mpsaloha commented Sep 18, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idea Ideas for the project
Projects
None yet
Development

No branches or pull requests

2 participants