Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

neon to nmdc-schema mappings #928

Closed
turbomam opened this issue May 31, 2023 · 5 comments
Closed

neon to nmdc-schema mappings #928

turbomam opened this issue May 31, 2023 · 5 comments
Assignees

Comments

@turbomam
Copy link
Collaborator

turbomam commented May 31, 2023

  • I will define a neon.schema prefix in the nmdc-schmea, with an obvious placeholder expansion
  • @bmeluch and others will create a assets/neon_nmdc_term_mappings.tsv document in the nmdc-schema repo with four/five columns
    • nmdc_class
    • nmdc_slot
    • neon_table
    • neon_column
    • notes?
  • I will write some code that asserts mappings in slot_usages like
mappings:
- neon.schema:<neon_table>.<neon_column>
@turbomam turbomam self-assigned this May 31, 2023
@turbomam turbomam mentioned this issue Jun 1, 2023
2 tasks
@turbomam
Copy link
Collaborator Author

turbomam commented Jun 9, 2023

Nice work, @brynnz22 and @bmeluch

Are you requesting that dna_concentration and dna_absorb1 should be attributes of the Extraction process? I suggest associating those slots with the ProcessedSample that actually has those characteristics. Maybe we should relocate sample_mass too.

Think about it this way: if a schema had a Cooking processes, and a served_to slot that pointed to a Person, then a satisfaction slot should be associated with the Person, not the Cooking.

It doesn't matter how our data source models the data, especially if it's coming form relational tables, which are notorious for mixing concerns. It's our responsibility to determine what entities have what qualities, and the nmdc-schema supports doing that in a way that reflects reality better than most relational databases.

Sometimes you can double check your intentions by looking at the other things that use those slots. dna_concentration and dna_absorb1 are only associated with Biosample, which is a MaterialEntity, not a PlannedProcess

@brynnz22
Copy link
Contributor

brynnz22 commented Jun 9, 2023

@turbomam We were discussing which ones it belongs to, and ProcessedSample makes a lot of sense. I think since sample_mass was in the Extraction class, we thought those should go in there too. But it makes sense to move those slots into ProcessedSample!

@aclum
Copy link
Contributor

aclum commented Jun 30, 2023

Are we done with this?

@aclum
Copy link
Contributor

aclum commented Jul 14, 2023

@turbomam are we done with this?

@aclum aclum closed this as completed Sep 12, 2023
@turbomam
Copy link
Collaborator Author

Might want to revisit this some day

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Status: DONE!
Development

No branches or pull requests

3 participants