Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add slots to Biosample class to disambiguate standard MG metadata vs long-read MG metadata #1937

Open
pkalita-lbl opened this issue Apr 22, 2024 · 3 comments
Assignees

Comments

@pkalita-lbl
Copy link
Collaborator

Montana correctly pointed out in this comment that our implementation of long-read metagenomics was somewhat incomplete.

The changes implemented for microbiomedata/submission-schema#168 added a new JgiMgLrInterface class. It reuses slots that are also used by the JgiMgInterface class. That makes sense from a pure LinkML perspective, but unfortunately it misses an important point about how submission data is brought into MongoDB where it adheres to nmdc-schema.

In the submission data one sample's metadata might be spread across multiple submission-schema class instances (e.g. a SoilInterface instance and a JgiMgInterface instance), linked together by the unique sample name. When going into Mongo those instances get collapsed into one instance of the nmdc-schema Biosample class. The issue is that if, in the submission-schema data, one sample has both an JgiMgInterface instance and a JgiMgLrInterface the slots values for one will overwrite the other when squashing into a Biosample instance.

This is the reason why we currently need to have pairs of slots like dna_absorb1 and rna_absorb1 instead of just absorb1. With the introduction of long-read MG metadata these need to become triples of slots (e.g. rna_absorb1, dna_absorb1, and -- new -- dna_lr_absorb1)

@pkalita-lbl pkalita-lbl self-assigned this Apr 22, 2024
@pkalita-lbl pkalita-lbl transferred this issue from microbiomedata/nmdc-server Apr 22, 2024
@mslarae13
Copy link
Contributor

Checking with Alicia if NMDC needs to store these slots. If so, which ones?

microbiomedata/issues#413 (comment)

@pkalita-lbl
Copy link
Collaborator Author

Removing this from Sprint 35. Not adding to a future sprint right now because it sounds like we need further input before proceeding.

@mslarae13
Copy link
Contributor

Decision was made on 06/12 during the metadata meeting

From @aclum in microbiomedata/issues#413 (comment)

would like to keep dna_isolate_meth and map it to a slot on NMDC's Extraction class.

We want to track dna_isolate_meth in NMDC, but this is the only slot.
We need to:

  • Add dna_isolate_meth_long & change dna_isolate_meth to dna_isolate_meth_short

POST BERK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants