Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coding: SSSOM Output #62

Open
joeflack4 opened this issue Feb 24, 2022 · 11 comments
Open

Coding: SSSOM Output #62

joeflack4 opened this issue Feb 24, 2022 · 11 comments
Assignees
Labels

Comments

@joeflack4
Copy link

joeflack4 commented Feb 24, 2022

Basic description

Feature request. Allow option for mappings outputs to be in SSSOM format.
Action: Generate mapping outputs.
Output: Rather than the existing output format, this would output mappings in SSSOM.

Related

#30

Additional information

Implementation details

Can add a new CLI option, e.g. --output-format, with options such as 'standard' (which I guess would be what you have now), and 'sssom'.

Resources

(@matentzn: Can you comment on the main difference between codebases (1) and (2)?

  1. https://github.com/mapping-commons/sssom-py - Source code.
  2. https://github.com/mapping-commons/sssom - Source code. And good documentation in the README.md. Good summary of the main 3-6 fields ([subject, predicate, object] x [id, label]).
  3. https://mapping-commons.github.io/sssom/spec/ - Full specification.
  4. https://mapping-commons.github.io/sssom/Mapping/ - Full description of all required and optional fields.
@matentzn
Copy link

@callahantiff I see in various places of the obo-verse that you have been working on an updated release of OMOP2OBO - just out of curiosity, is an SSSOM output of the data anywhere on the Horizon?

@joeflack4
Copy link
Author

@callahantiff Just as an FYI, Tiffany, I also worked on an OMOP Vocab to FHIR converter recently. Maybe useful for you to know about, not sure: https://github.com/jhu-bids/omop-vocab-on-fhir/tree/main

@callahantiff
Copy link
Owner

Hi @matentzn and @joeflack4! Apologies for the late response, just getting back from ISMB (lots of great ontology talks). I would love to have SSSOM format for an included output type we can mention in the paper (finally working on that again now too). When working on this before I came across a bunch of questions when trying to align 1:M mappings. Perhaps I can post a few example translations of different types of mappings into the SSSOM here and we can discuss?

Thanks so much for still being interested in the mappings and not losing hope. I really, really appreciate it! Thank you @joeflack4 for including your example as well, that will be very helpful!

@matentzn
Copy link

@callahantiff excellent, all ears, but you also always have the opportunity to just call us and we can go through it in a face2face. Anything is fine :)

@cmungall
Copy link

Looking at these with @justaddcoffee

It seems a lot of these are not simple pairwise mappings as handled by sssom

however, we are currently talking about postcomposition strategies for sssom and many of these seem intended for this:

900000030 23808 60400003 SNOMED HP_0011010 | HP_0004398 | HP_0002239 | HP_0031368 | HP_0004796 chronic | peptic ulcer | gastrointestinal hemorrhage | intestinal perforation | gastrointestinal obstruction Manual One-to-Many Concept AND(0, 1, NOT(2), NOT(3), NOT(4)) OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:snomed_13200003 | OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:umls_C0030920 | OBO_LABEL-OMOP_ANCESTOR_LABEL:peptic_ulcer | CONCEPT_SIMILARITY:HP_0004398_0.301 | Hand Mapping FALSE FALSE TRUE   Exported from OMOP2OBO and bulk imported to N3C c91cf525-aa2e-4ad8-b6d0-f83122ee48b5 2022-10-27T01:48:35.238Z

E.g. 900000030 maps to peptic ulcer AND has-qualifier some chronic

@callahantiff
Copy link
Owner

callahantiff commented Oct 31, 2022

Looking at these with @justaddcoffee

It seems a lot of these are not simple pairwise mappings as handled by sssom

however, we are currently talking about postcomposition strategies for sssom and many of these seem intended for this:

900000030 23808 60400003 SNOMED HP_0011010 | HP_0004398 | HP_0002239 | HP_0031368 | HP_0004796 chronic | peptic ulcer | gastrointestinal hemorrhage | intestinal perforation | gastrointestinal obstruction Manual One-to-Many Concept AND(0, 1, NOT(2), NOT(3), NOT(4)) OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:snomed_13200003 | OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:umls_C0030920 | OBO_LABEL-OMOP_ANCESTOR_LABEL:peptic_ulcer | CONCEPT_SIMILARITY:HP_0004398_0.301 | Hand Mapping FALSE FALSE TRUE   Exported from OMOP2OBO and bulk imported to N3C c91cf525-aa2e-4ad8-b6d0-f83122ee48b5 2022-10-27T01:48:35.238Z
E.g. 900000030 maps to peptic ulcer AND has-qualifier some chronic

@cmungall -- that would be fantastic! I apologize for not reaching out sooner, but the complex mappings like these are where I got stuck when trying to write a converter from the current format to SSSOM. I think it's not something that can be quickly fixed or easily dealt with. That said, perhaps there are patterns we can apply? I would love to help you guys with this. Are there meetings I can crash to help? If we can figure this out I am still very happy to write the conversion code.

@justaddcoffee -- thanks so much for being so open to using these mappings. Even the super gnarly ones like the one shown above, I really, really, really appreciate it!!

@cmungall
Copy link

Thanks for the fast response!

This is the relevant issue in sssom - mapping-commons/sssom#108

Here is another one, for the concept "Chronic peptic ulcer with hemorrhage but without obstruction" - these make great use cases for driving us with complex sssom mappings

      codeset_id: 900000140
      concept_id: 30770
            code: 81518000
      codeSystem: SNOMED
     ontology_id: HP_0011010 | HP_0004398 | HP_0002239 | HP_0004796
  ontology_label: chronic | peptic ulcer | gastrointestinal hemorrhage | gastrointestinal obstruction
mapping_category: Manual One-to-Many Concept
   mapping_logic: AND(0, 1, 2, NOT(3))
mapping_evidence: OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:snomed_13200003 | OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:umls_C0030920 | OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:umls_C0017181 | OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:snomed_74474003 | OBO_LABEL-OMOP_ANCESTOR_LABEL:peptic_ulcer | OBO_hasExactSynonym-OMOP_ANCESTOR_LABEL:gastrointestinal_haemorrhage | OBO_LABEL-OMOP_ANCESTOR_LABEL:gastrointestinal_hemorrhage | CONCEPT_SIMILARITY:HP_0004398_0.419 | Hand Mapping
      isExcluded: FALSE

includeDescendants: FALSE
includeMapped: TRUE
item_id:
annotation: Exported from OMOP2OBO and bulk imported to N3C
created_by: c91cf525-aa2e-4ad8-b6d0-f83122ee48b5
created_at: 2022-10-27T01:48:35.238Z

@justaddcoffee
Copy link

@justaddcoffee -- thanks so much for being so open to using these mappings. Even the super gnarly ones like the one shown above, I really, really, really appreciate it!!

@callahantiff no, thank you! We're exciting about using these new mappings in N3C - we got great results using v1, and v2 has something like 20x more mappings

@cmungall
Copy link

here is another for a disease concept "Sickle cell-hemoglobin SS disease", using OR

      codeset_id: 900000001
      concept_id: 22281
            code: 127040003
      codeSystem: SNOMED
     ontology_id: HP_0001903 | HP_0002664 | HP_0010566 | HP_0001871 | HP_0001877
  ontology_label: anemia | neoplasm | hamartoma | abnormality of blood and blood-forming tissues | abnormal erythrocyte morphology
mapping_category: Automatic One-to-Many Ancestor
   mapping_logic: OR
mapping_evidence: OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:umls_C0162119 | OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:snomed_271737000 | OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:umls_C0002871 | OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:snomed_165397008 | OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:umls_C0027651 | OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:umls_C0018552 | OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:umls_C0018939 | OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:umls_C0391870 | OBO_LABEL-OMOP_ANCESTOR_LABEL:anemia | OBO_hasExactSynonym-OMOP_ANCESTOR_LABEL:anaemia | OBO_LABEL-OMOP_ANCESTOR_LABEL:hamartoma | CONCEPT_SIMILARITY:HP_0001903_0.347
      isExcluded: FALSE

includeDescendants: FALSE
includeMapped: TRUE
item_id:
annotation: Exported from OMOP2OBO and bulk imported to N3C
created_by: c91cf525-aa2e-4ad8-b6d0-f83122ee48b5
created_at: 2022-10-27T01:48:35.238Z

it seems maybe the omop hierarchy is overloading is-a where in monarch we would make d2p links, and the mappings are recapitulating those....?

@callahantiff
Copy link
Owner

@justaddcoffee -- thanks so much for being so open to using these mappings. Even the super gnarly ones like the one shown above, I really, really, really appreciate it!!

@callahantiff no, thank you! We're exciting about using these new mappings in N3C - we got great results using v1, and v2 has something like 20x more mappings

I am so glad and really excited about v2, I really hope they will prove helpful!

@callahantiff
Copy link
Owner

@cmungall - for the second example you posted. If I am following you and interpreting this right, I think it might actually be a function of the lazy logic that I applied in the first round. For example, the first pass (since I erred on the side of inclusivity -- very naive) I allowed all ancestors to be mapped. Really this was me looking for all possible alignments. I have since realized that might not be the best approach because as you can see, things get really general, really fast. The updated logic I have been working on takes the level of ancestry (i.e., the number of steps or level above a concept) for each ancestor and enables the user to pass a threshold, which can be a number of ancestors at a certain level or any ancestor at the lowest level. This should do a better job of finding more meaningful mappings. Does that sounds like a better strategy to you? If so, I can probably re-generate the automatic ancestors again before you guys do anything on the mappings that fall into that category.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants