Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OMIM ingest: what is/should be the source of truth for gene 2 disease in OMIM #708

Open
Tracked by #710
matentzn opened this issue Apr 29, 2022 · 10 comments
Open
Tracked by #710
Assignees
Labels

Comments

@matentzn
Copy link
Member

@kevinschaper is the OMIM ingest done? How did we make the decision of using OMIM morbidmap vs medgen gene2disease?

@kevinschaper
Copy link
Member

Kent finished it before he left and I'm not really that familiar with the details, but we can definitely revisit any decisions in there. It's an oddball right now because it's creating nodes, and I assume it should be an edge only ingest and we should update the edges in the mapping step to catch human genes. (though, I'm less sure about the NucliecAcidEntity nodes?)

@matentzn
Copy link
Member Author

Ok, maybe it is not a priority right this moment, but we should consolidate the various OMIM ingests we have across Monarch a bit, at least conceptually:

@matentzn
Copy link
Member Author

matentzn commented Apr 30, 2022

I am in particular concerned about the source files being used.. @cmungall told me that there are multiple ways the OMIM g2ds can be obtained, like from MedGen, directly from morbidmap etc.. And no one knows what really is the best solution here. (We should try multiple and diff to see whats the difference? not sure)

@matentzn
Copy link
Member Author

This is not really a ticket I can take on effectively - I can advice, but its probably better to assign someone else

@monicacecilia
Copy link
Member

is the OMIM ingest done?

@matentzn - Yes, it is.


we should consolidate the various OMIM ingests we have across Monarch a bit, at least conceptually.

@matentzn - I agree, and I would like for us to retake this conversation if it hasn't already been done. Could we please restart this convo during the data call on 2024-02-01.


@cmungall @@kevinschaper 👀👆🏽

@matentzn
Copy link
Member Author

If I think about this correctly:

  1. g2ds are now, since our push over the summer, coming from the HPOA pipeline. These includes the OMIM g2ds (@kevinschaper right)?
  2. OMIM ids, with occasional links to genes as necessary for "defining the disease" come from https://github.com/monarch-initiative/omim, which I think also is ok
  3. We are exploring, or rather, should be, exploring, moving our g2d ingest uniformly to gencc (https://thegencc.org/)

@sagehrke
Copy link
Member

sagehrke commented Feb 1, 2024

Related to #707
@madanucd

@madanucd
Copy link
Contributor

The current flow for ingesting G2D associations from OMIM follows a structured pathway: data originates from OMIM, passes through Medgen, proceeds to HPOA-G2D, and finally integrates into the Monarch Knowledge Graph (KG).

An assessment of G2D associations among these sources, as of March 2024, reveals a comprehensive coverage. The relationships are visually depicted in an UpSet plot or Venn diagram, highlighting that HPOA-G2D encapsulates all associations from Medgen. Additionally, Medgen ensures the inclusion of all associations from OMIM.

image image

However, upon closer examination of Medgen's sources, a noteworthy observation emerges. While Medgen effectively captures G2D associations from OMIM, its downloadable files (as of March 2024) reveal a reliance on intermediary sources for these associations. This indirect pathway necessitates periodic verification to ensure that OMIM's contributions are fully accounted for within the dataset.

Medgen Sources G2D edges
GeneMap 6763
GeneMap; GeneReviews 379
NCBI curation 8
GeneMap; NCBI curation 8
GeneReviews 182
GeneReviews; NCBI curation 7
GeneMap; NCBI curation; OMIM 1
GeneMap; OMIM 2
GeneTests 4
OMIM 2
GeneMap; GeneTests 2

@sagehrke
Copy link
Member

@julesjacobsen @cmungall FYI 👀 ⬆️

@matentzn
Copy link
Member Author

Very nice analysis..

@monicacecilia monicacecilia transferred this issue from monarch-initiative/monarch-ingest May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants