Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

genes are getting into the disease file #14

Closed
mellybelly opened this issue Mar 14, 2015 · 16 comments
Closed

genes are getting into the disease file #14

mellybelly opened this issue Mar 14, 2015 · 16 comments

Comments

@mellybelly
Copy link

For example:
OMIM_107730 Apolipoprotein B; APOB
subclass of DO:disease and Orphanet_121386 (no label in file, but is a gene from Orphanet apolipoprotein B)

coming from MGI file I think, genes need to be pruned out upstream.

@sbello
Copy link

sbello commented Mar 16, 2015

OMIM:107730 is a gene + phenotype record, so it includes diseases even though it is named for a gene. This is why it is in the MGI disease cluster file. OMIM is working on breaking these apart into separate records.
Sue

@cmungall cmungall reopened this Mar 16, 2015
@cmungall
Copy link
Member

Reopening. May have been premature to filter these from mondo

@cmungall
Copy link
Member

OK, we we have entries such as OMIM:107730, which are apparently combined G+P entries that may be split in the future (we also have HPO annotations for OMIM:107730 in Monarch).

Then we also have cases like this one:
http://monarchinitiative.org/disease/OMIM:516000
'Complex I, Subunit Nd1'
which seems much more in the gene camp
For which MGI (sensibly) excludes from omimclusters. But note above, we do have phenotype data for.

How should we treat these in mondo? I think we need to at least provide a label. But even a classification under 'disease' is potentially confusing.

@mellybelly
Copy link
Author

@pnrobinson @drseb should these be migrated to a disease class? for OMIM:516000 possibly LEBER OPTIC ATROPHY?

for OMIM107730 perhaps HYPOBETALIPOPROTEINEMIA? Even if OMIM has genes and diseases mixed up doesn't mean we have to annotate to them, especially as many of these terms could come from DO, orphanet, or DC. We should not annotate phenotypes to genes methinks.

@cmungall
Copy link
Member

Or if we really want to say 'mutations in this gene cause this
phenotype' then we should use NCBIGene

On 18 Mar 2015, at 7:01, Melissa Haendel wrote:

@pnrobinson @drseb should these be migrated to a disease class? for
OMIM:516000 possibly LEBER OPTIC ATROPHY?

for OMIM107730 perhaps HYPOBETALIPOPROTEINEMIA? Even if OMIM has genes
and diseases mixed up doesn't mean we have to annotate to them,
especially as many of these terms could come from DO, orphanet, or DC.
We should not annotate phenotypes to genes methinks.


Reply to this email directly or view it on GitHub:
#14 (comment)

@drseb
Copy link
Member

drseb commented Mar 27, 2015

Are we sure that for every gene G, for which there is some phenotype (HPX) associated, there is an corresponding OMIM phenotype-entry (OMIMX)?
Is OMIMX annotated with HPX?
Is OMIMX linked to G?

If we have to answer one of the question with no, we are loosing information.

Just my two cents…

Seb

On 18 Mar 2015, at 17:13, Chris Mungall notifications@github.com wrote:

Or if we really want to say 'mutations in this gene cause this
phenotype' then we should use NCBIGene

On 18 Mar 2015, at 7:01, Melissa Haendel wrote:

@pnrobinson @drseb should these be migrated to a disease class? for
OMIM:516000 possibly LEBER OPTIC ATROPHY?

for OMIM107730 perhaps HYPOBETALIPOPROTEINEMIA? Even if OMIM has genes
and diseases mixed up doesn't mean we have to annotate to them,
especially as many of these terms could come from DO, orphanet, or DC.
We should not annotate phenotypes to genes methinks.


Reply to this email directly or view it on GitHub:
#14 (comment)

Reply to this email directly or view it on GitHub.

@sbello
Copy link

sbello commented Mar 27, 2015

The entry 535000 is a gene (* prefix) record in OMIM, all of the related phenotypes have separate OMIM records. At MGI we exclude all * records from our disease load.
Sue

@drseb
Copy link
Member

drseb commented Mar 27, 2015

The entry 535000 is a gene (* prefix) record in OMIM, all of the related phenotypes have separate OMIM records. At MGI we exclude all * records from our disease load.

I assume you mean 516000.

This OMIM entry has HPO annotations and is linked to ND1 (or MT-ND1, entrez 4535)
I usually use OMIM genemap, mim2gene, and Orphanet to link between diseases and genes or map between OMIM-gene-entry and Entrez-gene.

I do not see any other OMIM entry linked to ND1 in genemap.
I don’t see any Orphanet-entry WITH phenotype-data linked to that gene.

So the links between that gene and the currently annotated HP-terms would be lost IMHO.

Seb

@sbello
Copy link

sbello commented Mar 27, 2015

Yes, sorry about that 516000.

@nlwashington
Copy link
Collaborator

note the following omim entries have annotations from the HPO group (just 25), that are the "combined" gene and phenotype, but what it really means it is a genomic location that happens to have some phenotypes associated with it:
100650,107680,107730,107741,109270,114835,116790,124060,132810,138300,141800,141900,147892,151430,152200,152780,159555,168820,173470,177400,182870,211100,222745,309850,314200
perhaps these should just be migrated to the relevant disease.

some are easy to map (1:1); others are not.

100650 --> 610251
107680 --> 105200 or 604091 (but there are also other disease/phenotypes here that don't have omim ids, "ApoA-I and apoC-III deficiency, combined" and "Corneal clouding, autosomal recessive"
107730 --> 144010, 615558
107741 --> 104310, 611771, 269600, 603075 (plus Hyperlipoproteinemia, type III and {Myocardial infarction susceptibility} )
109270 --> A LOT OF THINGS.

@nlwashington
Copy link
Collaborator

i have also opened an issue in the hpo tracker here: https://sourceforge.net/p/obo/human-phenotype-requests/438/

@nlwashington
Copy link
Collaborator

OMIM:124060 is also a gene causing all kinds of trouble.

@nlwashington
Copy link
Collaborator

and then there are thinks like OMIM:601894 which are very clearly just diseases, but end up getting typed as genes somewhere in the pipeline. i've checked our code and output ttl, and this typing is not coming from the data, but must be from the ontologies.

@cmungall
Copy link
Member

I will:

  1. using Monarch's omim.ttl to generate a 'blacklist' of genes (can be done by the SO class)
  2. subtract these from any downstream application of omimclusters.obo

@cmungall
Copy link
Member

This issue was moved to monarch-initiative/monarch-disease-ontology-RETIRED#14

@nicolevasilevsky
Copy link
Member

This issue was moved to monarch-initiative/monarch-disease-ontology-RETIRED#22

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants