Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MeSH to disease file #15

Closed
mellybelly opened this issue Mar 14, 2015 · 10 comments
Closed

Add MeSH to disease file #15

mellybelly opened this issue Mar 14, 2015 · 10 comments

Comments

@mellybelly
Copy link

see required classes in directory
https://github.com/monarch-initiative/human-disease-ontology/tree/master/src/contrib/MesH.contrib

cmungall added a commit that referenced this issue Mar 15, 2015
cmungall added a commit that referenced this issue Mar 15, 2015
cmungall added a commit that referenced this issue Mar 15, 2015
…ing or by DO xrefs.

Issue #15
todo - compare against tudors list
@cmungall
Copy link
Member

@tudorgroza not sure I understand these files

why is this one in 1-n?

D013724 (Teratoma) :: [0005566, 0005563, 0003307]

Mappings highlighted:

 / DOID:4 ! disease
  is_a DOID:14566 ! disease of cellular proliferation
   is_a DOID:162 ! cancer
    is_a DOID:0050687 ! cell type cancer
     is_a DOID:2994 ! germ cell cancer
      is_a DOID:3095 ! germ cell and embryonal cancer
       is_a DOID:3307 ! teratoma ***
        is_a DOID:2660 ! cystic teratoma
        is_a DOID:5563 ! malignant teratoma ***
        is_a DOID:5565 ! adult teratoma
        is_a DOID:5566 ! mature teratoma *** 
        is_a DOID:5567 ! ovarian germ cell teratoma

I guess this is just because mesh has dodgy synonyms?

btw, not all DOIDs are zero-padded, annoyingly

@cmungall
Copy link
Member

These are the meshes in the list that I didn't find DO mappings for

https://github.com/monarch-initiative/human-disease-ontology/blob/master/src/experimental/mesh-missing-from-doid.tsv

Should see if @tudorgroza maps some I don't

Then pass to DO folks for addition?

@mellybelly
Copy link
Author

Should all MeSH diseases (or at least the union of those from CTD file and common disease work) have an xref in DO? do we need this so as to not use MeSH directly? what if a MeSH term maps to a DC term but not a DO term? There are a lot of unmatched terms there, though I do see a few that likely didn't match due to not exact string.

For example, 'adult T-cell leukemia' in DO has exact synonym 'Adult T-cell leukemia/lymphoma' which matches to MeSH 'Lymphoma, T-Cell' on your list of unmatched terms. The DO term has an xref to the NCI term 'Adult T-Cell Leukemia/Lymphoma' (Code C3184).

(though DO's use of exact synonyms is perhaps a bit greedy)

cmungall added a commit that referenced this issue Mar 15, 2015
Warning: some results are kind of odd
E.g. MESH:Starvation is a DOID nutrition deficiency disease

Issue #15
@tudorgroza
Copy link

@cmungall that mapping comes from DO (originally in the format DOID:0005566::D013724 | DOID:0005563::D013724 | DOID:0003307::D013724) and hence I assume the DO people tried to relate terms at different levels of specificity (i.e., the more specific DO terms to the more general MeSH term).

W.r.t. the padding - is there something I can do to fix it?

@tudorgroza
Copy link

@cmungall I couldn't find additional mappings for the list you've compiled there.

@cmungall
Copy link
Member

@tudorgroza - I see, those were the xrefs already in DO. My list should subsume these.

@mellybelly the CTD list subsumes mesh_list.txt, with the exception of MESH:D012203 Rh isoimmunization, which is in DO already as an xref so we don't need an extra class

@cmungall
Copy link
Member

Didn't remember to tag the commit, but mondo now has mesh classes, either as equivalents to existing mondo classes, or sometimes as their own class with subclass axioms to existing mondos.

This can lead to some oddities, e.g.

[Term]
id: MESH:D004806
name: Ependymoma
is_a: DOID:5075 ! myxopapillary ependymoma
is_a: DOID:5500 ! cellular ependymoma
is_a: DOID:5505 ! papillary ependymoma
is_a: DOID:5507 ! clear cell ependymoma
is_a: DOID:5889 ! DOID:5889

because there is no generic ependymoma in mondo. But this is better than alternatives

@cmungall
Copy link
Member

Adding MESH causes real problems. Leaving aside the weird junk it brings in, it artificially inflates the diseases we care about

For example, in the latest build we have distinct entries:

[Term]
id: OMIM:602404
name: Parkinson Disease 3, Autosomal Dominant
namespace: MGI_disease_ontology
synonym: "PARK3" EXACT []
synonym: "PARKINSON DISEASE 3, AUTOSOMAL DOMINANT; PARK3" EXACT []
is_a: DOID:14330 ! Parkinson's disease
is_a: DOID:630 ! genetic disease
is_a: Orphanet:2828 ! Young adult-onset Parkinsonism

[Term]
id: MESH:C537176
name: Parkinson disease 3
is_a: MESH:D010300 ! Parkinson Disease

it makes a mess of the PD graph:
mdo-doid-14330

It's not as easy to merge entries like this as it seems, but I'll try.

However, I'd rather get rid of MESH and map CTD diseases upfront

cmungall added a commit that referenced this issue Mar 18, 2015
cmungall added a commit that referenced this issue Mar 18, 2015
cmungall added a commit that referenced this issue Mar 18, 2015
@cmungall
Copy link
Member

Disregard my previous comment. Was not aligning mesh to the correct target. Fixed this. PD graph now looks better (stll oddities from ordo, but at least all mesh is now tucked in with the relevant OMIM or DOID or Orphanet class). Hmm, PD3 is still a problem. But at least the others mostly are incorporated correctly.

mdo-doid-14330

@nicolevasilevsky
Copy link
Member

This issue was moved to monarch-initiative/monarch-disease-ontology-RETIRED#23

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants