Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review synonyms of current Open Targets therapeutic areas #481

Closed
paolaroncaglia opened this issue Jun 26, 2019 · 9 comments
Closed

Review synonyms of current Open Targets therapeutic areas #481

paolaroncaglia opened this issue Jun 26, 2019 · 9 comments

Comments

@paolaroncaglia
Copy link
Collaborator

paolaroncaglia commented Jun 26, 2019

@d0choa and Sandra M. from Open Targets (OT) mentioned some issues with synonyms of EFO high-level terms that are currently OT therapeutic areas (TAs). In some cases the scope is incorrect, in other cases the synonym is not pertinent. These issues may be due to ancient Bioportal imports, and the broader matter was highlighted previously (#276). In this ticket, I'll aim at fixing synonyms for the 26 EFO terms that are currently tagged as OT TAs (https://github.com/opentargets/platform-therapeutic-areas/blob/master/tas.tsv).

Zoe collected synonyms of TAs in the "TA synonyms" tab here: https://docs.google.com/spreadsheets/d/1PNcCn8RP9LPKAFp5TtmCAwocDZwkAVEp6HijMRNkYMk/edit#gid=1787586044

At present, all synonyms in the “TA synonyms” tab in the spreadsheet of OT TAs (https://docs.google.com/spreadsheets/d/1PNcCn8RP9LPKAFp5TtmCAwocDZwkAVEp6HijMRNkYMk/edit#gid=1787586044) are in EFO as exact synonyms. I suggest the following strategy, as discussed with Zoe:
Check if currently exact synonyms are really and unequivocally exact. If yes, highlight the cell in green.
For non-green cells: was the synonym in EFO2 already?
If yes, we can edit it as necessary (i.e. change scope or delete) and our changes won’t be overwritten after the next release. If the scope needs to be changed, highlight the cell in yellow and add a B/N/R (for broad/narrow/related synonym). If the synonym is wrong and needs to be deleted, highlight the cell in red.
If the synonym was not in EFO2 already, ask MONDO to fix the synonym. Highlight the cell in grey.

I'd prioritise the deletion of wrong synonyms, and address other edits later.

@d0choa
Copy link

d0choa commented Jun 26, 2019

The term that did bring this to our attention was "parasitic infection" (EFO:0001067) which is not currently a therapeutic area.

@paolaroncaglia
Copy link
Collaborator Author

@d0choa thanks. For the sake of prioritisation, we'll have to keep the review of other synonyms for later :-) (we already have a ticket for that, #276), but I'll add your example there. Thanks!

@paolaroncaglia
Copy link
Collaborator Author

paolaroncaglia commented Jun 26, 2019

Tricky case number 1:
MONDO:0005046 'immune system disease' has exact synonym 'immune system disease', and we inherited it from them. I opened a MONDO ticket to ask if they have a check in place to avoid exact synonyms being identical to term labels. monarch-initiative/mondo#749
(Update 27/6/19: MONDO developers are looking into this. When it's fixed, after the first MONDO and EFO releases, EFO will "lose" this synonym and any other of the same kind that we may have inherited from MONDO.)

@paolaroncaglia
Copy link
Collaborator Author

paolaroncaglia commented Jun 26, 2019

Tricky case number 2:
When we switched from EFO2 to EFO3, "Bioportal_provenance has been removed from EFO3 entirely.
This has cleaned up annotations, removing the imported fingerprints from bioportal provenance from previous importing." So, the bioportal_provenance annotation shouldn’t exist anymore, but any synonyms that came from there that were labeled as synonyms in EFO2 will have come into EFO3. Zoe and I think that many of those synonyms aren't helpful and we may want to clean them up. We may wish to explore if there's a way of pulling them all out and delete them (I found examples that are absent in MONDO, so wouldn't be reinstated after each new MONDO import). We'll need to keep this for later (I'll move to the bigger ticket). For now, I might delete them for OT TAs if they are narrow or incorrect.
(Update 27/6/19: as discussed with Simon and Zoe today, we will not address narrow synonyms at this stage as they are not a priority; they will be addressed as part of #276 when we get to it.)

@paolaroncaglia
Copy link
Collaborator Author

paolaroncaglia commented Jun 26, 2019

I made a first pass at all synonyms of TA terms and highlighted 8 that need to be deleted.
All other synonyms are either exact or narrow. Most are just permutations of the same words, not really adding much for text mining I'd say.
Of the 8 bad synonyms above:
4 I deleted from EFO (Other postoperative functional disorders, nevus comedonicus (disorder), Pilosebaceous Nevoid disorder, acne nevus = all were incorrect synonyms of digestive system disease);
4 come from MONDO, I asked them to fix (monarch-initiative/mondo#750), they will.

@paolaroncaglia
Copy link
Collaborator Author

@d0choa
I looked at all exact synonyms of EFO terms that are currently tagged as therapeutic areas in Open Targets.
I deleted a handful that were incorrect and that came from EFO.
I asked MONDO to delete another handful of incorrect synonyms that EFO inherited from them; they are working on it and I'm confident that it will be done before the next release.
All other exact synonyms of TA terms are either correct or non-exact (mostly narrow, but not incorrect). So, text mining would still map to a correct term in most if not all cases, for the TA terms. Therefore we resolved that we will not address narrow synonyms for now (so we can prioritise your other recent tickets). Rather, we will address them as part of #276 when we get to that.
Thanks,
Paola

@paolaroncaglia
Copy link
Collaborator Author

paolaroncaglia commented Jun 27, 2019

@d0choa @simonjupp @zoependlington
Just to confirm,
Does the text-mining pipeline in Open Targets use only labels + exact synonyms of EFO terms?
Or does it use labels + all synonyms?
Thanks!
(Note for self: move to Done when I have a reply.)
(Update 2/7/19: David asked and is waiting for a reply; but the question may not be applicable after all because EFO2 only had alternative_term that was a mixed bag containing all scopes of synonyms. If so, it would still be important to know going forward.)

@MichaelaEBI
Copy link

EPMC use 'labels and 'hasExactSynonym' matches' for Open Targets text mining. I hope this answers the question. If not, then it might be best to get in touch with EPMC directly.

@paolaroncaglia
Copy link
Collaborator Author

@MichaelaEBI
Thanks! Yes, that answers my question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants