New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drug names used as IDs in website causing ambigous links #600
Comments
I was able to confirm some of these examples. There are definitely name-duplicates in DGIdb, but they have different concept_ids. I wasn't able to confirm the counts posted by @yarong-lifemap though. RAUWOLFIA SERPENTINA, same name in CHEMBL but two different CHEMBL IDs
TISAGENLECLEUCEL, same name but some claims matched to CHEMBL and the others to wikidata
Here are the corresponding claims:
We should check that with the upcoming V5 therapy normalizer improvements, these claims are all grouped into the same drug concept (if appropriate). We should also revisit linking out to drugs by name. Maybe linking out by concept_id or internal DGIdb drug_id would be better. |
There are 3658 name-based duplicate entries in the DRUGS (causing the number of distinct drugs by name to be 10,776 instead of 14,449). Examples include:
[DAROTROPIUM BROMIDE, 3 times]
[TISAGENLECLEUCEL, 3 times]
[CYCLOPHOSPHAMIDE, 3 times]
[RAUWOLFIA SERPENTINA, 4 times]
[TRASTUZUMAB DERUXTECAN, 3 times]
[INDOCYANINE GREEN, 3 times]
As a result, it's not possible to navigate to a correct drug entry
For example: Searching for "U-50488 METHANE SULFONATE" shows two results, both linking to the same page (which only reflects one of the entries, I can only assume).
The text was updated successfully, but these errors were encountered: