Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

imProving Agent returning results with drug names like "Pubchem.compound:6710690" instead of "Pharmakon1600-01504273" with lots of examples like this. I.e. CURIEs being returned, not names #568

Open
TranslatorIssueCreator opened this issue Oct 3, 2023 · 9 comments
Assignees
Labels
response labels task around getting better labels to show UI - display confusion on or overlooking information

Comments

@TranslatorIssueCreator
Copy link

Type: Bug Report

URL: https://ui.transltr.io/main/results?l=VPS13B%20(Human)&i=NCBIGene:157680&t=1&q=9c390038-73fc-4d7d-8c63-c08d85eda8b0

ARS PK: 9c390038-73fc-4d7d-8c63-c08d85eda8b0

Steps to reproduce:

Search for drugs that upregulate VPS13B

Screenshots:

@gglusman
Copy link

gglusman commented Oct 3, 2023

A better example: "Pubchem.compound:151537" instead of "4'-Epidoxorubicin (hydrochloride)".
Or: "Pubchem.compound:9841834" instead of "Istaroxime".

@sandrine-muller-research
Copy link
Collaborator

sandrine-muller-research commented Oct 4, 2023

I queried the compound Pubchem.compound:151537 through NodeNorm endpoint and found that the label for this compound (decided by NodeNorm) is :
"(7S,9S)-7-[(2S,4S,5R,6S)-4-amino-5-hydroxy-6-methyloxan-2-yl]oxy-6,9,11-trihydroxy-9-(2-hydroxyacetyl)-4-methoxy-8,10-dihydro-7H-tetracene-5,12-dione"
So here is my guess of what is happening but I'll need confirmation from the UI team @dnsmith124 : when the label is too long, the decision was made to show the ID. Here the user is asking whether another rule could be used?

The second issue here is NodeNorm choosing not the optimal label. This is a known issue.

@sandrine-muller-research sandrine-muller-research added the response labels task around getting better labels to show label Oct 4, 2023
@sandrine-muller-research sandrine-muller-research added the UI - display confusion on or overlooking information label Oct 4, 2023
@dnsmith124
Copy link
Collaborator

@gprice1129 can you speak to whether the backend is doing this with the names? The UI's frontend simply displays the names provided, and in the results I'm seeing from the example the 'Pubchem' terms are being given as the names for these results.

@Genomewide
Copy link

@sandrine-muller-research where does your preferred name come from?

@gprice1129
Copy link
Collaborator

@dnsmith124 @sandrine-m @MarkDWilliams the backend just takes the names we are given by the ARS. The ARS should be converting these names from CURIEs to whatever name is decided as the "best" one by NodeNorm.

@sandrine-muller-research
Copy link
Collaborator

@Genomewide from NodeNorm PROD endpoint :
image
@MarkDWilliams does ARS make something on top of NodeNorm to decide the best label for the compound?

@sstemann
Copy link

sstemann commented Apr 8, 2024

this still happens, i dont know if there is a solution @gaurav https://ui.test.transltr.io/main/results?l=VPS13B%20(Human)&i=NCBIGene:157680&t=1&r=0&q=d9bc14f5-c11a-4625-aef7-1ed76c3f7179

@gaurav
Copy link

gaurav commented Jul 26, 2024

Here's how we're doing on NodeNorm CI:

  • PUBCHEM.COMPOUND:151537 is still a long name ("(7S,9S)-7-[(2S,4S,5R,6S)-4-amino-5-hydroxy-6-methyloxan-2-yl]oxy-6,9,11-trihydroxy-9-(2-hydroxyacetyl)-4-methoxy-8,10-dihydro-7H-tetracene-5,12-dione"), but we don't have any alternate names in that clique.
  • PUBCHEM.COMPOUND:6710690 is still a non-ideal name ("Pharmakon1600-01504273")
  • PUBCHEM.COMPOUND:9841834 now has a preferred label of "Istaroxime" -- woo!

I'm tracking non-good preferred names in this spreadsheet as well as TranslatorSRI/Babel#306, but that work won't help these two cliques, because none of the other identifiers have a good label for this identifier. So we will probably need to pull in additional sources of labels and identifiers to fully fix this. I'm going to come back to this in Hammerhead, but unless there's a good source we're missing this will likely go unfixed this year.

(There's another ticket where we're discussing other solutions, such as having the UI display the CURIE -- "PUBCHEM.COMPOUND:151537" instead of "(7S,9S)-..." -- see #759)

@sstemann
Copy link

sstemann commented Aug 7, 2024

it still happens a lot and is most obvious on a new query, since Improve and Unsecret are the fatest to return

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
response labels task around getting better labels to show UI - display confusion on or overlooking information
Projects
None yet
Development

No branches or pull requests

9 participants