Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type-ahead wonky. Would have thought atherosclerosis would have been a top result from such a service. #85

Closed
TranslatorIssueCreator opened this issue Dec 22, 2022 · 10 comments
Assignees
Labels
autocomplete UI - feature request future enhancement identified by user UI - term selection identification of the specific node and context to be selected for a query
Milestone

Comments

@TranslatorIssueCreator
Copy link

Type: Bug Report

URL: https://ui.transltr.io/results?loading=true

ARS PK: 8c2e9da6-0d5a-4338-8716-c005c897f36e

Steps to reproduce:

Trying to remember how to spell atherosclerosis ... the type ahead is good. But initial results for 'Athe' or 'Ather' don't give anything other than Arterial Fatty at first. And then I tried capturing the details for this bug report and it stopped giving me any type-ahead. https://nodenorm.transltr.io/1.3/get_normalized_nodes POST gave me 503 Service Temporarily Unavailable
nginx

Screenshots:

@sierra-moxon
Copy link
Member

see also #42

@dnsmith124
Copy link
Collaborator

The core issue here appears to be that when search terms have relatively few characters, many terms are returned by the NR and NN, but many of them are not diseases (and in some cases, most of them), and so are filtered out. This results in relatively few results shown to the user for terms that seem like they should instead return many results.

I've implemented the following changes in the develop branch to try to address this particular issue:

  • increased number of results that the autocomplete requests from the name resolver from 40 to 100
  • clamped the number of results that can display to the user to 40 to keep the autocomplete window parsable by the user

This should give the user more of a chance to get the disease they want in their 'unspecific' search term!

@sierra-moxon
Copy link
Member

tested "ath" and "ather" on https://ui.test.transltr.io/ -- both return some form of "Atherosclerosis" 👍 :). However, its not immediately clear the difference between "Atherosclerosis" and "Atherosclerosis susceptibility" ("ath" returns "Atherosclerosis susceptibility" and "ather" returns Atherosclerosis" but only "Atherosclerosis" returns any results. "Atherosclerosis susceptibility" just spins (I gave up after 4 mins).

Image

Image

Maybe no autocomplete request should be sent unless a minimum number of characters is entered (e.g. more than 4, 5)?

  • need to figure out why "Atherosclerosis susceptibility" does not return any results but is included in the autocomplete. (might be there is no synonym/xref/mapping(?) in MONDO to this term, which may be from UMLS?)

@sierra-moxon
Copy link
Member

sierra-moxon commented Feb 16, 2023

from TAQA:

  • users that try very common text strings (and short ones), get thousands of results and no disease.
  • step one: get a curie from NameResolver, step two: get a canonical name for that curie from NodeNorm - sometimes this name is nowhere near what the user asked for.

Would it help if we could tell NameResolver what type we are looking for - this would help, yes! - Gaurav and David S.

@sierra-moxon
Copy link
Member

sierra-moxon commented Feb 16, 2023

Gaurav: yep, NameRes filter by biolink type now an issue at TranslatorSRI/NameResolution#39

Second issue not solved: matching at NodeNorm returning different things. UI to show synonyms? - yep. Sometimes this is a "feature" for users to use old names. Any incorrect synonyms will instantly get user attention - good. UI does not have an explicit way of determining if something is a synonym except "does it contain this search string"? Seems ok as a first step. If its in the search, then display it as a "match result."

@sstemann sstemann added the UI - feature request future enhancement identified by user label Mar 9, 2023
@sstemann
Copy link

sstemann commented Mar 9, 2023

UI team is also working on fixes for this issue

@sstemann sstemann added the UI - term selection identification of the specific node and context to be selected for a query label Mar 9, 2023
@cbizon cbizon added this to the July 31 milestone May 26, 2023
@gaurav
Copy link

gaurav commented Jul 14, 2023

We should be able to eliminate all of these issues once we deploy the new NameRes (currently being rebuilt, with about 10 hours remaining). The current dev NameRes is sorting atherosclerosis disorder below a bunch of other atherosclerosis-related MONDOs (see http://name-resolution-sri-dev.apps.renci.org/lookup?string=Athe&offset=0&limit=10&biolink_type=Disease&only_prefixes=MONDO%7CHP), but that’s exactly what this rebuild hopes to fix.

@gaurav
Copy link

gaurav commented Jul 21, 2023

Definitely looking better with the current NameRes with filtering:

Search term Notes
athe atherosclerosis (MONDO:0005311) is the third result
ather atherosclerosis (MONDO:0005311) is the second result after atherosclerosis susceptibility (MONDO:0007169).
atherosclerosis atherosclerosis (MONDO:0005311) finally becomes the first result. We've seen this before (TranslatorSRI/NameResolution#85) and are trying to work out a solution, but having atherosclerosis as the second result isn't too bad IMO.

I'll track this and close it after the new NameRes have been successfully incorporated into the UI.

@gglusman
Copy link

I just tried it on test. For 'athe', atherosclerosis was the 3rd option. For 'ather', it was the first one.

@gaurav
Copy link

gaurav commented Jul 28, 2023

We're now back to atherosclerosis susceptibility staying as the top result until you get to atherosclerosis, when it finally becomes the first result. In order to improve gene results, we're hoping to sort by CURIE suffix (in this case, 0007169 vs 0005311) between identically scoring results, which might boost atherosclerosis to the top. Otherwise, I think it would be acceptable to close this issue as atherosclerosis in second place isn't too bad. What do y'all think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autocomplete UI - feature request future enhancement identified by user UI - term selection identification of the specific node and context to be selected for a query
Projects
None yet
Development

No branches or pull requests

7 participants