-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
autocomplete return inconsistent results #42
Comments
I'm not sure if this is totally a name-resolver issue. Basically the issue is that in the ui "x li" with a space doesn't return very many results. When I search in name resolver there are many results coming back, including many diseases. One thing I did notice is that there is a higher proportion of non-disease results for this string as opposed to "x-li". (Note that name-resolver doesn't know about types). So maybe there some interaction like UI asks for N results, but only 1 of those N is a disease with this little info. If we want to pursue this, I think that there are two ways forward:
|
@gprice1129 do you have insight into how the UI talks to name-resolver? |
@cbizon You can check out the code for the autocomplete bar here: https://github.com/NCATSTranslator/ui-fe/blob/develop/src/Utilities/autocompleteFunctions.js The main function is getAutocompleteTerms() on line 4. Essentially we send the input text to the name resolver, then format the returned object into an array of curies to send to the Node Normalizer. Then we throw out any non-diseases that are returned, and that's what the user sees. |
Thanks @dnsmith124 . How many results does the autocomplete pull back? Does it go back for more if a bunch get filtered out with the non-disease filter? |
@cbizon Right now it only pulls back 20 results, and has no functionality built in to go back for more. The implementation is about as simple as could be due to the time constraints around the initial launch of the MVP. |
Sure, that makes sense. So one option would be to ask for a larger number of results, say 100. I'm not too sure what that would do to the time of the call though. |
I'm going to do some testing to figure out how effective that change would be. In some cases I think a larger set of results from the name resolver would help, as the problem sometimes lies with too many of the returned results not being diseases. In other cases though we'll have to figure out another solution, as sometimes the name resolver only returns a few results, in which case asking for more won't really help much. |
Yep - I wonder if you have some examples of the few results case? It kinds of sounds like maybe in those cases there just isn't a good match? |
Yep, I'm working on compiling a series of tests in a spreadsheet now, I'll share it here and in slack when I'm done! |
Here's a link to the first several tests: https://docs.google.com/spreadsheets/d/1Xnh9RwSXOZp6rPs1gNXx5Sb_-aWVS72BJBF609a8Ctw/edit?usp=sharing I set the limit to 100 for these tests, rather than 20. Initially it seems like for very short terms (2-4 characters) the full 100 results often return, but very few if any are diseases and so are thrown out. This sometimes leads to the odd behavior of gradually getting more results as you increase the length of your search term, which feels a bit counter intuitive. For example 'gene' places no results in the autocomplete, whereas 'genetic' returns 13 and 'genetic a' returns 23. In those cases the ability to request specific types would certainly be helpful, but it would have to happen at the Name Resolver level otherwise I don't think it would work. I'll continue to update that doc as I perform more tests. |
I looked into how name-resolver handles punctuation. Basically, it removes any non-alphanumeric characters, but doesn't tokenize on them. So searching for @gaurav do you have any insights? |
friends and family testing revealed a similar/same issue as the one found here, documented in #85 |
Hi @gaurav - was this issue handled in the latest release of name-resolver? :). maybe we could close if so? |
I think this is fixed on NameRes RENCI and ITRB-CI by TranslatorSRI/NameResolution#33. That should be pushed to ITRB-Test by mid next week and to ITRB-Prod soon thereafter. I'm basing this by trying "beta-sito" on ITRB-Prod (returns no results) and on RENCI/ITRB-CI (returns a bunch of results, including PUBCHEM.COMPOUND:222284 (beta-Sitosterol). If you know of additional synonyms I can test (and, more importantly, add to my test suite!), please let me know! Otherwise, I think we can close this until someone finds another synonym that's broken. |
from TAQA: Chris: newest version of NN will let you search by type! :D -- rolling out soon. |
I think this should be fixed now:
Okay to close @sierra-moxon? |
see issue #40
The text was updated successfully, but these errors were encountered: