Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Improve model search functionality #241

Closed
anamika-yadav99 opened this issue Apr 16, 2022 · 12 comments
Closed

Feature Request: Improve model search functionality #241

anamika-yadav99 opened this issue Apr 16, 2022 · 12 comments
Assignees
Labels
enhancement New feature or request

Comments

@anamika-yadav99
Copy link
Contributor

anamika-yadav99 commented Apr 16, 2022

Is your feature request related to a problem? Please describe.

yes, every once in a while there's a typo or user enters a wrong spelling. model search function in such cases return null table which could be confusing for the user.

Describe the solution you'd like.

I'm still exploring for a good solution. perhaps, a simple approximate string matching algorithm which would return the closest matching model name in the ModelSearch function and also on fetch command.

Describe alternatives you've considered

No response

Additional context.

No response

@anamika-yadav99 anamika-yadav99 added the enhancement New feature or request label Apr 16, 2022
@anamika-yadav99
Copy link
Contributor Author

@GemmaTuron would this be a feature you would like to have?

@dauinh
Copy link
Contributor

dauinh commented Apr 17, 2022

Hi @miquelduranfrigola @anamika-yadav99, can i work on this feature as well?

@dauinh
Copy link
Contributor

dauinh commented Apr 17, 2022

I found this article to be helpful. Do you think we can do a 'did you mean' and implement a Trie to find the correct command?

@anamika-yadav99
Copy link
Contributor Author

anamika-yadav99 commented Apr 17, 2022

Hi @dauinh Thanks for the help. The article was insightful. I'm also thinking of somewhere along the similar lines. Compare the distance between the input model name and the model names in the hub or local and then return the closest matching model name. This is basically string matching algorithm only. Let's wait for what @miquelduranfrigola and @GemmaTuron have to say on this.

@dauinh
Copy link
Contributor

dauinh commented Apr 19, 2022

Hi @dauinh Thanks for the help. The article was insightful. I'm also thinking of somewhere along the similar lines. Compare the distance between the input model name and the model names in the hub or local and then return the closest matching model name. This is basically string matching algorithm only. Let's wait for what @miquelduranfrigola and @GemmaTuron have to say on this.

Oh, i see what you mean... Thank you for clarifying!

@GemmaTuron
Copy link
Member

Hi @anamika-yadav99

This sounds fantastic. My only concern is if the search function for close models would slow down the package too much, but we can certainly explore options. Can you add this to your application as a task?

@miquelduranfrigola
Copy link
Member

This is interesting overall. Because we are using AirTable to store our models, my immediate guess is that there may be some AirTable functionality for fuzzy string matching or something like this in a text-based search. If this is the case, we should certainly take this avenue.

@anamika-yadav99
Copy link
Contributor Author

anamika-yadav99 commented May 1, 2022

Hi @miquelduranfrigola I looked into Airtable API. Unfortunately it doesn't support Fuzzy Search. Fuzzy search can easily be achieved using fuzzywuzzy python library but I guess, you wouldn't want any more dependency in ersilia. So, I have modified the Search function in Ersilia to include an algorithm which calculates the distance between the input from cli and data which is imported from hub in model catalog. It doesn't take a lot of time to perform the search through the table. The time complexity of algorithm is mn. I think similar solution would work for model fetch command as well. Shall I proceed with my approach so far?

@miquelduranfrigola
Copy link
Member

Hi, @anamika-yadav99 thanks. This sounds like a good approach. Let's go for it. I am more interested in the catalog command than in the fetch command. So let's try with catalog, to start with?

@anamika-yadav99
Copy link
Contributor Author

@miquelduranfrigola I'm almost done with the catalog command. I'll raise the PR by tomorrow.

@anamika-yadav99
Copy link
Contributor Author

Hi @miquelduranfrigola I'm sorry for the delay. I was caught in some college work which took longer than expected. I have raised a pr for catalog command. Looking forward to your feedback.

@miquelduranfrigola
Copy link
Member

Hi @anamika-yadav99 this is great. I have approved PR #262 and closing the issue now! Many thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants