You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First, thanks for the great library! I'm using it in a hobby password manager project of mine. It's the only Rust library I've found that actually works nicely for searching password items.
The current search functions, search and search_tokens, only return the result item ids. This is fine for most situations, as the results are already sorted by the score.
However, an ability to get the scores along with the ids would allow sorting items with equal scores in a custom way. For instance, in my password manager, I would want to sort items with equal scores alphabetically.
As a side note, the biggest issue I have right now is that search result order is not stable. Searching with the same term on the same data set multiple times returns items with equal scores in different orders:
use simsearch::SimSearch;fnmain(){// Generate itemslet items:Vec<_> = (1..=50).map(|n| format!("Sample item {}", n)).collect();letmut ss = SimSearch::new();for i in&items {
ss.insert(i, i);}for i in0..3{println!("Run #{i}");let res = ss.search("sample item");for id in&res[0..2]{println!(" {id}");}println!("---");}}
Running this prints out:
Run #0
Sample item 28
Sample item 21
---
Run #1
Sample item 1
Sample item 29
---
Run #2
Sample item 37
Sample item 38
---
I think the reason for this instability is the use of HashMap internally. Perhaps it could be fixed by changing the hasher implementation, but ultimately I think returning the scores is a more versatile solution.
The text was updated successfully, but these errors were encountered:
Thank you for taking your time in this small crate. You've given a strong example of why returning results with scores is necessary. But will it be sufficient if we replace HashMap with IndexMap which should make the result stable?
First, thanks for the great library! I'm using it in a hobby password manager project of mine. It's the only Rust library I've found that actually works nicely for searching password items.
The current search functions,
search
andsearch_tokens
, only return the result item ids. This is fine for most situations, as the results are already sorted by the score.However, an ability to get the scores along with the ids would allow sorting items with equal scores in a custom way. For instance, in my password manager, I would want to sort items with equal scores alphabetically.
As a side note, the biggest issue I have right now is that search result order is not stable. Searching with the same term on the same data set multiple times returns items with equal scores in different orders:
Running this prints out:
I think the reason for this instability is the use of
HashMap
internally. Perhaps it could be fixed by changing the hasher implementation, but ultimately I think returning the scores is a more versatile solution.The text was updated successfully, but these errors were encountered: