Replies: 1 comment 2 replies
-
🤖 Hello, The similarity scores returned by the To filter search results based on a threshold percentage, you would need to normalize the scores so they fall within a range of 0 to 1 (or 0% to 100%). However, the normalization process would depend on the specific distance metric used and the range of scores produced by your specific dataset and embeddings. Here's a general approach to filter the results based on a threshold: def filter_results_by_score(results, threshold):
return [(doc, score) for doc, score in results if score >= threshold]
# Use the function
filtered_results = filter_results_by_score(results, 0.5) # for 50% threshold In this code, Please note that this is a general approach and might need to be adjusted based on the specifics of your use case and the characteristics of your data and embeddings. As for the distance metric, the default one used by the I hope this helps! Let me know if you have any other questions. Best, Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
-
I've encountered a particular challenge related to the scores returned by the Similarity search with the score. One aspect that's perplexing is the nature of the similarity scores. Unlike a straightforward percentage, these scores are presented in a scaled range often in decimal points like 1.63 and so forth.
What I'm keen to understand is if anyone has devised a way to make these scores more interpretable. Specifically, I'm looking to filter the search results based on a threshold percentage, let's say around 50 percent. This would allow me to isolate the most meaningful and accurate records in the dataset.
Are there any workarounds or techniques that you've found effective in achieving this? I'm eager to gather insights from the community and learn from your experiences.
Beta Was this translation helpful? Give feedback.
All reactions