Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.Net: minRelevancyScore is "backwards" for Weaviate #6274

Closed
matthewbolanos opened this issue May 15, 2024 · 2 comments
Closed

.Net: minRelevancyScore is "backwards" for Weaviate #6274

matthewbolanos opened this issue May 15, 2024 · 2 comments
Labels
.NET Issue or Pull requests regarding .NET code triage

Comments

@matthewbolanos
Copy link
Member

For Weaviate, this does not return any results...

var matches = await memoryStore.GetNearestMatchAsync(
    collectionName: "scene",
    embedding: await textEmbeddingGenerationService.GenerateEmbeddingAsync(completePrompt),
    minRelevanceScore: 0
);

But this one does

var matches = await memoryStore.GetNearestMatchAsync(
    collectionName: "scene",
    embedding: await textEmbeddingGenerationService.GenerateEmbeddingAsync(completePrompt),
    minRelevanceScore: =1
);
@markwallace-microsoft markwallace-microsoft added .NET Issue or Pull requests regarding .NET code triage labels May 15, 2024
@github-actions github-actions bot changed the title minRelevancyScore is "backwards" for Weaviate .Net: minRelevancyScore is "backwards" for Weaviate May 15, 2024
@dmytrostruk
Copy link
Member

dmytrostruk commented May 15, 2024

@matthewbolanos From Weaviate point of view, it is correct, because default distance metric is cosine, distance value has range 0 <= d <= 2, where 0 is identical vector and 2 is opposing vector:
https://weaviate.io/developers/weaviate/config-refs/distances#available-distance-metrics

On the other hand, when you use IMemoryStore abstraction, you don't know which connector is used, which sounds like on abstraction level, the score should be aligned across all connectors. For example, for Weaviate, we will need to convert 0 <= d <= 2 to 1 >= d >= 0 and vice versa. cc: @westey-m

And this is just for cosine metric, but I think users will want to support other metrics as well, for example l2-squared with range 0 <= d < ∞.

It's not a problem to fix that for Weaviate at this point, but I'm a little bit worried that it may change the logic in users' applications now, if they already found a good threshold for their scenarios.

I'm wondering if new memory abstraction and implementation for connectors will be a good timing for such changes?

@westey-m
Copy link
Contributor

Agreed, we should definitely try to address this as part of the new memory connectors work. Normalization is a potential option, but it will be difficult to normalize across some of the disparate ranges, e.g. turning -∞ < d < ∞ into 0 to 1 or vice versa.
Alternatively we should consider making it clear what distance metric is in use and what the appropriate ranges are when specifying a required distance / similarity range. The interface will therefore be the same across different databases for the same distance metric, but distance metrics will be treated differently to each other.

CC @markwallace-microsoft

@matthewbolanos matthewbolanos closed this as not planned Won't fix, can't repro, duplicate, stale May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
.NET Issue or Pull requests regarding .NET code triage
Projects
None yet
Development

No branches or pull requests

4 participants