Add Similarity Metric Used to leaderboard #766

aminst · 2024-05-20T00:06:43Z

Hi, thanks for this awesome benchmark.
Is it possible to add the similarity metric used in each model in the benchmark? From what I understand, the choice of what similarity metric is used in each model influences what similarity metric people should use when storing the generated embeddings in a vector database for later similarity searches.
I believe this would help people to easily choose what similarity metric to use when storing the embeddings.
I can help and add this if it's valuable. Thanks!

KennethEnevoldsen · 2024-05-20T14:29:24Z

@tomaarsen what are your thoughts on adding this to the leaderboard? My guess is that almost all models would use cosine sim. in which case it wouldn't add much information

tomaarsen · 2024-05-21T06:27:03Z

@KennethEnevoldsen I do think it makes sense to show this in the leaderboard for all tasks - I think we currently only say it for STS:

Metric: Spearman correlation based on cosine similarity

But the other tasks primarily (exclusively?) use Cosine Similarity too. There are some models/tasks that perform a bit better with (non-normalized) dot as it prefers longer passages, but they're few and far between & not high on the leaderboard.

Tom Aarsen

KennethEnevoldsen · 2024-05-21T08:45:29Z

From my understanding, @aminst refers to the intended distance metric of the model itself (@aminst do correct me if I am wrong) and not the task?

However, I do agree that a model might have been trained with a different metric in mind, and assuming a distance metric seems problematic. I would ideally allow the model to supply the distance metric and then we just report the score (e.g. spearman correlation) for whatever distance metric the model selects.

aminst · 2024-05-21T14:33:19Z

@KennethEnevoldsen
Yes, that is exactly what I meant. It would be great if the leaderboard also shows the distance metric the model used during training. It would also help people to not misuse the embeddings with a different metric.
The use case I have in mind is the following, does it make sense?

Somebody wants to convert their data into vector embeddings and store it in a vector database for later retrieval and semantic search.
The person uses the leaderboard to find the model to use.
They should manually search for the distance metric to use, which the leaderboard itself can offer.

tomaarsen · 2024-05-21T15:27:41Z

Ohh, I see! Yes, that would indeed be optimal. I realised something similar with Sentence Transformers, so in Sentence Transformers v3 it will be possible to configure the similarity function in the model configuration. This will then be used when calling the new SentenceTransformer.similarity or SentenceTransformer.similarity_pairwise methods.

Additionally, ST models will start reporting their similarity function in the model card automatically, e.g. here.

That should help, at least with ST-based models.

Tom Aarsen

KennethEnevoldsen · 2024-05-22T07:43:55Z

It sounds like this is something that we might consider adding after the additions to ST3. I will leave the issue open, but atm. we probably won't add it in.

KennethEnevoldsen · 2024-06-05T18:16:00Z

I have added an issue related to using a custom sim. within the benchmark, but for the similarity of the model we will probably leave that to the model card.

edit: will close for now, but feel free to re-open the discussion if you believe that there is more to add.

KennethEnevoldsen added the leaderboard issues related to the leaderboard label May 20, 2024

KennethEnevoldsen changed the title ~~Similarity Metric Used~~ Add Similarity Metric Used to leaderboard May 22, 2024

KennethEnevoldsen mentioned this issue Jun 5, 2024

Use the model similarity instead of benchmark defined similarity #885

Open

KennethEnevoldsen closed this as completed Jun 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Similarity Metric Used to leaderboard #766

Add Similarity Metric Used to leaderboard #766

aminst commented May 20, 2024

KennethEnevoldsen commented May 20, 2024

tomaarsen commented May 21, 2024

KennethEnevoldsen commented May 21, 2024

aminst commented May 21, 2024

tomaarsen commented May 21, 2024

KennethEnevoldsen commented May 22, 2024

KennethEnevoldsen commented Jun 5, 2024 •

edited

Add Similarity Metric Used to leaderboard #766

Add Similarity Metric Used to leaderboard #766

Comments

aminst commented May 20, 2024

KennethEnevoldsen commented May 20, 2024

tomaarsen commented May 21, 2024

KennethEnevoldsen commented May 21, 2024

aminst commented May 21, 2024

tomaarsen commented May 21, 2024

KennethEnevoldsen commented May 22, 2024

KennethEnevoldsen commented Jun 5, 2024 • edited

KennethEnevoldsen commented Jun 5, 2024 •

edited