Improving Search relevancy through Generic Second stage reranker #248

aliasneo1 · 2023-08-08T11:41:14Z

Issue Description:
We are currently utilizing a neural retriever based on the bi-encoder vector search method. However, it has come to our attention that the performance of the bi-encoder approach is suboptimal when compared to the cross-encoder method, as highlighted in the referenced research paper (link).

Desired Solution:
We propose the integration of both Cross-Encoder and Bi-Encoder methods to enhance retrieval performance, particularly in scenarios involving large datasets. Cross-Encoders demonstrate superior performance, but they encounter scalability challenges with extensive datasets. To address this, a hybrid approach can be employed in scenarios like Information Retrieval and Semantic Search. Here's the suggested process:

Initiate retrieval using an efficient Bi-Encoder to identify the top 100 most similar sentences for a given query.
Subsequently, employ a Cross-Encoder to re-rank the initial 100 matches. This involves computing scores for each (query, hit) pairing.
By incorporating a Cross-Encoder-based re-ranker after the initial retrieval, a notable enhancement in final results for users can be achieved.

Considered Alternatives:
We have evaluated several alternative solutions and features in pursuit of improved retrieval performance. However, none have proven as effective as the combined Cross-Encoder and Bi-Encoder approach proposed above.

Additional Context:
For a more comprehensive understanding, any supplementary context or relevant screenshots related to this feature request will be provided as necessary. Your consideration of this enhancement would be greatly appreciated.

msfroh · 2023-08-09T16:20:11Z

@navneet1v -- Let's move this to the neural-search repo

macohen · 2023-08-09T16:21:54Z

@opensearch-project/admin I think we need an OpenSearch maintainer or admin to move this to https://github.com/opensearch-project/neural-search

navneet1v · 2023-08-09T17:58:03Z

@aliasneo1 thanks for opening this github issue. This seems very interesting. I have some couple of questions related to this:

I am still confused where when we want to the re-ranking in whole query flow of OpenSearch? Can you please add some details around that.

GauravTech1986 · 2023-09-08T18:32:36Z

Waiting eagerly for this feature! Do we have an estimated release date for it? I believe this is an important use case for search

austintlee · 2023-10-16T16:34:39Z

@navneet1v @ylwu-amzn What do you guys think about adding support for this in the RAG pipeline? This is a perfect use case for RAG. We can add a search response processor (similar to the Kendra re-ranker) that makes use of cross encoders.

What are some good candidates for pre-trained models to bring into ml-commons? Any suggestions?

navneet1v · 2023-10-16T18:49:40Z

What do you guys think about adding support for this in the RAG pipeline?

what do you mean by this?

We can add a search response processor (similar to the Kendra re-ranker) that makes use of cross encoders.

I think there has been some thoughts given to this, but if you want to start coming up with a proposal for this feel free to do that. As per my understanding we want to build this feature in Neural search. So if you are interested feel free to create a proposal for that.

What are some good candidates for pre-trained models to bring into ml-commons? Any suggestions?

No idea around this. This needs to be researched.

austintlee · 2023-10-16T21:14:30Z

A desired solution was already stated above.

Initiate retrieval using an efficient Bi-Encoder to identify the top 100 most similar sentences for a given query.
Subsequently, employ a Cross-Encoder to re-rank the initial 100 matches.

Since KNN does the first part, we need a search processor that does the second part (re-ranking). We have a reranker that uses Amazon Kendra. What I had in mind is a reranker that uses a cross encoder which executes in a search response processor. I'm hoping that this can help improve the results that the RAG processor feeds to LLMs. So, in this use case, the reranker would run after the hybrid search processor runs.

ml-commons might be a better place for this since cross encoders can run on BM25 (or sparse vector) results (independent of semantic search/KNN).

navneet1v · 2023-10-16T21:27:05Z

@austintlee in terms of improving Search relevance we are targeting to put features in Neural Search Plugin, or we should put Re-ranking basic interface in core. ML Commons can provide the model which will do the re-ranking but in terms of providing the interface to do re-ranking ML Commons is not a right option. Its closely related to Search.

austintlee · 2023-10-16T22:49:12Z

I don't think of ml-commons as just a model serving layer, although I think we do want to use it for serving cross encoders. I mainly want to see this functionality come to life. Even if we put this in neural-search, it's still going to be a search processor, right?

navneet1v · 2023-10-17T00:20:36Z

I don't think of ml-commons as just a model serving layer

Actually it is. Reason why I was thinking to build the re-ranking feature outside of ML commons so that users can write their own re-rankers which can use models not served by ML Commons.

So here is what I was thinking:

Define a standard/extensible Re-ranking Search Results Processor Interface in OpenSearch Core, to be used by users to create more specific re-rankers using lib, remote model, cross encoders etc.
Neural Search plugin extends that interface to create a Re-ranker that used a model provided by ML commons.
Other plugins external/internal to OpenSearch Project can use the extensible re-ranker interface to query an external re-ranking service to do the re-ranking for them with more personalized user based re-ranking.

navneet1v · 2023-10-17T00:22:50Z

Even if we put this in neural-search, it's still going to be a search processor, right?

Yes initial thought is around that only. But I was stepping up feature here and see if we can do better.

Plus there is one big question that we need to ans before we start working is, does re-ranking using Cross Encoder improve Search relevance, if yes by how much and if we compare it with techniques like Normalization and Score Combination what is the diff.

navneet1v · 2023-10-17T00:37:32Z

@austintlee I have added this in the VectorDB hot backlog. @vamshin lets see if we can prioritize this.

austintlee · 2023-10-17T01:46:28Z

Plus there is one big question that we need to ans before we start working is, does re-ranking using Cross Encoder improve Search relevance, if yes by how much and if we compare it with techniques like Normalization and Score Combination what is the diff.

I know this is an important question and we want to get at least a rough idea of how much, if any, improvement this will get us before we take this on, but at the same time, there clearly seems to be appetite from the community in making this feature available so people can experiment on their own data.

austintlee · 2023-10-17T01:48:06Z

I'm super interested in the vector db roadmap. Can we have a public meeting where we can discuss hot topics, what's coming, etc?

navneet1v · 2023-10-17T04:22:44Z

I'm super interested in the vector db roadmap. Can we have a public meeting where we can discuss hot topics, what's coming, etc?

@austintlee
Thanks for showing the interest. We are actually working on getting it up and running. I will check with @vamshin where are we on that.

FYI this is the public roadmap: https://github.com/orgs/opensearch-project/projects/145

samuel-oci · 2023-10-23T23:07:30Z

Plus there is one big question that we need to ans before we start working is, does re-ranking using Cross Encoder improve Search relevance, if yes by how much and if we compare it with techniques like Normalization and Score Combination what is the diff.

@navneet1v curious, are you saying that you see re-ranking and normalization as mutually exclusive? or maybe I misunderstood?
Also, I have seen some solutions that have up to 3 layers of re-ranking with each one applying additional sorting and filtering. Are we planning to support multi tier of re-ranking?
Should we just treat the normalization and score combination step be orthogonal to re-ranking? as re-ranking seems like just another step that can come in the response processing pipeline in a specific order.

austintlee · 2023-10-26T04:29:08Z

That's how I am approaching this. Another search response processor. Maybe there is a reranking processor that is tailored to re-ranking, but it would just be an extension of a search processor.

HenryL27 · 2023-11-01T23:40:49Z

Plus there is one big question that we need to ans before we start working is, does re-ranking using Cross Encoder improve Search relevance, if yes by how much and if we compare it with techniques like Normalization and Score Combination what is the diff.

@navneet1v Some metrics from experiments using a reranker in python-land on a customer dataset:

	recall@5	recall@100	MRR	avg pos. of first gold
no reranker	0.607456	0.975877	0.326642	8.526316
reranker	0.692982	0.975877	0.557356	2.842105

These experiments were run using hybrid search (min-max norm, linear combo, weighted (0.111, 0.889) towards neural over bm25) with embedding model "thenlper/gte-small" and reranker "BAAI/bge-reranker-large", reranking the top 100 documents. There are about 1000 documents in the index, and 19 question, docs pairs (each of the 19 questions matches to about 1-3 relevant documents)
I have tried a couple other hybrid search configurations (though not exhaustively) and haven't been able to beat these. So I'd say, (at least on this particular dataset) yes, reranking is worth it.

navneet1v · 2023-11-02T01:27:11Z

@HenryL27 thanks for sharing the results. This looks really awesome. I think this is good evidence. I have already put this task in hot backlog meaning it will be picked up soon.

@vamshin do you have some timeline in mind around this.

austintlee · 2023-11-02T16:30:23Z

Can we target this for 2.12?

HenryL27 · 2023-11-02T19:25:38Z

@navneet1v @vamshin Is someone working on this? If not, I'll take it

vamshin · 2023-11-02T22:12:43Z

@HenryL27 thanks for your interest. We are looking into this. We will need your support with RFC/Code reviews.

HenryL27 · 2023-11-02T22:21:37Z

@vamshin so yes, someone is working on this? What's the timeline? Any sub-issues I can pick up? I don't see any issues yet.

vamshin · 2023-11-02T22:37:24Z

@HenryL27 we started scoping this work and using this GitHub issue as a feature request. As a first step, we will do RFC to get community feedback on the approaches. Our idea is to build a more generic reranker(multi-stage) capable of passing metadata(user context) and OpenSearch results to a remote connector for reranking results.

Let me get back on the timelines. Happy to collaborate.

HenryL27 · 2023-11-02T22:56:17Z

@vamshin We have a customer who wants this now - can we scope this down to a simple 1-stage reranker and then expand it later?

Here's the API spec I have in mind

APIs

Create Rerank Pipeline

PUT /_search/pipeline
{
  "response_processors": [
    {
      "neural-rerank": {
        "top_k": int (how many to rerank),
        "model_id": id of cross-encoder,
        "context_field": str (source field to compare to query)
      }
    }
  ]
}

Query Rerank Pipeline

POST index/_search
{
  "query": {...},
  "ext": {
    "rerank": {
      "query_text": str (query text to compare)
    }
  }
}

or alternatively

POST index/_search
{
  "query": {...}
  "ext": {
    "rerank": {
      "query_text_path": str (path in the search body to the query text)
    }
  }
}

For example, with a neural query we might have

"query_text_path": "query.neural.embedding.query_text"

The rerank processor will evaluate the "top_k" search results, and then sort them based on the new scores. Documents outside of the top k will be left in place and not evaluated, meaning that they might conceivably have higher "_score" values than the reranked documents, while being lower on the list. This will also override any sorts in the search query, although I think the use case for sorts and semantic reranking are largely non-overlapping.

Upload Cross Encoder Model

POST /_plugins/_ml/models/_upload
{
  "name": "model name",
  "version": "1.0.0 or something",
  "description": "description",
  "model_format": "TORCH_SCRIPT",
  "function_name": "TEXT_SIMILARITY",
  "model_content_hash_value": "hash browns",
  "url": "https://url-of-model"
}

This is not a new API or anything, and all the other model-based APIs should still work for the cross encoder model/function name with minimal work to integrate.

Basically, a simple search response processor that a user can plug in to whatever search request they have, that looks a lot like how a neural search works so should be familiar. That would solve probably 90% of use cases.

HenryL27 · 2023-11-03T18:26:19Z

@vamshin RFC draft before I post it for real

dylan-tong-aws · 2023-11-03T22:55:48Z

All, I'm catching up on this thread. This is on our product roadmap. We're going to implement a generic second stage re-ranking search pipeline. You will be able to integrate a second-stage re-ranker like a cross-encoder via the AI connectors available in ml-commons. This functionality will be integrated with the neural search and LTR experience.

I have a forum post a while back to collect community feedback.

austintlee · 2023-11-03T23:38:44Z

@dylan-tong-aws Henry is essentially signing up to do the work described in that post.

vamshin · 2023-11-04T04:12:18Z

@HenryL27 please feel free to publish RFC and we can let community provide the feedback. Thanks

HenryL27 · 2023-11-04T19:47:36Z

See #485

vamshin · 2023-12-11T21:13:43Z

@HenryL27 @navneet1v Is it ok to change the title to "Generic Second stage reranker to Improve search relevancy"?

navneet1v · 2023-12-11T21:35:28Z

I don't see any concern. Let me change this,

vamshin · 2024-02-21T20:29:26Z

Closing this issue as its released in 2.12.0

aliasneo1 added the untriaged label Aug 8, 2023

msfroh removed the untriaged label Aug 9, 2023

prudhvigodithi transferred this issue from opensearch-project/OpenSearch Aug 9, 2023

github-actions bot added the untriaged label Aug 9, 2023

msfroh removed the untriaged label Aug 16, 2023

navneet1v added the Features Introduces a new unit of functionality that satisfies a requirement label Aug 25, 2023

navneet1v added help wanted Extra attention is needed backlog All the backlog features should be marked with this label labels Sep 15, 2023

HenryL27 mentioned this issue Nov 3, 2023

[RFC] Improving Search relevancy through Generic Reranker interfaces #485

Closed

HenryL27 mentioned this issue Nov 18, 2023

Adding support for generic re-ranker interface and opensearch ml re-ranker for improving search relavancy. #494

Merged

5 tasks

vamshin added neural-search and removed help wanted Extra attention is needed labels Dec 4, 2023

navneet1v changed the title ~~Improving Retrieval Performance through Combined Cross-Encoder and Bi-Encoder Approach~~ Improving Search relevancy through Generic Second stage reranker Dec 11, 2023

vamshin assigned HenryL27 and navneet1v Dec 11, 2023

vamshin added the v2.13.0 label Dec 11, 2023

martin-gaievski mentioned this issue Jan 24, 2024

Add validation for reranker processor parameters #555

Closed

martin-gaievski mentioned this issue Feb 6, 2024

Added Reranker feature #591

Merged

5 tasks

HenryL27 mentioned this issue Feb 6, 2024

[DOC] Documentation for new reranking feature opensearch-project/documentation-website#6359

Closed

4 tasks

vamshin added v2.12.0 Issues targeting release v2.12.0 and removed v2.13.0 labels Feb 21, 2024

vamshin closed this as completed Feb 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving Search relevancy through Generic Second stage reranker #248

Improving Search relevancy through Generic Second stage reranker #248

aliasneo1 commented Aug 8, 2023

msfroh commented Aug 9, 2023

macohen commented Aug 9, 2023

navneet1v commented Aug 9, 2023

GauravTech1986 commented Sep 8, 2023 •

edited

Loading

austintlee commented Oct 16, 2023

navneet1v commented Oct 16, 2023

austintlee commented Oct 16, 2023

navneet1v commented Oct 16, 2023

austintlee commented Oct 16, 2023

navneet1v commented Oct 17, 2023

navneet1v commented Oct 17, 2023

navneet1v commented Oct 17, 2023

austintlee commented Oct 17, 2023

austintlee commented Oct 17, 2023

navneet1v commented Oct 17, 2023 •

edited

Loading

samuel-oci commented Oct 23, 2023

austintlee commented Oct 26, 2023

HenryL27 commented Nov 1, 2023 •

edited

Loading

navneet1v commented Nov 2, 2023 •

edited

Loading

austintlee commented Nov 2, 2023

HenryL27 commented Nov 2, 2023

vamshin commented Nov 2, 2023 •

edited

Loading

HenryL27 commented Nov 2, 2023

vamshin commented Nov 2, 2023

HenryL27 commented Nov 2, 2023

HenryL27 commented Nov 3, 2023

dylan-tong-aws commented Nov 3, 2023

austintlee commented Nov 3, 2023

vamshin commented Nov 4, 2023

HenryL27 commented Nov 4, 2023

vamshin commented Dec 11, 2023

navneet1v commented Dec 11, 2023

vamshin commented Feb 21, 2024

Improving Search relevancy through Generic Second stage reranker #248

Improving Search relevancy through Generic Second stage reranker #248

Comments

aliasneo1 commented Aug 8, 2023

msfroh commented Aug 9, 2023

macohen commented Aug 9, 2023

navneet1v commented Aug 9, 2023

GauravTech1986 commented Sep 8, 2023 • edited Loading

austintlee commented Oct 16, 2023

navneet1v commented Oct 16, 2023

austintlee commented Oct 16, 2023

navneet1v commented Oct 16, 2023

austintlee commented Oct 16, 2023

navneet1v commented Oct 17, 2023

navneet1v commented Oct 17, 2023

navneet1v commented Oct 17, 2023

austintlee commented Oct 17, 2023

austintlee commented Oct 17, 2023

navneet1v commented Oct 17, 2023 • edited Loading

samuel-oci commented Oct 23, 2023

austintlee commented Oct 26, 2023

HenryL27 commented Nov 1, 2023 • edited Loading

navneet1v commented Nov 2, 2023 • edited Loading

austintlee commented Nov 2, 2023

HenryL27 commented Nov 2, 2023

vamshin commented Nov 2, 2023 • edited Loading

HenryL27 commented Nov 2, 2023

vamshin commented Nov 2, 2023

HenryL27 commented Nov 2, 2023

APIs

HenryL27 commented Nov 3, 2023

dylan-tong-aws commented Nov 3, 2023

austintlee commented Nov 3, 2023

vamshin commented Nov 4, 2023

HenryL27 commented Nov 4, 2023

vamshin commented Dec 11, 2023

navneet1v commented Dec 11, 2023

vamshin commented Feb 21, 2024

GauravTech1986 commented Sep 8, 2023 •

edited

Loading

navneet1v commented Oct 17, 2023 •

edited

Loading

HenryL27 commented Nov 1, 2023 •

edited

Loading

navneet1v commented Nov 2, 2023 •

edited

Loading

vamshin commented Nov 2, 2023 •

edited

Loading