BedrockRerank using newly available Cohere embedding model #13562

austinmw · 2023-11-18T17:51:30Z

Feature request

Hi, the new Cohere embedding models are now available on Amazon Bedrock. How can we use them for their reranking capability (instead of just embedding via BedrockEmbedding class)

Motivation

These models perform well for reranking

dosubot · 2023-11-18T17:51:39Z

🤖

Hello @austinmw! Great to see you again in the LangChain community. I'm diving into your issue about using the new Cohere embedding models on Amazon Bedrock for reranking. Expect a detailed response soon.

dosubot · 2023-11-18T17:57:47Z

🤖

Hi @austinmw, great to see you back on the LangChain repository! I appreciate your continuous interest and contributions.

To utilize the reranking capability of the new Cohere embedding models available on Amazon Bedrock in the LangChain framework, you would need to modify the _embedding_func method in the BedrockEmbeddings class.

Currently, this method is only set up to handle embedding tasks. To add reranking functionality, you would need to modify the input and output handling to match the requirements of the reranking task.

Here's an example of how you might modify the _embedding_func method to handle reranking:

def _reranking_func(self, query: str, documents: List[str]) -> List[float]:
    """Call out to Bedrock reranking endpoint."""
    # replace newlines, which can negatively affect performance.
    query = query.replace(os.linesep, " ")
    documents = [doc.replace(os.linesep, " ") for doc in documents]

    # format input body for provider
    provider = self.model_id.split(".")[0]
    _model_kwargs = self.model_kwargs or {}
    input_body = {**_model_kwargs}
    if provider == "cohere":
        if "input_type" not in input_body.keys():
            input_body["input_type"] = "rerank"
        input_body["query"] = query
        input_body["documents"] = documents
    else:
        # includes common provider == "amazon"
        input_body["inputText"] = query
        input_body["documents"] = documents
    body = json.dumps(input_body)

    try:
        # invoke bedrock API
        response = self.client.invoke_model(
            body=body,
            modelId=self.model_id,
            accept="application/json",
            contentType="application/json",
        )

        # format output based on provider
        response_body = json.loads(response.get("body").read())
        if provider == "cohere":
            return response_body.get("reranked_documents")
        else:
            # includes common provider == "amazon"
            return response_body.get("reranked_documents")
    except Exception as e:
        raise ValueError(f"Error raised by inference endpoint: {e}")

This function takes a query and a list of documents as input, and returns a list of reranked documents. The reranking is done by the Cohere model on the Bedrock platform.

Please note that this is a hypothetical example and the actual implementation may vary based on the specific requirements of the reranking task and the specific API of the Cohere model on the Bedrock platform.

Additionally, the LangChain framework does support reranking functionality. This is evident in the MapRerankDocumentsChain class in the map_rerank.py file. The MapRerankDocumentsChain class combines documents by mapping a chain over them and then reranking the results. This is done by calling an LLMChain on each input document. The LLMChain is expected to have an OutputParser that parses the result into both an answer (answer_key) and a score (rank_key). The answer with the highest score is then returned.

The reranking functionality is implemented in the _process_results method, where the results are sorted based on the score (rank_key), and the answer with the highest score is returned.

Here is the relevant code snippet:

def _process_results(
        self,
        docs: List[Document],
        results: Sequence[Union[str, List[str], Dict[str, str]]],
    ) -> Tuple[str, dict]:
        typed_results = cast(List[dict], results)
        sorted_res = sorted(
            zip(typed_results, docs), key=lambda x: -int(x[0][self.rank_key])
        )
        output, document = sorted_res[0]
        extra_info = {}
        if self.metadata_keys is not None:
            for key in self.metadata_keys:
                extra_info[key] = document.metadata[key]
        if self.return_intermediate_steps:
            extra_info["intermediate_steps"] = results
        return output[self.answer_key], extra_info

I hope this helps! Let me know if you have any other questions.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dentro-innovation · 2024-02-02T20:24:00Z

@austinmw Are you sure the reranking models are also available on Bedrock?

Cohere's webpage states that the rerank isn't available yet (https://cohere.com/deployment-options/aws):

austinmw · 2024-02-02T20:32:22Z

Hi @dentro-innovation, the "Cohere Rerank" model is not available on Bedrock, however, the latest Cohere embedding models (cohere-embed-english-v3.0 and cohere-embed-multilingual-v3.0) are both on the HuggingFace MTEB Reranking leaderboard (ranked 22 and 25, respectively) , so my assumption was that they could be configured for reranking use cases.

dentro-innovation · 2024-02-02T20:58:31Z

I wasn't aware that you can do reranking with an Embedding model.
Thought these two are different.
But you're right, on the MTEB leaderboard for reranking there are embedding models listed.

Do you have any resource that explains the relationship of embedding and reranking models? Or can you explain it in your own words @austinmw ?
Would greatly appreciate it as a web search and LLM question didn't really answer my confusion.

austinmw · 2024-02-02T21:49:10Z

I have some questions myself to be honest. My assumption is that you would:

embed the query
embed each retrieved document
calculate the cosine similarity between the embedded query and each embedded document
sort the retrieved documents according to these similarity scores

But it's possible I could be missing some intricacy of this model and how it should be used for reranking purposes. I posted a question on the Hugging Face model page.

drobbins-ancile · 2024-04-10T20:04:22Z

I have some questions myself to be honest. My assumption is that you would:
1. embed the query

2. embed each retrieved document

3. calculate the cosine similarity between the embedded query and each embedded document

4. sort the retrieved documents according to these similarity scores
But it's possible I could be missing some intricacy of this model and how it should be used for reranking purposes. I posted a question on the Hugging Face model page.

Did you get an answer to this question? It's still not super clear to me after reading the response you got on the HF model page.

We are trying to determine if the cohere ranker is simply generating embeddings of the query/document and calculating a cosine similarity or if it's doing something else. If it's the former we are looking to create an app which will do the same process and compare the results between different embeddings models.

Kevin-McIsaac · 2024-04-22T06:24:04Z

I have some questions myself to be honest. My assumption is that you would:

embed the query

embed each retrieved document

calculate the cosine similarity between the embedded query and each embedded document

sort the retrieved documents according to these similarity scores

My understanding there are two approaches, see cross-encoders-as-reranker

Bi-Encoding: This is what is described above and is what is done when you use embeddings and a vector database to look up by cosine similarity.
CrossEncoder: Here the question is paired with each of the retrieved results and passed into a sentence transformer which calculates the similarity. This is different from above and I think is what is implemented in the Cohere reranker

My guess is we need to add the bedrock cohere Reranker to the langchain CrossEncoderReranker

THe closest I could find to this is the (HuggingFaceCrossEncoder](https://python.langchain.com/docs/integrations/document_transformers/cross_encoder_reranker/) so we need a BedRockCohereCrossEncoder()

rajib76 · 2024-05-15T18:15:25Z

@austinmw My understanding of cross encoder reranking(with LLM) is as below. You do not use the embedding model but the language model.

You send the question and each documents(already retrieved through the embedding model) to the language model. Ask it to answer "YES" or "NO". Get the logprob of the answer convert to probability and then rerank based on a thershold probability. If you use the embedding model for reranking, that is not different than just the semantic match. You can as well do a semantic match/cosine similarity using the embedding model. Why would we need to wrap that as a reranker?

peebles · 2024-05-25T20:46:23Z

First, we need Amazon to add the Sagemaker Cohere Reranker model to Bedrock (and support on demand per-token/??? pricing!). I for one really hope they do this, soon.

austinmw changed the title ~~Cohere Reranker on Bedrock~~ BedrockRerank using newly available Cohere embedding model Nov 18, 2023

giusedroid mentioned this issue Jun 23, 2024

Re-ranking of sources aws-samples/Serverless-Retrieval-Augmented-Generation-RAG-on-AWS#25

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BedrockRerank using newly available Cohere embedding model #13562

BedrockRerank using newly available Cohere embedding model #13562

austinmw commented Nov 18, 2023 •

edited

Loading

dosubot bot commented Nov 18, 2023

dosubot bot commented Nov 18, 2023

dentro-innovation commented Feb 2, 2024 •

edited

Loading

austinmw commented Feb 2, 2024 •

edited

Loading

dentro-innovation commented Feb 2, 2024

austinmw commented Feb 2, 2024 •

edited

Loading

drobbins-ancile commented Apr 10, 2024

Kevin-McIsaac commented Apr 22, 2024

rajib76 commented May 15, 2024

peebles commented May 25, 2024 •

edited

Loading

BedrockRerank using newly available Cohere embedding model #13562

BedrockRerank using newly available Cohere embedding model #13562

Comments

austinmw commented Nov 18, 2023 • edited Loading

Feature request

Motivation

dosubot bot commented Nov 18, 2023

dosubot bot commented Nov 18, 2023

Sources

dentro-innovation commented Feb 2, 2024 • edited Loading

austinmw commented Feb 2, 2024 • edited Loading

dentro-innovation commented Feb 2, 2024

austinmw commented Feb 2, 2024 • edited Loading

drobbins-ancile commented Apr 10, 2024

Kevin-McIsaac commented Apr 22, 2024

rajib76 commented May 15, 2024

peebles commented May 25, 2024 • edited Loading

austinmw commented Nov 18, 2023 •

edited

Loading

dentro-innovation commented Feb 2, 2024 •

edited

Loading

austinmw commented Feb 2, 2024 •

edited

Loading

austinmw commented Feb 2, 2024 •

edited

Loading

peebles commented May 25, 2024 •

edited

Loading