[ai-agents] Introduce the re-rank agent with MMR ranking #502

eolivelli · 2023-09-28T10:46:33Z

Summary

new agent that performs re-ranking on the results of a query
there is only an algorithm implemented at the moment: MMR, using BM25 and Cosine Similarity
update the docker-chatbot and the webcrawler-source examples to use MMR (and query more documents than the available room in the prompt)
added the first integration test with a local Vector Database (using JDBC/HerdDB)
there is a small breaking change in the "query" and "query-vector-db" agents, now they return a "List<Map<String,Object>>" instead of "List<Map<String, String>>", this way it is possible to get the vector of floats from the query
in the integration tests now we are able to load the JDBC drivers (that are not on the classpath but in a directory in target)

Description about MMR

In order to perform MMR you need two functions, one to compute "diversity" and one to give a "relevance".
We are using BM25 + IDF in order to compute "relevance", and we use the average "cosine similarity" to compute "diversity".
The are a few parameters to tune the algorithms.

How do I use this feature ?

You need to provide both a "query" and a set of "documents", and for each of them you must also provide the "embeddings" (for the query and for each document).
This is easy in the standard chatbot pipeline because we compute the embeddings of the query before performing the vector search and when you perform the vector search you can get both the text and the embeddings stored on the database.

 pipeline:
      - name: "Re-rank query results"
        id: step1
        type: "re-rank"
        input: "input-topic"
        output: "output-topic"
        configuration:
            max: 10
            field: "value.query_results"
            output-field: "value.reranked_results"
            query-text: "value.query"
            query-embeddings: "value.query_embeddings"
            text-field: "record.text"
            embeddings-field: "record.embeddings"
            algorithm: "MMR"
            lambda: 0.5
            b: 2
            k1: 1.2

With the "max" parameter you can limit the number of documents, this is pretty useful in case you want to get many documents from the vector database but then you can use only fewer of them to build the prompt.

Parameters

max: maximum number of documents to keep
field: the field that contains the documents to sort
output-field: the field that will hold the results, it can be the same as "field" to override it
query-text: this is the field that contains the "query" (usually the question to the chatbot)
query-embeddings: this is the field that contains the "embeddings" for the "query", they must be precomputed
text-field: the field in the result set that contains the "text", you have to use the "record.xxx" syntax
embeddings-field: the field in the result set that contains the "embeddings" for the text, you have to use the "record.xxx" syntax
algorithm: "MMR" or "none"
lambda: this a the parameter for the MMR algorithm
b and k1: parameters for the B25 algoritm

Notes

you must pre-compute embeddings on the query and in all of the documents, but this is usually easy that you usually perform a vector search before this step, so you have to get both the text and the embeddings vector for each document.

cdbartholomew · 2023-09-28T13:38:34Z

Using this agent, is it possible to say query the vector store for 100 semantically similar results but only return the 10 most diverse from that larger set?

eolivelli · 2023-09-28T13:41:49Z

Using this agent, is it possible to say query the vector store for 100 semantically similar results but only return the 10 most diverse from that larger set?

I had the same thought, let me add this feature, and configure a "max" property

acantarero · 2023-09-29T16:56:44Z

comment is maybe a little late, but question I had looking at this:

I haven't had time to dig into if there's a best practice of which similarity metrics to use for the relevance and diversity.

However, I wonder if we want to allow the user to optionally choose between cosine and bm25.

My thought process here:
bm25 usually involves some text preprocessing steps (stop word removal, stemming/lemmatization, word normalization etc.) that is often not done on text data being used in vector search.

Given that our primary use case is genAI and most of our customers aren't doing those preprocessing steps, bm25 may not actually work as well as it should and they may be better off using cosine similarity for relevance.

eolivelli · 2023-09-29T20:14:37Z

@acantarero

Currently for MMR we need both BM25 and cosine similarity.
It is a further step, after the vector search.
In fact that if you send a query to the vector database and you ask for the "most similar documents", then you already them.

With this new agent you can retrieve more documents from the database and keep only a selection that is "diverse enough", but still "relevant"

additional note:

With LangStream it is pretty easy to pre-process the data before inserting the documents in the vector database and you can also apply the same preprocessing to normalise the "query" in your chat completion pipeline.

Let's follow up on slack or maybe you can open a "Discussion" or a GH Ticket, this PR has been closed, so nobody will find it easily

acantarero · 2023-09-29T20:37:21Z

could you explain more about why we need both?

I read the original MMR paper and it says you can use the same similarity function for both the relevancy and diversity.

eolivelli · 2023-09-30T08:56:40Z

Cosine similarity is not enough to reduce reduntant documents. With IDF we can ensure that we are not passing duplicate content to the LLM, as tokens have a cost.

)

eolivelli added 4 commits September 28, 2023 12:45

[ai-agents] Introduce the re-rank agent with MMR ranking

a8cd2e1

Implement BM25 and Cosine similarity

598ca9e

fix test

7ac5bcf

Add test

9d3a5db

eolivelli added 5 commits September 29, 2023 07:44

Make query return objects and not strings

1a854ba

fix build

7ebe476

fix NPE

4abc1cd

Add tests and better examples

cf4f76e

Fix Astra MMR example

f678238

eolivelli marked this pull request as ready for review September 29, 2023 10:05

eolivelli added 3 commits September 29, 2023 12:22

Fix test

a9198ff

fix test

90e6b70

Fix test

5ebeabd

eolivelli merged commit 83aeb7d into main Sep 29, 2023
8 checks passed

eolivelli deleted the impl/re-ranking branch September 29, 2023 13:02

eolivelli mentioned this pull request Sep 29, 2023

Add re-ranking agent #447

Closed

benfrank241 pushed a commit to vectorize-io/langstream that referenced this pull request May 2, 2024

[ai-agents] Introduce the re-rank agent with MMR ranking (LangStream#502

aca7d57

)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ai-agents] Introduce the re-rank agent with MMR ranking #502

[ai-agents] Introduce the re-rank agent with MMR ranking #502

eolivelli commented Sep 28, 2023 •

edited

Loading

cdbartholomew commented Sep 28, 2023

eolivelli commented Sep 28, 2023 •

edited

Loading

acantarero commented Sep 29, 2023

eolivelli commented Sep 29, 2023

acantarero commented Sep 29, 2023

eolivelli commented Sep 30, 2023

[ai-agents] Introduce the re-rank agent with MMR ranking #502

[ai-agents] Introduce the re-rank agent with MMR ranking #502

Conversation

eolivelli commented Sep 28, 2023 • edited Loading

Summary

Description about MMR

How do I use this feature ?

Parameters

Notes

cdbartholomew commented Sep 28, 2023

eolivelli commented Sep 28, 2023 • edited Loading

acantarero commented Sep 29, 2023

eolivelli commented Sep 29, 2023

acantarero commented Sep 29, 2023

eolivelli commented Sep 30, 2023

eolivelli commented Sep 28, 2023 •

edited

Loading

eolivelli commented Sep 28, 2023 •

edited

Loading