# Basic RAG using Lexy

## Introduction

Let's go through a basic implementation of Retrieval Augmented Generation (RAG) using Lexy. RAG is the process of using a retriever to find relevant documents to include in the prompt as context for a language model. 

In this example, we'll use Lexy to store and retrieve documents describing characters from the TV show House of the Dragon. We'll then use those documents to construct a prompt that GPT-4 can use to answer questions about the characters.

### OpenAI API Key

Note that this example requires an OpenAI API key. You can add your API key as an environment variable using the `.env` file in the root directory of the Lexy repository. See [How do I add a new environment variable](../faq.md#how-do-i-add-a-new-environment-variable) on the [FAQ page](../faq.md) for more details.

```shell title=".env"
OPENAI_API_KEY=your_secret_api_key
```

Remember to rebuild your containers after adding the environment variable. Simply run the following on the command line.

```shell
make update-dev-containers
```

## Sample data

Our data is in the `sample_data/documents` directory. Let's import it and take a look at the first few lines. 

In [None]:
with open("../sample_data/documents/hotd.txt") as f:
    lines = f.read().splitlines()
    
lines[:3]

## Add documents to Lexy

Now let's instantiate a Lexy client and create a new collection for our documents.

In [None]:
from lexy_py import LexyClient

lexy = LexyClient()

# create a new collection
collection = lexy.create_collection(
    collection_id="house_of_the_dragon", 
    description="House of the Dragon characters"
)
collection

We can add documents to our new collection using the `add_documents` method.

In [None]:
collection.add_documents([
    {"content": line} for line in lines
])

## Create an Index and Binding

We'll create a binding to embed each document, and an index to store the resulting embeddings. We're going to use the OpenAI embedding model `text-embedding-3-small` to embed our documents. See the [OpenAI API documentation](https://platform.openai.com/docs/guides/embeddings) for more information on the available embedding models.

In [None]:
# create an index
index_fields = {
    "embedding": {"type": "embedding", "extras": {"dims": 1536, "model": "text.embeddings.openai-3-small"}}
}
index = lexy.create_index(
    index_id="hotd_embeddings", 
    description="Text embeddings for House of the Dragon collection",
    index_fields=index_fields
)

To embed each document and store the result in our index, we'll create a Binding, which connects our Collection and Index using a Transformer. The `transformers` property shows a list of available Transformers. In this example, we'll use `text.embeddings.openai-3-small`.

In [None]:
# list of available transformers
lexy.transformers

In [None]:
# create a binding
binding = lexy.create_binding(
    collection_id="house_of_the_dragon",
    index_id="hotd_embeddings",
    transformer_id="text.embeddings.openai-3-small"
)
binding

## Retrieve documents

Now we can query our index using similarity search to retrieve the most relevant documents for a given query. Let's test this out by retrieving the 2 most relevant documents for the query "Rhaenyra Targaryen".

In [None]:
index.query("Rhaenyra Targaryen", k=2)

## Context for GPT-4

These documents may not be super useful on their own, but we can provide them as context to a language model to generate a response. Let's construct a prompt for GPT-4 to answer questions about House of the Dragon.

### Construct a prompt

With RAG, we construct our prompt **dynamically** using our retrieved documents. Given a question, we'll first retrieve the documents that are most relevant, and then include them in our prompt as context. Below is a basic template for our prompt.

In [None]:
system_prompt = (
    "You are an exceptionally intelligent AI assistant. Answer the following "
    "questions using the context provided. PLEASE CITE YOUR SOURCES. Be concise."
)

question_template = """\
Question: 
{question}

Context:
{context}
"""

As an example, let's construct a prompt for the question "who is the dragon ridden by Daemon Targaryen?".

In [None]:
# retrieve most relevant documents
question_ex = "who is the dragon ridden by Daemon Targaryen?"
results_ex = index.query(query_text=question_ex)

# format results as context
context_ex = "\n".join([
    f'[doc_id: {er["document_id"]}] {er["document.content"]}' for er in results_ex
])

# construct prompt
prompt_ex = question_template.format(question=question_ex, context=context_ex)
print(prompt_ex)

### Chat completion

Now we can use this prompt to generate a response using GPT-4. We'll use the same OpenAI client we've been using in Lexy to interact with the OpenAI API.

In [None]:
# import OpenAI client
from lexy.transformers.openai import openai_client

In [None]:
# generate response
oai_response = openai_client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": prompt_ex}
    ]
)
print(oai_response.choices[0].message.content)

We can see that GPT-4 has used the context we provided to answer the question, and has specifically cited the first two documents in our search results.

Let's put everything together into two functions: `construct_prompt` will construct a prompt given a user question, and `chat_completion` will prompt a completion from GPT-4.

In [None]:
def construct_prompt(question: str,  
                     result_template: str = "[doc_id: {r[document_id]}] {r[document.content]}",
                     **query_kwargs):
    # retrieve most relevant results
    results = index.query(query_text=question, **query_kwargs)
    # format results for context
    context = "\n".join([
        result_template.format(r=r) for r in results
    ])
    # format prompt
    return question_template.format(question=question, context=context)

def chat_completion(message: str,
                    system: str = system_prompt, 
                    **chat_kwargs):
    # generate response
    return openai_client.chat.completions.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": system},
            {"role": "user", "content": message}
        ],
        **chat_kwargs,
    )

Now let's try asking GPT-4 some more questions.

In [None]:
q = "which one is the blue dragon?"
oai_response = chat_completion(message=construct_prompt(q))
print(oai_response.choices[0].message.content)

In [None]:
q = "who rides Vhagar?"
oai_response = chat_completion(message=construct_prompt(q))
print(oai_response.choices[0].message.content)

In [None]:
q = "who is the second son of King Viserys?"
oai_response = chat_completion(message=construct_prompt(q))
print(oai_response.choices[0].message.content)

In [None]:
q = "who is the heir to the throne?"
oai_response = chat_completion(message=construct_prompt(q))
print(oai_response.choices[0].message.content)

## Using metadata as context

By now, you're probably thinking "Wow, is there anything Lexy *can't* do?". Well the answer is NO; Lexy can do literally everything! To prove it, let's look at an example where we might want to use document metadata as context for our prompts.

Let's ask which is the largest of the Targaryen dragons. We get the correct answer, Balerion.  

In [None]:
q = "which is the largest Targaryen dragon?"
oai_response = chat_completion(message=construct_prompt(q))
print(oai_response.choices[0].message.content)

But what if we want to add new documents to our collection, and those documents contain new or contradictory information? In this case, we can include additional metadata in our prompt which the language model can use to arrive at an answer. 

First, let's add a new document to our collection. Because the binding we created earlier has status set to ON, our new document will automatically be embedded and added to our index. This document will be a more recent document, as measured by the `updated_at` field.

In [None]:
# add a new document
collection.add_documents([
    {"content": "Lexy was by far the largest of the Targaryen dragons, and was ridden by AGI the Conqueror."}
])

Now let's ask the same question as before, but this time we'll include the `updated_at` field in our prompt. We'll use the `return_fields` parameter to return the document's `updated_at` field along with our search results, and we'll update our `result_template` to include its value. Let's take a look at our new prompt.

In [None]:
new_result_template = \
    "[doc_id: {r[document_id]}, updated_at: {r[document.updated_at]}] {r[document.content]}"

new_prompt = construct_prompt(
    question="which is the largest Targaryen dragon?", 
    result_template=new_result_template, 
    return_fields=["document.content", "document.updated_at"]
)
print(new_prompt)

We can see that our prompt now includes the `updated_at` field for each document. Now let's update our system prompt to tell GPT-4 to use the latest document when faced with conflicting information.

In [None]:
new_system_prompt = (
    "You are an exceptionally intelligent AI assistant. Answer the following "
    "questions using the context provided. PLEASE CITE YOUR SOURCES. Be concise. "
    "If the documents provided contain conflicting information, use the most "
    "recent document as determined by the `updated_at` field."
)

Now let's ask GPT-4 again.

In [None]:
q = "which is the largest Targaryen dragon?"
oai_response = chat_completion(
    message=construct_prompt(
        question=q, 
        result_template=new_result_template, 
        return_fields=["document.content", "document.updated_at"]
    ),
    system=new_system_prompt
)
print(oai_response.choices[0].message.content)

## Next steps

In this tutorial we learned how to implement basic RAG using Lexy. Specifically, we've seen how to use Lexy to store and retrieve documents, and how to include those documents and their metadata as context for a language model.

While this is a simple example, the basic principles are powerful. As we'll see, they can be applied to build far more complex AI applications. In the coming examples you'll learn:

- How to parse and store custom metadata along with your documents.
- How to use Lexy to summarize documents, and then leverage those summaries to retreive the most relevant documents.
- How to use document filters and custom Transformers to build flexible pipelines for your data.
- How to ingest and process file-based documents (including PDFs and images) for use in your AI applications.