# Retrieval augmented generation using Elasticsearch and OpenAI

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/colab-notebooks-examples/integrations/openai/intro.ipynb)


This notebook demonstrates how to: 
- Index the OpenAI Wikipedia vector dataset into Elasticsearch 
- Embed a question with the OpenAI [`embeddings`](https://platform.openai.com/docs/api-reference/embeddings) endpoint
- Perform [kNN search](https://www.elastic.co/guide/en/elasticsearch/reference/current/knn-search.html) on the Elasticsearch index using the encoded question
- Send the top search results to the OpenAI [Chat Completions](https://platform.openai.com/docs/guides/gpt/chat-completions-api) API endpoint for retrieval augmented generation (RAG)

## Install packages and import modules 

In this example, we are using wget to download wikipedia vector database and pandas library to read data into a dataframe. 

In [None]:
!python3 -m pip install -qU openai pandas wget elasticsearch

## Create Elastic Cloud deployment
If you don't have an Elastic Cloud deployment, sign up [here](https://cloud.elastic.co/registration?fromURI=%2Fhome) for a free trial.

* Go to the [Create deployment](https://cloud.elastic.co/deployments/create) page
  * Select Create deployment

## Connect to Elasticsearch

To get started with Elasticsearch, we will need to connect to Elastic deployment using the [python client](https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/index.html). Since we are using an Elastic Cloud deployment, we will use the Cloud ID to identify our deployment.

Instantiate [Elasticsearch python client](https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/index.html) by providing cloud id and password of your deployment

In [None]:
from getpass import getpass
from elasticsearch import Elasticsearch

CLOUD_ID = getpass("Elastic deployment Cloud ID")
CLOUD_PASSWORD = getpass("Elastic deployment Password")
client = Elasticsearch(
  cloud_id = CLOUD_ID,
  basic_auth=("elastic", CLOUD_PASSWORD)
)

In [None]:
print(client.info())

## Download the dataset and extract to dataframe

In [None]:
import wget
embeddings_url = 'https://cdn.openai.com/API/examples/data/vector_database_wikipedia_articles_embedded.zip'
wget.download(embeddings_url)

In [None]:
import zipfile

with zipfile.ZipFile("vector_database_wikipedia_articles_embedded.zip",
"r") as zip_ref:
    zip_ref.extractall("data")

In [None]:
import pandas as pd
wikipedia_dataframe = pd.read_csv("data/vector_database_wikipedia_articles_embedded.csv")

## Create Index with mapping
Let's now create a Elasticsearch index with mappings to index downloaded wikipedia dataset, adding `dense_vector` fields for `title_vector` and  `content_vector`

In [None]:
index_mapping= {
    "properties": {
      "title_vector": {
          "type": "dense_vector",
          "dims": 1536,
          "index": "true",
          "similarity": "cosine"
      },
      "content_vector": {
          "type": "dense_vector",
          "dims": 1536,
          "index": "true",
          "similarity": "cosine"
      },
      "text": {"type": "text"},
      "title": {"type": "text"},
      "url": { "type": "keyword"},
      "vector_id": {"type": "long"}
      
    }
}

In [None]:
client.indices.create(index="wikipedia_vector_index", mappings=index_mapping)

## Index data to Elasticsearch index
We will use [Elasticsearch Python Bulk helpers](https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/client-helpers.html) to index data to Elasticsearch index. 

We will first convert Pandas dataframe to helpers bulk api iteratable actions.


In [None]:
import json

def dataframe_to_bulk_actions(df):
    for index, row in df.iterrows():
        yield {
            "_index": 'wikipedia_vector_index',
            "_id": row['id'],
            "_source": {
                'url' : row["url"],
                'title' : row["title"],
                'text' : row["text"],
                'title_vector' : json.loads(row["title_vector"]),
                'content_vector' : json.loads(row["content_vector"]),
                'vector_id' : row["vector_id"]
            }
        }

As the dataframe is large, we will index data in batch of `100`.

In [None]:
from elasticsearch import helpers

start = 0
end = len(wikipedia_dataframe)
batch_size = 100
for batch_start in range(start, end, batch_size):
    batch_end = min(batch_start + batch_size, end)
    batch_dataframe = wikipedia_dataframe.iloc[batch_start:batch_end]
    actions = dataframe_to_bulk_actions(batch_dataframe)
    helpers.bulk(client, actions)

Let's test the index with simple match query.

In [None]:
print(client.search(index="wikipedia_vector_index", query={
    "match": {
      "text": {
        "query": "Hummingbird"
      }
    }
}))

## Encode a question with OpenAI embedding model

Set OpenAI API key and Create a new OpenAI embedding using  `text-embedding-ada-002` model.

In [None]:
import openai
OPENAI_API_KEY = getpass("Enter OpenAI API key")

openai.api_key = OPENAI_API_KEY

In [None]:
EMBEDDING_MODEL = "text-embedding-ada-002"

question = 'How wide is Atlantic ocean?'
modelOpenAi = openai.Embedding.create(input=question, model=EMBEDDING_MODEL)

## Perform kNN search

In [None]:
def pretty_response(response):
    for hit in response['hits']['hits']:
        id = hit['_id']
        score = hit['_score']
        title = hit['_source']['title']
        text = hit['_source']['text']
        pretty_output = (f"\nID: {id}\nTitle: {title}\nSummary: {text}\nScore: {score}")
        print(pretty_output)

In [None]:
response = client.search(
  index = "wikipedia_vector_index",
  knn={
      "field": "content_vector",
      "query_vector":  modelOpenAi["data"][0]["embedding"],
      "k": 10,
      "num_candidates": 100
    }
)
pretty_response(response)

## Use Chat Completions API for retrieval augmented generation

Now we can send the question and the text to OpenAI's chat completion API.

Using a LLM model together with a retrieval model is known as retrieval augmented generation (RAG). We're using Elasticsearch to do what it does best, retrieve relevant documents. Then we use the LLM to do what it does best, taks like generating summaries and answering questions, using the retrieved documents as context. 

The model will generate a response to the question, using the top kNN hit as context. Use the `messages` list to shape your prompt to the gen AI model.

In [None]:
summary = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Answer the following question:" + query + "by using the following text:" + text + "Please print each sentence on a new line."},
    ]
)

choices = summary.choices

for choice in choices:
    print("------------------------------------------------------------")
    print(choice.message.content)
    print("------------------------------------------------------------")

### Code explanation

Here's what that code does:

- Uses OpenAI's gpt-3.5-turbo model to generate a response
- Sends a conversation containing a system message and a user message to the model
- The system message sets the assistant's role as "helpful assistant"
- The user message contains a question specified in query and some input text in text
- The response from the model is stored in the `summary.choices` variable

## Next steps

That was just one example of how to combine Elasticsearch with the power of OpenAI's models, to enable retrieval augmented generation. RAG allows you to avoid the costly and complex process of training or fine-tuning models, by leveraging out-of-the-box models, enhanced with additional context.

Use this as a blueprint for your own experiments.

To adapt the conversation for different use cases, customize the system message to define the assistant's behavior or persona. Adjust the user message to specify the task, such as summarization or question answering, along with the desired format of the response.