# Chatbot with LangChain conversational chain and OpenAI 🤖💬

In this notebook we'll build a chatbot that can respond to questions about custom data, such as policies of an employer.

The chatbot uses LangChain's `ConversationalRetrievalChain` and has the following capabilities:
- Answer questions asked in natural language
- Run hybrid search in Elasticsearch to find documents that answer the question
- Extract and summarize the answer using OpenAI LLM
- Maintain conversational memory for follow-up questions

## Requirements 🧰
For this example, you will need:

- Python 3.6 or later
- An Elastic deployment with minimum **2GB machine learning node**
  - We'll be using [Elastic Cloud](https://www.elastic.co/guide/en/cloud/current/ec-getting-started.html) for this example (available with a [free trial](https://cloud.elastic.co/registration))
- OpenAI account

### Create Elastic Cloud deployment

If you don't have an Elastic Cloud deployment, follow these steps to create one.
1. Go to https://cloud.elastic.co/registration and sign up for a free trial
2. Select **Create Deployment**
3. Add a 2GB machine learning node to the deployment

## Install packages 📦

First we `pip install` the packages we need for this example.

In [1]:
%pip install -U langchain==0.0.245 jq openai elasticsearch tiktoken

Collecting langchain==0.0.245
  Using cached langchain-0.0.245-py3-none-any.whl (1.4 MB)
Collecting jq
  Using cached jq-1.4.1-cp310-cp310-macosx_11_0_arm64.whl
Collecting openai
  Using cached openai-0.27.8-py3-none-any.whl (73 kB)
Collecting elasticsearch
  Using cached elasticsearch-8.9.0-py3-none-any.whl (395 kB)
Collecting tiktoken
  Using cached tiktoken-0.4.0-cp310-cp310-macosx_11_0_arm64.whl (761 kB)
Collecting PyYAML>=5.4.1
  Using cached PyYAML-6.0.1-cp310-cp310-macosx_11_0_arm64.whl (169 kB)
Collecting tenacity<9.0.0,>=8.1.0
  Using cached tenacity-8.2.2-py3-none-any.whl (24 kB)
Collecting numpy<2,>=1
  Using cached numpy-1.25.2-cp310-cp310-macosx_11_0_arm64.whl (14.0 MB)
Collecting pydantic<2,>=1
  Using cached pydantic-1.10.12-cp310-cp310-macosx_11_0_arm64.whl (2.5 MB)
Collecting aiohttp<4.0.0,>=3.8.3
  Using cached aiohttp-3.8.5-cp310-cp310-macosx_11_0_arm64.whl (343 kB)
Collecting langsmith<0.1.0,>=0.0.11
  Using cached langsmith-0.0.20-py3-none-any.whl (32 kB)
Collectin

## Initialize clients 🔌

Next we input credentials with `getpass`. `getpass` is part of the Python standard library and is used to securely prompt for credentials.

In [2]:
from getpass import getpass

ELASTIC_CLOUD_ID = getpass("Elastic Cloud ID: ")
ELASTIC_USERNAME = getpass("Elastic username: ")
ELASTIC_PASSWORD = getpass("Elastic password: ")
OPENAI_API_KEY = getpass("OpenAI API key: ")

With these credentials we can now initialize the Elasticsearch Python client.

In [52]:
from elasticsearch import Elasticsearch

elasticsearch_client = Elasticsearch(
    cloud_id=ELASTIC_CLOUD_ID,
    basic_auth=(ELASTIC_USERNAME, ELASTIC_PASSWORD)
)

print('Connected to Elasticsearch\n', elasticsearch_client.info())

Connected to Elasticsearch
 {'name': 'instance-0000000001', 'cluster_name': 'c8c1f5348a8647989c409f4090d2d6f4', 'cluster_uuid': 'X3GD-4JrSK65K3RQCcBbSA', 'version': {'number': '8.10.0-SNAPSHOT', 'build_flavor': 'default', 'build_type': 'docker', 'build_hash': 'c26f88d66b60a8016e98b9b7ba7bae6c7c852213', 'build_date': '2023-08-04T11:42:34.579618687Z', 'build_snapshot': True, 'lucene_version': '9.7.0', 'minimum_wire_compatibility_version': '7.17.0', 'minimum_index_compatibility_version': '7.0.0'}, 'tagline': 'You Know, for Search'}


## Create index 🗄️

We'll create an Elasticsearch index to store documents along with the generated vector embeddings. This will allow us to execute vector search when retrieving documents for our query.

Since we're using OpenAI's `text-embedding-ada-002` model, we need a 1536-dimensional [dense_vector](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html) field to store the embeddings.

In [22]:
mappings = {
    "properties": {
        "text": { "type": "keyword" },
        "vector": {
            "type": "dense_vector",
            "dims": 1536,
            "index": True,
            "similarity": "cosine"
        }
    }
}

elasticsearch_client.indices.create(
    index='workplace-docs',
    mappings=mappings
)

ObjectApiResponse({'acknowledged': True, 'shards_acknowledged': True, 'index': 'workplace_docs'})

## Load and process documents 📄

Time to load some data! We'll be using the workplace search example data, which is a list of employee documents and policies.

In [11]:
import json
from urllib.request import urlopen

url = "https://raw.githubusercontent.com/elastic/elasticsearch-labs/main/example-apps/workplace-search/example-data/data.json"

response = urlopen(url)

workplace_docs = json.loads(response.read())

print(f'Successfully loaded {len(workplace_docs)} documents')

Successfully loaded 15 documents


As we're chatting with our bot, it will run semantic searches on the index to find the relevant documents. In order for this to be accurate, we need to split the full documents into small chunks (also called passages). This way the semantic search will find the passage within a document that most likely answers our question.

We'll use LangChain's `CharacterTextSplitter` and split the documents' text at 800 characters with some overlap between chunks.

In [60]:
from langchain.text_splitter import CharacterTextSplitter

metadata = []
content = []

for doc in workplace_docs:
    content.append(doc["content"])
    metadata.append({
        "name": doc["name"],
        "summary": doc["summary"]
    })

text_splitter = CharacterTextSplitter(
    chunk_size=800,
    chunk_overlap=400
)
docs = text_splitter.create_documents(content, metadatas=metadata)

print(f"Split {len(workplace_docs)} documents into {len(docs)} passages")

Created a chunk of size 866, which is longer than the specified 800
Created a chunk of size 1120, which is longer than the specified 800


Split 15 documents into 73 passages


Let's generate the embeddings and index the documents with them.

In [26]:
from langchain.embeddings import OpenAIEmbeddings

# Get the embeddings from openAI
embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)

# Extract page_content from the documents
texts = list(map(lambda t: t.page_content, docs))

text_embeddings = embeddings.embed_documents(texts)

print(f'Generated {len(text_embeddings)} embeddings')

# Persist the passage documents into Elasticsearch
actions = []
for i, passage in enumerate(docs):
    actions.append({"index": {}})
    actions.append({
        "text": passage.page_content,
        "vector": text_embeddings[i],
        "metadata": passage.metadata
    })

bulk_response = elasticsearch_client.bulk(
    operations=actions,
    index="workplace-docs"
)

print(f'Indexed {len(bulk_response["items"])} documents')

Generated 73 embeddings
Indexed 73 documents


## Chat with the chatbot 💬

Let's initialize our chatbot. We'll define Elasticsearch as a store for retrieving documents, OpenAI as the LLM to interpret questions and summarize answers, then we'll pass these to the conversational chain.

In [61]:
from langchain.vectorstores.elastic_vector_search import ElasticKnnSearch
from langchain.llms import OpenAI
from langchain.chains import ConversationalRetrievalChain

store = ElasticKnnSearch(
    es_connection=elasticsearch_client,
    index_name="workplace-docs",
    embedding=embeddings
)

retriever = store.as_retriever()

llm = OpenAI(openai_api_key=OPENAI_API_KEY)

chat = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=retriever,
    return_source_documents=True
)

Now we can ask questions from our chatbot!

See how the chat history is passed as context for each question.

In [62]:
# Define a convenience function for Q&A
def ask(question, history):
    result = chat({"question": question, "chat_history": chat_history})
    print("QUESTION: ", question,
          "\nANSWER:  ", result["answer"],
          "\nSUPPORTING DOCUMENTS: ", list(map(lambda d: d.metadata["name"], list(result["source_documents"])))
    )
    history.append((question, result["answer"]))
    
chat_history = []

ask("What does NASA stand for?", chat_history)
ask("Which countries are part of it?", chat_history)
ask("Who are the team's leads?", chat_history)


QUESTION:  What does NASA stand for? 
ANSWER:    NASA stands for North America South America. 
SUPPORTING DOCUMENTS:  ['Sales Organization Overview', 'Code Of Conduct', 'Code Of Conduct', 'Swe Career Matrix']
QUESTION:  Which countries are part of it? 
ANSWER:    The countries in the North America South America region are the United States, Canada, Mexico, as well as Central and South America. 
SUPPORTING DOCUMENTS:  ['Sales Organization Overview', 'Sales Organization Overview', 'Sales Organization Overview', 'Fy2024 Company Sales Strategy']
QUESTION:  Who are the team's leads? 
ANSWER:    The leads of the North America South America (NASA) team are Laura Martinez (Area Vice-President of North America) and Gary Johnson (Area Vice-President of South America). 
SUPPORTING DOCUMENTS:  ['Sales Organization Overview', 'Sales Organization Overview', 'Sales Organization Overview', 'Swe Career Matrix']


Try experimenting with other questions or after clearing the workplace data, and observe how the responses change.

# (Optional) Clean up 🧹

Once we're done, we can delete the Elasticsearch index.

In [63]:
elasticsearch_client.indices.delete(index='workplace-docs')

NotFoundError: NotFoundError(404, 'index_not_found_exception', 'no such index [workplace-docs]', workplace-docs, index_or_alias)