### Chatbots with LlamaIndex  
LlamaIndex serves as a bridge between your data and Large Language Models (LLMs), providing a toolkit that enables you to establish a query interface around your data for a variety of tasks, such as question-answering and summarization.

In this tutorial, we'll walk you through building a context-augmented chatbot  

#### Installing Packages

In [6]:
!pip install -q openai
!pip install -q llama-index
!pip install -q llama-index-experimental
!pip install -q pypdf
!pip install -q docx2txt

#### Importing Packages

In [10]:
import os
import openai

#os.environ["OPENAI_API_KEY"] = "<the key>"
openai.api_key = os.environ["OPENAI_API_KEY"]

import sys
import shutil
import glob
import logging
from pathlib import Path

import warnings
warnings.filterwarnings('ignore')

import pandas as pd

## Llamaindex readers
from llama_index.core import SimpleDirectoryReader

## LlamaIndex Index Types
from llama_index.core import ListIndex
from llama_index.core import VectorStoreIndex
from llama_index.core import TreeIndex
from llama_index.core import KeywordTableIndex
from llama_index.core import SimpleKeywordTableIndex
from llama_index.core import DocumentSummaryIndex
from llama_index.core import KnowledgeGraphIndex
from llama_index.experimental.query_engine import PandasQueryEngine


## LlamaIndex Context Managers
from llama_index.core import StorageContext
from llama_index.core import load_index_from_storage
from llama_index.core.response_synthesizers import get_response_synthesizer
from llama_index.core.response_synthesizers import ResponseMode
from llama_index.core.schema import Node

## LlamaIndex Templates
from llama_index.core.prompts import PromptTemplate
from llama_index.core.prompts import ChatPromptTemplate
from llama_index.core.base.llms.types import ChatMessage, MessageRole

## LlamaIndex Callbacks
from llama_index.core.callbacks import CallbackManager
from llama_index.core.callbacks import LlamaDebugHandler

In [11]:
import logging

#logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
#logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

#### Defining Models

In [39]:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

#model="gpt-4o"
model="gpt-4o-mini"

Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")
Settings.llm = OpenAI(temperature=0, 
                      model=model, 
                      #max_tokens=512
                      PRESENCE_PENALTY=-2,
                      TOP_P=1,
                     )

#### Defining Folders

In [18]:
DOCS_DIR = "../Data/"
PERSIST_DIR = "../Index/ChatExample/"

print(f"Current dir: {os.getcwd()}")

if not os.path.exists(DOCS_DIR):
  os.mkdir(DOCS_DIR)
docs = os.listdir(DOCS_DIR)
docs = [d for d in docs]
docs.sort()
print(f"Files in {DOCS_DIR}")
for doc in docs:
    print(doc)

Current dir: /home/renato/Documents/Repos/GenAI4Humanists/Notebooks
Files in ../Data/
1.pdf
WarrenCommissionReport.txt
axis_report.pdf
california_housing_train.csv
hdfc_report.pdf
hr.sqlite
icici_report.pdf
kafka_metamorphosis.txt
knowledge_card.pdf
loftq.pdf
longlora.pdf
lyft_2021.pdf
metagpt.pdf
metra.pdf
paul_graham_essay.txt
selfrag.pdf
swebench.pdf
uber_2021.pdf
values.pdf
vr_mcl.pdf
zipformer.pdf


#### (Optional) Downloading an example PDF file:  

In [None]:
#!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O './data/paul_graham_essay.txt'


### Creating a Vector Index    


#### Deleting existing Indexes  

(Only if you want to recreate the index)

In [19]:
if not os.path.exists(PERSIST_DIR):
    print(f"Creating Directory {PERSIST_DIR}")
    os.mkdir(PERSIST_DIR)
else:
    print(f"Re-Creating Directory {PERSIST_DIR}")
    shutil.rmtree(PERSIST_DIR)
    os.mkdir(PERSIST_DIR)
    print(os.listdir(PERSIST_DIR))

Creating Directory ../Index/ChatExample/


#### Generic Function to create indexes (with all txt files)

In [20]:
def create_retrieve_index(index_path, docs_path, index_type):
    if not os.path.exists(index_path):
        print(f"Creating Directory {index_path}")
        os.mkdir(index_path)
    if os.listdir(index_path) == []:
        print("Loading Documents...")
        required_exts = [".txt"]
        documents = SimpleDirectoryReader(required_exts=required_exts, 
                                          input_dir=docs_path).load_data()
        print("Creating Index...")
        index = index_type.from_documents(documents,
                                          show_progress=True,
                                          )
        print("Persisting Index...")
        index.storage_context.persist(persist_dir=index_path)
        print("Done!")
    else:
        print("Reading from Index...")
        index = load_index_from_storage(storage_context=StorageContext.from_defaults(persist_dir=index_path))
        print("Done!")
    return index

#### Creating Vector Store Index  

In [21]:
VECTORINDEXDIR = PERSIST_DIR + 'VectorStoreIndex'
vectorstoreindex = create_retrieve_index(VECTORINDEXDIR, DOCS_DIR, VectorStoreIndex)

Creating Directory ../Index/ChatExample/VectorStoreIndex
Loading Documents...
Creating Index...


Parsing nodes:   0%|          | 0/3 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/207 [00:00<?, ?it/s]

Persisting Index...
Done!


## Retrieving your data  

#### Retrieving from [Vector Store Index](https://docs.llamaindex.ai/en/stable/api_reference/query/retrievers/vector_store.html)  


In [27]:
query_engine = vectorstoreindex.as_query_engine(similarity_top_k=3,
                                                retriever_mode="embedding",
                                                #response_mode="accumulate",
                                                response_mode="compact",
                                                #response_mode="tree_summarize",
                                                verbose=True)
response = query_engine.query("Who is Gregor?")
print(response)

Gregor is a character in the provided text who experiences a sudden and mysterious transformation, leading to physical and emotional changes that affect his interactions with his family members.


In [31]:
query_engine = vectorstoreindex.as_query_engine(similarity_top_k=3,
                                                retriever_mode="embedding",
                                                #response_mode="accumulate",
                                                #response_mode="compact",
                                                response_mode="tree_summarize",
                                                verbose=True)

response = query_engine.query("Who is Paul McCartney?")
print(response)

Paul McCartney is not mentioned in the provided context information.


In [42]:
query_engine = vectorstoreindex.as_query_engine(similarity_top_k=3,
                                                retriever_mode="embedding",
                                                #response_mode="accumulate",
                                                #response_mode="compact",
                                                response_mode="tree_summarize",
                                                verbose=True)

response = query_engine.query("how many stars in the galaxy of auron-b?")
print(response)

There are approximately 400 billion stars in the galaxy of Auron-B.


## Creating an Simple Interactive Chatbot for our Index  

Chat Modes have their own different [retrieval modes](https://www.actalyst.ai/blog/optimizing-enterprise-chatbots-with-llamaindex-chat-modes-in-a-rag-system)

![](https://assets-global.website-files.com/652165b2d773cd686ae910a6/652e2b0154e3af353e2d9422_actalyst-llalaindex-chatmodes.png)

In [43]:
chat_engine = vectorstoreindex.as_chat_engine(chat_mode="context", 
                                              verbose=False)
chat_engine.reset()
chat_engine.chat_repl()

===== Entering Chat REPL =====
Type "exit" to exit.



Human:  When Gregor transformed into an insect?


Assistant: Gregor transformed into an insect in the story "The Metamorphosis" by Franz Kafka. The transformation occurs at the beginning of the story when Gregor Samsa wakes up one morning to find himself transformed into a giant insect. This sudden and inexplicable transformation sets the stage for the rest of the narrative, exploring themes of alienation, isolation, and the human condition.



Human:  Who is Paul Graham?


Assistant: Paul Graham is a computer scientist, entrepreneur, venture capitalist, and author. He is best known for co-founding the influential startup accelerator program Y Combinator (YC) in 2005, along with Jessica Livingston, Robert Morris, and Trevor Blackwell. Y Combinator has helped launch and support numerous successful tech startups, including Dropbox, Airbnb, Reddit, and Stripe.

Paul Graham is also known for his essays on technology, startups, and entrepreneurship, which are widely read and respected in the tech community. He has written on a variety of topics, sharing insights and advice based on his experiences as a programmer, entrepreneur, and investor.

In addition to his work with Y Combinator and his writing, Paul Graham has a background in computer science and programming. He studied philosophy at Cornell University and received a Master's degree in computer science from Harvard University. Prior to Y Combinator, he co-founded Viaweb, a web-based application company t

Human:  Do Paul Graham and Gregor know each other?


Assistant: There is no known connection between Paul Graham and Gregor from Franz Kafka's novella "The Metamorphosis." Paul Graham is a real-life figure in the tech industry, known for his work with Y Combinator and his writings on startups and technology. On the other hand, Gregor is a fictional character in Kafka's work, who undergoes a bizarre transformation into an insect.

As such, Paul Graham and Gregor are from different worlds—one from the realm of technology and entrepreneurship, and the other from the realm of literature and existential fiction. Their paths do not intersect in any known context.



Human:  exit


In [44]:
chat_engine = vectorstoreindex.as_chat_engine(chat_mode="condense_question", verbose=False)
chat_engine.reset()
chat_engine.chat_repl()

===== Entering Chat REPL =====
Type "exit" to exit.



Human:  How many stars in the Universe?


Assistant: The concept of stars in the Universe is vast and incomprehensible, with estimates suggesting there are billions of galaxies, each containing billions of stars. The exact number of stars in the Universe is difficult to quantify accurately.



Human:  In which universe lives Gregor?


Assistant: In a universe where there are billions of galaxies, each containing billions of stars, it is likely that there exists at least one path out of McCarthy's Lisp along which discoveredness is preserved.



Human:  Who is McCarthy?


Assistant: McCarthy is an individual who is connected to Lisp as he invented or discovered Lisp, which is a language defined by writing an interpreter in itself. In the universe where Gregor lives, McCarthy's Lisp serves as an answer to the question of what the minimum set of predefined operators needed to write an interpreter for a language in itself is. McCarthy's Lisp is also associated with the concept of discoveredness, as any changes made to McCarthy's Lisp in a discoveredness-preserving manner could potentially lead to the creation of a complete language with quality.



Human:  Please answer using the context only?


Assistant: Gregor lives in the universe of Franz Kafka's novella "The Metamorphosis." In this universe, the concept of discoveredness is associated with the character of Gregor Samsa, who undergoes a transformation into a giant insect.



Human:  exit


### Creating an [Customized Prompt](https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/prompts/chat_prompts.py) Chatbot

In [50]:
TEXT_QA_SYSTEM_PROMPT = ChatMessage(
    content=(
        "You are an expert Q&A system that is trusted around the world.\n"
        "Always answer the query using the provided context information, "
        "and not prior knowledge.\n"
        "Some rules to follow:\n"
        "1. Never directly reference the given context in your answer.\n"
        "2. Avoid statements like 'Based on the context, ...' or "
        "'The context information ...' or anything along "
        "those lines."
    ),
    role=MessageRole.SYSTEM,
)

TEXT_QA_PROMPT_TMPL_MSGS = [
    TEXT_QA_SYSTEM_PROMPT,
    ChatMessage(
        content=(
            "Context information is below.\n"
            "---------------------\n"
            "{context_str}\n"
            "---------------------\n"
            "Given the context information and not prior knowledge, "
            "answer the query.\n"
            "Query: {query_str}\n"
            "Answer: "
        ),
        role=MessageRole.USER,
    ),
]

CHAT_TEXT_QA_PROMPT = ChatPromptTemplate(message_templates=TEXT_QA_PROMPT_TMPL_MSGS)

# Tree Summarize
TREE_SUMMARIZE_PROMPT_TMPL_MSGS = [
    TEXT_QA_SYSTEM_PROMPT,
    ChatMessage(
        content=(
            "Context information from multiple sources is below.\n"
            "---------------------\n"
            "{context_str}\n"
            "---------------------\n"
            "Given the information from multiple sources and not prior knowledge, "
            "answer the query.\n"
            "Query: {query_str}\n"
            "Answer: "
        ),
        role=MessageRole.USER,
    ),
]

CHAT_TREE_SUMMARIZE_PROMPT = ChatPromptTemplate(
    message_templates=TREE_SUMMARIZE_PROMPT_TMPL_MSGS
)


# Refine Prompt
CHAT_REFINE_PROMPT_TMPL_MSGS = [
    ChatMessage(
        content=(
            "You are an expert Q&A system that strictly operates in two modes "
            "when refining existing answers:\n"
            "1. **Rewrite** an original answer using the new context.\n"
            "2. **Repeat** the original answer if the new context isn't useful.\n"
            "Never reference the original answer or context directly in your answer.\n"
            "If the query is unrelated to the context, just answer: I don't know. \n"
            "When in doubt, just repeat the original answer.\n"
            "New Context: {context_msg}\n"
            "Query: {query_str}\n"
            "Original Answer: {existing_answer}\n"
            "New Answer: "
        ),
        role=MessageRole.USER,
    )
]


CHAT_REFINE_PROMPT = ChatPromptTemplate(message_templates=CHAT_REFINE_PROMPT_TMPL_MSGS)


# Table Context Refine Prompt
CHAT_REFINE_TABLE_CONTEXT_TMPL_MSGS = [
    ChatMessage(content="{query_str}", role=MessageRole.USER),
    ChatMessage(content="{existing_answer}", role=MessageRole.ASSISTANT),
    ChatMessage(
        content=(
            "We have provided a table schema below. "
            "---------------------\n"
            "{schema}\n"
            "---------------------\n"
            "We have also provided some context information below. "
            "{context_msg}\n"
            "---------------------\n"
            "Given the context information and the table schema, "
            "refine the original answer to better "
            "answer the question. "
            "If the context isn't useful, return the original answer."
        ),
        role=MessageRole.USER,
    ),
]


CHAT_REFINE_TABLE_CONTEXT_PROMPT = ChatPromptTemplate(message_templates=CHAT_REFINE_TABLE_CONTEXT_TMPL_MSGS)

In [51]:
chat_engine = vectorstoreindex.as_chat_engine(chat_mode="context",
                                              verbose=True,
                                              text_qa_template=CHAT_REFINE_PROMPT)
chat_engine.reset()
chat_engine.chat_repl()

===== Entering Chat REPL =====
Type "exit" to exit.



Human:  Who is Paul Graham, in just 10 words?


Assistant: Paul Graham is a programmer, entrepreneur, and essayist.



Human:  Who is Paul McCartney?


Assistant: Paul McCartney is a British musician, singer, and songwriter, best known for being a member of the Beatles.



Human:  exit


### Adding Memory and System Prompt

In [52]:
from llama_index.core.memory import ChatMemoryBuffer

memory = ChatMemoryBuffer.from_defaults(token_limit=80000)
chat_engine = vectorstoreindex.as_chat_engine(
    chat_mode="context",
    memory=memory,
    system_prompt=(
            "Context information from multiple sources is below.\n"
            "---------------------\n"
            "{context_str}\n"
            "---------------------\n"
            "Given the information from multiple sources"
            "answer the query.\n"
            "If the query is unrelated to the context, just answer: I don't know"
            "Always start your answer with 'Dear Student'" 
            "Query: {query_str}\n"
            "Answer: "
    ),
)

In [54]:
chat_engine.chat_repl()

===== Entering Chat REPL =====
Type "exit" to exit.



Human:  Who is Paul Graham?


Assistant: Dear Student
Paul Graham is a computer scientist, entrepreneur, and venture capitalist. He is known for co-founding Y Combinator, a startup accelerator, and for his work on programming languages like Lisp. Additionally, he is a prolific writer, known for his essays on a wide range of topics related to technology, startups, and entrepreneurship.




Human:  And who is Paul McCartney?


Assistant: Dear Student
I don't know



Human:  Who participated in the Warren Comission?


Assistant: Dear Student
The Warren Commission was composed of several members, including Chief Justice Earl Warren, Senator John Sherman Cooper, Representative Hale Boggs, Senator Richard Russell Jr., Representative Gerald Ford, Allen Dulles, and John J. McCloy. These individuals were tasked with investigating the assassination of President John F. Kennedy.




Human:  Did Paul Graham was also part?


Assistant: Dear Student
I don't know



Human:  Did Paul Graham was also part of the Graham Comission?


Assistant: Dear Student
I don't know



Human:  I mean, the Warren Comission?


Assistant: Dear Student
Paul Graham was not part of the Warren Commission. The Warren Commission, officially known as the President's Commission on the Assassination of President Kennedy, was established by President Lyndon B. Johnson in 1963 to investigate the assassination of President John F. Kennedy.




Human:  exit


In [55]:
memory.to_dict()

{'token_limit': 80000,
 'chat_store': {'store': {'chat_history': [{'role': <MessageRole.USER: 'user'>,
     'content': 'Who is Paul Graham?',
     'additional_kwargs': {}},
    {'role': <MessageRole.ASSISTANT: 'assistant'>,
     'content': 'Dear Student\nPaul Graham is a computer scientist, entrepreneur, and venture capitalist. He is known for co-founding Y Combinator, a startup accelerator, and for his work on programming languages like Lisp. Additionally, he is a prolific writer, known for his essays on a wide range of topics related to technology, startups, and entrepreneurship.\n',
     'additional_kwargs': {}},
    {'role': <MessageRole.USER: 'user'>,
     'content': 'And who is Paul McCartney?',
     'additional_kwargs': {}},
    {'role': <MessageRole.ASSISTANT: 'assistant'>,
     'content': "Dear Student\nI don't know",
     'additional_kwargs': {}},
    {'role': <MessageRole.USER: 'user'>,
     'content': 'Who participated in the Warren Comission?',
     'additional_kwargs': {}