### Chatbots with LlamaIndex  
LlamaIndex serves as a bridge between your data and Large Language Models (LLMs), providing a toolkit that enables you to establish a query interface around your data for a variety of tasks, such as question-answering and summarization.

In this tutorial, we'll walk you through building a context-augmented chatbot  

#### Installing Packages

In [None]:
!pip install -q openai
!pip install -q llama-index
!pip install llama-index-experimental
!pip install -q pypdf
!pip install -q docx2txt

#### Importing Packages

In [17]:
import os
#os.environ["OPENAI_API_KEY"] = "<the key>"

import sys
import shutil
import glob
import logging
from pathlib import Path

import warnings
warnings.filterwarnings('ignore')

import pandas as pd
import openai

## Llamaindex readers
from llama_index.core import SimpleDirectoryReader

## LlamaIndex Index Types
from llama_index.core import ListIndex
from llama_index.core import VectorStoreIndex
from llama_index.core import TreeIndex
from llama_index.core import KeywordTableIndex
from llama_index.core import SimpleKeywordTableIndex
from llama_index.core import DocumentSummaryIndex
from llama_index.core import KnowledgeGraphIndex
from llama_index.experimental.query_engine import PandasQueryEngine


## LlamaIndex Context Managers
from llama_index.core import StorageContext
from llama_index.core import load_index_from_storage
from llama_index.core.response_synthesizers import get_response_synthesizer
from llama_index.core.response_synthesizers import ResponseMode
from llama_index.core.schema import Node

## LlamaIndex Templates
from llama_index.core.prompts import PromptTemplate
from llama_index.core.prompts import ChatPromptTemplate
from llama_index.core.base.llms.types import ChatMessage, MessageRole

## LlamaIndex Callbacks
from llama_index.core.callbacks import CallbackManager
from llama_index.core.callbacks import LlamaDebugHandler

In [2]:
import logging

#logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
#logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

#### Defining Models

#### Defining Folders

In [3]:
DOCS_DIR = "./data/"
PERSIST_DIR = "./index/"

print(f"Current dir: {os.getcwd()}")

if not os.path.exists(DOCS_DIR):
  os.mkdir(DOCS_DIR)
docs = os.listdir(DOCS_DIR)
docs = [d for d in docs]
docs.sort()
print(f"Files in {DOCS_DIR}")
for doc in docs:
    print(doc)


Current dir: /home/renato/Documents/Repos/GenAI4Humanists/Notebooks
Files in ./data/
.ipynb_checkpoints
1.pdf
california_housing_train.csv
paul_graham_essay.txt
pg58031.text


### Downloading some sample document.  
Unnecessary if you are copying your own documents to the folder

In [None]:
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O './data/paul_graham_essay.txt'


### Creating a Vector Index    


#### Deleting existing Indexes  

(Only if you want to recreate all indexes)

In [4]:
if not os.path.exists(PERSIST_DIR):
    print(f"Creating Directory {PERSIST_DIR}")
    os.mkdir(PERSIST_DIR)
else:
    print(f"Re-Creating Directory {PERSIST_DIR}")
    shutil.rmtree(PERSIST_DIR)
    os.mkdir(PERSIST_DIR)

Re-Creating Directory ./index/


#### Generic Function to create indexes (only txt files)

In [5]:
def create_retrieve_index(index_path, docs_path, index_type):
    if not os.path.exists(index_path):
        print(f"Creating Directory {index_path}")
        os.mkdir(index_path)
    if os.listdir(index_path) == []:
        print("Loading Documents...")
        required_exts = [".txt"]
        documents = SimpleDirectoryReader(required_exts=required_exts, 
                                          input_dir=docs_path).load_data()
        print("Creating Index...")
        index = index_type.from_documents(documents,
                                          show_progress=True,
                                          )
        print("Persisting Index...")
        index.storage_context.persist(persist_dir=index_path)
        print("Done!")
    else:
        print("Reading from Index...")
        index = load_index_from_storage(storage_context=StorageContext.from_defaults(persist_dir=index_path))
        print("Done!")
    return index

#### Creating Vector Store Index  

In [6]:
VECTORINDEXDIR = PERSIST_DIR + 'VectorStoreIndex'
vectorstoreindex = create_retrieve_index(VECTORINDEXDIR, DOCS_DIR, VectorStoreIndex)

Creating Directory ./index/VectorStoreIndex
Loading Documents...
Creating Index...


Parsing nodes:   0%|          | 0/1 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/22 [00:00<?, ?it/s]

Persisting Index...
Done!


## Retrieving your data  

#### Retrieving from [Vector Store Index](https://docs.llamaindex.ai/en/stable/api_reference/query/retrievers/vector_store.html)  


In [7]:
query_engine = vectorstoreindex.as_query_engine(retriever_mode="embedding",
                                                response_mode="accumulate",
                                                verbose=True)
response = query_engine.query("Who was Paul Graham?")
print(response)

Response 1: Paul Graham was a person who, after moving to New York, became a de facto studio assistant for someone who liked to paint on big, square canvases. He later became involved in the world of technology and co-founded a company called Viaweb, which eventually became a model for Y Combinator's funding structure.
---------------------
Response 2: Paul Graham was involved in the founding of Y Combinator (YC) and played a significant role in its leadership. He eventually decided to hand over the reins of YC to someone else and transitioned away from his role there. After leaving YC, he explored painting and writing essays before returning to working on Lisp, a programming language with a unique history and approach.


In [8]:
query_engine = vectorstoreindex.as_query_engine(retriever_mode="embedding",
                                                response_mode="tree_summarize",
                                                verbose=True)

response = query_engine.query("Who is Paul McCartney?")
print(response)

Paul McCartney is not mentioned in the provided context information.


In [9]:
query_engine = vectorstoreindex.as_query_engine(retriever_mode="embedding",
                                                response_mode="compact",
                                                verbose=True)

response = query_engine.query("how many stars are in the galaxy?")
print(response)

There are estimated to be around 100 to 400 billion stars in our Milky Way galaxy.


## Creating an Simple Interactive Chatbot for our Index  

Chat Modes have their own different [retrieval modes](https://www.actalyst.ai/blog/optimizing-enterprise-chatbots-with-llamaindex-chat-modes-in-a-rag-system)

![](https://assets-global.website-files.com/652165b2d773cd686ae910a6/652e2b0154e3af353e2d9422_actalyst-llalaindex-chatmodes.png)

In [12]:
chat_engine = vectorstoreindex.as_chat_engine(chat_mode="context", verbose=False)
chat_engine.reset()
chat_engine.chat_repl()

===== Entering Chat REPL =====
Type "exit" to exit.



Human:  Who is Paul Graham?


Assistant: Paul Graham is a computer scientist, entrepreneur, and venture capitalist. He is best known as the co-founder of Y Combinator, a prominent startup accelerator based in Silicon Valley. Graham is also known for his work in programming languages, particularly his contributions to the development of the Lisp programming language. Additionally, he is a prolific writer and has authored numerous essays on topics ranging from technology and startups to art and creativity.



Human:  How many stars are in the universe?


Assistant: The exact number of stars in the universe is difficult to determine due to the vastness of space and the limitations of current technology. However, astronomers estimate that there are around 100 billion stars in our own Milky Way galaxy alone. Considering that there are billions of galaxies in the observable universe, the total number of stars is likely in the order of magnitude of hundreds of billions of billions (or hundreds of sextillions).



Human:  What is the essay about?


Assistant: The essay by Paul Graham discusses his personal experiences and reflections on his early interests in writing and programming, his transition to working on microcomputers, his college studies in philosophy and eventual switch to artificial intelligence (AI), and his involvement in various projects such as writing essays, developing spam filters, and starting an investment firm. Graham also reflects on the importance of working on things that may not be prestigious but hold genuine value and the significance of following one's interests and motivations in pursuing meaningful work.



Human:  exit


In [14]:
chat_engine = vectorstoreindex.as_chat_engine(chat_mode="condense_question", verbose=False)
chat_engine.reset()
chat_engine.chat_repl()

===== Entering Chat REPL =====
Type "exit" to exit.



Human:  Who is Paul Graham?


Assistant: Paul Graham is an individual who was involved in the tech industry, particularly in the development of software and online platforms. He co-founded a company called Viaweb, which focused on building online stores. Later on, he was also a part of Y Combinator, a startup accelerator, where he played a significant role in its operations and leadership. Additionally, Paul Graham is known for his interest in painting and writing essays, showcasing a diverse range of interests and talents.



Human:  How many stars are in the universe?


Assistant: Paul Graham is an individual who transitioned from working on painting and art-related projects to becoming involved in the tech industry. He co-founded Viaweb, a company that developed software for building online stores, and later became a co-founder of Y Combinator, a startup accelerator. Graham's interest in technology led him to explore the potential of the World Wide Web in the mid-1990s, where he recognized the opportunity to create online stores. Additionally, he developed a new dialect of Lisp called Arc. Alongside his tech ventures, Graham is also known for his passion for writing essays, which he began publishing online, recognizing the power of the internet as a platform for sharing ideas. His experiences in various fields have shaped his perspective on pursuing work that may not initially be prestigious but holds potential for discovery and genuine interest.



Human:  What is the essay about?


Assistant: The topic of Paul Graham's essays is varied, covering subjects such as startups, programming languages like Lisp, entrepreneurship, and reflections on his own experiences. His essays reflect his diverse range of interests and talents by showcasing his ability to delve into different fields such as technology, business, and art, while also sharing personal anecdotes and insights. This demonstrates his multidisciplinary approach to writing and thinking, highlighting his curiosity and expertise across various domains.



Human:  exit


### Creating an [Customized Prompt](https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/prompts/chat_prompts.py) Chatbot

In [21]:
TEXT_QA_SYSTEM_PROMPT = ChatMessage(
    content=(
        "You are an expert Q&A system that is trusted around the world.\n"
        "Always answer the query using the provided context information, "
        "and not prior knowledge.\n"
        "Some rules to follow:\n"
        "1. Never directly reference the given context in your answer.\n"
        "2. Avoid statements like 'Based on the context, ...' or "
        "'The context information ...' or anything along "
        "those lines."
    ),
    role=MessageRole.SYSTEM,
)

TEXT_QA_PROMPT_TMPL_MSGS = [
    TEXT_QA_SYSTEM_PROMPT,
    ChatMessage(
        content=(
            "Context information is below.\n"
            "---------------------\n"
            "{context_str}\n"
            "---------------------\n"
            "Given the context information and not prior knowledge, "
            "answer the query.\n"
            "Query: {query_str}\n"
            "Answer: "
        ),
        role=MessageRole.USER,
    ),
]

CHAT_TEXT_QA_PROMPT = ChatPromptTemplate(message_templates=TEXT_QA_PROMPT_TMPL_MSGS)

# Tree Summarize
TREE_SUMMARIZE_PROMPT_TMPL_MSGS = [
    TEXT_QA_SYSTEM_PROMPT,
    ChatMessage(
        content=(
            "Context information from multiple sources is below.\n"
            "---------------------\n"
            "{context_str}\n"
            "---------------------\n"
            "Given the information from multiple sources and not prior knowledge, "
            "answer the query.\n"
            "Query: {query_str}\n"
            "Answer: "
        ),
        role=MessageRole.USER,
    ),
]

CHAT_TREE_SUMMARIZE_PROMPT = ChatPromptTemplate(
    message_templates=TREE_SUMMARIZE_PROMPT_TMPL_MSGS
)


# Refine Prompt
CHAT_REFINE_PROMPT_TMPL_MSGS = [
    ChatMessage(
        content=(
            "You are an expert Q&A system that strictly operates in two modes "
            "when refining existing answers:\n"
            "1. **Rewrite** an original answer using the new context.\n"
            "2. **Repeat** the original answer if the new context isn't useful.\n"
            "Never reference the original answer or context directly in your answer.\n"
            "When in doubt, just repeat the original answer.\n"
            "New Context: {context_msg}\n"
            "Query: {query_str}\n"
            "Original Answer: {existing_answer}\n"
            "New Answer: "
        ),
        role=MessageRole.USER,
    )
]


CHAT_REFINE_PROMPT = ChatPromptTemplate(message_templates=CHAT_REFINE_PROMPT_TMPL_MSGS)


# Table Context Refine Prompt
CHAT_REFINE_TABLE_CONTEXT_TMPL_MSGS = [
    ChatMessage(content="{query_str}", role=MessageRole.USER),
    ChatMessage(content="{existing_answer}", role=MessageRole.ASSISTANT),
    ChatMessage(
        content=(
            "We have provided a table schema below. "
            "---------------------\n"
            "{schema}\n"
            "---------------------\n"
            "We have also provided some context information below. "
            "{context_msg}\n"
            "---------------------\n"
            "Given the context information and the table schema, "
            "refine the original answer to better "
            "answer the question. "
            "If the context isn't useful, return the original answer."
        ),
        role=MessageRole.USER,
    ),
]
CHAT_REFINE_TABLE_CONTEXT_PROMPT = ChatPromptTemplate(
    message_templates=CHAT_REFINE_TABLE_CONTEXT_TMPL_MSGS
)

In [24]:
chat_engine = vectorstoreindex.as_chat_engine(chat_mode="context",
                                              verbose=True,
                                              text_qa_template=CHAT_REFINE_PROMPT)
chat_engine.reset()
chat_engine.chat_repl()

===== Entering Chat REPL =====
Type "exit" to exit.



Human:  Who is Paul Graham?


Assistant: Paul Graham is a computer scientist, entrepreneur, and venture capitalist. He is best known as the co-founder of Y Combinator, a startup accelerator that has helped launch numerous successful companies such as Dropbox, Airbnb, and Reddit. Graham is also a prolific writer and has published essays on a wide range of topics including startups, technology, and programming languages. Additionally, he has a background in computer science and has made significant contributions to the field, particularly in the area of programming languages.



Human:  How many stars in the universe?


Assistant: The exact number of stars in the universe is difficult to determine due to the vastness of space and the limitations of current technology. However, astronomers estimate that there are around 100 billion stars in our Milky Way galaxy alone. And considering that there are billions of galaxies in the observable universe, the total number of stars is likely in the order of magnitude of hundreds of billions of billions (or hundreds of sextillions). This number is truly mind-boggling and highlights the immense scale of the universe.



Human:  exit


### Adding Memory and System Prompt

In [41]:
from llama_index.core.memory import ChatMemoryBuffer

memory = ChatMemoryBuffer.from_defaults(token_limit=80000)
chat_engine = vectorstoreindex.as_chat_engine(
    chat_mode="context",
    memory=memory,
    system_prompt=(
            "Context information from multiple sources is below.\n"
            "---------------------\n"
            "{context_str}\n"
            "---------------------\n"
            "Given the information from multiple sources"
            "answer the query.\n"
            "If the query is unrelated to the context, just answer: I don't know"
            "Always start your answer with 'Dear Student'" 
            "Query: {query_str}\n"
            "Answer: "
    ),
)

In [45]:
chat_engine.chat_repl()

===== Entering Chat REPL =====
Type "exit" to exit.



Human:  Who is Paul Graham?


Assistant: Dear Student,
I don't know.



Human:  Who is Paul Graham?


Assistant: Dear Student,
Paul Graham is a computer scientist, entrepreneur, and venture capitalist. He is known for co-founding Viaweb, which was later acquired by Yahoo, and for being one of the co-founders of Y Combinator, a well-known startup accelerator. Graham is also a prolific writer and has written numerous essays on topics ranging from startups to programming languages.




Human:  Who is Jessica?


Assistant: Dear Student,
I don't know.



Human:  Who is Jessica Livingstone?


Assistant: Dear Student,
Jessica Livingston is a co-founder of Y Combinator, a startup accelerator. She is known for her work in compiling a book of interviews with startup founders and for her role in the early days of Y Combinator alongside Paul Graham and others.




Human:  exit


In [46]:
memory.to_dict()

{'token_limit': 80000,
 'chat_store': {'store': {'chat_history': [{'role': <MessageRole.USER: 'user'>,
     'content': 'Who is Paul Graham?',
     'additional_kwargs': {}},
    {'role': <MessageRole.ASSISTANT: 'assistant'>,
     'content': "Dear Student,\nI don't know.",
     'additional_kwargs': {}},
    {'role': <MessageRole.USER: 'user'>,
     'content': 'Who is Paul Graham?',
     'additional_kwargs': {}},
    {'role': <MessageRole.ASSISTANT: 'assistant'>,
     'content': 'Dear Student,\nPaul Graham is a computer scientist, entrepreneur, and venture capitalist. He is known for co-founding Viaweb, which was later acquired by Yahoo, and for being one of the co-founders of Y Combinator, a well-known startup accelerator. Graham is also a prolific writer and has written numerous essays on topics ranging from startups to programming languages.\n',
     'additional_kwargs': {}},
    {'role': <MessageRole.USER: 'user'>,
     'content': 'Who is Jessica?',
     'additional_kwargs': {}},
    