# Custom Knowledge Chatbot w/ LlamaIndex


In [43]:
!pip install --upgrade numpy



In [44]:
pip install llama-index

Note: you may need to restart the kernel to use updated packages.


# Basic LlamaIndex Usage Pattern

In [3]:
import os

os.environ['OPENAI_API_KEY'] = "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"

In [4]:
from llama_index import GPTVectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader('./data').load_data()
index = GPTVectorStoreIndex.from_documents(documents)

INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 1321 tokens


In [5]:
# Query your index!

response = index.query("What do you think of Facebook's LLaMa?")
print(response)

INFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 1448 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 11 tokens



I think Facebook's LLaMa is a great step forward in democratizing access to large language models and advancing research in this subfield of AI. It is encouraging to see that they are making the model available at several sizes and providing a model card to detail how it was built in accordance with responsible AI practices. I am also glad to see that they are releasing the model under a noncommercial license to ensure integrity and prevent misuse.


# Wikipedia Example

In [48]:
import os

os.environ['OPENAI_API_KEY'] = "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"

In [49]:
from llama_index import download_loader

WikipediaReader = download_loader("WikipediaReader")

loader = WikipediaReader()
wikidocs = loader.load_data(pages=['Cyclone Freddy'])

# https://en.wikipedia.org/wiki/Cyclone_Freddy

In [50]:
index = GPTVectorStoreIndex.from_documents(wikidocs)

INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 7883 tokens


In [52]:
response = index.query("What is cyclone freddy?")
print(response)

INFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 3910 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 8 tokens




Cyclone Freddy is a very intense tropical cyclone that traversed the southern Indian Ocean for more than five weeks in February and March 2023. It is both the longest-lasting and highest-ACE-producing tropical cyclone ever recorded worldwide. Additionally, it is the third-deadliest tropical cyclone recorded in the Southern Hemisphere, only behind 2019's Cyclone Idai and the 1973 Flores cyclone. It caused catastrophic flooding, wind damage, and loss of life in Madagascar, Mauritius, Mozambique, and Malawi. In Malawi, the rains worsened immensely, while in Mauritius, strong winds and waves were observed along the northern coast of the island, with winds in Port Louis reaching 104 km/h (65 mph) and a peak gust of 154 km/h (96 mph) observed on Signal Mountain. Flooding and gale-force winds also affected the country, resulting in one fatality and at least 500 displaced families in a variety of shelters across Mauritius. Additionally, the Taiwanese-flagged fishing trawler LV Lien Sheng Fa 

# Customer Support Example

In [45]:
import os

os.environ['OPENAI_API_KEY'] = "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"

In [46]:
from llama_index import GPTVectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader('./asos').load_data()
index = GPTVectorStoreIndex.from_documents(documents)

INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 12584 tokens


In [47]:
response = index.query("What premier service options do I have in the UAE?")
print(response)

INFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 1317 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 11 tokens



In the United Arab Emirates, you have the option of signing up for ASOS Premier, which gives you free Standard and Express delivery all year round when you spend over 150 AED. It costs 200 AED and is valid on the order you purchase it on.


# YouTube Video Example

In [53]:
import os

os.environ['OPENAI_API_KEY'] = "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"

In [54]:
YoutubeTranscriptReader = download_loader("YoutubeTranscriptReader")

loader = YoutubeTranscriptReader()
documents = loader.load_data(ytlinks=['https://www.youtube.com/watch?v=ILY3Q5AxPbc'])

In [55]:
index = GPTVectorStoreIndex.from_documents(documents)

INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 1979 tokens


In [57]:
response = index.query("Who created the universe?")
print(response)

INFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 2057 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 5 tokens



According to Hinduism, the universe was created through the actions of Lord Vishnu, who slept and let a lotus bloom from his navel. Lord Brahma is the creator, and Lord Shiva is the destroyer.


# Chatbot Class - Just include your index

In [58]:
import openai
import json

class Chatbot:
    def __init__(self, api_key, index):
        self.index = index
        openai.api_key = api_key
        self.chat_history = []

    def generate_response(self, user_input):
        prompt = "\n".join([f"{message['role']}: {message['content']}" for message in self.chat_history[-5:]])
        prompt += f"\nUser: {user_input}"
        response = index.query(user_input)

        message = {"role": "assistant", "content": response.response}
        self.chat_history.append({"role": "user", "content": user_input})
        self.chat_history.append(message)
        return message
    
    def load_chat_history(self, filename):
        try:
            with open(filename, 'r') as f:
                self.chat_history = json.load(f)
        except FileNotFoundError:
            pass

    def save_chat_history(self, filename):
        with open(filename, 'w') as f:
            json.dump(self.chat_history, f)


In [59]:
documents = SimpleDirectoryReader('./data').load_data()
index = GPTVectorStoreIndex.from_documents(documents)

INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total LLM token usage: 0 tokens
INFO:llama_index.token_counter.token_counter:> [build_index_from_nodes] Total embedding token usage: 1321 tokens


In [63]:
# Swap out your index below for whatever knowledge base you want
bot = Chatbot("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", index=index)
bot.load_chat_history("chat_history.json")

question_count = 0  # Initialize question counter

while question_count < 3:  # Change the desired number of questions here
    user_input = input("You: ")
    if user_input.lower() in ["bye", "goodbye"]:
        print("Bot: Goodbye!")
        bot.save_chat_history("chat_history.json")
        break
    response = bot.generate_response(user_input)
    print(f"Bot: {response['content']}")
    question_count += 1  # Increment question counter


You: What is Llama


INFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 1428 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 4 tokens


Bot: 
LLaMA is a state-of-the-art foundational large language model designed to help researchers advance their work in this subfield of AI. It is designed to be versatile and can be applied to many different use cases. It is trained on a large set of unlabeled data, which makes it ideal for fine-tuning for a variety of tasks.
You: who created it?


INFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 1366 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 4 tokens


Bot: 
Meta created LLaMA (Large Language Model Meta AI).
You: When was it created?


INFO:llama_index.token_counter.token_counter:> [query] Total LLM token usage: 1363 tokens
INFO:llama_index.token_counter.token_counter:> [query] Total embedding token usage: 5 tokens


Bot: 
LLaMA was created in 2021.
