<a href="https://colab.research.google.com/github/harshdhamecha/qa-chatbot/blob/main/Q_A_Bot_AISensy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Background

This notebook is a demonstration of Q/A Chatbot with Retrieval Augmented Generation. <br>

You can download the sample documents from my [GitHub](https://github.com/harshdhamecha/qa-chatbot/tree/main/sample). I have referred 3 AiSensy blogs, and converted them into pdfs.


# Setup

In [7]:
# Remove unnecessary directory if exists

!rm -rf /content/sample_data

## Import Modules

In [1]:
!pip install --upgrade --quiet langchain-pinecone langchain-openai langchain langchain-community pinecone-client tiktoken

In [2]:
!pip install unstructured -q
!pip install unstructured[local-inference] -q

## API Keys

In [3]:
import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter OPENAI API Key\n")
os.environ['PINECONE_API_KEY'] = getpass.getpass("Enter Pinecone API Key\n")

Enter OPENAI API Key
··········
Enter Pinecone API Key
··········


# Data Preparation

In [4]:
from langchain.document_loaders import DirectoryLoader

def load_documents(path):
    loader = DirectoryLoader(path)
    documents = loader.load()
    return documents

In [5]:
directory = '/content/'
documents = load_documents(directory)

In [6]:
len(documents)

3

In [7]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

def split_documents(documents, chunk_size=400, chunk_overlap=30):
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap)
    chunks = text_splitter.split_documents(documents)
    return chunks

In [8]:
chunks = split_documents(documents)

## Embeddings

In [11]:
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model='text-embedding-ada-002')

## Create Vector Database

In [12]:
import pinecone
from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(
    api_key=os.environ.get("PINECONE_API_KEY")
)

index_name = 'qa-langchain'

if index_name not in pc.list_indexes().names():
    pc.create_index(
        name=index_name,
        dimension=1536,
        metric='euclidean',
        spec=ServerlessSpec(
            cloud='aws',
            region='us-west-2'
        )
    )

index = pc.Index(index_name)

In [13]:
from langchain_pinecone import PineconeVectorStore

docsearch = PineconeVectorStore.from_documents(chunks, embeddings, index_name=index_name)

In [14]:
def get_similar_docs(query):
    return docsearch.similarity_search(query)

## Prompt Design

In [16]:
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant that provides answers based on the following context: {context}.\
            Response politely and in a humorous way when any off-topic or out-of-context question is being asked."
        ),
        ("human", "{input}")
    ]
)

# Model

In [15]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model='gpt-4o')

In [17]:
chain = prompt | llm

In [18]:
def get_answer(query, chain):
    context = get_similar_docs(query)
    response = chain.invoke(
        {
            "input": query,
            "context": context
        }
    )
    return response.content

# Test

In [19]:
query = "What is the recent update or feature launch by AISensy?"
answer = get_answer(query, chain)

In [20]:
print(answer)

It looks like you're interested in the latest and greatest! Recently, AiSensy enabled WhatsApp Ticket Booking for the Delhi Transport Corporation (DTC). Now you can book your DTC bus tickets via WhatsApp—how convenient is that? If only I could book a vacation through WhatsApp too! 🚌📱


In [21]:
# Out-of-context question
query = "How to play cricket?"
answer = get_answer(query, chain)

In [22]:
answer

"Ah, cricket! The glorious game that can turn a sunny afternoon into a thrilling saga. While I'd love to dive into the specifics of googlies and cover drives, I'm here to chat about WhatsApp content marketing strategies. But hey, if you ever need to turn a WhatsApp chat into a cricket discussion, I can help with that! 🏏😄"

Noticed How creatively it managed the Out-of-Context Question?