### Lesson 7: Chat


In [1]:
from dotenv import load_dotenv

load_dotenv()

True

In [2]:
from langchain.chains import ConversationalRetrievalChain, RetrievalQA
from langchain.memory import ConversationBufferMemory
from langchain.prompts import PromptTemplate
from langchain_community.chat_models.cohere import ChatCohere
from langchain_community.embeddings.cohere import CohereEmbeddings
from langchain_community.vectorstores.chroma import Chroma

In [3]:
persist_directory = "./.chroma/"

In [4]:
embedding = CohereEmbeddings(model="embed-multilingual-light-v3.0")

In [5]:
vectordb = Chroma(
    persist_directory=persist_directory,
    embedding_function=embedding,
)

In [6]:
question = "What are major topics for this class?"

docs = vectordb.similarity_search(question, k=3)

In [9]:
print(docs[0].page_content)

middle of class, but because there won't be video you can safely sit there and make faces 
at me, and that won't show, okay?  
Let's see. I also handed out this — ther e were two handouts I hope most of you have, 
course information handout. So let me just sa y a few words about parts of these. On the 
third page, there's a section that says Online Resources.  
Oh, okay. Louder? Actually, could you turn up the volume? Testing. Is this better? 
Testing, testing. Okay, cool. Thanks.


In [10]:
llm = ChatCohere(model="command-light", temperature=0)

llm.invoke("Hello world!")

AIMessage(content="Hello to you as well! I'm your AI-assistant chatbot, ready to help you with any task or to have a meaningful conversation. Feel free to ask me whatever you'd like, and we can get started!")

In [11]:
template = """Use the following pieces of context to answer the question \
    at the end. If you don't know the answer, just say that you don't know, \
    don't try to make up an answer. Use three sentences maximum. \
    Keep the answer as concise as possible. Always say "thanks for asking!" \
    at the end of the answer.
{context}
Question: {question}
Helpful Answer:\n
"""

QA_CHAIN_PROMPT = PromptTemplate.from_template(template=template)

In [12]:
question = "Is probability a topic from the lecture material that you can retrieve?"

In [13]:
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=vectordb.as_retriever(),
    return_source_documents=True,
    chain_type_kwargs={"prompt": QA_CHAIN_PROMPT},
)

In [14]:
result = qa_chain.invoke({"query": question})

print(result["result"])

I cannot retrieve the specific topic of probability in this context, as the provided context does not mention it. Having a good understanding of probability and statistics is essential in machine learning. It is used to understand the likelihood of different events occurring and is a fundamental concept in the field of data analysis. It is one of the topics that will be covered in a refresher session for those who need it in the discussion sections for the course on machine learning. 
If you need more information on probability or any other topic related to machine learning, feel free to ask! 
Thanks for asking!


#### Memory


In [15]:
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

#### ConversationalRetrievalChain


In [16]:
qa = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=vectordb.as_retriever(),
    memory=memory,
)

In [17]:
question = "Is probability a topic of the lecture material that you can retrieve?"

result = qa.invoke({"question": question})

In [18]:
print(result["answer"])

Certainly! Probability is most certainly a topic that can be discussed and retrieved by the databases I have access to. 

Probability makes it possible to predict the likelihood of occurrence of a spontaneous event or the repetition of an event that has a degree of uncertainty. The study of probability is the backbone of statistical analysis and is employed by data scientists and analysts, among other professionals. Would you like further clarification on the definition and applications of probability, or any specific information on what materials I can provide for you?


In [None]:
question = "What is the name of the topic I just asked?"

result = qa.invoke({"question": question})

In [20]:
print(result["answer"])

The topic of Probability is likely being discussed with regards to a mathematical context. Could you provide some additional information about the previous topic of discussion? This will help me to provide more relevant information.


#### Create a chatbot that works on your documents


In [21]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyPDFLoader
from langchain_community.vectorstores.docarray import DocArrayInMemorySearch

In [22]:
def load_db(file, chain_type, k):
    loader = PyPDFLoader(file_path=file)
    documents = loader.load()
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=150)
    docs = text_splitter.split_documents(documents=documents)

    embeddings = CohereEmbeddings()
    db = DocArrayInMemorySearch.from_documents(documents=docs, embedding=embeddings)

    retriever = db.as_retriever(search_type="similarity", search_kwargs={"k": k})

    qa_chain = ConversationalRetrievalChain.from_llm(
        llm=ChatCohere(model="command-light", temperature=0),
        chain_type=chain_type,
        retriever=retriever,
        return_source_documents=True,
        return_generated_question=True,
    )

    return qa_chain

In [None]:
import panel as pn
import param


class cbfs(param.Parameterized):
    chat_history = param.List([])
    answer = param.String("")
    db_query = param.String("")
    db_response = param.List([])

    def __init__(self, **params):
        super(cbfs, self).__init__(**params)
        self.panels = []
        self.loaded_file = "docs/cs229_lectures/MachineLearning-Lecture01.pdf"
        self.qa = load_db(self.loaded_file, "stuff", 4)

    def call_load_db(self, count):
        if count == 0 or file_input.value is None:  # init or no file specified :
            return pn.pane.Markdown(f"Loaded File: {self.loaded_file}")
        else:
            file_input.save("temp.pdf")  # local copy
            self.loaded_file = file_input.filename
            button_load.button_style = "outline"
            self.qa = load_db("temp.pdf", "stuff", 4)
            button_load.button_style = "solid"
        self.clr_history()
        return pn.pane.Markdown(f"Loaded File: {self.loaded_file}")

    def convchain(self, query):
        if not query:
            return pn.WidgetBox(
                pn.Row("User:", pn.pane.Markdown("", width=600)), scroll=True
            )
        result = self.qa({"question": query, "chat_history": self.chat_history})
        self.chat_history.extend([(query, result["answer"])])
        self.db_query = result["generated_question"]
        self.db_response = result["source_documents"]
        self.answer = result["answer"]
        self.panels.extend(
            [
                pn.Row("User:", pn.pane.Markdown(query, width=600)),
                pn.Row(
                    "ChatBot:",
                    pn.pane.Markdown(
                        self.answer, width=600, style={"background-color": "#F6F6F6"}
                    ),
                ),
            ]
        )
        inp.value = ""  # clears loading indicator when cleared
        return pn.WidgetBox(*self.panels, scroll=True)

    @param.depends(
        "db_query ",
    )
    def get_lquest(self):
        if not self.db_query:
            return pn.Column(
                pn.Row(
                    pn.pane.Markdown(
                        "Last question to DB:", styles={"background-color": "#F6F6F6"}
                    )
                ),
                pn.Row(pn.pane.Str("no DB accesses so far")),
            )
        return pn.Column(
            pn.Row(
                pn.pane.Markdown("DB query:", styles={"background-color": "#F6F6F6"})
            ),
            pn.pane.Str(self.db_query),
        )

    @param.depends(
        "db_response",
    )
    def get_sources(self):
        if not self.db_response:
            return
        rlist = [
            pn.Row(
                pn.pane.Markdown(
                    "Result of DB lookup:", styles={"background-color": "#F6F6F6"}
                )
            )
        ]
        for doc in self.db_response:
            rlist.append(pn.Row(pn.pane.Str(doc)))
        return pn.WidgetBox(*rlist, width=600, scroll=True)

    @param.depends("convchain", "clr_history")
    def get_chats(self):
        if not self.chat_history:
            return pn.WidgetBox(
                pn.Row(pn.pane.Str("No History Yet")), width=600, scroll=True
            )
        rlist = [
            pn.Row(
                pn.pane.Markdown(
                    "Current Chat History variable",
                    styles={"background-color": "#F6F6F6"},
                )
            )
        ]
        for exchange in self.chat_history:
            rlist.append(pn.Row(pn.pane.Str(exchange)))
        return pn.WidgetBox(*rlist, width=600, scroll=True)

    def clr_history(self, count=0):
        self.chat_history = []
        return

#### Create a chatbot


In [None]:
cb = cbfs()

file_input = pn.widgets.FileInput(accept=".pdf")
button_load = pn.widgets.Button(name="Load DB", button_type="primary")
button_clearhistory = pn.widgets.Button(name="Clear History", button_type="warning")
button_clearhistory.on_click(cb.clr_history)
inp = pn.widgets.TextInput(placeholder="Enter text here…")

bound_button_load = pn.bind(cb.call_load_db, button_load.param.clicks)
conversation = pn.bind(cb.convchain, inp)

jpg_pane = pn.pane.Image("./img/convchain.jpg")

tab1 = pn.Column(
    pn.Row(inp),
    pn.layout.Divider(),
    pn.panel(conversation, loading_indicator=True, height=300),
    pn.layout.Divider(),
)
tab2 = pn.Column(
    pn.panel(cb.get_lquest),
    pn.layout.Divider(),
    pn.panel(cb.get_sources),
)
tab3 = pn.Column(
    pn.panel(cb.get_chats),
    pn.layout.Divider(),
)
tab4 = pn.Column(
    pn.Row(file_input, button_load, bound_button_load),
    pn.Row(
        button_clearhistory,
        pn.pane.Markdown("Clears chat history. Can use to start a new topic"),
    ),
    pn.layout.Divider(),
    pn.Row(jpg_pane.clone(width=400)),
)
dashboard = pn.Column(
    pn.Row(pn.pane.Markdown("# ChatWithYourData_Bot")),
    pn.Tabs(
        ("Conversation", tab1),
        ("Database", tab2),
        ("Chat History", tab3),
        ("Configure", tab4),
    ),
)
dashboard