### Lesson 6: Question Answering


In [1]:
from dotenv import load_dotenv

load_dotenv()

True

In [2]:
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
from langchain_community.chat_models.cohere import ChatCohere
from langchain_community.embeddings.cohere import CohereEmbeddings
from langchain_community.vectorstores.chroma import Chroma

In [3]:
persist_directory = "./.chroma/"

In [4]:
embedding = CohereEmbeddings()

In [5]:
vectordb = Chroma(
    persist_directory=persist_directory,
    embedding_function=embedding,
)

In [6]:
vectordb._collection.count()

208

In [7]:
question = "What are major topics of the lectures given?"

docs = vectordb.max_marginal_relevance_search(query=question, k=3)

len(docs)

3

#### RetrievalQA Chain


In [8]:
qa_chain = RetrievalQA.from_chain_type(
    llm=ChatCohere(temperature=0.05),
    retriever=vectordb.as_retriever(),
)

In [9]:
result = qa_chain({"query": question})

In [10]:
print(result["result"])

I'm happy to help you identify the major topics of lectures given, but could you provide some more details about the lectures? For example, if you provide the names or titles of the lectures, it would help me provide more specific and relevant information. 

Furthermore, if you could tell me more about the context of the lectures, such as the subject area or the course they are given in, I might be able to provide more tailored suggestions. 

For instance, if you provide details like "a series of lectures on the French Revolution given in a History course" or "a lecture on database management given in a Computer Science course", I would be able to generate more specific answers. 

With this additional information, I would be able to provide a more comprehensive list of major topics covered in the lectures. 

Please provide more details about the lectures you're interested in.


#### Prompt


In [11]:
# Build prompt
template = """Use the following pieces of context to answer the question at the end. \
    If you don't know the answer, just say that you don't know, \
    don't try to make up an answer. Use three sentences maximum. \
    Keep the answer as concise as possible. Always say "thanks for asking!" \
    at the end of the answer. \
{context}
Question: {question}
Helpful Answer:\n
"""

QA_CHAIN_PROMPT = PromptTemplate.from_template(template=template)

In [12]:
QA_CHAIN_PROMPT

PromptTemplate(input_variables=['context', 'question'], template='Use the following pieces of context to answer the question at the end.     If you don\'t know the answer, just say that you don\'t know,     don\'t try to make up an answer. Use three sentences maximum.     Keep the answer as concise as possible. Always say "thanks for asking!"     at the end of the answer. {context}\nQuestion: {question}\nHelpful Answer:\n\n')

In [13]:
qa_chain = RetrievalQA.from_chain_type(
    llm=ChatCohere(temperature=0.05),
    retriever=vectordb.as_retriever(),
    return_source_documents=True,
    chain_type_kwargs={"prompt": QA_CHAIN_PROMPT},
)

In [14]:
question = "Is probability a topic from the lectures given?"

In [15]:
result = qa_chain({"query": question})

In [16]:
print(result["result"])

Yes, probability is indeed a topic that will be covered in the lectures given later in the quarter. The instructor mentions that they will go over topics like convex optimization and hidden Markov models, which are extensions of the material, in the main lectures. 

Would you like to know more about the topics that will be covered in the lectures? Feel free to let me know if there's anything else you'd like to know about the course or the discussion sections related to it. 

Thanks for asking! 

Do you have any other questions about your coursework or studies?


In [17]:
print(result["source_documents"][0].page_content)

statistics for a while or maybe algebra, we'll go over those in the discussion sections as a 
refresher for those of you that want one.  
Later in this quarter, we'll also use the disc ussion sections to go over extensions for the 
material that I'm teaching in the main lectur es. So machine learning is a huge field, and 
there are a few extensions that we really want  to teach but didn't have time in the main 
lectures for.


#### RetrievalQA chain types


In [18]:
qa_chain_mr = RetrievalQA.from_chain_type(
    llm=ChatCohere(),
    retriever=vectordb.as_retriever(),
    chain_type="map_reduce",
)

In [19]:
result = qa_chain_mr({"query": question})

In [20]:
print(result["result"])

Yes, probability is a fundamental topic that is often covered in lectures given in universities and colleges. It is a branch of mathematics that deals with the measurement and quantification of uncertainty. Probability provides a framework for analyzing and understanding the likelihood or chance of events occurring.

Probability theory finds wide application in various fields, including mathematics, statistics, computer science, physics, engineering, finance, and economics. It serves as the foundation for statistical analysis, enabling researchers, scientists, and decision-makers to make informed choices based on data.

Specifically, in the field of computer science, probability is crucial for developing algorithms that make predictions, estimate probabilities, and handle uncertain or incomplete information. It is used in machine learning algorithms to train models and make data-driven decisions. Additionally, probability is essential for understanding the theoretical foundations of cr

In [None]:
qa_chain_rf = RetrievalQA.from_chain_type(
    llm=ChatCohere(),
    retriever=vectordb.as_retriever(),
    chain_type="refine",
)

result = qa_chain_rf({"query": question})

In [24]:
print(result["result"])

Thank you for providing additional context regarding the student's story and its relevance to the course. Let's use this opportunity to further refine the response to adequately address the question at hand while incorporating the new information:

"The student's success with applying MATLAB in their work is indeed noteworthy, but it also underscores the importance of genuinely understanding the foundational concepts in machine learning. By covering prerequisites such as probability and statistics in the discussion sections, you aim to ensure that your students have a robust base of knowledge to build upon. These fundamental concepts provide the necessary framework for comprehending advanced topics like machine learning and their practical applications.

The discussion sections, facilitated by TAs and recorded for convenience, offer a valuable opportunity for students to engage with the material, clarify doubts, and reinforce their learning. By dedicating the initial weeks of these sec

#### RetrievalQA limitations

QA fails to preserve conversational history.


In [21]:
qa_chain = RetrievalQA.from_chain_type(
    llm=ChatCohere(temperature=0.05),
    retriever=vectordb.as_retriever(),
)

In [22]:
question = "Is probability a topic of the lectures given?"

result = qa_chain({"query": question})

In [23]:
print(result["result"])

Yes, the topic of probability is often covered in lectures given by university professors. Probability is a fundamental concept in mathematics that deals with the measurement and quantification of uncertainty. It is used to analyze and understand the likelihood or chance of events occurring.

In various fields of study, lectures may incorporate probability in different ways:

1. Mathematics Lectures: In mathematics courses, lectures on probability often delve into the theoretical foundations and concepts of probability theory. This includes covering topics such as probability spaces, events, random variables, probability distributions, and mathematical techniques for calculating probabilities.

2. Statistics Lectures: In statistics lectures, probability is a central topic. Lectures may focus on applying probability theory to practical scenarios, including experimental design, data analysis, and drawing conclusions from observed data. Topics such as statistical inference, confidence int

In [28]:
question = "why are those prerequesites needed?"

result = qa_chain({"query": question})

print(result["result"])

Sure, I'd be happy to help you understand why certain prerequisites are needed for a given task, assignment, or job position. Could you provide more details about the specific prerequisites you have in mind? This will allow me to provide a more tailored response. 

In general, prerequisites are requirements or conditions that must be met or fulfilled before proceeding with a task, project, course, or job role. They are often necessary to ensure that individuals have the necessary skills, knowledge, qualifications, or resources to undertake a particular endeavor effectively. 

Here are some common reasons why prerequisites are typically needed:

1. Skill and Knowledge Assessment: Prerequisites help assess an individual's proficiency in certain areas before embarking on a task or learning experience. They ensure that individuals have a baseline understanding of fundamental concepts or skills necessary for the task at hand. 

2. Foundation and Building Blocks: Prerequisites often represen

Note, The LLM response varies. Some responses do include a reference to probability which might be gleaned from referenced documents. The point is simply that the model does not have access to past questions or answers, this will be covered in the next lesson.
