# Session 2 - Demo 2.4 - Data-Augmented Question Answering

We are interested to build a personal learning assistant. The parts we need:

- user question (input)
- role prompting to mimic learning assistant role
- relevant context obtained via knowledge source
    - knowledge base/source (we are using lecture transcriptions for simplicity)
- vector database to store the data source and support semantic search
- personalized response with source/citations (summarized output)


<a href="https://colab.research.google.com/github/dair-ai/pe-for-llms/blob/main/notebooks/session-2/demo-2.3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [57]:
import openai
import os
import IPython
from langchain.llms import OpenAI
from dotenv import load_dotenv

In [58]:
load_dotenv()

# API configuration
openai.api_key = os.getenv("OPENAI_API_KEY")

# for LangChain
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

First, we need to download the data we want to use as source to augment generation.

In [59]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.embeddings.cohere import CohereEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores.elastic_vector_search import ElasticVectorSearch
from langchain.vectorstores import Chroma
from langchain.docstore.document import Document
from langchain.prompts import PromptTemplate

As our data source, we will use a transcription of Karpathy's recent lecture on GPT. 

In [60]:
# split text into chunks
with open('../data/kar-gpt.txt') as f:
    text_data = f.read()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0, separator=" ")
texts = text_splitter.split_text(text_data)

# embeddings obtained from OpenAI (you can use open-source like FAISS)
embeddings = OpenAIEmbeddings()

In [31]:
docsearch = Chroma.from_texts(texts, embeddings, metadatas=[{"source": str(i)} for i in range(len(texts))])

Using embedded DuckDB without persistence: data will be transient


In [61]:
query = "What is the course about?"
docs = docsearch.similarity_search(query)

In [63]:
from langchain.chains.qa_with_sources import load_qa_with_sources_chain
from langchain.llms import OpenAI

In [65]:
chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="stuff")
query = "What is the course about?"
chain({"input_documents": docs, "question": query}, return_only_outputs=True)

{'output_text': ' This course is about understanding and appreciating how chat GPT works, and how to develop a transformer neural network. It requires proficiency in Python and some basic understanding of calculus and statistics.\nSOURCES: 1, 7, 107, 108'}

In [87]:
template = """
Given the following extracted parts of a long document and a question, create a final answer with references ("SOURCES"). 
If you don't know the answer, just say that you don't know. Don't try to make up an answer.
ALWAYS return a "SOURCES" part in your answer.

=========
{summaries}
=========

Given the summary above, help answer the following question from the user:

Question: {question}
"""


# create a prompt template
PROMPT = PromptTemplate(template=template, input_variables=["summaries", "question"])

# query 
chain = load_qa_with_sources_chain(OpenAI(temperature=0), chain_type="stuff", prompt=PROMPT)
query = "What is the course about?"
chain({"input_documents": docs, "question": query}, return_only_outputs=True)

{'output_text': '\nAnswer: This course is about training language models using Python and understanding the transformer neural network. It covers the basics of language modeling, such as multi-level perceptrons, and then focuses on the transformer neural network. It also covers fine tuning stages that can be used to perform tasks such as sentiment detection. SOURCES: 108, 1, 107, 7'}

In [114]:
from langchain import PromptTemplate, LLMChain
from langchain.chains import SimpleSequentialChain

llm = OpenAI(temperature=0.9)

agent_response = PromptTemplate(
    input_variables=["response"],
    template="""You are a personal learning assistant. 
    Just take the answer from the previous response {response} and output it as the final response.

    Agent:
    """
)

agent_chain = LLMChain(llm=llm, prompt=agent_response)

overall_chain = SimpleSequentialChain(chains=[chain, agent_chain], verbose=True)

query = "What is the course about?"
overall_chain({"input": {"input_documents": docs ,"question": query}}, return_only_outputs=True)



[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3m
Answer: This course is about understanding and appreciating how the transformer neural network works under the hood. It requires a proficiency in Python and some basic understanding of calculus and statistics. It also helps to have seen the Makemore series on the same YouTube channel, which introduces the language modeling framework. The course focuses on the transformer neural network, which is a language model that models the sequence of words or characters or tokens more generally. SOURCES: 1, 2, 7, 21.[0m
[33;1m[1;3m
The course focuses on the transformer neural network, which is a language model that models the sequence of words or characters or tokens more generally.[0m

[1m> Finished chain.[0m


{'output': '\nThe course focuses on the transformer neural network, which is a language model that models the sequence of words or characters or tokens more generally.'}

Exercise: Add another chain that connects with the previous `agent_chain` to create another agent that tries to be helpful and follows up with a question if it helps to keep the conversation going.