# Retrieval-augmented Generation (RAG) Using LLMs 

In this notebook, we create a basic prototype of a retrieval-augmented generation system. This prototype demonstates how traditional information retrieval methods can be combined with LLM-based text summarization to search through large documents or collection of documents.

We use a Google Pixel 7 release notes as the input document. It is available in the `tensor-house-data` repository.

## Environment Setup and Initialization

In [2]:
#
# Initialize LLM provider
# (google-cloud-aiplatform must be installed)
#
from google.cloud import aiplatform
aiplatform.init(
    project='<< specify your project name here >>',
    location='us-central1'
)

## Question Answering Over a Large Document

In this section, we demonstrate how standalone questions can be answered. The input document(s) is split inot chunks which are then indexed in a vector store. To answer the user question, the most relevant chunks are retrieved and passed to the LLM.

In [27]:
from langchain.llms import VertexAI

from langchain.chains import RetrievalQA, ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

from langchain.document_loaders import UnstructuredHTMLLoader
from langchain.embeddings.vertexai import VertexAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma

#
# Load the input document
#
loader = UnstructuredHTMLLoader("../../tensor-house-data/search/news-feeds/pixel-7-release.html")
documents = loader.load()

#
# Splitting
#
text_splitter = CharacterTextSplitter(chunk_size=4000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)
print(f'The input document has been split into {len(texts)} chunks\n')

#
# Indexing and storing
#
embeddings = VertexAIEmbeddings()
docsearch = Chroma.from_documents(texts, embeddings)

#
# Querying
#
llm = VertexAI(temperature=0.7, verbose=True)
qa_chain = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=docsearch.as_retriever(search_kwargs={"k": 2}), return_source_documents=True)

question = "What are the three coolest Pixel 7 features? Please provide a list with short summaries."
response = qa_chain({"query": question})

print(response['result'])

The input document has been split into 2 chunks

The three coolest Pixel 7 features are:

* Next-generation Super Res Zoom up to 8x on Pixel 7 and up to 30x on Pixel 7 Pro.
* Macro Focus, which delivers Pixel HDR+ photo quality from as close as three centimeters away.
* Photo Unblur, a Google Photos feature only on Pixel 7 and Pixel 7 Pro. Photo Unblur uses machine learning to improve your blurry pictures – even old ones.


## Conversational Retrieval

In this section, we prototype a conversational retrieval system. It combines the chat history with the retrieved documents to answer the question.

In [26]:
#
# Initialize new chat
#
llm = VertexAI(verbose=True)
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
chat = ConversationalRetrievalChain.from_llm(llm, retriever=docsearch.as_retriever(search_kwargs={"k": 2}), memory=memory, verbose=False, return_generated_question=False)

#
# Ask questions with continuous context
#

def print_last_chat_turn(chat_history):
    print(chat_history[-2].content, '\n')
    print(chat_history[-1].content, '\n')    

result = chat({"question": "What are the three coolest Pixel features?"})
print_last_chat_turn(result['chat_history'])

result = chat({"question": "Can you explain the first feature in more detail?"})
print_last_chat_turn(result['chat_history'])

What are the three coolest Pixel features? 

The three coolest Pixel features are:

1. Super Res Zoom up to 8x on Pixel 7 and up to 30x on Pixel 7 Pro.
2. Macro Focus, which delivers Pixel HDR+ photo quality from as close as three centimeters away.
3. Photo Unblur, a Google Photos feature only on Pixel 7 and Pixel 7 Pro. 

Can you explain the first feature in more detail? 

Super Res Zoom is a camera feature that uses machine learning to improve the quality of zoomed-in images. It works by taking multiple photos at different focal lengths and then stitching them together to create a single, high-resolution image. This results in images that are sharper and have less noise than images taken with a traditional zoom lens. 

