In [0]:
from getpass import getpass

OPENAI_KEY = getpass('Enter Open AI API Key: ')

In [0]:
import os

os.environ['OPENAI_API_KEY'] = OPENAI_KEY

**What is Retrieval-Augmented Generation (RAG)?**

RAG combines the capabilities of external knowledge retrieval with LLM text generation. Instead of relying solely on the model’s pre-trained knowledge, RAG retrieves relevant data from a knowledge base, augments it with the query, and uses the LLM to generate a response.


**Key Features of RAG**

*   Dynamic Knowledge Access: Fetch real-time data from external sources.  
*   Enhanced Contextual Responses: Use retrieved knowledge to augment LLM output.  
*   Customizable Knowledge Sources: Connect to databases, document stores, or APIs.


**Workflow of a RAG System** 
*   User Query →  
*   Retrieve Relevant Data from a Knowledge Base →  
*   Augment Query with Retrieved Data →  
*   Generate Final Response Using an LLM.


In [0]:
!pip install faiss-cpu openai



#### Import Libraries

In [0]:
from langchain.chains import RetrievalQA
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from langchain.llms import OpenAI
import os


### Basic RAG Application

#### Setup the Vector Database

In [0]:
from langchain.schema import Document

In [0]:
documents = [
    {"text": "LangChain is a framework for building applications with large language models."},
    {"text": "Retrieval-Augmented Generation combines retrieval with text generation."},
    {"text": "FAISS is a vector database used for similarity searches."},
    {"text": "Transformers are deep learning models designed for sequence-to-sequence tasks."},
    {"text": "Tokenization is the process of breaking text into individual words or subwords."},
    {"text": "BERT is a pre-trained transformer model developed by Google for natural language understanding."},
    {"text": "GPT-3 is an autoregressive language model that uses deep learning to produce human-like text."},
    {"text": "Attention mechanisms allow models to focus on specific parts of the input sequence."},
    {"text": "Natural Language Processing enables computers to understand and process human languages."},
    {"text": "Word embeddings are vector representations of words capturing their meanings and relationships."}
]

document_objects = [Document(page_content=doc["text"]) for doc in documents]


#### Generate Embeddings

In [0]:

embeddings = OpenAIEmbeddings(api_key=OPENAI_KEY)

#### Load the Document into a FAISS Vector Store

In [0]:


vector_db = FAISS.from_documents(document_objects, embeddings)

#### Setup LLM and Retrieval Chain

In [0]:


llm = OpenAI(model = 'gpt-3.5-turbo-instruct',temperature=0.7)

retriever = vector_db.as_retriever()

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",  # Specify the chain type as needed
    retriever=retriever,
    return_source_documents=True  # Optional: Set to True if you want to return source documents
)

#### Process the Query and return the document and answer

In [0]:
input_data = {"query": "Vector database used for similarity searches"}
result = qa_chain(input_data)

#### Extract the answer and source documents which contains the answer

In [0]:
answer = result['result']
source_docs = result['source_documents']

In [0]:
print("Answer:", answer)
print("Source Documents:", source_docs)

Answer:  FAISS
Source Documents: [Document(metadata={}, page_content='FAISS is a vector database used for similarity searches.'), Document(metadata={}, page_content='Word embeddings are vector representations of words capturing their meanings and relationships.'), Document(metadata={}, page_content='LangChain is a framework for building applications with large language models.'), Document(metadata={}, page_content='BERT is a pre-trained transformer model developed by Google for natural language understanding.')]


![image.png](attachment:image.png)

### Enhance the RAG Application

#### Expand the system to retrieve and summarize multiple documents for complex queries.

* Modify the retrieval step to fetch multiple documents.  
* Use a summarization prompt to condense the retrieved information.



In [0]:
# Modify retriever to fetch top 3 documents 
retriever = vector_db.as_retriever(search_kwargs={"k": 3})

In [0]:
# Update the QA Chain for Summarization
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",  # Specify the chain type as needed
    retriever=retriever,
    return_source_documents=True  # Set to True to return source documents
)

In [0]:
input_data = {"query": "Vector database used for similarity searches"}
result = qa_chain(input_data)

In [0]:
result

{'query': 'Vector database used for similarity searches',
 'result': ' FAISS',
 'source_documents': [Document(metadata={}, page_content='FAISS is a vector database used for similarity searches.'),
  Document(metadata={}, page_content='Word embeddings are vector representations of words capturing their meanings and relationships.'),
  Document(metadata={}, page_content='LangChain is a framework for building applications with large language models.')]}