### Retriever and Chain with Langchain

In [20]:
import os
from dotenv import load_dotenv

from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_ollama import OllamaLLM
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain

In [8]:
load_dotenv()

os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

In [9]:
loader = PyPDFLoader("lec_6.pdf")
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size = 1000, chunk_overlap = 200)
split_docs = text_splitter.split_documents(docs)

db = FAISS.from_documents(split_docs[:10], OpenAIEmbeddings())

In [10]:
query = "An attention function can be what?"
results = db.similarity_search(query, k=2)
results

[Document(id='3f470646-df81-45f9-a5a0-2c18d4abd515', metadata={'producer': 'Microsoft: Print To PDF', 'creator': 'PyPDF', 'creationdate': '2024-03-19T21:25:11-05:00', 'author': 'Shaurya Tripathi', 'moddate': '2024-03-19T21:25:11-05:00', 'title': 'Microsoft PowerPoint - 3_Self_Attention_Transformer', 'source': 'lec_6.pdf', 'total_pages': 56, 'page': 8, 'page_label': '9'}, page_content='3/19/2024\n9\nhttps://web.stanford.edu/class/archive/cs/cs224n/cs224n.1214/slides/cs224n-2021-lecture09-transformers.pdfAttention is a general Deep Learning techniqueGiven a set of vector values, and a vector query, attention is a technique to compute a weighted sum of the values,  dependent on the query .Intuition:•The weighted sum is a selective summary of the  information contained in the values, where the query  determines which values to focus on.•Attention is a way to obtain a fixed-size representation  of an arbitrary set of representations (the values),  dependent on some other representation (the

In [15]:
llm = OllamaLLM(model="llama2")
llm

OllamaLLM(model='llama2')

In [17]:
prompt = ChatPromptTemplate.from_template(
    """
    Answer the following question based on the context I give you. Think step by step
    <context> {context} </context>
    Question: {input}
    """
)

In [19]:
## Chain Introduction
document_chain = create_stuff_documents_chain(llm, prompt)

retriever = db.as_retriever()
retriever

VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x00000188CEE39BD0>, search_kwargs={})

In [22]:
## Retriever chain

retrieval_chain = create_retrieval_chain(retriever, document_chain)

ans = retrieval_chain.invoke({
    "input": "What is attention in RNN"
})['answer']

print(ans)

The question you've provided is related to the context given earlier, which discusses attention in the context of Recurrent Neural Networks (RNNs). Attention in RNNs refers to a technique used to selectively focus on certain parts of the input sequence when processing it. This is done by computing a weighted sum of the input elements, where the weights are determined by the query vector. The attention mechanism allows the model to pay more attention to the most relevant parts of the input sequence, rather than treating all elements equally.

In RNNs, attention is typically used in the encoder-decoder architecture, where the encoder processes the input sequence and produces a context vector, and the decoder processes the context vector and generates the output sequence. The attention mechanism is applied to the input sequence to determine which parts are most relevant for the decoder to generate the output sequence.

The issues with recurrent attention, as mentioned in the context, incl