## Data Ingestion

In [1]:
from langchain_community.document_loaders import TextLoader

loader = TextLoader("speech.txt")

#### Using loader loading the text documents

In [2]:
text_documents = loader.load()
text_documents

[Document(page_content='Friends, Romans, countrymen, lend me your ears;\nI come to bury Caesar, not to praise him.\nThe evil that men do lives after them;\nThe good is oft interred with their bones;\nSo let it be with Caesar. The noble Brutus\nHath told you Caesar was ambitious:\nIf it were so, it was a grievous fault,\nAnd grievously hath Caesar answerâ€™d it.\nHere, under leave of Brutus and the restâ€“\nFor Brutus is an honourable man;\nSo are they all, all honourable menâ€“\nCome I to speak in Caesarâ€™s funeral.\nHe was my friend, faithful and just to me:\nBut Brutus says he was ambitious;\nAnd Brutus is an honourable man.\nHe hath brought many captives home to Rome\nWhose ransoms did the general coffers fill:\nDid this in Caesar seem ambitious?\nWhen that the poor have cried, Caesar hath wept:\nAmbition should be made of sterner stuff:\nYet Brutus says he was ambitious;\nAnd Brutus is an honourable man.\nYou all did see that on the Lupercal\nI thrice presented him a kingly crown,

#### Loading environment

In [3]:
from dotenv import load_dotenv
import os

load_dotenv()
os.environ["GEMINI_API_KEY"] = os.getenv("GEMINI_API_KEY")

### Loading data from web

In [4]:
from langchain_community.document_loaders import WebBaseLoader
import bs4

loader = WebBaseLoader(web_path=("https://www.ibm.com/think/topics/artificial-intelligence"),
                      bs_kwargs=dict(parse_only=bs4.SoupStrainer(
                        class_=("article-heading-title", "rich-text text")
                      )),)
loader

<langchain_community.document_loaders.web_base.WebBaseLoader at 0x12e2f1d1550>

In [5]:
text_documents = loader.load()
text_documents

[Document(page_content='\n                        \n                        \n\n\n\n  \n    What is artificial intelligence (AI)?\n\n\n\n\n\n\n    \n\n\n                    \nArtificial intelligence (AI) is technology that enables computers and machines to simulate human learning, comprehension, problem solving, decision making, creativity and autonomy.\n\nApplications and devices equipped with AI can see and identify objects. They can understand and respond to human language. They can learn from new information and experience. They can make detailed recommendations to users and experts.\xa0They can act independently, replacing the need for human intelligence or intervention (a classic example being a self-driving car).\nBut in 2024, most AI researchers, practitioners and most AI-related headlines are focused on breakthroughs in generative AI\xa0(gen AI), a technology that can create original text, images, video and other content. To fully understand generative AI, it’s important to fi

### Inputing the document and loading it

In [6]:
from langchain_community.document_loaders import PyPDFLoader

loaders = PyPDFLoader("attention.pdf")
docs = loaders.load()

In [7]:
docs

[Document(page_content='Attention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.com\nNoam Shazeer∗\nGoogle Brain\nnoam@google.com\nNiki Parmar∗\nGoogle Research\nnikip@google.com\nJakob Uszkoreit∗\nGoogle Research\nusz@google.com\nLlion Jones∗\nGoogle Research\nllion@google.com\nAidan N. Gomez∗†\nUniversity of Toronto\naidan@cs.toronto.edu\nŁukasz Kaiser ∗\nGoogle Brain\nlukaszkaiser@google.com\nIllia Polosukhin∗‡\nillia.polosukhin@gmail.com\nAbstract\nThe dominant sequence transduction models are based on complex recurrent or\nconvolutional neural networks that include an encoder and a decoder. The best\nperforming models also connect the encoder and decoder through an attention\nmechanism. We propose a new simple network architecture, the Transformer,\nbased solely on attention mechanisms, dispensing with recurrence and convolutions\nentirely. Experiments on two machine translation tasks show these models to\nbe superior in quality while being more parallelizable an

### Splitting docs using text splitter

In [8]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
text_splitter

<langchain_text_splitters.character.RecursiveCharacterTextSplitter at 0x12e2dee4dd0>

In [9]:
splitted_docs = text_splitter.split_documents(docs)
splitted_docs

[Document(page_content='Attention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.com\nNoam Shazeer∗\nGoogle Brain\nnoam@google.com\nNiki Parmar∗\nGoogle Research\nnikip@google.com\nJakob Uszkoreit∗\nGoogle Research\nusz@google.com\nLlion Jones∗\nGoogle Research\nllion@google.com\nAidan N. Gomez∗†\nUniversity of Toronto\naidan@cs.toronto.edu\nŁukasz Kaiser ∗\nGoogle Brain\nlukaszkaiser@google.com\nIllia Polosukhin∗‡\nillia.polosukhin@gmail.com\nAbstract\nThe dominant sequence transduction models are based on complex recurrent or\nconvolutional neural networks that include an encoder and a decoder. The best\nperforming models also connect the encoder and decoder through an attention\nmechanism. We propose a new simple network architecture, the Transformer,\nbased solely on attention mechanisms, dispensing with recurrence and convolutions\nentirely. Experiments on two machine translation tasks show these models to\nbe superior in quality while being more parallelizable an

### Vector Embeddings and vector store

In [None]:
from langchain_community.embeddings import SentenceTransformerEmbeddings
from langchain_community.vectorstores import FAISS

emb = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
vct_store = FAISS.from_documents(documents=splitted_docs[:20],
                                embedding=emb)

vct_store

<langchain_community.vectorstores.faiss.FAISS at 0x12e6e2e7a90>

In [11]:
query = "What is applications of attentions  in model from Attention all you need"
result = vct_store.similarity_search(query=query, 
                                    k=3,)
result[0].page_content

'we found it beneﬁcial to linearly project the queries, keys and values htimes with different, learned\nlinear projections to dk, dk and dv dimensions, respectively. On each of these projected versions of\nqueries, keys and values we then perform the attention function in parallel, yielding dv-dimensional\noutput values. These are concatenated and once again projected, resulting in the ﬁnal values, as\ndepicted in Figure 2.\nMulti-head attention allows the model to jointly attend to information from different representation\nsubspaces at different positions. With a single attention head, averaging inhibits this.\n4To illustrate why the dot products get large, assume that the components of q and k are independent random\nvariables with mean 0 and variance 1. Then their dot product, q · k = ∑dk\ni=1 qiki, has mean 0 and variance dk.\n4'

In [12]:
from langchain_community.llms import Ollama

llm = Ollama(model="gemma:2b")
llm

Ollama(model='gemma:2b')

### Prompt Building and Creating chain

In [13]:
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template(
    """Answer the following question based on the provided context.
    Think step by step before providing a detailed answer.
    <context>
    {context}
    </context>
    Question: {input}
    """
)

In [14]:
from langchain.chains.combine_documents import create_stuff_documents_chain

document_chain = create_stuff_documents_chain(llm, prompt)
document_chain

RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableLambda(format_docs)
}), config={'run_name': 'format_inputs'})
| ChatPromptTemplate(input_variables=['context', 'input'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'input'], template='Answer the following question based on the provided context.\n    Think step by step before providing a detailed answer.\n    <context>\n    {context}\n    </context>\n    Question: {input}\n    '))])
| Ollama(model='gemma:2b')
| StrOutputParser(), config={'run_name': 'stuff_documents_chain'})

In [15]:
retriever = vct_store.as_retriever()
retriever


VectorStoreRetriever(tags=['FAISS', 'HuggingFaceEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x0000012E6E2E7A90>)

In [16]:
from langchain.chains import create_retrieval_chain
retrieval_chain = create_retrieval_chain(retriever, document_chain)
retrieval_chain

RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableBinding(bound=RunnableLambda(lambda x: x['input'])
           | VectorStoreRetriever(tags=['FAISS', 'HuggingFaceEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x0000012E6E2E7A90>), config={'run_name': 'retrieve_documents'})
})
| RunnableAssign(mapper={
    answer: RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
              context: RunnableLambda(format_docs)
            }), config={'run_name': 'format_inputs'})
            | ChatPromptTemplate(input_variables=['context', 'input'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'input'], template='Answer the following question based on the provided context.\n    Think step by step before providing a detailed answer.\n    <context>\n    {context}\n    </context>\n    Question: {input}\n    '))])
            | Ollama(model='gemma:2b')
            | StrOutputParser(), config={'ru

In [31]:
response = retrieval_chain.invoke({"input": "What is Self-Attention Mechanism"})

In [32]:
response["answer"]

'Sure, here is a detailed answer to the question:\n\nSelf-attention is an attention mechanism that relates different positions of a single sequence in order to compute a representation of the sequence. It is used successfully in a variety of tasks including reading comprehension, abstractive summarization, textual entailment and learning task-independent sentence representations.'