## 1. Data Ingestion

In [1]:
from langchain_community.document_loaders import TextLoader 
loader = TextLoader('transformer.txt')
docs = loader.load()
docs

[Document(metadata={'source': 'transformer.txt'}, page_content='The Transformer model, introduced in the paper “Attention is All You Need” by Vaswani et al. in 2017, marked a significant departure from previous deep learning architectures used for natural language processing (NLP).\n\nPrior to Transformers, models like Recurrent Neural Networks (RNNs) and Long Short-Term Memory networks (LSTMs) were the go-to methods for handling sequential data. \n\nHowever, these models processed data step by step, leading to slower training times and limitations in capturing long-range dependencies. \n\nThe Transformer, in contrast, uses a mechanism called self-attention that enables it to process entire sequences of data in parallel, making it much faster and more effective at capturing complex relationships between words, regardless of their distance in the sequence.\n\nAt the heart of the Transformer architecture lies the self-attention mechanism. Self-attention allows each token in a sequence to

## 2. Data Transformation

In [2]:
from langchain_text_splitters import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
text_chunks = text_splitter.split_documents(docs)
text_chunks

[Document(metadata={'source': 'transformer.txt'}, page_content='The Transformer model, introduced in the paper “Attention is All You Need” by Vaswani et al. in 2017, marked a significant departure from previous deep learning architectures used for natural language processing (NLP).\n\nPrior to Transformers, models like Recurrent Neural Networks (RNNs) and Long Short-Term Memory networks (LSTMs) were the go-to methods for handling sequential data. \n\nHowever, these models processed data step by step, leading to slower training times and limitations in capturing long-range dependencies. \n\nThe Transformer, in contrast, uses a mechanism called self-attention that enables it to process entire sequences of data in parallel, making it much faster and more effective at capturing complex relationships between words, regardless of their distance in the sequence.'),
 Document(metadata={'source': 'transformer.txt'}, page_content='At the heart of the Transformer architecture lies the self-attent

## 3. Embeddings

In [3]:
from langchain_community.embeddings import OllamaEmbeddings
embeddings = OllamaEmbeddings(model="gemma2:2b") # Default -> Llama2
embeddings

  embeddings = OllamaEmbeddings(model="gemma2:2b") # Default -> Llama2


OllamaEmbeddings(base_url='http://localhost:11434', model='gemma2:2b', embed_instruction='passage: ', query_instruction='query: ', mirostat=None, mirostat_eta=None, mirostat_tau=None, num_ctx=None, num_gpu=None, num_thread=None, repeat_last_n=None, repeat_penalty=None, temperature=None, stop=None, tfs_z=None, top_k=None, top_p=None, show_progress=False, headers=None, model_kwargs=None)

## 4. VectorStore

In [None]:
from langchain_community.vectorstores import FAISS
vector_store = FAISS.from_documents(documents=text_chunks, embedding=embeddings)
vector_store

## retriever 
retriever = vector_store.as_retriever()

In [5]:
query = "There are reasonable limits to concurrent request"
result = vector_store.similarity_search(query)
result[0].page_content

'At the heart of the Transformer architecture lies the self-attention mechanism. Self-attention allows each token in a sequence to interact with every other token in the sequence, computing a weighted representation of the input sequence. \n\nThis mechanism helps the model understand contextual relationships between words even if they are far apart. \n\nFor example, in the sentence “The cat sat on the mat,” the word “cat” is highly relevant to “sat,” but less so to “mat.” Self-attention allows the model to assign higher attention to "cat" and "sat" than "cat" and "mat." \n\nThis ability to weigh token importance dynamically is what gives Transformers their superior performance in NLP tasks'

In [6]:
import os 
from dotenv import load_dotenv
load_dotenv()
os.environ['OPENAI_API_KEY'] = os.getenv("OPENAI_API_KEY")
os.environ['GROQ_API_KEY'] = os.getenv("GROQ_API_KEY")
os.environ['LANGSMITH_API_KEY'] = os.getenv("LANGSMITH_API_KEY")
os.environ['LANGSMITH_PROJECT'] = os.getenv("LANGSMITH_PROJECT")
os.environ['LANGSMITH_TRACING'] = os.getenv("LANGSMITH_TRACING")

In [7]:
from langchain_groq import ChatGroq
llm = ChatGroq(model="Gemma2-9b-It")
llm

ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x129ee5310>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x129ee6420>, model_name='Gemma2-9b-It', model_kwargs={}, groq_api_key=SecretStr('**********'))

In [8]:
## input and get response from LLM 
result = llm.invoke("What is AI Agents?")
result

AIMessage(content="##  AI Agents: Autonomous Software for Your Digital World \n\nAI agents are essentially **software programs that can act autonomously to achieve specific goals**. They are designed to perceive their environment, make decisions, and take actions based on learned patterns and given instructions. \n\nThink of them as **virtual assistants** with a more sophisticated brain. \n\nHere's a breakdown:\n\n**Key Features:**\n\n* **Autonomy:** They operate independently, without constant human intervention.\n* **Goal-Oriented:** They are programmed with specific objectives and work towards achieving them.\n* **Environment Interaction:** They can sense and react to changes in their surroundings (data, user input, etc.).\n* **Decision Making:** They use algorithms and learned knowledge to choose the best course of action.\n* **Learning and Adaptation:** Many AI agents can learn from their experiences and improve their performance over time.\n\n**Examples:**\n\n* **Chatbots:**  Pro

In [9]:
from langchain_core.documents import Document
from langchain_core.prompts import ChatPromptTemplate
from langchain.chains.combine_documents import create_stuff_documents_chain

prompt = ChatPromptTemplate.from_messages(
    [("system", "What is transformer?:\n\n{context}")]
)

## create_stuff_documents_chain
chain = create_stuff_documents_chain(llm, prompt)
chain

RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableLambda(format_docs)
}), kwargs={}, config={'run_name': 'format_inputs'}, config_factories=[])
| ChatPromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template='What is transformer?:\n\n{context}'), additional_kwargs={})])
| ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x129ee5310>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x129ee6420>, model_name='Gemma2-9b-It', model_kwargs={}, groq_api_key=SecretStr('**********'))
| StrOutputParser(), kwargs={}, config={'run_name': 'stuff_documents_chain'}, config_factories=[])

In [10]:
from langchain.chains.combine_documents import create_stuff_documents_chain
document_chain=create_stuff_documents_chain(llm,prompt)
document_chain

RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableLambda(format_docs)
}), kwargs={}, config={'run_name': 'format_inputs'}, config_factories=[])
| ChatPromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template='What is transformer?:\n\n{context}'), additional_kwargs={})])
| ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x129ee5310>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x129ee6420>, model_name='Gemma2-9b-It', model_kwargs={}, groq_api_key=SecretStr('**********'))
| StrOutputParser(), kwargs={}, config={'run_name': 'stuff_documents_chain'}, config_factories=[])

In [11]:
vector_store.similarity_search("What is transfomer")

[Document(id='8431e022-bf50-4540-8daa-fcaf949b3a91', metadata={'source': 'transformer.txt'}, page_content='At the heart of the Transformer architecture lies the self-attention mechanism. Self-attention allows each token in a sequence to interact with every other token in the sequence, computing a weighted representation of the input sequence. \n\nThis mechanism helps the model understand contextual relationships between words even if they are far apart. \n\nFor example, in the sentence “The cat sat on the mat,” the word “cat” is highly relevant to “sat,” but less so to “mat.” Self-attention allows the model to assign higher attention to "cat" and "sat" than "cat" and "mat." \n\nThis ability to weigh token importance dynamically is what gives Transformers their superior performance in NLP tasks'),
 Document(id='a379f66d-1d36-4f4d-9148-7086ac12ebc4', metadata={'source': 'transformer.txt'}, page_content='The Transformer model, introduced in the paper “Attention is All You Need” by Vaswani

In [12]:
from langchain.chains import create_retrieval_chain

retrieval_chain=create_retrieval_chain(retriever,document_chain)
retrieval_chain

RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableBinding(bound=RunnableLambda(lambda x: x['input'])
           | VectorStoreRetriever(tags=['FAISS', 'OllamaEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x128907590>, search_kwargs={}), kwargs={}, config={'run_name': 'retrieve_documents'}, config_factories=[])
})
| RunnableAssign(mapper={
    answer: RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
              context: RunnableLambda(format_docs)
            }), kwargs={}, config={'run_name': 'format_inputs'}, config_factories=[])
            | ChatPromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template='What is transformer?:\n\n{context}'), additional_kwargs={})])
            | ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x129ee5

In [13]:
result=retrieval_chain.invoke({"input":"Why transformer"})
result['answer']

'That\'s a great explanation of transformers! Here\'s a concise summary:\n\n**What is a Transformer?**\n\nA transformer is a type of neural network architecture specifically designed for processing sequential data, like text. It revolutionized natural language processing (NLP) due to its unique ability to understand relationships between words in a sentence, regardless of their distance.\n\n**Key Features:**\n\n* **Self-Attention:** This is the heart of the transformer. It allows each word in a sequence to "attend" to all other words, weighing their importance in understanding the context. Imagine it like each word having a conversation with all the other words, figuring out who\'s most relevant.\n* **Encoder-Decoder Structure:**  Transformers typically have two main parts:\n    * **Encoder:**  Processes the input sequence, understanding its meaning and relationships.\n    * **Decoder:**  Generates the output sequence (e.g., translating a sentence, writing a summary) based on the encod