In [8]:
# web based loader
from langchain_community.document_loaders import WebBaseLoader
import bs4
import os

# Set a custom USER_AGENT to avoid warning
os.environ["USER_AGENT"] = "Mozilla/5.0 (compatible; MyLangchainBot/1.0; +https://example.com/bot)"
## load,chunk and index the content of the html page

loader=WebBaseLoader(web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
                     bs_kwargs=dict(parse_only=bs4.SoupStrainer(
                         class_=("post-title","post-content","post-header")

                     )))

text_documents=loader.load()

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [9]:
from langchain_google_genai import ChatGoogleGenerativeAI, GoogleGenerativeAIEmbeddings
embeddings = GoogleGenerativeAIEmbeddings(
    model="models/embedding-001",
    google_api_key=os.getenv("GOOGLE_API_KEY")
)

In [10]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter=RecursiveCharacterTextSplitter(chunk_size=1000,chunk_overlap=200)
documents=text_splitter.split_documents(text_documents)
documents[:5]

[Document(metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, page_content='LLM Powered Autonomous Agents\n    \nDate: June 23, 2023  |  Estimated Reading Time: 31 min  |  Author: Lilian Weng\n\n\nBuilding agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\nAgent System Overview#\nIn a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components:\n\nPlanning\n\nSubgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.\nReflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refi

In [11]:
## Vector Embedding And Vector Store
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma


In [12]:
db = Chroma.from_documents(documents,embeddings)

In [31]:
query = "what isShort-term memory: ?"
retireved_results=db.similarity_search(query)
print(retireved_results[0].page_content)

Short-Term Memory (STM) or Working Memory: It stores information that we are currently aware of and needed to carry out complex cognitive tasks such as learning and reasoning. Short-term memory is believed to have the capacity of about 7 items (Miller 1956) and lasts for 20-30 seconds.


Long-Term Memory (LTM): Long-term memory can store information for a remarkably long time, ranging from a few days to decades, with an essentially unlimited storage capacity. There are two subtypes of LTM:

Explicit / declarative memory: This is memory of facts and events, and refers to those memories that can be consciously recalled, including episodic memory (events and experiences) and semantic memory (facts and concepts).
Implicit / procedural memory: This type of memory is unconscious and involves skills and routines that are performed automatically, like riding a bike or typing on a keyboard.





Categorization of human memory.

We can roughly consider the following mappings:


In [2]:
## Pdf reader
from langchain_community.document_loaders import PyPDFLoader
loader=PyPDFLoader('temp_Privacy_Policy.pdf')
docs=loader.load()

In [3]:
docs

[Document(metadata={'producer': 'iText 2.1.7 by 1T3XT', 'creator': 'PyPDF', 'creationdate': '2025-04-21T19:59:08+05:30', 'moddate': '2025-04-21T19:59:08+05:30', 'source': 'temp_Privacy_Policy.pdf', 'total_pages': 2, 'page': 0, 'page_label': '1'}, page_content='FROM Harsha,\nTO Cognizant Technology Solutions India Private Limited\nI Read and Understood the below Candidate Privacy policy.\nBefore submitting your details, please read our  Notice which provides important Candidate Privacy\ninformation about the collection and use of your personal information for recruitment purposes, \nincluding information on your individual rights.\nIf your job application is unsuccessful, please let us know if you would like us to retain your details \nso that we can keep in touch with you about other future job opportunities at Cognizant and send you \nother useful recruitment related information. If you chose to sign up to receive this information from \nCognizant, we will use your personal informatio

In [14]:
## Design ChatPrompt Template
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_template("""
Answer the following question based only on the provided context. 
Think step by step before providing a detailed answer. 
I will tip you $1000 if the user finds the answer helpful. 
<context>
{context}
</context>
Question: {input}""")

In [21]:
llm = ChatGoogleGenerativeAI(
    model="gemini-2.0-flash",
    temperature=0.7,
    max_tokens=None,
    timeout=None,
    max_retries=2,
)

In [15]:
## Chain Introduction
## Create Stuff Docment Chain

from langchain.chains.combine_documents import create_stuff_documents_chain

document_chain=create_stuff_documents_chain(llm,prompt)

In [16]:
"""
Retrievers: A retriever is an interface that returns documents given
 an unstructured query. It is more general than a vector store.
 A retriever does not need to be able to store documents, only to 
 return (or retrieve) them. Vector stores can be used as the backbone
 of a retriever, but there are other types of retrievers as well. 
 https://python.langchain.com/docs/modules/data_connection/retrievers/   
"""

retriever=db.as_retriever()
retriever

VectorStoreRetriever(tags=['Chroma', 'GoogleGenerativeAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x000001AD382CCF90>, search_kwargs={})

In [17]:
"""
Retrieval chain:This chain takes in a user inquiry, which is then
passed to the retriever to fetch relevant documents. Those documents 
(and original inputs) are then passed to an LLM to generate a response
https://python.langchain.com/docs/modules/chains/
"""
from langchain.chains import create_retrieval_chain
retrieval_chain=create_retrieval_chain(retriever,document_chain)

In [25]:

from langchain.chains import RetrievalQAWithSourcesChain
qa_chain = RetrievalQAWithSourcesChain.from_chain_type(
                    llm=llm,
                    retriever=retriever,
                    return_source_documents=True
                )



  result = qa_chain({"question": "What is the purpose of the Privacy Policy?"})


In [32]:
result = qa_chain({"question": "WHAT ARE DIFFERENT TYPES OF PROmpt "})

In [33]:
result["answer"], result["sources"]

('Based on the provided text, here are different types of prompts:\n\n*   Chain of Thought (CoT): Instructs the model to "think step by step" to decompose tasks into smaller steps. (https://lilianweng.github.io/posts/2023-06-23-agent/)\n*   Tree of Thoughts: Extends CoT by exploring multiple reasoning possibilities at each step, creating a tree structure. (https://lilianweng.github.io/posts/2023-06-23-agent/)\n*   ReAct: Integrates reasoning and acting within LLMs, prompting the model to think, act, and observe. (https://lilianweng.github.io/posts/2023-06-23-agent/)\n\n',
 'https://lilianweng.github.io/posts/2023-06-23-agent/')