<a href="https://colab.research.google.com/github/Vickyvarvi/firstproject/blob/main/Langchain_learning_chapter_7.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**📘 LangChain Learning Chapters**

**📖 Chapter 1: Introduction to LangChain**

What is LangChain?

Why use LangChain (vs direct API calls)?

Core building blocks: Prompts, LLMs, Chains, Tools, Agents.

👉 Mini Task: Print "Hello LangChain!" using Python.

**📖 Chapter 2: Prompts**

PromptTemplate – reusable prompts with variables.

ChatPromptTemplate – multi-role prompts (system, human, ai).

👉 Example:

Create prompt template for resume builder.

Translate English → Tamil.

**📖 Chapter 3: LLMs**

Connecting to LLMs:

OpenAI (ChatOpenAI)

Local (Ollama)

HuggingFace models

.invoke(), .predict(), .generate() methods.

👉 Example: Simple chatbot using OpenAI.

**📖 Chapter 4: Chains**

LLMChain – combine prompt + LLM → output.

SequentialChain – multiple steps in order.

SimpleTransformChain – custom function chains.

👉 Example:

Step 1 → Generate title

Step 2 → Generate blog intro

Step 3 → Generate blog body

**📖 Chapter 5: Memory**

Why memory is needed (chatbots, assistants).

ConversationBufferMemory – remembers last messages.

ConversationSummaryMemory – summaries old chats.

VectorStoreRetrieverMemory – long-term memory.

👉 Example: Personal assistant bot that remembers your name.

**📖 Chapter 6: Document Loading & Text Splitting**

Document Loaders → PDF, CSV, Text, Web.

TextSplitter → break large docs into chunks.

👉 Example: Load a PDF and split it into 500-character chunks.

**📖 Chapter 7: Embeddings & Vector Stores**

Convert text → vectors.

Embeddings models: OpenAI, HuggingFace.

Vector DBs: FAISS, Chroma, pgvector.

Similarity Search → Find top relevant docs.

👉 Example: Store a PDF in FAISS and search answers.

**📖 Chapter 8: Retrieval-Augmented Generation (RAG)**

Combine retriever + LLM.

User query → Retrieve → Feed into LLM → Answer.

👉 Example: PDF Q&A bot.

**📖 Chapter 9: Agents & Tools**

AgentExecutor → LLM decides which tool to use.

Built-in tools: Python REPL, SQL DB, API calls.

Custom tools → your own Python functions.

👉 Example:

Agent that checks today’s weather using API.

Agent that queries SQL database.

**📖 Chapter 10: Output Parsers**

Why parsing is needed (structured JSON, tables).

SimpleOutputParser

PydanticOutputParser (validate outputs).

👉 Example: Parse LLM output into JSON (name, age, skills).

**📖 Chapter 11: LangGraph (Advanced LangChain)**

State-based workflow.

Nodes, Edges, State.

Looping & branching logic.

👉 Example: Workflow – Load Doc → Extract Entities → Validate → Store.

**📖 Chapter 12: Deployment**

Run LangChain inside FastAPI or Django.

Build simple frontend (Streamlit / React).

Deploy to Vercel, AWS, or GCP.

👉 Example: Deploy PDF Q&A bot with FastAPI.

Prompts → PromptTemplate, ChatPromptTemplate

**Simple Prompt**

In [1]:
from langchain.prompts import PromptTemplate

In [2]:
# Step 1: Template define pannrathu
template = "Translate the following English text into Tamil:\n\n{text}"

In [3]:
# Step 2: PromptTemplate object create
prompt = PromptTemplate(input_variables=["text"], template=template)

In [4]:
# Step 3: Fill values
final_prompt = prompt.format(text="Good morning, how are you?")

print(final_prompt)

Translate the following English text into Tamil:

Good morning, how are you?


**Multiple** **Variables**

In [5]:
template = "Write a {tone} email to {recipient} about {topic}."

In [7]:
prompt = PromptTemplate(
    input_variables=["tone", "recipient", "topic"],
    template=template
)

In [8]:

final_prompt = prompt.format(
    tone="formal",
    recipient="HR Manager",
    topic="leave application"
)

print(final_prompt)

Write a formal email to HR Manager about leave application.


**ChatPromptTemplate**

In [10]:
from langchain.prompts import ChatPromptTemplate

In [13]:

# Step 1: Define multi-role prompt
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful Tamil translator."),
    ("human", "Translate this text into Tamil: {text}")
])

In [15]:
# Step 2: Fill values
final_prompt = prompt.format_messages(text="How are you?")
print(final_prompt)

[SystemMessage(content='You are a helpful Tamil translator.', additional_kwargs={}, response_metadata={}), HumanMessage(content='Translate this text into Tamil: How are you?', additional_kwargs={}, response_metadata={})]


Chapter 3: Large Language Models (LLMs)

3.1 – What is an LLM in LangChain?
LLM → A text-only model (like GPT-3.5, Gemini text-only, etc.)

ChatModel → A conversational model (like Gemini, GPT-4 chat, etc.)

👉 In LangChain, we usually use ChatModel (e.g., ChatGoogleGenerativeAI) since it supports role-based messages (system, human, ai).


In [16]:
pip install langchain-google-genai google-generativeai


Collecting langchain-google-genai
  Downloading langchain_google_genai-2.1.10-py3-none-any.whl.metadata (7.2 kB)
Collecting filetype<2.0.0,>=1.2.0 (from langchain-google-genai)
  Downloading filetype-1.2.0-py2.py3-none-any.whl.metadata (6.5 kB)
Collecting google-ai-generativelanguage<0.7.0,>=0.6.18 (from langchain-google-genai)
  Downloading google_ai_generativelanguage-0.6.18-py3-none-any.whl.metadata (9.8 kB)
Collecting langchain-core<0.4.0,>=0.3.75 (from langchain-google-genai)
  Downloading langchain_core-0.3.75-py3-none-any.whl.metadata (5.7 kB)
INFO: pip is looking at multiple versions of google-generativeai to determine which version is compatible with other requirements. This could take a while.
Collecting google-generativeai
  Downloading google_generativeai-0.8.4-py3-none-any.whl.metadata (4.2 kB)
  Downloading google_generativeai-0.8.3-py3-none-any.whl.metadata (3.9 kB)
  Downloading google_generativeai-0.8.2-py3-none-any.whl.metadata (3.9 kB)
  Downloading google_generative

**ChatPromptTemplate + LLM**

In [None]:
import os
from langchain_google_genai import ChatGoogleGenerativeA

In [18]:
# 👉 Set your Gemini API Key
os.environ["GOOGLE_API_KEY"] = "YOUR_GEMINI_API_KEY"

In [19]:
# Step 1: Define ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a resume builder assistant."),
    ("human", "Generate a 2-line summary for a {role} with skills in {skills}.")
])

In [20]:
# Step 2: Fill placeholders
final_prompt = prompt.format_messages(
    role="Python Developer",
    skills="Django, LangChain"
)

In [None]:
# Step 3: Connect to Gemini LLM
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash")

In [None]:
# Step 4: Invoke the model with our prompt
response = llm.invoke(final_prompt)

In [None]:
print("👉 AI Resume Summary:")
print(response.content)

 Flow Recap:
ChatPromptTemplate → reusable structured prompt (system + human).

final_prompt → gets converted into SystemMessage + HumanMessage.

llm.invoke(final_prompt) → sends it to Gemini.

Gemini returns final text (resume summary).

**Combine** **PromptTemplate** + **LLM**

In [None]:
from langchain.prompts import PromptTemplate
from langchain_google_genai import ChatGoogleGenerativeAI

# Prompt template
template = "Summarize the following text in 2 lines:\n\n{text}"
prompt = PromptTemplate.from_template(template)

# Format with input
final_prompt = prompt.format(text="LangChain helps build LLM apps easily.")

# Gemini LLM
llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash")
response = llm.invoke(final_prompt)
print(response.content)


** What is invoke in LangChain?**

In LangChain, LLM objects (like ChatGoogleGenerativeAI, ChatOpenAI) have methods to send input and get output.

invoke() → Send one input and get one output (synchronous call).

ainvoke() → Same as invoke but async (for async Python).

batch() → Send a list of inputs and get a list of outputs.

stream() → Get output token by token (like live streaming).



**📖 Chapter 3: LangChain — Chains**

❓ What is a Chain?

In real world, we rarely ask the LLM just one question.

We want to combine multiple steps (e.g., prompt → LLM → parse → another LLM).

A Chain connects these steps together.

👉 Think of it like a pipeline.



**1.Simple LLMChain**
#The most basic chain: Prompt → LLM → **Output**

In [None]:

from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain_google_genai import ChatGoogleGenerativeA


In [None]:
# LLM
llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash")

In [None]:
# Prompt
template = "Give me a motivational quote about {topic}."
prompt = PromptTemplate(input_variables=["topic"], template=template)

In [None]:
# Chain
chain = LLMChain(llm=llm, prompt=prompt)


In [None]:
# Run
response = chain.run("learning Python")
print(response)

**2️⃣ SequentialChain (Multiple Steps)**

You can connect multiple prompts in sequence.

Example:

Step 1 → Generate a title

Step 2 → Generate a summary based on the title

In [None]:
from langchain.chains import SimpleSequentialChain

# Chain 1: Generate a title
prompt1 = PromptTemplate.from_template("Give me a creative title about {topic}.")
chain1 = LLMChain(llm=llm, prompt=prompt1)


In [None]:
# Chain 2: Generate a summary
prompt2 = PromptTemplate.from_template("Write a 2-line summary for the title: {title}")
chain2 = LLMChain(llm=llm, prompt=prompt2)

In [None]:
# Sequential chain
overall_chain = SimpleSequentialChain(chains=[chain1, chain2])


In [None]:
# Run
print(overall_chain.run("Artificial Intelligence in Healthcare"))

**3️⃣ SequentialChain (more powerful)**

Unlike SimpleSequentialChain, here we can pass multiple variables across steps.

In [None]:
from langchain.chains import SequentialChain

# Chain 1: Generate a title
prompt1 = PromptTemplate(input_variables=["topic"], template="Give me a creative title about {topic}.")
chain1 = LLMChain(llm=llm, prompt=prompt1, output_key="title")


In [None]:
# Chain 2: Generate summary using title + topic
prompt2 = PromptTemplate(
    input_variables=["title", "topic"],
    template="Write a 2-line summary for the blog '{title}' on topic {topic}."
)
chain2 = LLMChain(llm=llm, prompt=prompt2, output_key="summary")


In [None]:
# SequentialChain with multiple inputs/outputs
overall_chain = SequentialChain(
    chains=[chain1, chain2],
    input_variables=["topic"],
    output_variables=["title", "summary"]
)


In [None]:
result = overall_chain({"topic": "AI in Finance"})
print(result)

Difference from SimpleSequentialChain:

Supports dictionaries (multiple inputs/outputs)

More flexible for complex workflows

**4️⃣ RouterChain**

What if you want to route input to different prompts?
E.g., if the topic is "science", use science prompt, if "history", use history prompt.

This is like an intelligent switchboard.

In [None]:
from langchain.chains.router import MultiPromptChain

# Different prompt templates
science_prompt = PromptTemplate.from_template("Explain {question} like a science teacher.")
history_prompt = PromptTemplate.from_template("Explain {question} like a history professor.")


In [None]:
# Define destinations
destination_chains = {
    "science": LLMChain(llm=llm, prompt=science_prompt),
    "history": LLMChain(llm=llm, prompt=history_prompt),
}

In [None]:
# Default fallback
default_chain = LLMChain(llm=llm, prompt=PromptTemplate.from_template("Explain {question} briefly."))


In [None]:
# Router Chain
router_chain = MultiPromptChain(
    llm=llm,
    destination_chains=destination_chains,
    default_chain=default_chain
)


In [None]:
print(router_chain.run("What is gravity?"))
print(router_chain.run("Who was Napoleon?"))
#✅ Output will automatically go to the correct chain depending on input.

🧠 **Memory** **in** **LangChain**
1. What is Memory?
Normally, when you talk to an LLM (like Gemini, GPT), it does not remember past conversations.


**2. Types of Memory in LangChain**

LangChain gives different memory styles:

**ConversationBufferMemory**

Remembers the full chat history.

Example: “ChatGPT style memory.”

**ConversationBufferWindowMemory**

Remembers only the last N messages (like a sliding window).

Useful when you don’t want to overload the context.

**ConversationSummaryMemory**

Summarizes old chats, keeps only key points.

Helps when context is long but you still need memory.

**EntityMemory**

Remembers specific facts about people, places, things.

Example: If you say “My dog is Max,” later it remembers Max is your dog.

**Simple Conversation Memory**

In [None]:
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash")

# Memory object
memory = ConversationBufferMemory()

# Conversation chain with memory
conversation = ConversationChain(llm=llm, memory=memory, verbose=True)

conversation.run("Hello, my name is Vignesh.")
conversation.run("I live in Chennai.")
conversation.run("Do you remember my name?")


**Chapter 1 – ConversationBufferMemory (Full History)**


In [None]:
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain
from langchain_google_genai import ChatGoogleGenerativeAI

# LLM
llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash")

# Memory (stores full conversation)
memory = ConversationBufferMemory()

# Chain with memory
conversation = ConversationChain(llm=llm, memory=memory, verbose=True)

print(conversation.run("Hello, I am Vignesh."))
print(conversation.run("I live in Chennai."))
print(conversation.run("What is my name?"))
print(conversation.run("Where do I live?"))


**Chapter 2 – ConversationBufferWindowMemory (Sliding Window)**

In [None]:
from langchain.memory import ConversationBufferWindowMemory
from langchain.chains import ConversationChain

memory = ConversationBufferWindowMemory(k=2)  # remembers only last 2 exchanges

conversation = ConversationChain(llm=llm, memory=memory, verbose=True)

print(conversation.run("My favorite color is blue."))
print(conversation.run("I like cricket."))
print(conversation.run("I work as a data analyst."))
print(conversation.run("What is my favorite color?"))


**Chapter 3 – ConversationSummaryMemory (Summarized)**

In [None]:
from langchain.memory import ConversationSummaryMemory
from langchain.chains import ConversationChain

memory = ConversationSummaryMemory(llm=llm)

conversation = ConversationChain(llm=llm, memory=memory, verbose=True)

print(conversation.run("I am Vignesh. I work in supply chain automation."))
print(conversation.run("I am building an AI RFQ tool."))
print(conversation.run("Can you remind me what I am working on?"))


**Chapter 4 – EntityMemory (Facts about people/objects)**

In [None]:
from langchain.memory import ConversationEntityMemory
from langchain.chains import ConversationChain

memory = ConversationEntityMemory(llm=llm)

conversation = ConversationChain(llm=llm, memory=memory, verbose=True)

print(conversation.run("My dog’s name is Max."))
print(conversation.run("Max loves playing fetch."))
print(conversation.run("What is my dog’s name?"))


**Chapter 5 – Choosing the Right Memory**

Use BufferMemory if you need full history (short convos).

Use WindowMemory if you want only recent history.

Use SummaryMemory for long chats.

Use EntityMemory for structured facts

**📘 Chapter 6: Document Loading (LangChain)**

When building RAG or any knowledge-based chatbot, you need to load documents first (PDF, text, CSV, web pages, etc.) before embedding and querying.

LangChain gives you Document Loaders for this.

**6.1 🔹 What is a Document?**

In LangChain, a Document is a Python object that has:

page_content → the text inside

metadata → info like filename, page number, source, etc.



In [None]:
#📂 Loading a Text File
from langchain.document_loaders import TextLoader

loader = TextLoader("example.txt")
documents = loader.load()

print(documents[0].page_content[:200])  # first 200 chars
print(documents[0].metadata)

In [None]:
#📄 Loading a PDF
from langchain.document_loaders import PyPDFLoader

loader = PyPDFLoader("example.pdf")
documents = loader.load()

print(len(documents))  # number of pages
print(documents[0].page_content[:200])


In [None]:
#🌍 Loading from a Website
from langchain.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://docs.langchain.com")
documents = loader.load()

print(documents[0].page_content[:200])


**6.3 🔹 Splitting Large Documents**

If a document is very big, you split it into chunks before embedding.



In [None]:
from langchain.text_splitter import CharacterTextSplitter

text_splitter = CharacterTextSplitter(
    separator="\n",
    chunk_size=500,
    chunk_overlap=50,
    length_function=len
)

docs = text_splitter.split_documents(documents)

print(len(docs))  # number of chunks
print(docs[0].page_content)


**6.4 🔹 Typical Flow**

Load documents (PDF, CSV, website, etc.)

Split into chunks

Embed into vector DB (next chapter)

Use RAG pipeline to answer questions



**Chapter 7 – Embeddings + Vector Stores**

This is the core of RAG (Retrieval-Augmented Generation).
We take the chunks from the PDF and convert them into vector embeddings (numerical representation of text). Then, we store them in a Vector DB (like Chroma, FAISS, Pinecone).

🧩 **Step 1: Install requirements**

In [None]:
pip install langchain langchain-chroma sentence-transformers


🧩 **Step 2: Create embeddings**

In [None]:
from langchain_huggingface import HuggingFaceEmbeddings

# Create embedding model
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

# Example text
text = "LangChain is a framework for developing applications powered by LLMs."

vector = embeddings.embed_query(text)
print("🔹 Vector length:", len(vector))
print("🔹 First 5 numbers:", vector[:5])
#👉 You’ll see a list of floating numbers (vector). That’s how the model represents meaning.

🧩 **Step 3: Store in VectorDB (Chroma)**

In [None]:
from langchain_chroma import Chroma

# Assume `docs` is from Chapter 6 (PDF chunks)
db = Chroma.from_documents(docs, embeddings, persist_directory="./chroma_db")

print("✅ Vector DB created with", db._collection.count(), "documents")

🧩 **Step 4: Search from VectorDB**

In [None]:
# Query
query = "What is LangChain?"

# Get top 2 similar chunks
results = db.similarity_search(query, k=2)

for i, res in enumerate(results):
    print(f"\n🔹 Result {i+1}:")
    print(res.page_content)
    print("Metadata:", res.metadata)
