<a href="https://colab.research.google.com/github/Sayandeep27/Gen-AI-with-Sayandeep/blob/main/2_Langchain_(Full_Tutorial_using_Free_Gemini_LLM).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
api="AIzaSyBpijVSmuqyt5qelKWaPKpd-Ys30wTd37w"

In [2]:
!pip install langchain-google-genai google-generativeai

Collecting langchain-google-genai
  Downloading langchain_google_genai-2.1.2-py3-none-any.whl.metadata (4.7 kB)
Collecting filetype<2.0.0,>=1.2.0 (from langchain-google-genai)
  Downloading filetype-1.2.0-py2.py3-none-any.whl.metadata (6.5 kB)
Collecting google-ai-generativelanguage<0.7.0,>=0.6.16 (from langchain-google-genai)
  Downloading google_ai_generativelanguage-0.6.17-py3-none-any.whl.metadata (9.8 kB)
Collecting langchain-core<0.4.0,>=0.3.49 (from langchain-google-genai)
  Downloading langchain_core-0.3.49-py3-none-any.whl.metadata (5.9 kB)
INFO: pip is looking at multiple versions of google-generativeai to determine which version is compatible with other requirements. This could take a while.
Collecting google-generativeai
  Downloading google_generativeai-0.8.3-py3-none-any.whl.metadata (3.9 kB)
  Downloading google_generativeai-0.8.2-py3-none-any.whl.metadata (3.9 kB)
  Downloading google_generativeai-0.8.1-py3-none-any.whl.metadata (3.9 kB)
  Downloading google_generativea

In [3]:
from langchain_google_genai import GoogleGenerativeAI

api_key = api  # Get your free API key from https://ai.google.dev/

llm = GoogleGenerativeAI(model="gemini-1.5-pro-latest", google_api_key=api_key, temperature=0.1)

In [4]:
llm.invoke("What is the capital of England?")

'London\n'

# **Prompt** **Templates**

In [5]:
from re import template
from langchain.prompts import PromptTemplate

template=PromptTemplate(
    input_variables=["name"],
    template="Hello {name}, how can I assist you today?"
)


formatted_prompt=template.format(name="Sayandeep")

In [6]:
print(formatted_prompt)

Hello Sayandeep, how can I assist you today?


In [9]:
# Using Multiple Variables

template=PromptTemplate(
    input_variables=["greet","name",],
    template="{greet} !! Hello {name}, how can I assist you today?"
)


formatted_prompt=template.format(greet="Good Morning",name="Sayandeep")

print(formatted_prompt)


Good Morning !! Hello Sayandeep, how can I assist you today?


# **Few Shot Prompt Template**

In [10]:
# Few shot prompting
'''
What is Few-Shot Prompting?
Few-shot prompting is a technique where you give an AI model a few examples of what you want before asking it to complete a task. This helps the model understand the pattern and generate better responses.
'''

'\nWhat is Few-Shot Prompting?\nFew-shot prompting is a technique where you give an AI model a few examples of what you want before asking it to complete a task. This helps the model understand the pattern and generate better responses.\n'

In [11]:
from langchain.prompts import FewShotPromptTemplate, PromptTemplate

# Few-shot examples
examples = [
    {"input": "The product is amazing! I love it.", "output": "Positive"},
    {"input": "It was a complete waste of money. I regret buying it.", "output": "Negative"},
]

# Example formatter
example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Review: {input}\nSentiment: {output}\n"
)

# Few-Shot Prompt Template
few_shot_prompt = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix="Classify the sentiment of the following reviews:\n",
    suffix="Review: {input}\nSentiment:",
    input_variables=["input"]
)

# Test Few-Shot Prompting
new_input = {"input": "The quality is decent, but I expected better."}
formatted_prompt = few_shot_prompt.format(**new_input)

In [12]:
print(formatted_prompt)

Classify the sentiment of the following reviews:


Review: The product is amazing! I love it.
Sentiment: Positive


Review: It was a complete waste of money. I regret buying it.
Sentiment: Negative


Review: The quality is decent, but I expected better.
Sentiment:


In [13]:
# Get AI response

response=llm.invoke(formatted_prompt)

In [14]:
# Print results
print(f"Prompt:\n{formatted_prompt}")
print(f"\nAI Response:\n{response}")

Prompt:
Classify the sentiment of the following reviews:


Review: The product is amazing! I love it.
Sentiment: Positive


Review: It was a complete waste of money. I regret buying it.
Sentiment: Negative


Review: The quality is decent, but I expected better.
Sentiment:

AI Response:
Neutral (or slightly negative)



# **Few-Shot Prompting with Chat Models (ChatPromptTemplate)**

In [21]:
from langchain.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate

# Define system and human message templates
system_message = SystemMessagePromptTemplate.from_template(
    "You are a helpful AI assistant that classifies text sentiment."
)
human_message = HumanMessagePromptTemplate.from_template(
    "Review: {review}\nSentiment:"
)

# Create Few-Shot ChatPromptTemplate
chat_prompt = ChatPromptTemplate.from_messages([system_message, human_message])

formatted_prompt = chat_prompt.format(review="I don't love this product! It's bad")
print(formatted_prompt)

System: You are a helpful AI assistant that classifies text sentiment.
Human: Review: I don't love this product! It's bad
Sentiment:


In [22]:
llm.invoke(formatted_prompt)

'Sentiment: Negative\n'

# **Output Parsers in LangChain**
Output Parsers in LangChain help convert the raw output from a language model into structured data. They are useful when you need the output in a specific format, such as JSON, a list, or key-value pairs.



In [24]:
from langchain.output_parsers import CommaSeparatedListOutputParser
from langchain.prompts import PromptTemplate

# Define an output parser
output_parser = CommaSeparatedListOutputParser()

# Create a prompt template
prompt = PromptTemplate(
    template="List three programming languages, separated by commas.",
    input_variables=[],
    output_parser=output_parser
)

# Format the prompt
formatted_prompt = prompt.format()
print(formatted_prompt)

# Parse the output
parsed_output = output_parser.parse("Python, JavaScript, C++")
print(parsed_output)


#The model's response is converted into a structured list

List three programming languages, separated by commas.
['Python', 'JavaScript', 'C++']


In [25]:
# CommaSeparatedListOutputParser

output_parser = CommaSeparatedListOutputParser()
parsed_output = output_parser.parse("Red, Green, Blue")
print(parsed_output)


['Red', 'Green', 'Blue']


# **Document Loaders in LangChain**
Document Loaders in LangChain help in loading text from various sources like PDFs, Word files, Notion, web pages, databases, and cloud storage. They are essential when working with Retrieval-Augmented Generation (RAG) and other NLP tasks.

In [31]:
!pip install -U langchain-community

Collecting langchain-community
  Downloading langchain_community-0.3.20-py3-none-any.whl.metadata (2.4 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain-community)
  Downloading pydantic_settings-2.8.1-py3-none-any.whl.metadata (3.5 kB)
Collecting httpx-sse<1.0.0,>=0.4.0 (from langchain-community)
  Downloading httpx_sse-0.4.0-py3-none-any.whl.metadata (9.0 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading marshmallow-3.26.1-py3-none-any.whl.metadata (7.3 kB)
Collecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading typing_inspect-0.9.0-py3-none-any.whl.metadata (1.5 kB)
Collecting python-dotenv>=0.21.0 (from pydantic-settings<3.0.0,>=2.4.0->langchain-community)
  Downloading python_dotenv-1.1.0-py3-none-any.whl.metadata (24 kB

In [33]:
from langchain.document_loaders import PyPDFLoader

# Load a PDF
loader = PyPDFLoader("/content/Sayandeep_Resume.pdf")
documents = loader.load()

# Print the first page content
print(documents[0].page_content)


Sayandeep Sarkar +91- 7001932512
Bachelor of Engineering sarkarsayandeep093@gmail.com
in Mechanical Engineering GitHub
Jadavpur University, Kolkata, India LinkedIn
Summary
Currently a final year Undergrad at Jadavpur University, highly interested in Data Engineering, Data Science and
Software Development
SPECIALIST -Codeforces (1400*) (solved 350+ questions)
Solved 550+ Data Structures and Algorithms questions on Leetcode and overall 1000+ questions including Codeforces
and Gfg
Internship
Data Engineering Intern atIIM Shillong
Developed and maintained Data Engineering pipelines usingApache Airflow, ensuring efficient scheduling and
monitoring ofETL tasks. Worked extensively withApache Kafkato build real-time data streaming solutions,
enabling seamless data integration and processing. LeveragedPySpark for big data processing, optimizing
performance and scalability in large-scale data transformation tasks.
Roles and Responsibilities
Central Placement Coordinatorof Jadavpur University for

# **Text Splitters in LangChain**
Text Splitters in LangChain help break large documents into smaller chunks. This is useful for:
 * Efficient retrieval in RAG (Retrieval-Augmented Generation)
 * Chunking text for embeddings in vector databases
 * Processing large documents without exceeding token limits




In [40]:
# Character-based Text Splitter

from langchain.text_splitter import CharacterTextSplitter

text = """Artificial Intelligence (AI) is transforming industries worldwide.
From healthcare to finance, AI is improving efficiency and decision-making."""

# Initialize splitter
text_splitter = CharacterTextSplitter(chunk_size=5, chunk_overlap=2)

# Split text
chunks = text_splitter.split_text(text)

# Print chunks
for i, chunk in enumerate(chunks):
    print(f"Chunk {i+1}:\n{chunk}\n")


Chunk 1:
Artificial Intelligence (AI) is transforming industries worldwide. 
From healthcare to finance, AI is improving efficiency and decision-making.



In [41]:
# Recursive Character Splitter (Better for Documents)

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=50, chunk_overlap=10)
chunks = text_splitter.split_text(text)

for i, chunk in enumerate(chunks):
    print(f"Chunk {i+1}:\n{chunk}\n")


Chunk 1:
Artificial Intelligence (AI) is transforming

Chunk 2:
industries worldwide.

Chunk 3:
From healthcare to finance, AI is improving

Chunk 4:
improving efficiency and decision-making.



# **Chains in LangChain using LCEL (LangChain Expression Language)**
Chains in LangChain allow you to connect multiple components (LLMs, retrievers, memory, etc.) into a single pipeline.

* LCEL (LangChain Expression Language) makes defining chains more intuitive using a functional programming approach.
* No need for complex classes—just chain components together like a function!

In [43]:
from langchain_google_genai import GoogleGenerativeAI
from langchain.prompts import PromptTemplate

# Define a prompt
prompt = PromptTemplate.from_template("What is the capital of {country}?")

# Define an LLM
llm = GoogleGenerativeAI(model="gemini-1.5-pro-latest", google_api_key=api_key, temperature=0.1)

# Create the LCEL chain
chain = prompt | llm


# Run the chain

response=chain.invoke({"country":"France"})

print(response)


Paris



# **LCEL Chain with Memory (Conversation Chain)**

In [44]:
from langchain.memory import ConversationBufferMemory
from langchain_core.runnables import RunnablePassthrough

# Create memory
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

# Define a prompt
prompt = PromptTemplate.from_template("Chat history: {chat_history}\nUser: {input}\nAI:")

# LCEL Chain: Memory → Prompt → LLM
chain = (
    {"chat_history": memory.load_memory_variables | RunnablePassthrough(), "input": RunnablePassthrough()}
    | prompt
    | llm
)

# Run the chain
response = chain.invoke({"input": "Hello!"})
print(response)

# Save memory
memory.save_context({"input": "Hello!"}, {"output": response})


  memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)


Hi there! How can I help you today?



# **Memory in LangChain**
Memory in LangChain allows stateful interactions by storing past conversations or information.
* Keeps chat history for context
* Improves user experience in chatbots
* Works with various storage backends

In [51]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

# Initialize Google Gemini Model
llm = ChatGoogleGenerativeAI(model="gemini-1.5-pro-latest", google_api_key=api, temperature=0.1)

# Create Memory to Store Conversation History
memory = ConversationBufferMemory()

# Create Conversation Chain with Memory
conversation = ConversationChain(
    llm=llm,
    memory=memory
)


# Simulate a conversation
print(conversation.predict(input="Hello, who are you?"))
print(conversation.predict(input="Can you remember my name is Alex?"))
print(conversation.predict(input="What is my name?"))  # Should remember "Alex"


Hello! I am a large language model, trained by Google.  I don't have a name, per se, but you can call me AI if you like.  I exist as a computer program and have been trained on a massive dataset of text and code. This dataset included a wide range of information, from books and articles to websites and code repositories.  Because of this training, I can communicate in response to a wide range of prompts and questions, generating different creative text formats, like poems, code, scripts, musical pieces, email, letters, etc. I can even translate languages!  I don't have personal experiences or emotions like humans do, and my knowledge is based on the data I was trained on, which has a cutoff point.  So, for example, I wouldn't know about current events past my last training update.  Is there anything you'd like to know more about?
As a large language model, I don't have memory of past conversations. Each interaction we have starts fresh.  So, while I can't remember your name from a prev