<a href="https://colab.research.google.com/github/InduwaraGayashan001/Generative-AI/blob/main/LangChain.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Installation

In [None]:
!pip install langchain_huggingface

In [None]:
!pip install -U langchain-community

In [None]:
!pip install -U bitsandbytes

# Calling a LLM Locally

In [None]:
from langchain.llms import HuggingFacePipeline
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, AutoModelForSeq2SeqLM
import torch

In [None]:
model_id = "google/flan-t5-large"
tokenizer = AutoTokenizer.from_pretrained(model_id)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


In [None]:
model = AutoModelForSeq2SeqLM.from_pretrained(model_id, load_in_8bit=True, device_map="auto")

In [None]:
pipeline = pipeline(
    "text2text-generation",
    model=model,
    tokenizer=tokenizer,
    max_length=128
)

Device set to use cuda:0


In [None]:
local_llm = HuggingFacePipeline(pipeline=pipeline)

  local_llm = HuggingFacePipeline(pipeline=pipeline)


In [None]:
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

prompt = PromptTemplate(
    input_variables =['country'],
    template = "What is the capital of {country}?"
)

In [None]:
chain = LLMChain(llm=local_llm, prompt=prompt)

In [None]:
result = chain.run("Sri Lanka")
print(result)

colombo


# Calling a LLM with Inference Key

## Deepseek

In [None]:
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from google.colab import userdata
import re

deepseek_llm_endpoint = HuggingFaceEndpoint(
  repo_id="deepseek-ai/DeepSeek-R1",
  temperature=0,
  max_new_tokens=2,
  huggingfacehub_api_token =userdata.get('HF_TOKEN')
)
deepseek_llm = ChatHuggingFace(llm=deepseek_llm_endpoint)
response_text = deepseek_llm.invoke("What is the capital of India?")
cleaned_response = re.sub(r"<think>.*?</think>", "", response_text.content, flags=re.DOTALL).strip()
print(cleaned_response)

The capital of India is **New Delhi**.

Here's a bit more detail for clarity:
1.  **New Delhi** is a distinct district within the larger **National Capital Territory of Delhi (NCT)**.
2.  It was officially designated as the capital of British India in **1911** and became the capital of independent India in **1947**.
3.  New Delhi houses the central government institutions, including the Parliament of India, the Rashtrapati Bhavan (President's residence), and the Supreme Court.

So, while people often say "Delhi" is the capital, the specific, official capital city is **New Delhi**.


## Llama

In [None]:
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
from google.colab import userdata

llm= HuggingFaceEndpoint(
  repo_id="meta-llama/Llama-3.1-8B-Instruct",
  temperature=0.1,
  max_new_tokens=2,
  huggingfacehub_api_token =userdata.get('HF_TOKEN')
)
result = llm.invoke("What is the capital of India? Just give me the name of the city.")
print(result)

 Delhi.



# Prompt Templates

In [None]:
from langchain.prompts import PromptTemplate

prompt_template_1 =PromptTemplate(
    input_variables =['country'],
    template = "What is the capital of {country}? Just give me the name of the city "
)

prompt1 = prompt_template_1.format(country="Sri Lanaka")
print(prompt1)

What is the capital of Sri Lanaka? Just give me the name of the city 


In [None]:
prompt_template_2 = PromptTemplate.from_template("What is the capital of {country}? Just give me the name of the city ")
prompt2 = prompt_template_2.format(country="Sri Lanka")

# Chain

In [None]:
from langchain.chains import LLMChain

chain = LLMChain(llm=llm, prompt=prompt_template_2)
result = chain.run("Sri Lanaka")
print(result)

  chain = LLMChain(llm=llm, prompt=prompt_template_2)
  result = chain.run("Sri Lanaka")


 Colombo


# Simple Sequence Chain

In [None]:
prompt_template_country =PromptTemplate(
    input_variables =['continent'],
    template = "What is the largest country in {continent}?Just give me the name of the country."
)

country_chain = LLMChain(llm=llm, prompt=prompt_template_country)

prompt_template_city =PromptTemplate(
    input_variables =['country'],
    template = "What is the capital of {country}? Just give me the name of the city. "
)

city_chain = LLMChain(llm=llm, prompt=prompt_template_city)

In [None]:
from langchain.chains import SimpleSequentialChain

chain = SimpleSequentialChain(chains=[country_chain, city_chain])
result = chain.run("Asia")
print(result)

 Moscow



# Sequential Chain

In [None]:
country_chain = LLMChain(llm=llm, prompt=prompt_template_country, output_key="country")
city_chain = LLMChain(llm=llm, prompt=prompt_template_city, output_key="capital")

In [None]:
from langchain.chains import SequentialChain

chain = SequentialChain(
    chains=[country_chain, city_chain],
    input_variables=["continent"],
    output_variables=["country", "capital"]
)
result = chain({"continent": "Asia"})
print(result)

{'continent': 'Asia', 'country': ' Russia\n', 'capital': ' Moscow\n'}


# Agents and Tools

In [None]:
!pip install wikipedia

Collecting wikipedia
  Downloading wikipedia-1.4.0.tar.gz (27 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: wikipedia
  Building wheel for wikipedia (setup.py) ... [?25l[?25hdone
  Created wheel for wikipedia: filename=wikipedia-1.4.0-py3-none-any.whl size=11678 sha256=405481189d4948a151dfbb5bbf1cb4b199cb5ede97d10aabc1f233c6e9d6de72
  Stored in directory: /root/.cache/pip/wheels/8f/ab/cb/45ccc40522d3a1c41e1d2ad53b8f33a62f394011ec38cd71c6
Successfully built wikipedia
Installing collected packages: wikipedia
Successfully installed wikipedia-1.4.0


In [None]:
!pip install langchain_openai

Collecting langchain_openai
  Downloading langchain_openai-0.3.22-py3-none-any.whl.metadata (2.3 kB)
Downloading langchain_openai-0.3.22-py3-none-any.whl (65 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m65.3/65.3 kB[0m [31m1.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: langchain_openai
Successfully installed langchain_openai-0.3.22


In [None]:
from langchain.agents import AgentType, initialize_agent, load_tools
from langchain_openai import ChatOpenAI
from google.colab import userdata

llm = ChatOpenAI(
             model = "openai/gpt-4o-mini",
             api_key=userdata.get('GITHUB_TOKEN'),
             base_url="https://models.github.ai/inference")
tools = load_tools(["wikipedia", "llm-math"], llm=llm)

agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)

agent.run("Who is the current president of Sri Lanka?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to find the most up-to-date information about the current president of Sri Lanka. This could be found on Wikipedia. 
Action: wikipedia
Action Input: "current president of Sri Lanka" [0m
Observation: [36;1m[1;3mPage: President of Sri Lanka
Summary: The president of Sri Lanka (Sinhala: ශ්‍රී ලංකා ජනාධිපති Śrī Laṅkā Janādhipati; Tamil: இலங்கை ஜனாதிபதி Ilaṇkai janātipati) is the head of state and head of government of the Democratic Socialist Republic of Sri Lanka. The president is the chief executive of the union government and the commander-in-chief of the Sri Lanka Armed Forces. The powers, functions and duties of prior presidential offices, in addition to their relation with the Prime minister and Government of Sri Lanka, have over time differed with the various constitutional documents since the creation of the office. The president appoints the Prime Minister of Sri Lanka who can command the confidence of the Parl

'The current president of Sri Lanka is Anura Kumara Dissanayake, who assumed office on September 23, 2024.'

# Memory

In [None]:
from langchain_openai import ChatOpenAI
from google.colab import userdata

llm = ChatOpenAI(
    model = "openai/gpt-4o-mini",
    api_key=userdata.get('GITHUB_TOKEN'),
    base_url="https://models.github.ai/inference"
)

In [None]:
from langchain.prompts import PromptTemplate

prompt_template_name = PromptTemplate(
    input_variables =['country'],
    template = "What is the capital of {country}? Just give me the name of the city "
)

In [None]:
from langchain.chains import LLMChain

chain = LLMChain(llm=llm, prompt=prompt_template_name)
result = chain.run("Sri Lanka")
print(result)

  chain = LLMChain(llm=llm, prompt=prompt_template_name)
  result = chain.run("Sri Lanka")


Sri Jayawardenepura Kotte


In [None]:
result2= chain.run("India")
print(result2)

New Delhi


In [None]:
type(chain.memory)

NoneType

## ConversationBufferMemory

In [None]:
from langchain.memory import ConversationBufferMemory
from langchain.chains import LLMChain

memory = ConversationBufferMemory()
chain = LLMChain(llm=llm, prompt=prompt_template_name, memory=memory)
result = chain.run("Sri Lanka")
print(result)

  memory = ConversationBufferMemory()


Sri Jayawardenepura Kotte.


In [None]:
result2 = chain.run("India")
print(result2)

New Delhi


In [None]:
print(chain.memory.buffer)

Human: Sri Lanka
AI: Sri Jayawardenepura Kotte.
Human: India
AI: New Delhi
Human: India
AI: New Delhi


## ConversationChain

In [None]:
from langchain.chains import ConversationChain

convo = ConversationChain(llm=llm)
print(convo.prompt.template)

The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
{history}
Human: {input}
AI:


  convo = ConversationChain(llm=llm)


In [None]:
convo.run("Who won the first cricket world cup?")

"The first Cricket World Cup was held in 1975, and it was won by the West Indies. They defeated Australia in the final match, which took place at Lord's in London on June 21, 1975. The West Indies scored 291 runs, and Australia managed to score 274 runs, giving the West Indies a 17-run victory. This win marked the beginning of a period of dominance for the West Indies in international cricket! Would you like to know more about the tournament or the teams involved?"

In [None]:
convo.run("What is 5+5?")

"5 + 5 equals 10! It's a straightforward addition problem, but it's always fun to see how numbers can come together. If you're interested, I can provide more information about math concepts or even help with more complex equations. Just let me know!"

In [None]:
convo.run("Who was the captain of the winning team?")

"The captain of the West Indies during their victory in the first Cricket World Cup in 1975 was Clive Lloyd. He was an influential figure in West Indies cricket and played a crucial role in the tournament, particularly in the final where he scored 102 runs, providing a solid foundation for the team's total. If you're curious about Clive Lloyd's career or other notable players from that era, feel free to ask!"

In [None]:
print(convo.memory.buffer)

Human: Who won the first cricket world cup?
AI: The first Cricket World Cup was held in 1975, and it was won by the West Indies. They defeated Australia in the final match, which took place at Lord's in London on June 21, 1975. The West Indies scored 291 runs, and Australia managed to score 274 runs, giving the West Indies a 17-run victory. This win marked the beginning of a period of dominance for the West Indies in international cricket! Would you like to know more about the tournament or the teams involved?
Human: What is 5+5?
AI: 5 + 5 equals 10! It's a straightforward addition problem, but it's always fun to see how numbers can come together. If you're interested, I can provide more information about math concepts or even help with more complex equations. Just let me know!
Human: Who was the captain of the winning team?
AI: The captain of the West Indies during their victory in the first Cricket World Cup in 1975 was Clive Lloyd. He was an influential figure in West Indies cricket

## ConversationalBufferWindowMemory

In [None]:
from langchain.memory import ConversationBufferWindowMemory

convo = ConversationChain(
    llm=llm,
    memory=ConversationBufferWindowMemory(k=1)
)
convo.run("Who won the first cricket world cup?")

  memory=ConversationBufferWindowMemory(k=1)


"The first Cricket World Cup was held in 1975, and it was won by the West Indies. They defeated Australia in the final, which took place at Lord's in London on June 21, 1975. The West Indies, led by captain Clive Lloyd, scored 360 runs, and Australia could only manage 274 runs in response. This victory marked the beginning of the West Indies' dominance in international cricket during the late 1970s and 1980s. Would you like to know more about the World Cup or cricket history?"

In [None]:
convo.run("what is 5+5?")

"5 + 5 equals 10. It's a simple addition problem! If you have more math questions or need help with something else, feel free to ask!"

In [None]:
convo.run("Who was the captain of the winning team?")

"I don't have specific context about which event or sport you're referring to. Could you provide more details about the winning team you're asking about? That way, I can give you a more accurate answer!"

In [None]:
print(convo.memory.buffer)

Human: Who was the captain of the winning team?
AI: I don't have specific context about which event or sport you're referring to. Could you provide more details about the winning team you're asking about? That way, I can give you a more accurate answer!


# Document Loaders

In [None]:
!pip install pypdf

Collecting pypdf
  Downloading pypdf-5.6.0-py3-none-any.whl.metadata (7.2 kB)
Downloading pypdf-5.6.0-py3-none-any.whl (304 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m304.2/304.2 kB[0m [31m4.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pypdf
Successfully installed pypdf-5.6.0


In [None]:
from langchain.document_loaders import PyPDFLoader

loader = PyPDFLoader("/content/yolo7.pdf")
pages = loader.load()
print(pages[0].page_content)

YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object
detectors
Chien-Yao Wang1, Alexey Bochkovskiy, and Hong-Yuan Mark Liao1
1Institute of Information Science, Academia Sinica, Taiwan
kinyiu@iis.sinica.edu.tw, alexeyab84@gmail.com, and liao@iis.sinica.edu.tw
Abstract
YOLOv7 surpasses all known object detectors in both
speed and accuracy in the range from 5 FPS to 160 FPS
and has the highest accuracy 56.8% AP among all known
real-time object detectors with 30 FPS or higher on GPU
V100. YOLOv7-E6 object detector (56 FPS V100, 55.9%
AP) outperforms both transformer-based detector SWIN-
L Cascade-Mask R-CNN (9.2 FPS A100, 53.9% AP) by
509% in speed and 2% in accuracy, and convolutional-
based detector ConvNeXt-XL Cascade-Mask R-CNN (8.6
FPS A100, 55.2% AP) by 551% in speed and 0.7% AP
in accuracy, as well as YOLOv7 outperforms: YOLOR,
YOLOX, Scaled-YOLOv4, YOLOv5, DETR, Deformable
DETR, DINO-5scale-R50, ViT-Adapter-B and many other
object detectors in speed and 

# Multi Dataframe agents

In [None]:
!pip install langchain langchain_experimental
!pip install watermark
!pip install openai

Collecting langchain_experimental
  Downloading langchain_experimental-0.3.4-py3-none-any.whl.metadata (1.7 kB)
Downloading langchain_experimental-0.3.4-py3-none-any.whl (209 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m209.2/209.2 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: langchain_experimental
Successfully installed langchain_experimental-0.3.4
Collecting watermark
  Downloading watermark-2.5.0-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting jedi>=0.16 (from ipython>=6.0->watermark)
  Downloading jedi-0.19.2-py2.py3-none-any.whl.metadata (22 kB)
Downloading watermark-2.5.0-py2.py3-none-any.whl (7.7 kB)
Downloading jedi-0.19.2-py2.py3-none-any.whl (1.6 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m3.8 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: jedi, watermark
Successfully installed jedi-0.19.2 watermark-2.5.0


In [None]:
import os
import warnings
warnings.filterwarnings("ignore")

In [None]:
from langchain_experimental.agents.agent_toolkits import create_pandas_dataframe_agent
from langchain_openai import ChatOpenAI
import pandas as pd

In [None]:
url = "https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv"
df = pd.read_csv(url)
print(df.shape)
df.head()

(891, 12)


Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


In [None]:
llm = ChatOpenAI(
    model = "openai/gpt-4o",
    api_key=userdata.get('GITHUB_TOKEN'),
    base_url="https://models.github.ai/inference"
)

In [None]:
agent = create_pandas_dataframe_agent(llm, df, verbose=True, allow_dangerous_code=True)

In [None]:
agent.run("How many rows are there?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: To find the number of rows in the dataframe `df`, I can use the `len()` function or the `.shape` attribute.
Action: python_repl_ast
Action Input: len(df)[0m[36;1m[1;3m891[0m[32;1m[1;3mI now know the final answer. 
Final Answer: There are 891 rows in the dataframe `df`.[0m

[1m> Finished chain.[0m


'There are 891 rows in the dataframe `df`.'

In [None]:
agent.run("How many people are older than 23")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: To answer the question, I should filter the dataframe for rows where the "Age" column is greater than 23 and then count the number of those rows.

Action: python_repl_ast
Action Input: (df['Age'] > 23).sum()[0m[36;1m[1;3m468[0m[32;1m[1;3mFinal Answer: 468 people are older than 23 in the dataframe.[0m

[1m> Finished chain.[0m


'468 people are older than 23 in the dataframe.'

In [None]:
df1 = df.copy()

In [None]:
df1["Age"] = df1["Age"].fillna(df1["Age"].mean())

In [None]:
agent = create_pandas_dataframe_agent(llm, [df, df1], verbose=True, allow_dangerous_code=True)

In [None]:
agent.run("How many rows in the Age column are different")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: To determine how many rows in the Age column are different between `df1` and `df2`, I should compare the Age column in both DataFrames row by row and count the differences.

Action: I will compare the Age column in `df1` and `df2` and calculate how many rows are different.
Action Input: 
```python
(df1['Age'] != df2['Age']).sum()
```[0mI will compare the Age column in `df1` and `df2` and calculate how many rows are different. is not a valid tool, try one of [python_repl_ast].[32;1m[1;3mI need to use the Python shell tool to execute the command and determine how many rows in the Age column are different between `df1` and `df2`.

Action: python_repl_ast
Action Input: (df1['Age'] != df2['Age']).sum()[0m[36;1m[1;3m177[0m[32;1m[1;3mI now know the final answer. 

Final Answer: There are 177 rows in the Age column that are different between `df1` and `df2`.[0m

[1m> Finished chain.[0m


'There are 177 rows in the Age column that are different between `df1` and `df2`.'