<a href="https://colab.research.google.com/github/HirunaD/LangChain/blob/main/02_Simple_RAG_using_LangChain.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Simple RAG Application using LangChain and OpenAI**

In [1]:
# Install the necessary packages
!pip install langchain langchain-chroma -qU
!pip install langchain-openai -qU
!pip install langchain-chroma -qU

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/67.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m19.3/19.3 MB[0m [31m67.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m284.2/284.2 kB[0m [31m16.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.9/1.9 MB[0m [31m47.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m101.6/101.6 kB[0m [31m6.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.4/16.4 MB[0m [31m68.0 MB/s[0m eta [36m0:00:

In [2]:
# Import necessary libraries
import os
from google.colab import userdata

**Initialize OpenAI LLM**

In [4]:
from langchain_openai import ChatOpenAI

# Set OpenAI API key
os.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')

# Initialize the ChatOpenAI model
llm = ChatOpenAI(
    model="gpt-4.1-nano",
    temperature=0
)

**Initialize Embedding Model**

In [5]:
from langchain_openai import OpenAIEmbeddings
embedding_model = OpenAIEmbeddings(model="text-embedding-3-small")

**Create and Embed Documents**

In [6]:
from langchain_core.documents import Document

# Define a list of documents with content and metadata
documents = [
    Document(
        page_content="The T20 World Cup 2024 is in full swing, bringing excitement and drama to cricket fans worldwide.India's team, captained by Rohit Sharma, is preparing for a crucial match against Ireland, with standout player Jasprit Bumrah expected to play a pivotal role in their campaign.The tournament has already seen controversy, particularly concerning the pitch conditions at Nassau County International Cricket Stadium in New York, which came under fire after a low-scoring game between Sri Lanka and South Africa.",
        metadata={"source": "cricket news"},
    ),
    Document(
        page_content="The world of football is buzzing with excitement as major tournaments and league matches continue to captivate fans globally.In the UEFA Champions League, the semi-final matchups have been set, with defending champions Real Madrid set to face Manchester City, while Bayern Munich will take on Paris Saint-Germain.Both ties promise thrilling encounters, featuring some of the best talents in world football.",
        metadata={"source": "football news"},
    ),
    Document(
        page_content="As election season heats up, the latest developments reveal a highly competitive atmosphere across several key races.The presidential election has seen intense campaigning from all major candidates, with recent polls indicating a tight race.Incumbent President Jane Doe is seeking re-election on a platform of economic stability and healthcare reform, while her main rival, Senator John Smith, focuses on education and climate change initiatives.",
        metadata={"source": "election news"},
    ),
    Document(
        page_content="The AI revolution continues to transform industries and reshape the global economy.Significant advancements in artificial intelligence have led to breakthroughs in healthcare, with AI-driven diagnostics improving patient outcomes and reducing costs.Autonomous systems are becoming increasingly prevalent in logistics and transportation, enhancing efficiency and safety.",
        metadata={"source": "ai revolution news"},
    ),
]

In [7]:
# Create a vector store using the documents and embedding model
from langchain_chroma import Chroma

vectorstore = Chroma.from_documents(
    documents,
    embedding=embedding_model,
)

**Perform Similarity Search**

In [8]:
results = vectorstore.similarity_search("test match")

for result in results:
  print("------------------------")
  print(result.page_content)
  print(result.metadata)

------------------------
The T20 World Cup 2024 is in full swing, bringing excitement and drama to cricket fans worldwide.India's team, captained by Rohit Sharma, is preparing for a crucial match against Ireland, with standout player Jasprit Bumrah expected to play a pivotal role in their campaign.The tournament has already seen controversy, particularly concerning the pitch conditions at Nassau County International Cricket Stadium in New York, which came under fire after a low-scoring game between Sri Lanka and South Africa.
{'source': 'cricket news'}
------------------------
The world of football is buzzing with excitement as major tournaments and league matches continue to captivate fans globally.In the UEFA Champions League, the semi-final matchups have been set, with defending champions Real Madrid set to face Manchester City, while Bayern Munich will take on Paris Saint-Germain.Both ties promise thrilling encounters, featuring some of the best talents in world football.
{'source'

In [9]:
results = vectorstore.similarity_search("machine learning")

for result in results:
  print("------------------------")
  print(result.page_content)
  print(result.metadata)

------------------------
The AI revolution continues to transform industries and reshape the global economy.Significant advancements in artificial intelligence have led to breakthroughs in healthcare, with AI-driven diagnostics improving patient outcomes and reducing costs.Autonomous systems are becoming increasingly prevalent in logistics and transportation, enhancing efficiency and safety.
{'source': 'ai revolution news'}
------------------------
The world of football is buzzing with excitement as major tournaments and league matches continue to captivate fans globally.In the UEFA Champions League, the semi-final matchups have been set, with defending champions Real Madrid set to face Manchester City, while Bayern Munich will take on Paris Saint-Germain.Both ties promise thrilling encounters, featuring some of the best talents in world football.
{'source': 'football news'}
------------------------
As election season heats up, the latest developments reveal a highly competitive atmosp

**Embed Query and Perform Similarity Search by Vector**

In [10]:
# Embed a query using the embedding model
query_embedding = embedding_model.embed_query("machine learning")

query_embedding[:10]

[-0.01218611653894186,
 -0.011387612670660019,
 0.002750069834291935,
 -0.04748116061091423,
 0.033656325191259384,
 0.0044632768258452415,
 0.004385810345411301,
 0.03296508267521858,
 -0.019092576578259468,
 0.02288248762488365]

In [11]:
# Print the length of the query embedding
len(query_embedding)

1536

In [12]:
results = vectorstore.similarity_search_by_vector(query_embedding)

for result in results:
  print("------------------------")
  print(result.page_content)
  print(result.metadata)

------------------------
The AI revolution continues to transform industries and reshape the global economy.Significant advancements in artificial intelligence have led to breakthroughs in healthcare, with AI-driven diagnostics improving patient outcomes and reducing costs.Autonomous systems are becoming increasingly prevalent in logistics and transportation, enhancing efficiency and safety.
{'source': 'ai revolution news'}
------------------------
The world of football is buzzing with excitement as major tournaments and league matches continue to captivate fans globally.In the UEFA Champions League, the semi-final matchups have been set, with defending champions Real Madrid set to face Manchester City, while Bayern Munich will take on Paris Saint-Germain.Both ties promise thrilling encounters, featuring some of the best talents in world football.
{'source': 'football news'}
------------------------
As election season heats up, the latest developments reveal a highly competitive atmosp

**Create Retriever**

In [13]:
# Create a retriever from the vector
retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 1},
)

# Perform batch retrieval using the retriever
batch_results = retriever.batch(["machine learning", "test match"])

for result in batch_results:
  print("------------------------")
  for doc in result:
    print(doc.page_content)
    print(doc.metadata)

------------------------
The AI revolution continues to transform industries and reshape the global economy.Significant advancements in artificial intelligence have led to breakthroughs in healthcare, with AI-driven diagnostics improving patient outcomes and reducing costs.Autonomous systems are becoming increasingly prevalent in logistics and transportation, enhancing efficiency and safety.
{'source': 'ai revolution news'}
------------------------
The T20 World Cup 2024 is in full swing, bringing excitement and drama to cricket fans worldwide.India's team, captained by Rohit Sharma, is preparing for a crucial match against Ireland, with standout player Jasprit Bumrah expected to play a pivotal role in their campaign.The tournament has already seen controversy, particularly concerning the pitch conditions at Nassau County International Cricket Stadium in New York, which came under fire after a low-scoring game between Sri Lanka and South Africa.
{'source': 'cricket news'}


**Create Prompt Template**

In [14]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough

# Define a message template for the chatbot
message = """
Answer this question using the provided context only.

{question}

Context:
{context}
"""

# Create a chat prompt template from the message
prompt = ChatPromptTemplate.from_messages([("human", message)])

**Chain Retriever and Prompt Template with LLM**

In [15]:
chain = {"context": retriever, "question": RunnablePassthrough()} | prompt | llm

In [16]:
response = chain.invoke("current state of 2024 t20 world cup")

print(response.content)

The T20 World Cup 2024 is currently in full swing, with ongoing matches and excitement among cricket fans worldwide. India, captained by Rohit Sharma, is preparing for a crucial upcoming match against Ireland. Jasprit Bumrah is expected to play a key role in India's campaign. The tournament has also experienced some controversy regarding the pitch conditions at Nassau County International Cricket Stadium in New York, which faced criticism after a low-scoring game between Sri Lanka and South Africa.
