<a href="https://colab.research.google.com/github/Minahil-official/Quater-2/blob/main/Project_2_LangChain_RAG_Project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# **LangChain RAG with Google Gemini Flash and Pinecone**



### Installations

In [19]:
!pip install -qU langchain-pinecone langchain-google-genai

### Configuring Pinecone Api

In [3]:
from IPython.display import Markdown, display
from google.colab import userdata
from pinecone import Pinecone, ServerlessSpec
pinecone_api_key = userdata.get('Pinecon_API')
pc = Pinecone(api_key=pinecone_api_key)

### **Creating Index**

In [5]:
index_name = "rag-project"
pc.delete_index(index_name)

pc.create_index(
    name=index_name,
    dimension=768,
    metric='cosine',
    spec  = ServerlessSpec(cloud="aws",region="us-east-1"),
)

index = pc.Index(index_name)

### **Creating Embeddings Using Embedding model**

In [6]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings
import os

GEMINI_KEY = userdata.get("GOOGLE_API_KEY_2")

embeddings = GoogleGenerativeAIEmbeddings(model = "models/embedding-001",google_api_key=GEMINI_KEY )

In [7]:
vector = embeddings.embed_query("Building a Rag project! ")

In [8]:
vector[:5]

[0.005886509083211422,
 -0.01920737698674202,
 -0.01310189813375473,
 -0.03790365159511566,
 -0.003551947884261608]

### **Creating Vector Store with Pinecone**

In [9]:
from langchain_pinecone import PineconeVectorStore

vector_store = PineconeVectorStore(index=index, embedding=embeddings)

In [10]:
from langchain_core.documents import Document

document_1 = Document(
    page_content="I have chocolate chip pancake and scrambled eggs for breakfast",
    metadata={"source": "personal", "meal": "breakfast"}
)

document_2 = Document(
    page_content="LangChain is a framework for building applications with LLMs (Large Language Models), such as GPT. It provides tools to manage chains, agents, and memory for building more advanced AI applications.",
    metadata={"topic": "LangChain", "category": "AI Framework"}
)

document_3 = Document(
    page_content="Agentic AI refers to autonomous AI systems that are capable of decision-making, learning, and adaptation in real-world environments without needing constant human intervention.",
    metadata={"topic": "Agentic AI", "category": "Artificial Intelligence", "importance": "High"}
)

document_4 = Document(
    page_content="In the latest world news, a new tech company has developed an innovative AI that is able to solve real-world problems faster than previous models, pushing the boundaries of automation and efficiency.",
    metadata={"topic": "Tech News", "date": "2025-01-02", "company": "Innovative AI Company"}
)

document_5 = Document(
    page_content="The use of AI in healthcare is growing rapidly. Recent advancements in diagnostic AI tools are helping doctors identify diseases more accurately and faster, significantly improving patient outcomes.",
    metadata={"topic": "Healthcare AI", "category": "Medical Technology", "impact": "Positive"}
)

document_6 = Document(
    page_content="LangChain offers a wide range of tools to work with LLMs. This includes support for document search, question-answering systems, and memory management to make intelligent agents more effective in real-world tasks.",
    metadata={"topic": "LangChain", "category": "AI Tools", "use_case": "Advanced workflows"}
)

document_7 = Document(
    page_content="Agentic AI is becoming increasingly important in industries like autonomous vehicles, robotics, and smart cities, where real-time decision-making and adaptability are crucial for success.",
    metadata={"topic": "Agentic AI", "category": "Industry Applications", "industries": ["Autonomous Vehicles", "Robotics", "Smart Cities"]}
)

document_8 = Document(
    page_content="A new world record has been set for the fastest internet speed, with researchers breaking through previous limitations using advanced fiber-optic technology, promising faster and more efficient global communication.",
    metadata={"topic": "Tech News", "category": "Innovation", "record": "Fastest Internet Speed", "date": "2025-01-02"}
)

document_9 = Document(
    page_content="In a recent study, AI-powered chatbots have been shown to outperform human customer service agents in resolving technical issues, reducing wait times and improving customer satisfaction.",
    metadata={"topic": "AI in Customer Service", "category": "Business Technology", "impact": "High"}
)

document_10 = Document(
    page_content="LangChain continues to evolve with new integrations, including support for databases, APIs, and external data sources, enabling more complex and efficient workflows for AI applications.",
    metadata={"topic": "LangChain", "category": "Development", "features": ["Database Integration", "API Support", "External Data Sources"]}
)

documents = [
    document_1, document_2, document_3, document_4, document_5,
    document_6, document_7, document_8, document_9, document_10
]

In [11]:
len(  documents)

10

## Adding Documents and giving IDs

In [12]:
from uuid import uuid4
uuid4()

UUID('c5cd4623-5fa4-45f7-a0db-85a489b75032')

In [13]:
uuids = [str(uuid4()) for i in range (len(documents))]
vector_store.add_documents(documents=documents, ids=uuids)

['2641c889-08b7-4c61-8b05-cd31712c4e3c',
 '5f23815c-9479-4dd8-bd70-471ca6b41735',
 '13e31ea9-9102-4816-aec9-44d820150609',
 '7392720b-45ab-4aa7-8fd2-3186276a8a4d',
 'f95dfd7e-b4f0-481a-819a-c95c76202797',
 '826413ab-580a-403d-ad6a-b57cbf5404c2',
 '273e4def-213e-46f3-a39a-6e934195f68b',
 'cded9dde-1617-49bd-8ec8-0595a9e237ab',
 '0c1df380-fba2-4aa5-b880-4e0175d9eedb',
 'd823773a-da6a-40dd-9e3e-fdc679e4ee3a']

### Performng Similarity search

In [14]:
vector_result = vector_store.similarity_search(
    "What is langchain", k=7,)
print(vector_result[0].page_content)

LangChain is a framework for building applications with LLMs (Large Language Models), such as GPT. It provides tools to manage chains, agents, and memory for building more advanced AI applications.


### Generating Contextual Answers with Google Generative AI

### **Using Langchain**

In [15]:
from langchain_google_genai import ChatGoogleGenerativeAI

from langchain.prompts import PromptTemplate

In [16]:
def user_answer(question):

  vector_result = vector_store.similarity_search(question, k=5)

  llm = ChatGoogleGenerativeAI(
      api_key=GEMINI_KEY,
      model = "gemini-2.0-flash-exp",
      max_tokens= 100,
      temperature=0.7
  )

  prompt1= PromptTemplate(
      input_variables=["question"],
      template = "Using this data {vector_result} . Answer the following question: \n\n{question}"
  )
  chain1= prompt1 | llm

  final_answer = chain1.invoke({"vector_result": vector_result, "question": question})

  return final_answer

In [17]:
response = user_answer("What's the latest news?")

In [18]:
display(Markdown(response.content))

Based on the provided data, here's the latest news:

*   **Innovative AI Company Develops Advanced AI:** A new tech company has created an AI that solves real-world problems faster than previous models, advancing automation and efficiency (from document ID `7392720b-45ab-4aa7-8fd2-3186276a8a4d`).
*   **New World Record for Fastest Internet Speed:** Researchers