<a href="https://colab.research.google.com/github/OvaisMemon/LangChain/blob/main/Project_2_Langchain_RAG_Project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Step 1: Install Langchain Pinecone**

In [None]:
%pip install -qU langchain-pinecone langchain-google-genai

**Step 2: Import Necessary Packages and Setup Pinecone API**

In [None]:
from pinecone import Pinecone, ServerlessSpec
from google.colab import userdata

pinecone_api_key = userdata.get('PINECONE_API_KEY')

pc = Pinecone(api_key=pinecone_api_key)

**Step 3: Initialize Pinecone Index**

In [None]:
index_name = "langchain-test-index-01"

existing_indexes = [index_info["name"] for index_info in pc.list_indexes()]

if index_name not in existing_indexes:
    pc.create_index(
        name=index_name,
        dimension=768,
        metric="cosine",
        spec=ServerlessSpec(cloud="aws", region="us-east-1"),
    )
index = pc.Index(index_name)

**Step 4: Initialize Embedding Model of Google**

In [None]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings
import os

Google_Api_Key = userdata.get('GEMINI_API_KEY')
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001", google_api_key=Google_Api_Key)

**[Below code is for Test Purpose only]**

In [None]:
vector = embeddings.embed_query("Hello world");
vector[:5]

[0.04703257977962494,
 -0.04019005596637726,
 -0.02902696467936039,
 -0.026809632778167725,
 0.01892058178782463]

**Step 5: Setup Vector Store**

In [None]:
from langchain_pinecone import PineconeVectorStore

vector_store = PineconeVectorStore(index=index, embedding=embeddings)

**Step 6: Add Items to Vector Store**

In [None]:
from uuid import uuid4

from langchain_core.documents import Document

document_1 = Document(
    page_content="I had chocalate chip pancakes and scrambled eggs for breakfast this morning.",
    metadata={"source": "tweet"},
)

document_2 = Document(
    page_content="The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees.",
    metadata={"source": "news"},
)

document_3 = Document(
    page_content="Building an exciting new project with LangChain - come check it out!",
    metadata={"source": "tweet"},
)

document_4 = Document(
    page_content="Robbers broke into the city bank and stole $1 million in cash.",
    metadata={"source": "news"},
)

document_5 = Document(
    page_content="Wow! That was an amazing movie. I can't wait to see it again.",
    metadata={"source": "tweet"},
)

document_6 = Document(
    page_content="Is the new iPhone worth the price? Read this review to find out.",
    metadata={"source": "website"},
)

document_7 = Document(
    page_content="The top 10 soccer players in the world right now.",
    metadata={"source": "website"},
)

document_8 = Document(
    page_content="LangGraph is the best framework for building stateful, agentic applications!",
    metadata={"source": "tweet"},
)

document_9 = Document(
    page_content="The stock market is down 500 points today due to fears of a recession.",
    metadata={"source": "news"},
)

document_10 = Document(
    page_content="I have a bad feeling I am going to get deleted :(",
    metadata={"source": "tweet"},
)

documents = [
    document_1,
    document_2,
    document_3,
    document_4,
    document_5,
    document_6,
    document_7,
    document_8,
    document_9,
    document_10,
]
uuids = [str(uuid4()) for _ in range(len(documents))]

vector_store.add_documents(documents=documents, ids=uuids)

['8408b887-116d-4799-82df-9f8d3efc6789',
 '6d15f871-31e9-4783-a015-1c6e6c0a2613',
 '54f4c927-119c-4b35-bb80-8172eebd303d',
 '6b8add1e-5924-4139-996d-1732a016b317',
 '1bfa1670-d074-47b6-9cdf-74d4a9c71a31',
 'bec5f13c-a76c-4f64-9ee2-bb582c1258d5',
 'cec66572-b672-46ba-8ec6-bbeb0894774c',
 'd5c9f555-b838-4d2a-814c-26dba4e22ba4',
 '94cb307d-8726-49b7-9efd-b36ba2b49a16',
 '84e5014e-bbea-4932-aa30-f5be9022bcf9']

**Step 7 (a): Perform Similary Search on Vector Store**

In [None]:
results = vector_store.similarity_search("Langchain provides abstractions to make working with LLMs easy", filter={"source":"news"})
for result in results:
  print(f"*{result.page_content} [{result.metadata}]")

*The stock market is down 500 points today due to fears of a recession. [{'source': 'news'}]
*The stock market is down 500 points today due to fears of a recession. [{'source': 'news'}]
*The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees. [{'source': 'news'}]
*The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees. [{'source': 'news'}]


**Step 7 (b): Perform Similary Search with Score on Vector Score**

In [None]:
results = vector_store.similarity_search_with_score("Langchain provides abstractions to make working with LLMs easy", filter={"source":"news"})
for result, score in results:
  print(f"* [SIM={score:3f}] {result.page_content} [{result.metadata}]")

* [SIM=0.408264] The stock market is down 500 points today due to fears of a recession. [{'source': 'news'}]
* [SIM=0.408264] The stock market is down 500 points today due to fears of a recession. [{'source': 'news'}]
* [SIM=0.403576] The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees. [{'source': 'news'}]
* [SIM=0.403576] The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees. [{'source': 'news'}]


**Step 8: Inject the model**

In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm_model : ChatGoogleGenerativeAI = ChatGoogleGenerativeAI(model = "gemini-2.0-flash-exp",
                                                            api_key = Google_Api_Key)

**Step 9: Augment vector response in the model**

In [None]:
query = "Why is the stock market down?"
vector_results = vector_store.similarity_search(query)
response = llm_model.invoke(f"Answer user queries {query} from these vector results {vector_results}")

print(response.content)

The stock market is down 500 points today due to fears of a recession.

