<a href="https://colab.research.google.com/github/Shehbaz-Niazi/Agentic_AI-PROJECTS/blob/main/RAG_with_Pinecone_for_text_data.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Install Libraries**

In [None]:
!pip install -q -U langchain
!pip install -q -U langchain-google-genai
%pip install -qU langchain-pinecone

# **Pinecone API and Connection Setup**

In [None]:

from google.colab import userdata
from pinecone import Pinecone, ServerlessSpec
import os

os.environ["GOOGLE_API_KEY"] = userdata.get('GOOGLE_API_KEY')

pinecone_api_key = userdata.get('PINECONE_API_KEY')

pc = Pinecone(api_key=pinecone_api_key)

# **Making Pincone Index Process**

In [None]:
import time

index_name = "langchain-myself-rag-project"  # change if desired

existing_indexes = [index_info["name"] for index_info in pc.list_indexes()]

if index_name not in existing_indexes:
    pc.create_index(
        name=index_name,
        dimension=768,
        metric="cosine",
        spec=ServerlessSpec(cloud="aws", region="us-east-1"),
    )
    while not pc.describe_index(index_name).status["ready"]:
        time.sleep(1)

index = pc.Index(index_name)

# **Embeddings From Gen AI**

In [None]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings


embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

# **Creating Documents**

In [None]:
from uuid import uuid4
from langchain_core.documents import Document

# Document 1: Tournament Overview
document_1 = Document(
    page_content="The ICC Champions Trophy 2017 was held in England and Wales from June 1 to June 18, 2017. It featured the top 8 ODI teams in the world competing for the title.",
    metadata={"source": "icc-website"},
)

# Document 2: Winner of the Tournament
document_2 = Document(
    page_content="Pakistan won the ICC Champions Trophy 2017 by defeating India in the final by 180 runs. Fakhar Zaman and Mohammad Amir were the standout performers for Pakistan.",
    metadata={"source": "news"},
)

# Document 3: Key Match - India vs Pakistan (Group Stage)
document_3 = Document(
    page_content="In the group stage match between India and Pakistan, India won by 124 runs via the DLS method. Yuvraj Singh and Virat Kohli played crucial innings for India.",
    metadata={"source": "match-report"},
)

# Document 4: Key Match - Pakistan vs Sri Lanka (Group Stage)
document_4 = Document(
    page_content="Pakistan defeated Sri Lanka by 3 wickets in a thrilling group stage match. Sarfaraz Ahmed led Pakistan to victory with a composed innings.",
    metadata={"source": "match-report"},
)

# Document 5: Key Match - England vs Bangladesh (Group Stage)
document_5 = Document(
    page_content="England defeated Bangladesh by 8 wickets in the group stage. Joe Root scored a century to guide England to an easy victory.",
    metadata={"source": "match-report"},
)

# Document 6: Semi-Final - England vs Pakistan
document_6 = Document(
    page_content="Pakistan defeated England by 8 wickets in the semi-final. Hasan Ali's bowling and Azhar Ali's batting were key to Pakistan's victory.",
    metadata={"source": "match-report"},
)

# Document 7: Semi-Final - India vs Bangladesh
document_7 = Document(
    page_content="India defeated Bangladesh by 9 wickets in the semi-final. Rohit Sharma and Virat Kohli scored half-centuries to secure a comfortable win.",
    metadata={"source": "match-report"},
)

# Document 8: Final - India vs Pakistan
document_8 = Document(
    page_content="In the final of the ICC Champions Trophy 2017, Pakistan defeated India by 180 runs. Fakhar Zaman scored a century, and Mohammad Amir took crucial wickets.",
    metadata={"source": "match-report"},
)

# Document 9: Top Performers of the Tournament
document_9 = Document(
    page_content="The top performers of the ICC Champions Trophy 2017 included Fakhar Zaman (Pakistan) with the bat and Hasan Ali (Pakistan) with the ball. Hasan Ali was named Player of the Tournament.",
    metadata={"source": "icc-website"},
)

# Document 10: Interesting Facts
document_10 = Document(
    page_content="The ICC Champions Trophy 2017 was the last edition of the tournament. It was replaced by the ICC World Test Championship and the ICC ODI Super League.",
    metadata={"source": "news"},
)

# List of Documents
documents = [
    document_1,
    document_2,
    document_3,
    document_4,
    document_5,
    document_6,
    document_7,
    document_8,
    document_9,
    document_10,
]



# **Vector Store Setup**

In [None]:
from langchain_pinecone import PineconeVectorStore

vector_store = PineconeVectorStore(index=index, embedding=embeddings)

# **Adding Documents in Vector Store**

In [None]:
uuids = [str(uuid4()) for _ in range(len(documents))]
vector_store.add_documents(documents=documents, ids=uuids)

# **Similarity Search (Basic)**

In [None]:
results = vector_store.similarity_search(
    "Give Some report of ICC Champion Trophy 2025",
    k=2,
    filter={"source": "icc-website"},
)
for res in results:
    print(f"* {res.page_content} [{res.metadata}]")

* The ICC Champions Trophy 2017 was held in England and Wales from June 1 to June 18, 2017. It featured the top 8 ODI teams in the world competing for the title. [{'source': 'icc-website'}]
* The ICC Champions Trophy 2017 was held in England and Wales from June 1 to June 18, 2017. It featured the top 8 ODI teams in the world competing for the title. [{'source': 'icc-website'}]


# **Similarity Search (with Score)**

In [None]:
results = vector_store.similarity_search_with_score(
    "How many Run to Win Pakistan from india",
    k=1,
    filter={"source": "match-report"}
)
for res, score in results:
    print(f"* [SIM={score:3f}] {res.page_content} [{res.metadata}]")

* [SIM=0.693188] In the group stage match between India and Pakistan, India won by 124 runs via the DLS method. Yuvraj Singh and Virat Kohli played crucial innings for India. [{'source': 'match-report'}]


# **Langchain LLM Connection With Gemini**

In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(
    model="gemini-1.5-flash",
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
    # other params...
)

# **Final Answer Setup**

In [None]:
def answer_to_user(query:str):

  # vector search
  vector_results = vector_store.similarity_search(query, k=2)
  print(len(vector_results))

  # pass the model vector Result + User query
  final_answer = llm.invoke(f"Answer This User Query: {query}, Here are some refrences to answer: {vector_results}")

  return final_answer

In [None]:
answer =  answer_to_user("Tell me about Fakhar zaman icc champion trophy 2017")
print(answer.content)

2
In the ICC Champions Trophy 2017, Fakhar Zaman was one of Pakistan's top performers with the bat.  While Hasan Ali won the Player of the Tournament award,  Fakhar Zaman's batting contributions were significant to Pakistan's success in the tournament.  The provided text doesn't give specifics on his performance, only that he was a top performer.
