# RAG System with Pinecone and Google Gemini

This notebook implements a Retrieval-Augmented Generation (RAG) system using:
- Pinecone as the vector database with managed embeddings
- Google Gemini as the LLM via Google AI Studio API

## Install Required Libraries

In [11]:
!pip install pinecone google-generativeai



In [159]:
!pip install openai==0.28

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Collecting openai==0.28
  Using cached openai-0.28.0-py3-none-any.whl.metadata (13 kB)
Using cached openai-0.28.0-py3-none-any.whl (76 kB)
Installing collected packages: openai
  Attempting uninstall: openai
    Found existing installation: openai 0.26.0
    Uninstalling openai-0.26.0:
      Successfully uninstalled openai-0.26.0
Successfully installed openai-0.28.0


## Import Libraries

In [160]:
import os
import pinecone
import google.generativeai as genai
import textwrap
import uuid
import json
# from sentence_transformers import SentenceTransformer
import openai

In [78]:
client = genai.Client(api_key='AIzaSyDkgst24W2OZtQ0RLGTqEIYKZjy0v9NBnc')

AttributeError: module 'google.generativeai' has no attribute 'Client'

## Set API Keys

You'll need to set up your API keys for Pinecone and Google AI Studio.

In [144]:
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
OPENAI_API_KEY = user_secrets.get_secret("OPENAI_API_KEY")

openai.api_key = OPENAI_API_KEY

In [8]:
# Set your API keys here
PINECONE_API_KEY = "pcsk_2LY45h_LtbyFTSHfYq16sGAR2KFj6zMhmFxiwub4itC5AsLiKXyE4bNTp9ZNFjXzrxhGEP"  # Replace with your Pinecone API key
GOOGLE_API_KEY = "AIzaSyDkgst24W2OZtQ0RLGTqEIYKZjy0v9NBnc"  # Replace with your Google AI Studio API key

# Initialize Pinecone with the new API format
pc = pinecone.Pinecone(api_key=PINECONE_API_KEY)

# Initialize Google Gemini
genai.configure(api_key=GOOGLE_API_KEY)

## Create or Connect to Pinecone Index

In [56]:
# Define index name
index_name = "gemini-rag-embeds"

# Check if the index already exists
if not pc.has_index(index_name):
    # Create a new index using create_index_for_model with a supported model
    pc.create_index_for_model(
        name=index_name,
        cloud="aws",
        region="us-east-1",
        embed={
            "model": "llama-text-embed-v2",
            "field_map": {"text": "text"}
        }
    )
    print(f"Created new Pinecone index: {index_name}")
else:
    print(f"Using existing Pinecone index: {index_name}")

# Connect to the index
index = pc.Index(index_name)

Created new Pinecone index: gemini-rag-embeds


## Document Processing Functions

In [192]:


def split_text(text, chunk_size=700):
    """
    Split text into chunks based on paragraphs.
    """
    paragraphs = [p for p in text.split('\n') if p]
    chunks = []
    
    chunk = ''
    for para in paragraphs:
        if len(chunk + para) > chunk_size:
            chunks.append(chunk.strip())
            chunk = ''
        chunk += para + '\n'
    
    if chunk:
        chunks.append(chunk.strip())
    
    return chunks

def process_document(document, metadata=None):
    """
    Process document for Pinecone with managed embeddings.
    """
    # Split document into paragraphs
    chunks = split_text(document)
    print(f"Document split into {len(chunks)} chunks")
    
    # Prepare records for Pinecone with managed embeddings
    records = []
    for i, chunk in enumerate(chunks):
        # Create record in the format expected by Pinecone
        record = {
            "_id": f"chunk_{i}",
            "text": chunk,  # This matches the field_map in your index
            "category": metadata.get("source", "unknown") if metadata else "unknown",
            "author": metadata.get("author", "unknown") if metadata else "unknown"
        }
        records.append(record)
    
    # Upload all chunks at once
    index.upsert_records("ns1", records)
    print(f"Uploaded {len(records)} chunks to Pinecone")
    return len(records)

## Add Sample Documents

Let's add some sample documents to our vector database. You can replace this with your own documents.

In [193]:
# Sample document - replace with your own content
sample_document = """
About Co-Ventech

Co-Ventech is a leading software development company specializing in SaaS-based solutions 
that transform business visions into digital realities. Since our inception in 2019, we've partnered 
with over 50 global clients, completing more than 200 projects with a 95% client retention rate. 
Our mission is to empower businesses by integrating advanced SaaS platforms, fortified 
cybersecurity measures, and seamless software quality assurance practices.

Our Products

1. Recruitinn – AI-Powered Recruitment Platform

Recruitinn revolutionizes the hiring process with AI-driven solutions. It enables businesses to 
discover, assess, and onboard top talent efficiently. Key features include:

● AI-based candidate screening
● Automated interview scheduling
● Real-time analytics and reporting
● Seamless integration with existing HR systems

Website: app.recruitinn.ai

2. SkillBuilder – Comprehensive Learning & Career Development Platform

SkillBuilder offers a holistic approach to professional development, providing:

● Curated courses across various domains
● Recorded sessions for flexible learning
● Live sessions with industry experts
● AI-based career counseling to guide career paths

Website: app.skillbuilder.online

3. Co-Vental – AI-Driven Staff Augmentation Platform

Co-Vental streamlines the process of connecting businesses with top-tier freelancers through:

● AI-based initial interviews to assess technical skills
● Subsequent human interviews for comprehensive evaluation
● Inclusion in a vetted talent pool
● Client interviews to ensure the perfect match

https://app.co-ventech.com/login
https://app.recruitinn.ai/
https://app.skillbuilder.online

Our Services

Software Development

We craft custom software solutions tailored to your business needs, including:

● Custom Software Development
● App Development
● Enterprise Application Solutions
● Cloud-based Software
● Digital Transformation Services

QA & Test Automation

Ensuring the quality and reliability of your software through:

● Functional Testing
● Automated Testing
● Security Testing
● Performance Testing
● Continuous Testing in CI/CD pipelines

UI/UX Designing

Creating intuitive and engaging user experiences with services like:

● UX Design
● User Interface Design
● Prototyping
● Responsive Design
● User-Centered Design

DevOps

Optimizing your development processes through:

● Process Automation
● CI/CD Pipeline Implementation
● Cloud DevOps Solutions
● Serverless Architecture
● Scalable Infrastructure

Cybersecurity

Protecting your digital assets with services such as:

● Threat Detection
● Penetration Testing
● Vulnerability Assessment
● Security Audits
● Incident Response
"""

# Process and add the document to Pinecone
document_metadata = {"source": "AI Introduction", "author": "Sample"}
num_vectors = process_document(sample_document, document_metadata)

Document split into 5 chunks
Uploaded 5 chunks to Pinecone


In [194]:
print(index.describe_index_stats())

{'dimension': 1024,
 'index_fullness': 0.0,
 'metric': 'cosine',
 'namespaces': {'ns1': {'vector_count': 5}},
 'total_vector_count': 5,
 'vector_type': 'dense'}


## RAG Query Function

Now let's create a function to perform RAG queries using Pinecone for retrieval and Gemini for generation.

In [167]:
import openai
print(f"OpenAI version: {openai.__version__}")


OpenAI version: 1.70.0


In [259]:
import openai

def rag_query(query, top_k=3):
    """
    Perform a RAG query using Pinecone's managed embeddings and Gemini.
    """

    results = index.search(
    namespace="ns1",
    query={
        "top_k": 3,
        "inputs": {
            'text': query
            }
        }
    )
    
    # Extract relevant contexts from the results
    contexts = []
    for hit in results["result"]["hits"]:
        if "text" in hit["fields"]:
            contexts.append(hit["fields"]["text"])
    
    # If no contexts were found, return a message
    if not contexts:
        return "No relevant information found in the database."
    
    # Combine contexts
    combined_context = "\n\n".join(contexts)
    
    # Create prompt for Gemini
    prompt = f"""
    Based on the following information, please answer the question.
    
    Information:
    {combined_context}
    
    Question: {query}
    
    Answer:
    """
    
    # Generate response using Gemini
    # model = genai.GenerativeModel('gemini-2.0-flash')
    # generation_config = {
    #     "temperature": 0.2,
    #     "top_p": 0.5,
    #     "top_k" : 40,
    #     "max_output_tokens": 1000
    # }
    # response = model.generate_content(prompt, generation_config=generation_config)

    # 
    # response = openai.ChatCompletion.create(
    #         model="gpt-3.5-turbo",
    #         messages=[
    #             {"role": "system", "content": "You are a helpful assistant."},
    #             {"role": "user", "content": prompt}
    #         ],
    #         temperature=0.1,
    #         top_p=0.9,
    #         max_tokens=1000
        # )
    client = openai.OpenAI(api_key=OPENAI_API_KEY)
    response = client.responses.create(
            model="gpt-4o",
            instructions='You are an assistant which guides through the information necessary. Avoid intro lines, or your intro wordings, move directly to the wordings of answer. Only answer based on infromation provided, if there is no info simply refuse the user with good words with no and give reason why you are designed to Co-Ventech.',
            input=prompt,
            temperature=0.2,
            top_p=0.95,
            max_output_tokens=1000
        )
    
    response = response.output[0].content[0]
    
    
    return response.text

## Test the RAG System

Let's test our RAG system with some sample queries.

In [260]:
# Test query
from IPython.display import Markdown
test_query = "" + "Do you know about Microsoft?"

response = rag_query(test_query, top_k=5)

print("Query:", test_query)
print("\nResponse:")
# print(textwrap.fill(response, width=100))
Markdown(response)

Query: Do you know about Microsoft?

Response:


No, I don't have information about Microsoft. My focus is on providing details related to Co-Ventech and its offerings.

In [229]:
reranked_results = index.search(
    namespace="ns1",
    query={
        "top_k": 6,
        "inputs": {
            'text': test_query
        }
    },
    rerank={
        "model": "bge-reranker-v2-m3",
        "top_n": 6,
        "rank_fields": ["text"]
    },
    fields=["category", "text"]
)

print(reranked_results)

{'result': {'hits': [{'_id': 'chunk_0',
                      '_score': 0.8966140151023865,
                      'fields': {'category': 'AI Introduction',
                                 'text': 'About Co-Ventech\n'
                                         'Co-Ventech is a leading software '
                                         'development company specializing in '
                                         'SaaS-based solutions \n'
                                         'that transform business visions into '
                                         'digital realities. Since our '
                                         "inception in 2019, we've partnered \n"
                                         'with over 50 global clients, '
                                         'completing more than 200 projects '
                                         'with a 95% client retention rate. \n'
                                         'Our mission is to empower businesses '
           

In [225]:
# Try another query
test_query = "What are the main applications of NLP?"
response = rag_query(test_query)

print("Query:", test_query)
print("\nResponse:")
print(textwrap.fill(response, width=100))

Query: What are the main applications of NLP?

Response:
Based on the provided information, the main applications of NLP (Natural Language Processing) are:
*   **AI-based candidate screening:** (Recruitinn) *   **AI-based career counseling:**
(SkillBuilder) *   **AI-based initial interviews to assess technical skills:** (Co-Vental)


## Add Your Own Documents

You can add your own documents to the system using the `process_document` function.

In [None]:
# Example of adding your own document
'''
my_document = """Your document text here..."""
my_metadata = {"source": "My Document", "author": "Your Name"}
process_document(my_document, my_metadata)
'''

## Clean Up (Optional)

If you want to delete the Pinecone index when you're done.

In [None]:
# Uncomment to delete the index
# pc.delete_index(index_name)