# Task 1 :
#### Develop a Retrieval-Augmented Generation (RAG) model for a Question Answering (QA) bot for a business. Use a vector database like Pinecone DB and a generative model like Cohere API (or any other available alternative). The QA bot should be able to retrieve relevant information from a dataset and generate coherent answers.
Task Requirements:
1. Implement a RAG-based model that can handle questions related to a provided
document or dataset.
2. Use a vector database (such as Pinecone) to store and retrieve document
embeddings efficiently.
3. Test the model with several queries and show how well it retrieves and generates
accurate answers from the document.

In [None]:
# Setup and Requirements
!pip install pinecone-client cohere transformers torch

Collecting pinecone-client
  Downloading pinecone_client-5.0.1-py3-none-any.whl.metadata (19 kB)
Collecting cohere
  Downloading cohere-5.9.4-py3-none-any.whl.metadata (3.4 kB)
Collecting pinecone-plugin-inference<2.0.0,>=1.0.3 (from pinecone-client)
  Downloading pinecone_plugin_inference-1.1.0-py3-none-any.whl.metadata (2.2 kB)
Collecting pinecone-plugin-interface<0.0.8,>=0.0.7 (from pinecone-client)
  Downloading pinecone_plugin_interface-0.0.7-py3-none-any.whl.metadata (1.2 kB)
Collecting boto3<2.0.0,>=1.34.0 (from cohere)
  Downloading boto3-1.35.24-py3-none-any.whl.metadata (6.6 kB)
Collecting fastavro<2.0.0,>=1.9.4 (from cohere)
  Downloading fastavro-1.9.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.5 kB)
Collecting httpx>=0.21.2 (from cohere)
  Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)
Collecting httpx-sse==0.4.0 (from cohere)
  Downloading httpx_sse-0.4.0-py3-none-any.whl.metadata (9.0 kB)
Collecting parameterized<0.10.0,>=0.9.0 (f

In [None]:
# Initialize pinecone
import os
from pinecone import Pinecone, ServerlessSpec
pc = Pinecone(api_key="axxxxxxxxxxxxxxxxxxx5")

In [None]:
# Create index
if 'rag' not in pc.list_indexes().names():
          pc.create_index(
              name='rag',
              dimension=384,
              metric='cosine',
              spec=ServerlessSpec(
                  cloud='aws',
                  region='us-east-1'
              )
          )

In [None]:
# Vector Database Setup
import pinecone
import cohere
from transformers import AutoTokenizer, AutoModel
import torch

index = pc.Index('rag')

# Initialize Cohere for text generation (alternatively, GPT-3/4 API can be used)
cohere_client = cohere.Client(api_key="bxxxxxxxxxxxxxxxxxxx")

# Load a pre-trained embedding model from Hugging Face (e.g., sentence-transformers)
tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
model = AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]



config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

In [None]:
import numpy as np

def generate_embeddings(texts):
    inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True)
    with torch.no_grad():
        model_output = model(**inputs)
    embeddings = model_output.last_hidden_state.mean(dim=1)  # Average pooling

    # Convert to numpy array and cast to float32
    embeddings_np = embeddings.numpy().astype(np.float32)

    # L2 normalization (make sure the norm of the vector is 1)
    norms = np.linalg.norm(embeddings_np, axis=1, keepdims=True)
    normalized_embeddings = embeddings_np / norms  # Apply L2 normalization

    return normalized_embeddings


In [None]:
# Example document segments
document_segments = [
    "IBM company offers cloud-based solutions.",
    "IBM specialize in artificial intelligence and machine learning.",
    "IBM company mission is to drive innovation in technology."
]

In [None]:
# Insert documents into Pinecone with their embeddings
for i, segment in enumerate(document_segments):
    embedding = generate_embeddings([segment])[0].tolist()
    index.upsert([(f"doc_{i}", embedding, {"text": segment})])  # Store the embedding in Pinecone

In [None]:
def retrieve_relevant_docs(query, top_k=3):
    print("Inside retrieve_relevant_docs function...")

    # Generate normalized embeddings
    query_embedding = generate_embeddings([query])[0].tolist()  # Convert to list after normalization

    results = index.query(vector=[query_embedding], top_k=top_k)
    relevant_docs = []
    if 'matches' in results and results['matches']:
        for match in results['matches']:
            doc_id = match['id']
            doc_index = int(doc_id.split("_")[1])  # Assuming "doc_X" format
            relevant_docs.append(document_segments[doc_index])
    else:
        print("No matches found in the query results.")

    return relevant_docs


In [None]:
# Generate the answer using the relevant documents (text)
def generate_answer(query, relevant_docs):
    context = "\n".join(relevant_docs)
    prompt = f"Question: {query}\n\nContext:\n{context}\n\nAnswer:"

    response = cohere_client.generate(
        model="command-nightly",
        prompt=prompt,
        max_tokens=700,
        temperature=0.5
    )
    return response.generations[0].text.strip()

In [None]:
# QA Bot Function
def qa_bot(query):
    # Retrieve relevant documents based on the query
    relevant_docs = retrieve_relevant_docs(query)

    # Generate a coherent answer using Cohere
    answer = generate_answer(query, relevant_docs)
    return answer, relevant_docs

In [None]:
# Example usage
query = "What is IBM conmpany mission?"
answer, relevant_docs = qa_bot(query)
print("Answer:", answer)
# Check relevant documents for last question
print('\nRelevant documents :\n' , relevant_docs)

Inside retrieve_relevant_docs function...
Answer: IBM's mission is to drive innovation in technology, with a focus on artificial intelligence, machine learning, and cloud-based solutions.

Relevant documents :
 ['IBM company mission is to drive innovation in technology.', 'IBM specialize in artificial intelligence and machine learning.', 'IBM company offers cloud-based solutions.']


In [None]:
# Example usage
query = "What IBM company offers?"
answer, relevant_docs = qa_bot(query)
print("Answer:", answer)
# Check relevant documents for last question
print('\nRelevant documents :\n', relevant_docs)

Inside retrieve_relevant_docs function...
Answer: IBM company offers cloud-based solutions, and specializes in artificial intelligence and machine learning.

Relevant documents :
 ['IBM company offers cloud-based solutions.', 'IBM company mission is to drive innovation in technology.', 'IBM specialize in artificial intelligence and machine learning.']


In [None]:
# Example usage
query = "What IBM conmpany specialize in?"
answer, relevant_docs = qa_bot(query)
print("Answer:", answer)
# Check relevant documents for last question
print('Relevant documents :\n', relevant_docs)

Inside retrieve_relevant_docs function...
Answer: IBM specializes in artificial intelligence, machine learning, and cloud-based solutions. The company's mission is to drive innovation in technology.
Relevant documents :
 ['IBM specialize in artificial intelligence and machine learning.', 'IBM company mission is to drive innovation in technology.', 'IBM company offers cloud-based solutions.']
