<a href="https://colab.research.google.com/github/Josogrephy/Auction/blob/master/RAG_Teaching_Material.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [6]:
# Install necessary libraries
!pip install sentence-transformers faiss-cpu


Collecting faiss-cpu
  Downloading faiss_cpu-1.11.0-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (4.8 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch>=1.11.0->sentence-transformers)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_6

In [9]:
# Install the Google Generative AI library
!pip install -q -U google-generativeai

In [7]:
import numpy as np
from sentence_transformers import SentenceTransformer
import faiss

# 1. Prepare your documents
# For this example, let's use a list of strings.
# In a real scenario, you might load these from files.
my_documents = [
    "The Eiffel Tower is a wrought-iron lattice tower on the Champ de Mars in Paris, France.",
    "The Great Wall of China is a series of fortifications made of stone, brick, tamped earth, wood, and other materials.",
    "The Colosseum is an oval amphitheatre in the centre of the city of Rome, Italy.",
    "Generative AI refers to artificial intelligence models capable of generating new content, such as text, images, audio, and video.",
    "RAG combines retrieval mechanisms with generative models to provide more contextually relevant and accurate responses."
]

# You can implement a chunking strategy here if your documents are large.
# For simplicity, we'll use the full sentences as documents.

# 2. Load a pre-trained Sentence Transformer model
# These models are excellent for creating semantically meaningful embeddings.
model = SentenceTransformer('all-MiniLM-L6-v2') # A popular and efficient model

# 3. Generate embeddings for your documents
document_embeddings = model.encode(my_documents)

# Check the shape of our embeddings
print("Shape of document embeddings:", document_embeddings.shape)
# This will show (number_of_documents, embedding_dimension)

# document_embeddings now holds the numerical representation of your texts.
# We can save these embeddings for later use.
np.save("my_document_embeddings.npy", document_embeddings)

print("Embeddings generated and saved!")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.5k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Shape of document embeddings: (5, 384)
Embeddings generated and saved!


In [8]:
# Ensure you have run the previous code block or have the necessary libraries and data loaded.

# If you haven't run the previous block in the same session, load the embeddings:
# document_embeddings = np.load("my_document_embeddings.npy")
# model = SentenceTransformer('all-MiniLM-L6-v2') # if needed for query embedding later

# 1. Get the dimensionality of our embeddings
d = document_embeddings.shape[1] # Dimension of embeddings

# 2. Create a FAISS index
# IndexFlatL2 is a basic index that performs exact L2 distance search.
# For very large datasets, you might explore more complex FAISS indexes like IndexIVFFlat.
index = faiss.IndexFlatL2(d)

# 3. Add the document embeddings to the index
index.add(document_embeddings)

# Check if the embeddings are added
print("Number of vectors in the FAISS index:", index.ntotal)

# We can save the FAISS index to disk
faiss.write_index(index, "my_faiss_index.index")

print("FAISS index created, populated, and saved!")

Number of vectors in the FAISS index: 5
FAISS index created, populated, and saved!


In [10]:
import google.generativeai as genai

# --- Configuration ---
# Make sure you have your Gemini API key stored as a secret in Colab.
# Name it 'GEMINI_API_KEY'.
try:
    GEMINI_API_KEY = 'AIzaSyBm0rMDLYG4IaVy4FVw3dPeNTIUmNppKTg'
    genai.configure(api_key=GEMINI_API_KEY)
except Exception as e:
    print(f"An error occurred during API key configuration: {e}")
    GEMINI_API_KEY = None


# --- Load pre-requisites (if not already in the environment) ---
# Ensure 'model', 'index', and 'my_documents' are loaded from previous steps.

# If you are in a new session, uncomment and run these:
# model = SentenceTransformer('all-MiniLM-L6-v2')
# document_embeddings = np.load("my_document_embeddings.npy")
# index = faiss.read_index("my_faiss_index.index")
# my_documents = [ # Re-define or load your documents
#     "The Eiffel Tower is a wrought-iron lattice tower on the Champ de Mars in Paris, France.",
#     "The Great Wall of China is a series of fortifications made of stone, brick, tamped earth, wood, and other materials.",
#     "The Colosseum is an oval amphitheatre in the centre of the city of Rome, Italy.",
#     "Generative AI refers to artificial intelligence models capable of generating new content, such as text, images, audio, and video.",
#     "RAG combines retrieval mechanisms with generative models to provide more contextually relevant and accurate responses."
# ]


# 1. Define a user query
user_query = "What is the Eiffel Tower made of?"

# 2. Embed the user query
# It's crucial to use the SAME model for embedding the query as you used for the documents.
query_embedding = model.encode([user_query]) # Pass the query as a list

# 3. Search the FAISS index
k = 2 # Number of top relevant documents to retrieve
distances, indices = index.search(query_embedding, k)

# 'indices' will contain the row numbers of the most similar documents in your original 'my_documents' list.
# 'distances' will contain the corresponding similarity scores (e.g., L2 distances).

print(f"Query: {user_query}")
print(f"Retrieved document indices: {indices}")
print(f"Distances: {distances}")

# 4. Retrieve the actual document content
retrieved_docs_content = [my_documents[i] for i in indices[0]]

print("\n--- Retrieved Documents ---")
for i, doc in enumerate(retrieved_docs_content):
    print(f"Doc {i+1}: {doc}")

# 5. Prepare the context and prompt for Gemini
context_for_llm = "\n\n".join(retrieved_docs_content)

prompt_template = f"""Based ONLY on the following context, answer the question.
If the context doesn't contain the answer, say "I don't have enough information from the provided documents."

Context:
{context_for_llm}

Question: {user_query}

Answer:
"""

print("\n--- Prompt for LLM ---")
print(prompt_template)

# 6. Call the Gemini API (if the API key is available)
if GEMINI_API_KEY:
    try:
        llm_model = genai.GenerativeModel('gemini-1.5-flash-latest') # Or your preferred Gemini model
        response = llm_model.generate_content(prompt_template)

        print("\n--- LLM Response ---")
        print(response.text)
    except Exception as e:
        print(f"\nError during Gemini API call: {e}")
        print("Please ensure your API key is correct and you have API access.")
else:
    print("\nSkipping Gemini API call as API key is not configured.")

Query: What is the Eiffel Tower made of?
Retrieved document indices: [[0 1]]
Distances: [[0.6012749 1.2529941]]

--- Retrieved Documents ---
Doc 1: The Eiffel Tower is a wrought-iron lattice tower on the Champ de Mars in Paris, France.
Doc 2: The Great Wall of China is a series of fortifications made of stone, brick, tamped earth, wood, and other materials.

--- Prompt for LLM ---
Based ONLY on the following context, answer the question.
If the context doesn't contain the answer, say "I don't have enough information from the provided documents."

Context:
The Eiffel Tower is a wrought-iron lattice tower on the Champ de Mars in Paris, France.

The Great Wall of China is a series of fortifications made of stone, brick, tamped earth, wood, and other materials.

Question: What is the Eiffel Tower made of?

Answer:


Error during Gemini API call: 400 POST https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent?%24alt=json%3Benum-encoding%3Dint: User l

