# üß† Study Buddy ‚Äî Build Your Own RAG Chatbot with Gemini
Upload any PDF or text file (e.g., course notes, a Wikipedia export, or an article).

Ask questions like:
- ‚ÄúSummarize Chapter 2‚Äù
- ‚ÄúWhat is reinforcement learning?‚Äù
- ‚ÄúWhat‚Äôs the main takeaway from this section?‚Äù


In [None]:
# üß© Step 1: Install dependencies
!pip install -q google-generativeai PyPDF2 faiss-cpu

In [None]:
!pip install -q google-genai
from google.colab import userdata
from google import genai
import os

# Load API Key from Secrets
os.environ['GEMINI_API_KEY'] = userdata.get('GEMINI_API_KEY')
client = genai.Client()

# Example: Ask a question about your code
my_code = "for i in range(10): print(i**2)"
prompt = f"Explain the complexity of this code: {my_code}"

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents=prompt
)
print(response.text)

Let's break down the complexity of this code:

```python
for i in range(10):
    print(i**2)
```

The complexity of an algorithm is usually described using Big O notation, which characterizes how the runtime or space requirements grow as the input size grows.

In this specific case, there's a crucial detail: the loop runs for a **fixed number of iterations (10)**, not a number that depends on some variable input `n`.

### Time Complexity

*   **Loop:** The `for i in range(10)` loop will execute exactly 10 times.
*   **Inside the loop:**
    *   `i**2`: This is an exponentiation operation. For small integer values of `i` (from 0 to 9), this operation takes a constant amount of time.
    *   `print(value)`: Printing a single integer also takes a constant amount of time (it doesn't depend on the "size" of the input in terms of memory or calculations needed for *this specific line*).
*   **Total Operations:** Since the loop runs a fixed number of times (10) and each operation inside the lo

In [None]:
# üß† Step 2: Import libraries
import google.generativeai as genai
from getpass import getpass
import PyPDF2
import faiss
import numpy as np
import re

In [23]:
# ‚öôÔ∏è Step 3: Configure Gemini API
GEMINI_API_KEY = getpass("üîë Enter your Gemini API key: ")
genai.configure(api_key=GEMINI_API_KEY)

In [24]:
# üßæ Step 4: Upload your study material
from google.colab import files
uploaded = files.upload()

file_name = list(uploaded.keys())[0]
text = ""

if file_name.endswith(".pdf"):
    reader = PyPDF2.PdfReader(file_name)
    for page in reader.pages:
        text += page.extract_text() or ""
else:
    text = uploaded[file_name].decode("utf-8")

print(f"‚úÖ Loaded {len(text)} characters from {file_name}")

Saving AI_SEAS_8525_DA4_Homework #1_Shaikh.pdf to AI_SEAS_8525_DA4_Homework #1_Shaikh.pdf
‚úÖ Loaded 1873 characters from AI_SEAS_8525_DA4_Homework #1_Shaikh.pdf


In [25]:
# ü™Ñ Step 5: Split text into chunks
def split_text(text, chunk_size=1000, overlap=200):
    text = re.sub(r'\s+', ' ', text)
    chunks = []
    start = 0
    while start < len(text):
        end = start + chunk_size
        chunks.append(text[start:end])
        start += chunk_size - overlap
    return chunks

chunks = split_text(text)
print(f"üìö Split into {len(chunks)} chunks")

üìö Split into 3 chunks


In [26]:
# üß© Step 6: Create embeddings and index
embed_model = "models/gemini-embedding-001"
embeddings = []

for chunk in chunks:
    result = genai.embed_content(model=embed_model, content=chunk)
    embeddings.append(result["embedding"])

embeddings = np.array(embeddings, dtype="float32")

index = faiss.IndexFlatL2(embeddings.shape[1])
index.add(embeddings)
print("‚úÖ Vector index built!")



BadRequest: 400 POST https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent?%24alt=json%3Benum-encoding%3Dint: API key not valid. Please pass a valid API key.

In [None]:
# üí¨ Step 7: Define RAG query function
def retrieve(query, k=3):
    q_embed = genai.embed_content(model=embed_model, content=query)["embedding"]
    _, idx = index.search(np.array([q_embed], dtype="float32"), k)
    return [chunks[i] for i in idx[0]]

def ask_study_buddy(query):
    docs = retrieve(query)
    context = "\n\n".join(docs)
    prompt = f"You are Study Buddy, a helpful assistant for learning.\nUse the context below to answer the question concisely and clearly.\n\nContext:\n{context}\n\nQuestion: {query}"
    model_name = "gemini-2.5-flash"
    model = genai.GenerativeModel(model_name)
    response = model.generate_content(prompt)
    return response.text

# üß™ Step 8: Try asking a question
question = "summarize the doc"
print(f"ü§î Q: {question}\n")
print("üí° A:", ask_study_buddy(question))