# Rag-Python Glossary Q&A Bot
A lightweight Retrieval-Augmented Generation (RAG) assistant that answers questions using the official Python 3.12 Glossary.

### 1. IMPORTS

In [1]:
from PyPDF2 import PdfReader
import spacy
from sentence_transformers import SentenceTransformer
import chromadb
from openai import OpenAI
from dotenv import load_dotenv
import os

  from .autonotebook import tqdm as notebook_tqdm


### 2. CONFIGURATION â€“ OPENROUTER API KEY

In [2]:
load_dotenv()
os.environ["OPENROUTER_API_KEY"] = os.getenv('OPENROUTER_API_KEY')
API_KEY = os.environ["OPENROUTER_API_KEY"]
print("API key used successfully!")

API key used successfully!


### 3. LOAD AND EXTRACT TEXT FROM PDF

In [3]:
print("Loading PDF ->")
reader = PdfReader("python_glossary.pdf")
text = "\n".join(page.extract_text() + "\n" for page in reader.pages)
print(f"   Extracted text from {len(reader.pages)} pages")

Loading PDF ->
   Extracted text from 18 pages


### 4. CHUNK TEXT INTO SENTENCES USING spaCy

In [4]:
print("Chunking text into semantic chunks with spaCy...")

import spacy

# Load spaCy model
nlp = spacy.load("en_core_web_sm")

doc = nlp(text)

chunks = []
current_chunk = ""
MAX_CHUNK_LEN = 350  # characters

for sent in doc.sents:
    sentence = sent.text.strip()

    # Skip very small or useless lines
    if len(sentence) < 5:
        continue

    # If chunk becomes too large, save it and start new
    if len(current_chunk) + len(sentence) > MAX_CHUNK_LEN:
        chunks.append(current_chunk.strip())
        current_chunk = sentence
    else:
        current_chunk += " " + sentence

# Add last chunk
if current_chunk:
    chunks.append(current_chunk.strip())

print(f"   Created {len(chunks)} semantic chunks")

Chunking text into semantic chunks with spaCy...
   Created 172 semantic chunks


### 5. LOAD HUGGING FACE EMBEDDING MODEL

In [5]:
print("Loading embedding model ->")
embedder = SentenceTransformer("all-MiniLM-L6-v2")

Loading embedding model ->


### 6. SETUP CHROMA VECTOR DATABASE (FAISS backend)

In [6]:
print("Initializing ChromaDB ->")
client = chromadb.PersistentClient(path="glossary_db")
collection = client.get_or_create_collection(name="glossary")

if collection.count() == 0:
    print("   First run detected, embedding and saving chunks...")
    embeddings = embedder.encode(chunks, show_progress_bar=True).tolist()
    collection.add(
        documents=chunks,
        embeddings=embeddings,
        ids=[f"chunk_{i}" for i in range(len(chunks))]
    )
    print("   Database built and saved to 'glossary_db/'")
else:
    print("   Loading existing database from 'glossary_db/'")

Initializing ChromaDB ->
   Loading existing database from 'glossary_db/'


### 7. INITIALIZE LLM VIA OPENROUTER (Llama 3.1 8B Instruct)

In [7]:
llm = OpenAI(base_url="https://openrouter.ai/api/v1", api_key=API_KEY)

### 8. INTERACTIVE Q&A LOOP

In [8]:
print("\n" + "=" * 70)
print("     Rag-Python Glossary Q&A Bot is READY!     ")
print("     Powered by Llama 3.1 8B Instruct       ")
print("     Type 'quit', 'exit', or 'bye' to stop  ")
print("=" * 70 + "\n")

while True:
    question = input("You:").strip()

    if question.lower() in {"quit", "exit", "bye", "q"}:
        print("\nHappy learning! Come back anytime.")
        break

    if not question:
        continue

    print("   Searching glossary...", end="\r")

    # Retrieve relevant context
    query_embedding = embedder.encode([question]).tolist()
    results = collection.query(query_embeddings=query_embedding, n_results=5)
    context = "\n\n".join(results["documents"][0])

    print("   Generating answer...     ")

    # Call LLM with context
    response = llm.chat.completions.create(
        model="meta-llama/llama-3.1-8b-instruct",
        messages=[
            {
                "role": "system",
                "content": "You are a precise and helpful Python documentation assistant. "
                           "Answer questions using ONLY the provided glossary context. "
                           "Be concise, accurate, and professional."
            },
            {
                "role": "user",
                "content": f"Context:\n{context}\n\nQuestion: {question}"
            }
        ],
        temperature=0.1,
        max_tokens=500,
    )

    answer = response.choices[0].message.content
    print(f"RAG-Bot: {answer}\n")


     Rag-Python Glossary Q&A Bot is READY!     
     Powered by Llama 3.1 8B Instruct       
     Type 'quit', 'exit', or 'bye' to stop  



You: How to use print command?


   Generating answer...     
RAG-Bot: The `print` command is used to output text or values to the screen. It can be used in various ways, such as:

* `print("Hello, World!")` to print a string
* `print(123)` to print an integer
* `print(food)` to print the contents of a list or other iterable

In the context of the provided examples, `print` is used to output the elements of the `food` list, like this: `print(piece)`.

Note that the `print` command is not explicitly defined in the provided context, but it is a built-in function in Python that is commonly used for outputting values.



You: quit



Happy learning! Come back anytime.
