# Detyra 5 — RAG

Qëllimi: retrieval + përgjigje me citime.
Në këtë detyrë, përgjigjja mund të jetë e thjeshtë (template-based) pa LLM.
Theksi është te *grounding*.


In [None]:
# === Setup (run this cell first) ===
import os, json, re, math, random
from pathlib import Path

PROJECT_ROOT = Path("/content")  # In Colab, students will upload the project folder or mount Drive
DOCS_DIR = PROJECT_ROOT / "documents"
QA_PATH = PROJECT_ROOT / "qa" / "qa_benchmark_40.json"

print("Docs dir:", DOCS_DIR)
print("QA path:", QA_PATH)

# Tip: If you're using Google Drive:
# from google.colab import drive
# drive.mount('/content/drive')
# PROJECT_ROOT = Path('/content/drive/MyDrive/<YOUR_PROJECT_FOLDER>')


In [None]:
!pip -q install sentence-transformers faiss-cpu pymupdf
import fitz, re, numpy as np, faiss
from sentence_transformers import SentenceTransformer


## 1) Ndërtoni index si në detyrën 4 (ose importoni kodin)
**TODO:** vendosni një funksion `retrieve(query)->top_chunks`.


In [None]:
def extract_text(pdf_path):
    doc = fitz.open(str(pdf_path))
    txt = ""
    for p in doc:
        txt += p.get_text() + "\n"
    return re.sub(r"\s+", " ", txt).strip()

def segment_units_simple(raw: str):
    split = re.split(r"(?=\b(FRAZA|TERM|Dialog|Gabim)\b)", raw)
    units = []
    buf = ""
    for part in split:
        if part in ["FRAZA","TERM","Dialog","Gabim"]:
            if buf.strip(): units.append(buf.strip())
            buf = part
        else:
            buf += " " + part
    if buf.strip(): units.append(buf.strip())
    return [u for u in units if len(u) > 120]

pdfs = sorted(DOCS_DIR.glob("MIL-ENG-00*.pdf"))
records = []
for pdf in pdfs:
    raw = extract_text(pdf)
    units = segment_units_simple(raw)
    for i,u in enumerate(units):
        records.append({"doc": pdf.name, "unit_id": i, "text": u})

model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
emb = model.encode([r["text"] for r in records], normalize_embeddings=True).astype("float32")
index = faiss.IndexFlatIP(emb.shape[1]); index.add(emb)
print("Units indexed:", index.ntotal)

def retrieve(query, k=5):
    q = model.encode([query], normalize_embeddings=True).astype("float32")
    scores, idx = index.search(q, k)
    out = []
    for s,i in zip(scores[0], idx[0]):
        rec = records[int(i)]
        out.append({"score": float(s), **rec})
    return out


## 2) Përgjigje me citim (template)
Ndërtoni një funksion `answer(query)` që:
1) merr top-3 chunks
2) kthen përgjigje të shkurtër në shqip
3) shton citim: (Doc: ..., unit: ...)
**TODO:** Nëse score është i ulët -> refuzo.


In [None]:
def answer(query, k=3, min_score=0.25):
    hits = retrieve(query, k=k)
    if not hits or hits[0]["score"] < min_score:
        return "Nuk e gjej dot informacionin në dokumente. Ju lutem konsultoni materialin origjinal."
    # Simple grounded answer: return a summary stub + citations
    cited = hits[0]
    resp = (
        f"Përgjigje e bazuar në dokumente:\n"
        f"- Fragmenti më relevant sugjeron: {cited['text'][:180]}...\n"
        f"Citim: (Doc: {cited['doc']}, unit: {cited['unit_id']})"
    )
    return resp

print(answer("Çfarë do të thotë 'Roger' në radio?"))


## 3) Mini-evaluim me 10 pyetje nga benchmark
**TODO:** llogaritni sa herë modeli refuzon vs përgjigjet.


In [None]:
import json, random
with open(QA_PATH, "r", encoding="utf-8") as f:
    qa = json.load(f)["questions"]
sample = random.sample(qa, k=min(10, len(qa)))

for q in sample:
    print("\nQ:", q["question"])
    print(answer(q["question"]))


## Dorëzimi
- Implementim i `retrieve` dhe `answer`
- Demonstrim me pyetje
- Shpjegim i logjikës së refuzimit.
