# Embedding‑Based Retrieval Prompt Lab
Goal: build a mini *Retrieval‑Augmented Generation* pipeline **inside one prompt**.
We will:
1. Create an in‑memory FAISS index of short docs.
2. Embed user question & retrieve top‑k context.
3. Compose an *augmented prompt* fed to an open model.
4. Measure answer quality with and without retrieval.

## 1. Install & import

In [None]:
!pip -q install --upgrade sentence-transformers faiss-cpu transformers

## 2. Toy knowledge base

In [None]:
documents=[
    {'id':0,'text':'The Eiffel Tower is located in Paris and was completed in 1889.'},
    {'id':1,'text':'The capital of Japan is Tokyo.'},
    {'id':2,'text':'Python is a popular programming language created by Guido van Rossum.'},
    {'id':3,'text':'The mitochondrion is the powerhouse of the cell.'},
]

### Build vector index

In [None]:
from sentence_transformers import SentenceTransformer
import faiss, numpy as np

embedder=SentenceTransformer('all-MiniLM-L6-v2')
embs=np.vstack([embedder.encode(d['text']) for d in documents]).astype('float32')
index=faiss.IndexFlatL2(embs.shape[1])
index.add(embs)


## 3. Retrieval helper

In [None]:
def retrieve(query, k=2):
    q_emb=embedder.encode([query]).astype('float32')
    D,I=index.search(q_emb,k)
    return [documents[i]['text'] for i in I[0]]


## 4. Compose augmented prompt

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer
lm_name='gpt2'
tok=AutoTokenizer.from_pretrained(lm_name)
lm=AutoModelForCausalLM.from_pretrained(lm_name)

def answer(question, use_retrieval=True):
    context=''\
        .join([f'- {c}\n' for c in retrieve(question)]) if use_retrieval else ''
    prompt=(
        'Answer the question as best you can.\n'+
        ('\nContext:\n'+context if use_retrieval else '')+
        f'\nQuestion: {question}\nAnswer:'
    )
    inp=tok(prompt, return_tensors='pt')
    out=lm.generate(**inp, max_new_tokens=40)
    ans=tok.decode(out[0], skip_special_tokens=True).split('Answer:')[-1]
    return ans.strip()


## 5. Compare

In [None]:
q='When was the Eiffel Tower finished?'
print('Without retrieval:', answer(q, False))
print('With retrieval:', answer(q, True))

### Exercise
1. Add more docs & try *top‑k=3*.
2. Swap the encoder (e.g., `sentence-transformers/all-mpnet-base-v2`).
3. Measure exact match accuracy across 10 trivia questions.