# Optional Advanced: Citation Management and Verification in RAG

For research, it’s not enough for the LLM to generate plausible answers—we need verifiable sourcing. Reliable citation is **crucial for trust**.

### 1. Getting the Model to Cite Sources
- Format the prompt to enumerate the retrieved documents (e.g. `[1]`, `[2]`, ...).
- Ask the LLM: *"Answer the question and cite the source by number for each claim."*
- Some models (including Qwen, Llama-2-Chat, etc.) will try to comply if prompted clearly.

```text
Prompt Example:
You have these sources:
[1] Lecanemab Phase 3 trial... 
[2] 2024 review paper comparing Alzheimer drugs...
Question: What are the side effects of lecanemab? Cite your answer with [1], [2], etc.
```


### 2. Citation Hallucination: A Serious Risk
- LLMs may **make up** citations. For instance, they might put "[1]" on the wrong fact, or invent a plausible but non-existent reference.
- They may quote/paraphrase in a way the original source doesn't match exactly—requiring careful checking.

### 3. Techniques to Verify Citations
- **String matching:** Search the cited document for the exact phrase/claim the LLM produced.
- **Fuzzy matching:** Use similarity scoring (e.g., Levenshtein distance, `difflib.SequenceMatcher`, or `fuzzball`) to see if a paraphrased fact from the LLM is present in the cited source.
- **Automated routines vs. manual spot-checks:** For high-stakes answers (theses, law, medicine), *manual verification* is needed. In informal or exploratory contexts, automated routines help scale the process.

In [None]:
# Example: Fuzzy matching of LLM answer to cited document
import difflib
llm_claim = "about 12% of lecanemab patients experienced ARIA-E (brain swelling)"
source_text = "In the Phase 3 trial, ARIA-E occurred in 12.6% of patients treated with lecanemab. ARIA-E refers to amyloid-related imaging abnormalities, i.e., brain swelling."

sm = difflib.SequenceMatcher(None, llm_claim.lower(), source_text.lower())
print(f"Match score: {sm.ratio()*100:.1f}%") # >0.8 is typically a strong match (i.e., verifiable citation)


> **Summary:**
Robust research RAG systems use prompts that **require** citation, and they check those citations (at least by string/fuzzy match) to flag possible hallucination.
- In future sessions: we could explore how to programmatically enforce or automate even more robust citation pipelines!

### Reflection
How would ensuring reliable citations change your *trust* in using LLM-powered assistants for research? Do you see yourself using a RAG system for serious work if citation verification is robust?

In [None]:
from utils import create_answer_box
create_answer_box('📝 **Your Answer:** Verifiable citation would matter to me because ...', question_id='opt_citation_importance')