<a href="https://colab.research.google.com/github/nfpaiva/ml-ai-experiments/blob/main/notebooks/ai-act-chatbot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🤖📚 Talking to the EU AI Act: A Didactic RAG Demo

This notebook demonstrates how to build a **simple Retrieval-Augmented Generation (RAG) system** to interact with large documents — using the **EU Artificial Intelligence Act (AI Act)** as an example.

As organizations move toward compliance with the AI Act and other regulatory frameworks, understanding how to **search, interpret, and explain** legal and technical documents becomes a critical skill.

---

### 🎯 What You’ll Learn and Explore

- ✅ How to load, chunk, and embed long documents (like the full 144-page AI Act PDF)
- ✅ How to build a question-answering system that **retrieves** relevant text and **generates** natural language answers
- ✅ How to apply **open-source models** like FLAN-T5-XL for domain-specific QA
- ✅ How prompt design, chunk size, and retrieval parameters impact results
- ✅ How RAG systems can help bridge the gap between:
  - Legal documents  
  - Business requirements  
  - Technical artifacts (like model cards)

---

### 💡 Why This Matters in the Real World

- 📜 The **AI Act will soon affect most companies** deploying AI/ML in the EU
- 🏢 Companies will need to **reference regulations, model cards, data sheets, and policies** — often written in dense legal or technical language
- 🛠️ A RAG-based assistant can:
  - Help internal teams understand obligations faster
  - Reduce compliance risk
  - Make onboarding and documentation review easier
  - Support fairness, explainability, and transparency goals

---

✨ By the end of this notebook, you’ll have built a working AI assistant that can answer natural-language questions grounded in a real regulatory document — a technique you can extend to many domains.


🔧 1. Install Dependencies

In [None]:
!pip install -q langchain chromadb sentence-transformers pypdf unstructured
!pip install -q transformers accelerate
!pip install -q langchain-huggingface
!pip install -q langchain-community
!pip install pymupdf

📄 2. Download the EU AI Act PDF

In [None]:
# from google.colab import files
!wget -O eu_ai_act.pdf "https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=OJ:L_202401689"

📚 3. Load + Chunk the PDF

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_core.documents import Document
import re

loader = PyMuPDFLoader("eu_ai_act.pdf")
pages = loader.load()

splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
docs = splitter.split_documents(pages)

def clean_text(text):
    text = text.replace("\n", " ")
    text = re.sub(r"\s+", " ", text)  # Collapse multiple spaces
    return text.strip()

docs = [Document(page_content=clean_text(doc.page_content)) for doc in docs]

print(f"Loaded {len(docs)} document chunks.")


🧠 4. Create Embeddings + Vector Store

In [None]:
from langchain.vectorstores import Chroma
from langchain.embeddings import SentenceTransformerEmbeddings
from langchain_huggingface import HuggingFaceEmbeddings


embedding = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
db = Chroma.from_documents(docs, embedding)

retriever = db.as_retriever(search_kwargs={"k": 3})

🤖 5.  Load **FLAN-T5-XL** for generating answers from retrieved document chunks

🧠 About the Language Model: FLAN-T5-XL

This demo uses a powerful open-source model: [**FLAN-T5-XL**](https://huggingface.co/google/flan-t5-xl) from Google.

It’s part of the **FLAN-T5** family — models fine-tuned to follow instructions well.  
We're using it here to answer questions about the EU AI Act using retrieved legal context.

📦 Model Specs
- **Name**: `google/flan-t5-xl`
- **Size**: ~3 billion parameters
- **Max Input Length**: 2048 tokens
- **Strengths**:  
  - Instruction following  
  - Question answering  
  - Summarization  
  - Lightweight enough for Colab GPUs (if used carefully)

🔗 [View full model card on Hugging Face →](https://huggingface.co/google/flan-t5-xl)

---

⏳ **Note**: The model may take **2 minutes to load** on the first run. That’s normal — we’re pulling several GBs of weights and initializing it on the GPU.



In [None]:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline
from langchain.llms import HuggingFacePipeline

model_id = "google/flan-t5-xl"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(
    model_id,
    device_map="auto",         # Uses GPU automatically
    torch_dtype="auto"         # Enables fp16 if possible
)

flan_pipe = pipeline(
    "text2text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=512,
    temperature=0,
)

llm = HuggingFacePipeline(pipeline=flan_pipe)


🔁 6. Build RAG Chain

In [None]:
from langchain.chains import RetrievalQA

qa_chain = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)


💬 7. Talk with AI.Act!

In [None]:
from langchain.llms import HuggingFacePipeline

while True:
    query = input("Ask a question about the EU AI Act (or type 'exit'): ")
    if query.lower() in ['exit', 'quit']:
        break

    retrieved_docs = retriever.get_relevant_documents(query)

    # print("\n🔍 Retrieved Context:")
    # for i, doc in enumerate(retrieved_docs):
    #     print(f"\n--- Chunk {i+1} ---\n{doc.page_content[:1000]}")

    context = "\n\n".join(doc.page_content for doc in retrieved_docs)

    prompt = (
        "You are an expert assistant helping users understand the EU AI Act.\n"
        "Using only the context below, provide a clear and complete answer to the question.\n"
        "Summarize the answer in your own words. If needed, include key phrases or quotes.\n"
        "Do not copy full legal clauses verbatim unless specifically asked.\n"
        "If the context does not contain the answer, say 'Not found in the provided context.'\n\n"
        f"Context:\n{context}\n\n"
        f"Question: {query}\n\nAnswer:"
    )

    response = llm(prompt)
    print("\n🧠 Answer:", response)
