# 🔍 LLM-based Document Compression

## 🎯 Obiettivo

Utilizzare un **LLM come filtro intelligente** per decidere **quali documenti recuperati sono davvero rilevanti** per rispondere a una domanda, riducendo così il carico sul LLM nella fase finale di generazione.

---

## 📦 Cos’è un Document Compressor?

Un **compressore** basato su LLM:

* Valuta **uno a uno** i documenti recuperati
* Verifica se il contenuto è **utile, accurato, pertinente e completo** rispetto alla domanda
* Ritorna `True` se il documento è rilevante, altrimenti `False`

💡 Questo approccio è **più sofisticato** del semplice Top-K o del reranking, perché valuta **la pertinenza semantica completa**.

In [1]:
from langchain.schema import Document
from langchain_community.document_loaders import DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_community.vectorstores import Chroma

from dotenv import load_dotenv

loader = DirectoryLoader("./data", glob="**/*.txt")

docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=120,
    chunk_overlap=20,
    length_function=len,
    is_separator_regex=False
)

embedding_function = OpenAIEmbeddings()

model = ChatOpenAI()

chunks = text_splitter.split_documents(docs)

db = Chroma.from_documents(chunks, embedding_function)

retriever = db.as_retriever()

libmagic is unavailable but assists in filetype detection. Please consider installing libmagic for better results.
libmagic is unavailable but assists in filetype detection. Please consider installing libmagic for better results.
libmagic is unavailable but assists in filetype detection. Please consider installing libmagic for better results.


## Prepariamo un prompt per far si che il modello valuti i documenti

In [2]:
from langchain.prompts import PromptTemplate

DOCUMENT_EVALUATOR_PROMPT = PromptTemplate(
    input_variables = ['document', 'question'],
    template="""You are an AI language model assistant. Your task is to evaluate the provided document to determine if it is suited to answer the given user question. Assess the document for its relevance to the question, the completeness of information, and the accuracy of the content.
    
    Original question: {question}
    Document for Evaluation: {document}
    Evaluation Result: <<'True' if the document is suited to answer the question, 'False' if it is not>>

    Note: Conclude with a 'True' or 'False' based on your analysis of the document's relevance, copleteness, and accuracy in relation to the question.
    """
)

In [3]:
# creiamo una lista di Document
from langchain.schema import Document

documents = [
    Document(page_content="The owner is Guivanni"),
    Document(page_content="Pizza Salami costs 10$"),
    Document(page_content="We close the restaurant at 10p.m each day")
]


model = ChatOpenAI()

compression_chain = DOCUMENT_EVALUATOR_PROMPT | model | StrOutputParser()

In [4]:
compression_chain.invoke(
    {"question": "Who is the owner of the restaurant", "document": documents[0]}
)

"Evaluation Result: True\n\nThe document clearly states that the owner of the restaurant is Guivanni, which directly addresses the user's question. It is relevant, complete, and accurate in providing the requested information."

---

## 🔁 Funzione dinamica: `evaluate_documents`

```python
def evaluate_documents(inputs):
    question = inputs["question"]
    docs = inputs["documents"]

    results = []
    for doc in docs:
        response = compression_chain.invoke({
            "question": question,
            "document": doc.page_content
        })
        # Interpreta la stringa come booleano
        is_relevant = response.strip().lower() == "true"
        results.append(is_relevant)

    # Filtro documenti
    filtered_docs = [doc for doc, keep in zip(docs, results) if keep]
    return {"documents": filtered_docs}
```

In [5]:
import re

def evaluate_documents(input: dict):
    documents = input.get("documents", [])
    question = input.get("question")

    DOCUMENT_EVALUATOR_PROMPT = PromptTemplate(
    input_variables = ['document', 'question'],
    template="""You are an AI language model assistant. Your task is to evaluate the provided document to determine if it is suited to answer the given user question. Assess the document for its relevance to the question, the completeness of information, and the accuracy of the content.
    
    Original question: {question}
    Document for Evaluation: {document}
    Evaluation Result: <<'True' if the document is suited to answer the question, 'False' if it is not>>

    Note: Conclude with a 'True' or 'False' based on your analysis of the document's relevance, copleteness, and accuracy in relation to the question.
    """
    )
    model = ChatOpenAI()

    compression_chain = DOCUMENT_EVALUATOR_PROMPT | model | StrOutputParser()

    results = []

    for document in documents:
        evaluation_result = compression_chain.invoke(
            {"document": document.page_content, "question": question}
        )
        result = bool(re.search(r"\btrue\b", evaluation_result.lower()))
        print(result)

        results.append(result) # avremo una lista di booleani

    filtered_documents = [doc for doc, res in zip(documents, results) if res]

    return filtered_documents

In [6]:
_input = {
    "documents": [
        Document(page_content="The owner is Guivanni"),
        Document(page_content="Pizza Salami costs 10$"),
        Document(page_content="We close the restaurant at 10p.m each day")
    ],
    "question": "Who is the owner of the restaurant?"
}


results = evaluate_documents(_input)
print(results)

True
False
False
[Document(metadata={}, page_content='The owner is Guivanni')]




---

## ✅ Vantaggi del Document Compressor

| ✅ Vantaggi                                 | ❌ Limiti                         |
| ------------------------------------------ | -------------------------------- |
| Migliore precisione nel filtrare documenti | Più costoso (chiama LLM n volte) |
| Elimina documenti irrilevanti              | Lento su grandi volumi           |
| Rileva contenuti non informativi           | Richiede prompt tuning           |

---

## 🔚 Conclusione

La **compressione basata su LLM** è un potente strumento per:

* Migliorare l'efficienza e l'accuratezza di una pipeline RAG
* Filtrare documenti inutili o fuorvianti
* Ridurre il carico computazionale sulla fase di generazione

🔜 **Prossima lezione**: **Routing**: come indirizzare query diverse verso percorsi diversi nella pipeline usando agenti o template dinamici.
