# 🧩 **LangChain Expression Language (LCEL) — Deep Dive**

## 🎓 **Cos’è LCEL**

* LCEL (**LangChain Expression Language**) è una **sintassi dichiarativa** per **costruire catene** di elaborazione dati.
* Si ispira al comportamento **pipe** di **Linux**, dove l’output di un comando diventa l’input del successivo.
* In Python, LCEL realizza questo comportamento sfruttando il **sovraccarico di operatori** (`|`, `>>` ecc.).

---

## ⚙️ **Primo Esempio Pratico**

### 📌 **Obiettivo**

Creare una **semplice catena**:

1. Prompt →
2. Modello di chat (OpenAI) →
3. Parser di output.

---

### ✅ **Notebook di Esempio**

* File: `00_lcel.ipynb`
* Contiene tutti i passaggi per riprodurre la catena di base.

---

### 🗂️ **Componenti Importati**

* `ChatOpenAI` → Istanza del modello di chat.
* `ChatPromptTemplate` → Definizione del **prompt dinamico**.
* `StrOutputParser` → Parsing dell’output del modello in stringa.
* Lettura **API Key** OpenAI da variabile d’ambiente per eseguire il codice.

---

## ✏️ **Creazione del Prompt**

```python
from langchain.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template(
    "Tell me a short joke about {topic}."
)
```

* **Variabile:** `topic`
* Esempio: `gelato`

---

## 🔗 **Creazione della Catena**

```python
from langchain.chat_models import ChatOpenAI
from langchain.schema import StrOutputParser

llm = ChatOpenAI()
output_parser = StrOutputParser()

chain = prompt | llm | output_parser
```

* `|` = Operatore pipe sovraccaricato!
* **Ordine:** prompt → modello → parser

---

## ⚙️ **Esecuzione della Catena**

```python
response = chain.invoke({"topic": "ice cream"})
print(response)
```

* **Output atteso:**
  *“Why did the ice cream truck break down? Because it had too many scoops on board!”*


In [1]:
from dotenv import load_dotenv
load_dotenv()

True

In [2]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
import os

In [3]:
prompt = ChatPromptTemplate.from_template("tell me a short jike about {topic}")
print(prompt)

input_variables=['topic'] input_types={} partial_variables={} messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['topic'], input_types={}, partial_variables={}, template='tell me a short jike about {topic}'), additional_kwargs={})]


In [4]:
model = ChatOpenAI()
output_parser = StrOutputParser()

chain = prompt | model | output_parser

chain.invoke({"topic": "ice cream"})

'Why did the ice cream cone go to therapy?\nBecause it wanted to find its true self-scoop!'

---

## 🧩 **Come Funziona Sotto il Cofano**

1. **Sovraccarico Operatore Pipe (`|`)**

   * LangChain implementa una **interfaccia `Runnable`**.
   * Gli oggetti (`Prompt`, `LLM`, `Parser`) sono tutti `Runnable`.
   * Il pipe crea una **sequenza di Runnables** concatenati.

2. **Metodo `invoke`**

   * Ogni `Runnable` ha `invoke()`.
   * Puoi invocare:

     * La **catena completa**.
     * Ogni singolo step.

---

### 🔍 **Esempio Interno**

* Il **prompt** quando invocato restituisce:

  ```python
  {'messages': [HumanMessage(content="Tell me a short joke about ice cream.")]}
  ```

* Il **modello** quando invocato genera:

  ```python
  {'messages': [AIMessage(content="Why did the ice cream truck...")]}
  ```

In [6]:
print(type(prompt))

<class 'langchain_core.prompts.chat.ChatPromptTemplate'>


In [5]:
prompt.invoke({"topic": "ice cream"})

ChatPromptValue(messages=[HumanMessage(content='tell me a short jike about ice cream', additional_kwargs={}, response_metadata={})])

In [7]:
from langchain_core.messages.human import HumanMessage

messages = [HumanMessage(content="tell me a short joke about ice cream")]
model.invoke(messages)

AIMessage(content='Why did the ice cream truck break down?\nIt had too many "licks" and not enough "cylinders"!', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 26, 'prompt_tokens': 15, 'total_tokens': 41, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-Bt8MQtcd9Nktn7YmxKpG3xUdyBxiB', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None}, id='run--3bf5362f-0589-4c0e-b73d-721ed43e4c9d-0', usage_metadata={'input_tokens': 15, 'output_tokens': 26, 'total_tokens': 41, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

---

## ⚡ **Ereditarietà Classi**

* `ChatOpenAI` → `BaseChatModel` → `BaseLanguageModel` → `RunnableSerializable`.
* Tutti gli step condividono la logica di esecuzione `invoke()` e `|`.

---

## ✅ **Key Takeaway**

| 📝 | **LCEL ti permette di combinare blocchi modulari (`Runnables`) in catene, eseguendoli in sequenza in modo semplice e leggibile.** |
| -- | --------------------------------------------------------------------------------------------------------------------------------- |

---

## ▶️ **Prossimo Passo**

Nel prossimo modulo:

* **Costruirai la tua versione personalizzata** di un’espressione LCEL.
* Imparerai a definire operatori complessi e pipe dinamici.

---

# 🧠 **Creare un LangChain Expression Language (LCEL) Personalizzato**

## 🧩 Obiettivo

Costruire una **versione semplificata e personalizzata** del comportamento LCEL, ovvero la **logica a pipeline (`|`)** usata da LangChain per concatenare operazioni.

---

## 🧱 **Concetti Chiave**

* In Python, non esiste nativamente un operatore **pipe (`|`)** come in Linux.
* LangChain **sovraccarica gli operatori** (`|`, `>>`, ecc.) per collegare componenti (prompt, modelli, parser).
* Per replicarlo, creiamo una **classe base `Runnable`** e una **classe `RunnableSequence`**.

---

## 🔧 **Passaggio 1: Creare la Classe Base `CRunnable`**

In [8]:
from abc import ABC, abstractmethod

class CRunnable(ABC):
    def __init__(self):
        self.next = None

    # decoratore della libreria abc
    # indica che il metodo process deve essere obbligatoriamente 
    # implementato da ogni sottoclasse 
    @abstractmethod
    def process(self, data):
        """
        This method must be implemented by subclasses to define 
        data processing behavior.
        """
        pass

    def invoke(self, data):
        processed_data = self.process(data)
        if self.next is not None:
            return self.next.invoke(processed_data)
        return processed_data
    
    # metodo speciale che permette di sovraccaricare l'operatore | (pipe)
    # a | b --> a.__or__(b)
    # fa si che CRunnable supporti la sintassi obj1 | obj2
    # che restituisce un oggetto CRunnableSequence contenente entrambi
    def __or__(self, other):
        return CRunnableSequence(self, other)
    

class CRunnableSequence(CRunnable):
    def __init__(self, first, second):
        super().__init__()
        self.first = first
        self.second = second

    def process(self, data):
        return data
    
    def invoke(self, data):
        first_result = self.first.invoke(data)
        return self.second.invoke(first_result)

---

## 🔁 **Passaggio 2: Costruire la Classe `CRunnableSequence`**

```python
class CRunnableSequence(CRunnable):
    def __init__(self, first, second):
        super().__init__()
        self.first = first
        self.second = second

    def invoke(self, data):
        # Esegue il primo runnable
        result1 = self.first.invoke(data)
        # Passa il risultato al secondo runnable
        result2 = self.second.invoke(result1)
        return result2

    def process(self, data):
        # Il metodo process non modifica nulla
        return data
```

---

## ⚙️ **Passaggio 3: Definire dei Runnable Personalizzati**

### ➕ Aggiungi 10

```python
class AddTen(CRunnable):
    def process(self, data):
        return data + 10
```

### ✖️ Moltiplica per 2

```python
class MultiplyByTwo(CRunnable):
    def process(self, data):
        return data * 2
```

### 🔤 Converti in Stringa

```python
class ToString(CRunnable):
    def process(self, data):
        return str(data)
```

In [9]:
class AddTen(CRunnable):
    def process(self, data):
        print("AddTen: ", data)
        return data +10
    

class MultiplyByTwo(CRunnable):
    def process(self, data):
        print("Multiply by 2: ", data)
        return data * 2
    

class ConvertToString(CRunnable):
    def process(self, data):
        print("Convert to string: ", data)
        return f"Result: {data}"

In [10]:
a = AddTen()
b = MultiplyByTwo()
c = ConvertToString()

chain = a | b | c

In [11]:
result = chain.invoke(10)

print(result)

AddTen:  10
Multiply by 2:  20
Convert to string:  40
Result: 40


---

## 🔗 **Passaggio 4: Comporre la Catena**

```python
chain = AddTen() | MultiplyByTwo() | ToString()
```

* Il `|` crea una sequenza eseguibile con pipe, come in LangChain.
* Internamente esegue:

  1. `data + 10`
  2. `* 2`
  3. `str()`

---

## ▶️ **Eseguire la Catena**

```python
result = chain.invoke(10)
print(result)  # Output: "40"
```

🧠 **Step-by-step:**

* 10 + 10 = **20**
* 20 × 2 = **40**
* str(40) = **"40"**

---

## ✅ **Cosa Abbiamo Imparato**

| Componente  | Funzione                                            |                                  |
| ----------- | --------------------------------------------------- | -------------------------------- |
| `CRunnable` | Interfaccia base per tutti i blocchi della catena   |                                  |
| `__or__()`  | Sovraccarica \`                                     | \` per creare pipe tra Runnables |
| `invoke()`  | Metodo pubblico per eseguire il pipeline            |                                  |
| `process()` | Metodo da personalizzare per ogni step della catena |                                  |

---

## 🏆 **Conclusione**

Hai costruito un **mini motore LCEL** personalizzato in Python!
Con questo approccio puoi concatenare qualsiasi tipo di trasformazione in modo elegante e leggibile.

---

## 🔜 **Prossimo Modulo**

Nel prossimo video esamineremo i **runnables più importanti offerti da LangChain**, come:

* `RunnableLambda`
* `RunnablePassthrough`
* `RunnableMap`
* `RunnableSequence` (interno)


# 🧠 LangChain — I Runnables Più Importanti (Deep Dive)

## 📚 Cos’è un Runnable?

Un `Runnable` è una **unità componibile ed eseguibile** all'interno di LangChain. Ogni oggetto `Runnable` espone un'interfaccia standard (`invoke`, `stream`, `batch`, ecc.) ed è compatibile con l’**operatore `|`** per creare pipeline.

---

## 🔁 `RunnablePassthrough`

### 🔹 Descrizione

Un **Runnable che non modifica l’input**: lo passa semplicemente al passo successivo.

### 🔸 Uso tipico

Utile per debugging o all'interno di `RunnableParallel`.

In [12]:
from langchain_core.runnables import RunnablePassthrough

chain = RunnablePassthrough() | RunnablePassthrough() | RunnablePassthrough()

chain.invoke("hello")

'hello'

---

## 🧮 `RunnableLambda`

### 🔹 Descrizione

Permette di **inserire una funzione Python arbitraria** all’interno di una catena `Runnable`.

### 🔸 Esempio

In [13]:
def input_to_upper(input: str):
    output = input.upper()

    return output

In [14]:
from langchain_core.runnables import RunnableLambda

chain = RunnablePassthrough() | RunnableLambda(input_to_upper) | RunnablePassthrough()
chain.invoke("hello")

'HELLO'

---

## 🌿 `RunnableParallel`

### 🔹 Descrizione

Esegue più Runnables **in parallelo** (non concorrente), creando un dizionario con più risultati.

### 🔸 Uso base

In [15]:
from langchain_core.runnables import RunnableParallel

chain = RunnableParallel({"x": RunnablePassthrough(), "y": RunnablePassthrough()})

chain.invoke("hello")

{'x': 'hello', 'y': 'hello'}

In [16]:
chain.invoke({"input": "hello", "input2": "goodbye"})

{'x': {'input': 'hello', 'input2': 'goodbye'},
 'y': {'input': 'hello', 'input2': 'goodbye'}}

In [17]:
chain = RunnableParallel({"x": RunnablePassthrough(), "y": lambda z: z["input2"]})

chain.invoke({"input": "hello", "input2": "goodbye"})

{'x': {'input': 'hello', 'input2': 'goodbye'}, 'y': 'goodbye'}

## Nested Chains

In [18]:
def find_keys_to_uppercase(input: dict):
    output = input.get("input", "not found").upper()
    return output

In [19]:
chain = RunnableParallel({"x": RunnablePassthrough() | RunnableLambda(find_keys_to_uppercase), 
                          "y": lambda z: z["input2"]})


chain.invoke({"input": "hello", "input2": "goodbye"})

{'x': 'HELLO', 'y': 'goodbye'}

In [21]:
chain = RunnableParallel({"x": RunnablePassthrough()})

def assign_func(_):
    return 100

def multiply(input):
    return input * 10

In [22]:
chain.invoke({"input": "hello", "input2": "goodbye"})

{'x': {'input': 'hello', 'input2': 'goodbye'}}

In [23]:
chain = RunnableParallel({"x": RunnablePassthrough()}).assign(extra=RunnableLambda(assign_func))

result = chain.invoke({"input": "hello", "input2": "goodbye"})
print(result)

{'x': {'input': 'hello', 'input2': 'goodbye'}, 'extra': 100}


## Combine multiple chains

In [24]:
def extractor(input: dict):
    return input.get("extra", "Key not found")


def cupper(upper: str):
    return str(upper).upper()

new_chain = RunnableLambda(extractor) | RunnableLambda(cupper)

In [25]:
new_chain.invoke({"extra": "test"})

'TEST'

In [26]:
final_chain = chain | new_chain

final_chain.invoke({"input": "hello", "input2": "goodbye"})

'100'

---

## ✅ Punti Chiave

| Runnable              | Scopo principale                                       |
| --------------------- | ------------------------------------------------------ |
| `RunnablePassthrough` | Passa l’input senza modificarlo                        |
| `RunnableLambda`      | Avvolge funzioni Python arbitrarie                     |
| `RunnableParallel`    | Crea output paralleli da uno stesso input              |
| `RunnableAssign`      | Aggiunge o sovrascrive chiavi in un dizionario d'input |

---

## 🔜 Prossima Lezione

Nel prossimo modulo:
➡️ **Esempi reali** di pipeline RAG con LangChain, combinando `LCEL`, retrieval, parsing e valutazione.

# 🚀 **Real-World RAG Pipeline con LangChain e LCEL**

## 🧠 Obiettivo della Lezione

Costruire una **pipeline RAG** reale usando:

* Prompt dinamici
* Modello `ChatOpenAI`
* Retriever basato su `Chroma`
* Embedding con `OpenAIEmbeddings`
* Output parser
* LCEL (LangChain Expression Language)

---

## 🔐 Setup Iniziale

In [27]:
from langchain_core.prompts import ChatPromptTemplate 

import os
from dotenv import load_dotenv
load_dotenv()

True

In [28]:
prompt = ChatPromptTemplate.from_template("Tell me an interesting fact about {topic}")

prompt_val = prompt.invoke({"topic": "dog"})

print(prompt_val)

messages=[HumanMessage(content='Tell me an interesting fact about dog', additional_kwargs={}, response_metadata={})]


In [29]:
print(prompt_val.to_messages())

[HumanMessage(content='Tell me an interesting fact about dog', additional_kwargs={}, response_metadata={})]


In [30]:
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-3.5-turbo")

result = model.invoke(prompt_val)
result

AIMessage(content='Dogs have an incredible sense of smell, with some breeds being able to detect scents up to 100,000 times better than humans. This ability has led to them being used in a variety of jobs such as search and rescue, bomb detection, and even medical scent detection to detect illnesses like cancer.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 62, 'prompt_tokens': 14, 'total_tokens': 76, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-BtAszOhBoGxnOFleDI0EIw04RGc7G', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None}, id='run--e855543a-5000-48d8-b356-55558e79e7fc-0', usage_metadata={'input_tokens': 14, 'output_tokens': 62, 'total_tokens': 76, 'input_token_details': {'audio

In [31]:
from langchain_core.output_parsers import StrOutputParser

output_parser = StrOutputParser()

output_parser.invoke(result)

'Dogs have an incredible sense of smell, with some breeds being able to detect scents up to 100,000 times better than humans. This ability has led to them being used in a variety of jobs such as search and rescue, bomb detection, and even medical scent detection to detect illnesses like cancer.'

## Now let's do this LCEL

In [32]:
prompt = ChatPromptTemplate.from_template("Tell me an interesting fact about {topic}")

model = ChatOpenAI()

output_parser = StrOutputParser()

basicchain = model | output_parser 

In [33]:
basicchain.invoke("hello!")

'Hello! How can I assist you today?'

In [34]:
chain = prompt | model | output_parser

chain.invoke({"topic": "dog"})

'Dogs have an incredible sense of smell, with some breeds being able to detect scents at concentrations as low as one part per trillion. This makes them invaluable in search and rescue operations, detecting drugs and explosives, and even diagnosing medical conditions such as cancer in humans.'

---

## 🧱 Costruzione Prompt RAG

```python
prompt = ChatPromptTemplate.from_template(
    "Answer the question based only on the context below.\n\nContext:\n{context}\n\nQuestion:\n{question}"
)
```

* **Variabili richieste**: `context`, `question`
* 🧩 Il prompt costringe il modello a usare **solo il contesto recuperato**, non la conoscenza interna.

---

## 📥 Input e Output

| Step            | Input                               | Output                             |
| --------------- | ----------------------------------- | ---------------------------------- |
| Prompt Template | `{"context": ..., "question": ...}` | Lista di messaggi (`HumanMessage`) |
| ChatOpenAI      | Lista messaggi                      | Messaggio `AIMessage`              |
| StrOutputParser | Messaggio `AIMessage`               | `String` (solo testo)              |

---

## 📚 Embedding e Archivio Vettoriale

```python
from langchain.schema import Document

documents = [
    Document(page_content="Dogs love to eat meat and kibble."),
    Document(page_content="Cats enjoy eating fish and chicken.")
]

vectorstore = Chroma.from_documents(
    documents=documents,
    embedding=OpenAIEmbeddings()
)

retriever = vectorstore.as_retriever()
```

* I documenti vengono **embedded** e salvati localmente.
* `retriever.invoke(query)` restituisce i documenti più simili.

In [36]:
from langchain.schema import Document
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma 
from langchain_core.runnables import RunnablePassthrough


embedding_function = OpenAIEmbeddings()

docs = [
    Document(
        page_content="the dog loves to eat pizza", metadata={"source": "animal.txt"}
    ),
    Document(
        page_content="the cat loves to eat lasagna", metadata={"source": "animal.txt"}
    )
]


db = Chroma.from_documents(docs, embedding_function)
retriever = db.as_retriever()

In [37]:
retriever.get_relevant_documents("What does the dog want to eat?")

  retriever.get_relevant_documents("What does the dog want to eat?")


[Document(metadata={'source': 'animal.txt'}, page_content='the dog loves to eat pizza'),
 Document(metadata={'source': 'animal.txt'}, page_content='the cat loves to eat lasagna')]

In [38]:
retriever.invoke("What does the dog wnat to eat?")

[Document(metadata={'source': 'animal.txt'}, page_content='the dog loves to eat pizza'),
 Document(metadata={'source': 'animal.txt'}, page_content='the cat loves to eat lasagna')]

In [39]:
# creiamo un semplice prompt per RAG
template = """Answer the question based only on the following context:
{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)
model = ChatOpenAI()

In [40]:
from operator import itemgetter

retrieval_chain = (
    {
        "context": (lambda x: x["question"]) | retriever,
        # "question": lambda x: x["question"]
        "question": itemgetter("question")
    }
    | prompt
    | model
    | StrOutputParser()
)

In [41]:
retrieval_chain.invoke({"question": "What does the dog like to eat?"})

'The dog loves to eat pizza.'

In [42]:
# possiamo fare anche nel seguente modo
# input come stringa

retrieval_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

In [43]:
retrieval_chain.invoke("What does the dog like to eat?")

'The dog loves to eat pizza.'

✅ Più leggibile
✅ Comportamento identico (se il retriever accetta una `str`)

---

## 🧪 Esecuzione

```python
response = rag_chain.invoke("What do dogs like to eat?")
print(response)
# Output: Risposta generata basata sul contesto dei documenti
```

---

## 💡 Concetti Importanti

| Concetto                | Spiegazione                                          |    |
| ----------------------- | ---------------------------------------------------- | -- |
| **Retriever**           | Converte query in embedding e trova documenti simili |    |
| **Prompt Template**     | Usa `{context}` e `{question}` come segnaposto       |    |
| **StrOutputParser**     | Estrae il contenuto del messaggio AI                 |    |
| **RunnableLambda**      | Serve per manipolare input dinamicamente             |    |
| **RunnablePassthrough** | Passa input invariato (es. `question`)               |    |
| **LCEL Pipeline**       | Permette di collegare componenti con \`              | \` |

---

## 📦 Diagramma Semplificato

```
Input (stringa: domanda)
    │
    ├── context ──> Retriever ──┐
    │                          │
    └── question ──────────────┘
           │
        Prompt Template
           │
        ChatOpenAI
           │
     StrOutputParser
           │
         Risposta
```

---

## 🔜 Prossimo Step

📌 **Integrazione della cronologia della chat** (chat history) nelle catene RAG:

* Tenere conto del **contesto conversazionale**
* Migliorare l’esperienza dei chatbot nel mondo reale

# 💬 **Gestione della Cronologia delle Chat in LangChain (con RAG + LCEL)**

## ❓ Il Problema

* Nelle **chat multi-turno**, l'utente può porre **domande dipendenti dal contesto**:

  > Utente: Cosa piace mangiare ai cani?
  > Bot: I cani amano la pizza.
  > Utente: Davvero?

* La domanda `"Davvero?"` non ha senso da sola.

* Se eseguiamo una **vector search** solo su `"Davvero?"`, otterremo documenti **non pertinenti** → output sbagliato.

In [44]:
from langchain.schema import Document
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
import os


embedding_function = OpenAIEmbeddings()

docs = [
    Document(
        page_content="the dog loves to eat pizza", metadata={"source": "animal.txt"}
    ),
    Document(
        page_content="the cat loves to eat lasagna", metadata={"source": "animal.txt"}
    )
]


db = Chroma.from_documents(docs, embedding_function)


retriever = db.as_retriever()

In [45]:
retriever.invoke("What exactly?")

[Document(metadata={'source': 'animal.txt'}, page_content='the dog loves to eat pizza'),
 Document(metadata={'source': 'animal.txt'}, page_content='the dog loves to eat pizza'),
 Document(metadata={'source': 'animal.txt'}, page_content='the cat loves to eat lasagna'),
 Document(metadata={'source': 'animal.txt'}, page_content='the cat loves to eat lasagna')]

---

## 🧠 Soluzione: **Rewriting della domanda**

👉 LLM viene usato per **riscrivere la domanda** in modo autonomo (self-contained).

### 📘 Prompt di riscrittura


In [46]:
from langchain.prompts.prompt import PromptTemplate

rephrase_template = """Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:
{chat_history}

Follow Up Input: {question}
Standalone question:
"""

REPHRASE_TEMPLATE = PromptTemplate.from_template(rephrase_template)

In [47]:
from langchain_core.messages import AIMessage, HumanMessage
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser

rephrase_chain = REPHRASE_TEMPLATE | ChatOpenAI(temperature=0) | StrOutputParser()

In [48]:
rephrase_chain.invoke(
    {
        "question": "No, really?",
        "chat_history": [
            HumanMessage(content="What does the dog like to eat?"),
            AIMessage(content="Thuna!")
        ]
    }
)

'Is that really what the dog likes to eat?'


## 🧱 Catena di Recupero (RAG)

### Prompt RAG

---

## 🔗 Combinazione Finale: Rewriting + Retrieval + Generation

In [49]:
from langchain_core.prompts import ChatPromptTemplate

template = """Answer the question based only on the following context:
{context}

Question: {question}
"""


ANSWER_PROMPT = ChatPromptTemplate.from_template(template)

In [50]:
from langchain_core.runnables import RunnablePassthrough

retrieval_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | ANSWER_PROMPT
    | ChatOpenAI(temperature=0)
    | StrOutputParser()
)

In [51]:
final_chain = rephrase_chain | retrieval_chain

In [52]:
final_chain.invoke(
    {
        "question": "No, really?",
        "chat_history": [
            HumanMessage(content="What does the dog like to eat?"),
            AIMessage(content="Thuna!")
        ]
    }
)

'Yes, based on the provided context, it is stated that the dog loves to eat pizza.'

---

## 🧪 Chat with returning documents

In [54]:
retrieved_documents = {"docs": retriever, "question": RunnablePassthrough()}

final_inputs = {
    "context": lambda x: "\n".join(doc.page_content for doc in x["docs"]),
    "question": RunnablePassthrough()
}

answer = {
    "answer": final_inputs | ANSWER_PROMPT | ChatOpenAI() | StrOutputParser(),
    "docs": RunnablePassthrough()
}

final_chain = rephrase_chain | retrieved_documents | answer

In [55]:
result = final_chain.invoke(
    {
        "question": "No, really?",
        "chat_history": [
            HumanMessage(content="What does the dog like to eat?"),
            AIMessage(content="Thuna!")
        ],
    }
)



print(result)

{'answer': 'Yes, the dog loves to eat pizza.', 'docs': {'docs': [Document(metadata={'source': 'animal.txt'}, page_content='the dog loves to eat pizza'), Document(metadata={'source': 'animal.txt'}, page_content='the dog loves to eat pizza'), Document(metadata={'source': 'animal.txt'}, page_content='the cat loves to eat lasagna'), Document(metadata={'source': 'animal.txt'}, page_content='the cat loves to eat lasagna')], 'question': 'Is that really what the dog likes to eat?'}}


---

## 🧩 Dettaglio del Flusso

```mermaid
graph LR
A[chat_history + question] --> B[Rephrase Chain]
B --> C[Rephrased Question]
C --> D[Retriever]
C --> E[Prompt RAG]
D --> F[Context]
F --> E
E --> G[ChatOpenAI]
G --> H[StrOutputParser]
H --> I[Final Answer]
```

---

## ✅ Vantaggi del Rewriting

| Problema                | Soluzione offerta                        |
| ----------------------- | ---------------------------------------- |
| Domande ambigue         | Rewriting in domande auto-contenute      |
| Retrieval inefficace    | Migliore similarità semantica            |
| Hallucinations dell’LLM | Prompt RAG basato solo su contesto       |
| Debug e tracciamento    | Output strutturato: risposta + documenti |

---

## 📌 Conclusione

* Usare la **cronologia delle chat** migliora drasticamente il contesto.
* Rewriting + Retrieval + RAG = pipeline completa e robusta.
* LangChain e LCEL permettono di gestire questa logica in modo modulare ed elegante.

---

## 🧪 Suggerimento per la pratica

> Modifica i documenti, cambia la domanda, analizza i documenti recuperati e la risposta generata.
> 🔄 Più ci giochi, meglio comprenderai i meccanismi di LCEL!
