```{contents}
```
## Observability Hooks

**Observability hooks** are **integration points** that let you **capture signals** (logs, metrics, traces, events) from inside an LLM pipeline **at runtime**.
They allow you to *see what the system is doing while it’s doing it*.

In LangChain-based systems, observability hooks are implemented using:

* Callback handlers
* Tracing hooks
* Metrics hooks
* Streaming hooks

They are foundational to **production-grade AI systems**.

```
Runnable Execution
   ├── Observability Hooks
   │     ├── Logs
   │     ├── Metrics
   │     └── Traces
   ↓
Monitoring / Debugging / Alerting
```

---

### Why Observability Hooks Matter

Without hooks:

* You don’t know why answers are wrong
* You can’t debug latency
* Failures are invisible
* Costs are opaque

With hooks:

* Full execution visibility
* Root-cause analysis
* SLA monitoring
* Safe production rollout

---

### What Observability Hooks Capture

| Signal  | Purpose              |
| ------- | -------------------- |
| Logs    | What happened        |
| Metrics | How often / how long |
| Traces  | Where time was spent |
| Events  | Lifecycle changes    |
| Streams | Partial outputs      |

---

### Architecture View

![Image](https://last9.ghost.io/content/images/2024/12/monitoring_stack.webp)

![Image](https://canada1.discourse-cdn.com/flex007/uploads/langchain/optimized/2X/8/8107939d30e448eea2471b6d89f60f681dc20a86_2_1024x434.png)

![Image](https://framerusercontent.com/images/H9paF4dK6nmB3pyEm9263g8KY.png?height=702\&width=1200)


---

### Logging Hook (Simple Observability)

#### Logging Hook via RunnableLambda



In [1]:
import logging
from langchain_core.runnables import RunnableLambda

logging.basicConfig(level=logging.INFO)

def log_input(x):
    logging.info("Input received: %s", x)
    return x

chain = (
    RunnableLambda(log_input)
    | RunnableLambda(lambda x: x.upper())
)

chain.invoke("observability")

INFO:root:Input received: observability


'OBSERVABILITY'



**Observed**

* Input logged
* Output produced normally

This is a **manual observability hook**.

---

### Callback Handler Hook (Most Common)

#### Custom Observability Callback



In [3]:
from langchain_classic.callbacks.base import BaseCallbackHandler
import time

class ObservabilityHook(BaseCallbackHandler):
    def on_chain_start(self, serialized, inputs, **kwargs):
        self.start = time.time()
        print("Chain started")

    def on_llm_end(self, response, **kwargs):
        print("LLM finished")

    def on_chain_end(self, outputs, **kwargs):
        duration = time.time() - self.start
        print(f"Chain completed in {duration:.2f}s")




---

#### Attach Hook to a Chain


In [4]:
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template(
    "Explain {topic} briefly"
)

llm = ChatOpenAI(callbacks=[ObservabilityHook()])

chain = prompt | llm

chain.invoke({"topic": "observability hooks"})

ModuleNotFoundError: No module named 'langchain.prompts'



**Observed**

* Chain start
* LLM completion
* Total execution time

---

## C. Token Streaming Hook (Real-Time Observability)

#### Streaming Token Hook

```python
class TokenStreamHook(BaseCallbackHandler):
    def on_llm_new_token(self, token, **kwargs):
        print(token, end="", flush=True)

llm = ChatOpenAI(
    streaming=True,
    callbacks=[TokenStreamHook()]
)

llm.invoke("Explain observability hooks")
```

**Observed**

* Tokens emitted live
* Real-time UX + observability

---

## D. Tracing Hook (Automatic Observability)

#### Enable Tracing Hooks

```bash
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=your_key
export LANGCHAIN_PROJECT=observability-demo
```

```python
chain.invoke({"topic": "hooks"})
```

**Observed**

* Full execution trace
* Step-by-step timing
* Token usage
* Retry/fallback visibility

This is **automatic observability**.

---

## E. Metrics Hook (Latency / Counters)

#### Metrics Hook Example

```python
class MetricsHook(BaseCallbackHandler):
    def on_chain_end(self, outputs, **kwargs):
        print("Metric: chain.success = 1")

    def on_error(self, error, **kwargs):
        print("Metric: chain.failure = 1")
```

Used to export metrics to:

* Prometheus
* CloudWatch
* Datadog

---

### Where Observability Hooks Attach

| Layer     | Hook Point       |
| --------- | ---------------- |
| Runnable  | Input/output     |
| Chain     | Start/end        |
| LLM       | Tokens / latency |
| Tool      | Invocation       |
| Retriever | Query time       |
| Agent     | Decisions        |

---

### Observability Hooks vs Callbacks vs Tracing

| Concept            | Role             |
| ------------------ | ---------------- |
| Callback Handler   | Mechanism        |
| Observability Hook | Purpose          |
| Tracing            | Visualization    |
| Logging            | Human-readable   |
| Metrics            | Machine-readable |

Hooks are the **bridge**.

---

### Real-World Example (Production Agent)

**IT Support Assistant**

* Hook logs every ticket query
* Hook traces RAG latency
* Hook streams tokens to UI
* Hook counts fallback usage
* Hook alerts on error spikes

---

### Best Practices

* Keep hooks lightweight
* Never block execution
* Redact sensitive data
* Separate logs vs metrics
* Enable tracing selectively in prod

---

### Mental Model

Observability hooks are **sensors** inside your LLM system.

```
System runs → hooks observe → signals emitted → insight gained
```

---

### Key Takeaways

* Observability hooks expose internal behavior
* Implemented via callbacks, tracing, streaming
* Essential for debugging, performance, reliability
* Mandatory for production LLM systems