
```{contents}
```

## Runnable Interface

### What is the Runnable Interface

The **Runnable interface** is the **core execution abstraction in LangChain**.
Every executable component in LangChain (LLMs, prompts, chains, retrievers, parsers, agents) is implemented as a **Runnable**.

> A Runnable is **anything that can be invoked, composed, streamed, batched, retried, or executed asynchronously** in a uniform way.

---

### Why Runnable Exists

Before Runnable:

* Chains were rigid
* Limited composition
* Poor async & streaming support

With Runnable:

* Everything is composable
* Declarative pipelines
* Built-in async, streaming, retries
* One mental model for all components

---

### Runnable Mental Model

```
Input
  ↓
Runnable
  ↓
Output
```

And multiple Runnables can be composed:

```
Runnable → Runnable → Runnable
```

---

### Core Runnable Capabilities

Every Runnable supports:

* `invoke()` – sync execution
* `ainvoke()` – async execution
* `stream()` – token streaming
* `batch()` – batch processing
* `with_retry()` – retries
* `with_fallbacks()` – fallback logic
* `with_config()` – tracing & metadata

---

### Basic Runnable Example

#### LLM as a Runnable



In [1]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini")

result = llm.invoke("Explain LangChain in one sentence")
print(result.content)


LangChain is a framework designed for developing applications that leverage language models (LLMs) by facilitating their integration, chaining, and interaction with other data sources and tools.




`ChatOpenAI` is a **Runnable**.

---

### Prompt as a Runnable



In [2]:
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant"),
    ("human", "{input}")
])

messages = prompt.invoke({"input": "What is RAG?"})




`ChatPromptTemplate` is also a **Runnable**.

---

### Runnable Composition (LCEL)

#### The `|` Operator (LCEL)



In [3]:
chain = prompt | llm




This creates a **RunnableSequence**.

Execution flow:

```
dict → prompt → messages → llm → AIMessage
```

---

### Full Runnable Chain Demonstration



In [4]:
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()

chain = prompt | llm | parser

result = chain.invoke({"input": "Explain RAG"})
print(result)


RAG stands for Retrieval-Augmented Generation, which is a framework used in natural language processing, particularly in the context of language models and information retrieval. The main idea behind RAG is to combine the strengths of both retrieval-based methods and generation-based methods to improve the accuracy and relevance of generated responses.

### Key Components of RAG:

1. **Retrieval**:
   - In the first step of the RAG process, relevant documents or pieces of information are retrieved from a large corpus or knowledge base. This is typically done using a retrieval model that identifies content that is semantically similar to the input query.
   - The retrieval component can use various techniques, including traditional information retrieval methods (like TF-IDF, BM25) or more advanced neural models (like dense embeddings).

2. **Augmentation**:
   - The retrieved documents are then used to provide additional context and information to the generative model. This helps ensure



Each component is a Runnable.

---

### RunnableLambda (Custom Logic)

#### Wrap any Python function as a Runnable



In [5]:
from langchain_core.runnables import RunnableLambda

uppercase = RunnableLambda(lambda x: x.upper())

uppercase.invoke("hello")

'HELLO'



Useful for:

* Pre-processing
* Post-processing
* Glue logic

---

### RunnablePassthrough

#### Pass input through unchanged (or assign values)



In [6]:
from langchain_core.runnables import RunnablePassthrough

chain = (
    RunnablePassthrough.assign(
        length=lambda x: len(x["input"])
    )
)

chain.invoke({"input": "LangChain"})


{'input': 'LangChain', 'length': 9}



Used heavily in:

* Agents
* Prompt variable injection

---

### RunnableParallel

#### Execute multiple Runnables in parallel



In [7]:
from langchain_core.runnables import RunnableParallel

parallel = RunnableParallel(
    answer=prompt | llm,
    length=lambda x: len(x["input"])
)

parallel.invoke({"input": "Explain LangChain"})


{'answer': AIMessage(content='LangChain is a framework designed to facilitate the development of applications that integrate with large language models (LLMs). It provides a suite of tools, components, and utilities that allow developers to build sophisticated natural language processing workflows and applications with ease. Here are some key aspects of LangChain:\n\n1. **Modular Components**: LangChain is built around modular components that can be used independently or combined to create complex systems. These components include things like language models, prompt templates, chains, and agents.\n\n2. **Language Models**: LangChain supports multiple LLMs, providing a uniform interface for interacting with different models, whether they are hosted by companies like OpenAI, Hugging Face, or others.\n\n3. **Prompt Management**: The framework includes tools for creating and managing prompts, enhancing the way developers can structure their interactions with language models. This can help 



Output:

```python
{
  "answer": AIMessage(...),
  "length": 18
}
```

---

### Streaming with Runnables

```python
for chunk in llm.stream("Explain LangChain"):
    print(chunk.content, end="")
```

Works for:

* Chat UIs
* SSE
* WebSockets

---

### Async Execution

```python
await llm.ainvoke("Explain RAG")
```

Every Runnable supports async.

---

### Batch Execution

```python
llm.batch([
    "What is LangChain?",
    "What is RAG?"
])
```

Used for:

* Embedding generation
* Bulk inference

---

### Retry & Fallback

#### Retry

```python
safe_llm = llm.with_retry(max_attempts=3)
safe_llm.invoke("Explain LangChain")
```

#### Fallback

```python
fallback_llm = llm.with_fallbacks([backup_llm])
```

---

### Runnable Configuration & Tracing

```python
chain.invoke(
    {"input": "Explain LangChain"},
    config={"tags": ["demo"], "metadata": {"env": "dev"}}
)
```

Used by:

* LangSmith
* Observability tools

---

### Runnables in Agents

Internally, agents are **Runnable graphs**:

```
Input
 ↓
Prompt Runnable
 ↓
LLM Runnable
 ↓
Tool Runnable
 ↓
Parser Runnable
```

This is why agents:

* Are composable
* Are debuggable
* Support streaming

---

### Runnables in RAG

```python
rag_chain = (
    {"context": retriever, "input": RunnablePassthrough()}
    | prompt
    | llm
)
```

Retriever, prompt, LLM → all Runnables.

---

### What is NOT a Runnable

* Plain strings
* Raw Python objects
* SDK responses (without wrapping)

Everything must be **wrapped** or **adapted** into a Runnable.

---

### Common Runnable Types (Cheat Sheet)

| Runnable            | Purpose            |
| ------------------- | ------------------ |
| LLM / ChatModel     | Model invocation   |
| PromptTemplate      | Prompt formatting  |
| Retriever           | Document retrieval |
| OutputParser        | Structured parsing |
| RunnableLambda      | Custom logic       |
| RunnableParallel    | Fan-out            |
| RunnablePassthrough | Input wiring       |

---

### Common Mistakes

* Mixing sync and async incorrectly
* Forgetting that prompt output ≠ string
* Manual chaining instead of LCEL
* Overusing custom glue code

---

### Interview-Ready Summary

> “The Runnable interface is LangChain’s execution backbone. Every component—LLMs, prompts, retrievers, agents—is a Runnable, enabling uniform invocation, streaming, batching, retries, and composable pipelines using LCEL.”

---

### Rule of Thumb

If it **executes**, it should be a **Runnable**.
If it **doesn’t compose**, you’re doing it wrong.

---

If you want next:

* Runnable internals
* Runnable vs Chain vs Agent
* Debugging Runnable graphs
* LangGraph vs Runnable
* Performance tuning with Runnables

State which one.
