```{contents}
```
## Stuff Chain


A **Stuff Chain** is a LangChain abstraction that **“stuffs” all input documents or text chunks into a single prompt** and sends them to the LLM **in one call**.

> There is **no splitting, no iteration, no aggregation** — everything is passed at once.

It is the **simplest** document-processing chain in LangChain.

---

### Why Stuff Chain Exists

Stuff chain exists for:

* Small inputs
* Quick summarization
* Simple RAG demos
* Lowest latency workflows

It assumes:

* The entire input **fits within the model’s context window**
* The model can reason over all content at once

---

### Conceptual Flow

```
All Documents
   ↓
Concatenate (stuff)
   ↓
Single Prompt
   ↓
LLM
   ↓
Final Output
```

---

### How Stuff Chain Works Internally

1. Take all documents
2. Concatenate their contents
3. Inject into a single prompt variable
4. Call the LLM once

No intermediate steps are stored.

---

### Basic Stuff Chain Demonstration

#### Step 1: Define the Prompt



In [1]:
from langchain_core.prompts import PromptTemplate

prompt = PromptTemplate.from_template(
    "Answer the question using the following context:\n{context}"
)




---

#### Step 2: Create the Stuff Chain



In [5]:
from langchain_classic.chains.summarize import load_summarize_chain
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

chain = load_summarize_chain(
    llm=llm,
    chain_type="stuff",
    prompt=prompt
)



---

#### Step 3: Invoke the Chain



In [7]:
from langchain_core.documents import Document

# Create sample documents
documents = [
    Document(page_content="Artificial Intelligence is transforming industries worldwide. Machine learning algorithms enable computers to learn from data and improve their performance over time without explicit programming."),
    Document(page_content="Natural Language Processing allows machines to understand and generate human language. Applications include chatbots, translation services, and sentiment analysis."),
    Document(page_content="Computer vision enables machines to interpret and understand visual information from the world. It powers technologies like facial recognition, autonomous vehicles, and medical image analysis.")
]

result = chain.invoke({"input_documents": documents})
print(result["output_text"])


What are some applications of Artificial Intelligence mentioned in the context?

Some applications of Artificial Intelligence mentioned include chatbots, translation services, sentiment analysis, facial recognition, autonomous vehicles, and medical image analysis.




All documents are stuffed into `{context}`.

---

### Internal Execution (Simplified)

```
context = join(doc1, doc2, doc3, ...)
response = LLM(prompt(context))
```

---

### Common Use Cases

* Small document summarization
* FAQ answering
* Short reports
* RAG with few chunks
* Prototyping and demos

---

### Stuff Chain vs MapReduce

| Aspect        | Stuff Chain | MapReduce  |
| ------------- | ----------- | ---------- |
| Scalability   | Low         | High       |
| Latency       | Low         | Medium     |
| Parallelism   | ❌           | ✅          |
| Context usage | High        | Controlled |

---

### Stuff Chain vs Refine

| Aspect                | Stuff | Refine |
| --------------------- | ----- | ------ |
| Order sensitivity     | Low   | High   |
| Accuracy on long docs | Low   | High   |
| Latency               | Low   | High   |

---

### Limitations of Stuff Chain

* ❌ Breaks if input exceeds context window
* ❌ Expensive for large documents
* ❌ No partial fault tolerance
* ❌ Legacy abstraction

---

### Stuff Chain in RAG Pipelines

Stuff chain is commonly used when:

* Retriever returns **few, small chunks**
* You want a **single grounded answer**

Example:

```
Retriever → Stuff Chain → Answer
```

---

### Stuff Chain vs LCEL (Modern Approach)

#### Stuff Chain (Legacy)



In [11]:
from langchain_classic.chains.summarize import load_summarize_chain
load_summarize_chain(llm=llm, chain_type="stuff")


StuffDocumentsChain(verbose=False, llm_chain=LLMChain(verbose=False, prompt=PromptTemplate(input_variables=['text'], input_types={}, partial_variables={}, template='Write a concise summary of the following:\n\n\n"{text}"\n\n\nCONCISE SUMMARY:'), llm=ChatOpenAI(profile={'max_input_tokens': 128000, 'max_output_tokens': 16384, 'image_inputs': True, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': False, 'tool_calling': True, 'structured_output': True, 'image_url_inputs': True, 'pdf_inputs': True, 'pdf_tool_message': True, 'image_tool_message': True, 'tool_choice': True}, client=<openai.resources.chat.completions.completions.Completions object at 0x0000018D9025D7F0>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x0000018D9025C650>, root_client=<openai.OpenAI object at 0x0000018D9025D550>, root_async_client=<openai.AsyncOpenAI object at 0x0000018D9025D460>, mode



### LCEL Equivalent (Conceptual)





LCEL provides:

* Better composability
* Streaming
* Validation
* Future-proof design

---

### When to Use Stuff Chain

* Input size is guaranteed small
* Latency matters more than scalability
* Simple QA or summarization
* Learning LangChain basics

---

### When NOT to Use Stuff Chain

* Large documents
* Unbounded user input
* Production RAG systems
* Compliance-critical systems

---

### Best Practices

* Limit number of retrieved documents
* Enforce chunk size
* Use low temperature
* Validate token counts
* Add fallbacks for overflow

---

### Interview-Ready Summary

> “A Stuff Chain in LangChain concatenates all input documents into a single prompt and processes them in one LLM call. It is fast and simple but limited by context window size and has largely been superseded by LCEL for production systems.”

---

### Rule of Thumb

* **Few small documents → Stuff**
* **Many large documents → MapReduce or Refine**
* **Modern production → LCEL-based RAG**
