DAG-based (Directed Acyclic Graph) solutions:
Workflows are explicitly laid out as a sequence of tasks in a graph structure that has no cycles (i.e., you can’t come back around to a previous task). This is great for predictable, linear (or branching) pipelines—like ETL jobs or Airflow workflows—because you have a clear, static order of steps.

Agentic architecture:
Instead of predefining an entire workflow graph, you have a more dynamic “agent” that decides which step to take next based on the current context and goal. Rather than strictly following a graph of tasks, the agent can observe intermediate outputs and adjust its strategy on the fly, including calling different tools or subroutines, updating its own plan, or taking new paths altogether if needed.

- A DAG-based solution has a predetermined path: Task A flows into Task B, into Task C, etc. It might allow branching, but the overall structure is fixed.
- An agentic approach lets the system “choose” what to do next at runtime. Each step can vary depending on the agent’s reasoning, available tools, or new information gleaned partway through the workflow.

- In a DAG, each task often just takes an input from upstream tasks, does its job, and passes the output downstream.
- In an agentic system, the agent maintains a form of memory or persistent state and can evaluate partial results in a loop. It can re-check or refine previous reasoning.

- DAGs shine when the flow is well-known in advance: e.g., you need to transform data, then validate, then load, in a fixed sequence.
- An agentic approach is more suitable when the path to the final answer is uncertain or can change based on intermediate discoveries. The agent can decide to call additional APIs, explore different sub-tasks, or re-run certain steps to improve the result.

- In DAG-based solutions, the flow is “procedure-oriented”: you list out steps to get from start to finish.
- In an agentic architecture, you define a goal, and the agent figures out the steps (sometimes called “self-prompting” or “chain-of-thought”) to move toward that goal.

Langgraph is a multi-agent workflow. Langgraph supports cycles, controllability, and persistance.

Cycles

Traditionally, a DAG-based workflow doesn’t allow loops—it’s a Directed Acyclic Graph by definition. With agentic flows, you can revisit previous steps, refine them, or run iterative loops (“cycles”) until you reach some confidence level or result. This is key for more open-ended tasks where you might need multiple “tries” or repeated reasoning steps.

Controllability

In an agentic system, you can guide (or “control”) the flow of decisions at a higher level. Rather than letting the agent blindly keep trying, you can define rules, constraints, or checkpoints. This ensures the agent doesn’t go off-track and can be nudged or constrained in certain directions, depending on your use case.

Persistence

Persistence means the system can “remember” what happened before—even if you stop and restart the process. For example, if your agent is building a piece of content or executing a multi-step workflow, it can store intermediate results, partial solutions, or the agent’s own state in a database or memory. This way, you can resume where you left off, audit what happened so far, or trace how the solution evolved over time.




Streaming Across Multiple Nodes in LangGraph

Using LangGraph’s state-machine (graph) with streaming output requires careful handling of node transitions and function outputs. Below, we outline best practices for streaming across multiple nodes, how to transition between nodes without breaking the stream, and key configuration considerations.

How LangGraph Streaming Works

LangGraph provides .stream() and .astream() methods (sync and async) to stream outputs as a graph executes. You can specify a streaming mode to control what gets streamed. Common modes include:
- messages – streams LLM output token-by-token (as AIMessageChunk objects) with metadata (useful for real-time token streaming)
- updates – streams only the state updates after each node (e.g. the keys/values returned by that node)​
- values – streams the full state after each step (all state variables)
- custom – streams custom data emitted from inside nodes via LangGraph’s StreamWriter​
- debug – streams verbose debug events for each step

For example, using stream_mode="messages" will yield each LLM token as it’s generated, along with info about which node produced it​. In contrast, stream_mode="updates" would yield the partial state (or node output) after each node finishes​. You can even combine modes (e.g. stream_mode=["messages","updates"]) to get token streams and state updates; in that case the iterator yields tuples like ("messages", <token>) and ("updates", <state-delta>). The key point is that graph.astream will produce a continuous asynchronous iterator of events spanning the entire graph execution, from the first node to the last, as long as it’s properly configured.

State management: LangGraph uses a shared state dictionary (or TypedDict) that persists through the graph. Each node can read from and return updates to this state. The .astream() call manages this state behind the scenes, applying each node’s return values to the state and carrying it into subsequent nodes. Thus, data produced by one node (e.g. an LLM’s answer or a decision flag) can be used by later nodes. You don’t need to manually pass state between nodes – the graph does it. For instance, a conditional node can inspect state (populated by previous nodes) to decide the next path.

Transitioning Between Nodes Without Interrupting Streaming

**When a graph transitions from one node to the next, the stream remains open** – you should continue iterating over the graph.astream generator. In other words, you typically write something like:
```python
async for event in graph.astream(input_data, stream_mode="messages"):
    handle(event)  # event could be a token or state update, etc.
```
This loop will not terminate until the graph finishes or you explicitly break. To keep the stream active across node boundaries, do not break out of the loop early. The graph.astream will yield events for Node A, then Node B, then Node C, sequentially.

**However, note that streaming is inherently tied to the nodes’ execution. If Node A is an LLM that streams tokens and Node B is a quick deterministic function, you will see a burst of token events for Node A, then likely a pause (or just a state update event) for Node B, then perhaps more tokens if Node C calls an LLM. This is normal – the stream hasn’t closed; it’s simply waiting for Node B to complete. LangGraph does not yet support truly simultaneous streaming from one node into the next (piping chunk-by-chunk between nodes). In fact, attempting to forward each token from one node to another in real time isn’t supported: “streaming output from one node to another isn’t currently supported”​. Instead, each node runs to completion (while possibly streaming its own output) before the graph proceeds.**

Best practices for smooth transitions: Design your graph so that any long-running or LLM-intensive steps produce streaming output to avoid long silent gaps. For example, if Node B is an I/O-bound or slow operation with no streaming, you might use a custom stream event to notify progress (e.g., using StreamWriter to emit a "Processing..." message). If a node simply routes to the next step (like a condition node), it should be quick – the stream will resume as soon as the next node starts producing output. In summary, as long as you let the graph.astream run to completion, it will handle multiple nodes in sequence. Any “break” in the streaming flow usually indicates either the graph execution ended prematurely or an intermediate node isn’t emitting anything (which can be expected if that node doesn’t involve an LLM or custom stream events).

Using return vs yield in Node Functions

Always use return to output from a node function, not yield. Each node in a LangGraph graph should be a normal function (or async function) that returns either:
- a dictionary of state updates (for regular nodes),
- a node identifier or list of identifiers (for conditional routing), or
- a special END signal (to terminate the graph).

Using a return allows the graph to capture the output and properly transition to the next node or finish. In contrast, using yield inside a node turns it into a generator, which LangGraph cannot execute as a normal node. One developer ran into this issue when attempting to yield LLM tokens from inside a node: “The problem is, when I have yield in this node, it becomes a generator so I can’t run this function directly. In the LangGraph workflow... I am stuck here.”​ A node defined as a generator will fail to compile or execute in the graph, effectively breaking the workflow.

How to stream from a node properly: If you want to stream partial results from within a node, do not use Python’s yield. Instead, leverage LangGraph’s streaming facilities:
- **If the node calls an LLM (via LangChain ChatOpenAI or similar), simply call the model’s .invoke/ainvoke normally. LangGraph’s .stream/.astream will capture the token-wise output via callbacks, so you get streaming without writing a generator yourself**. In other words, let the LLM and LangGraph handle token streaming – your node can just return the final message or updated state.
- If the node performs a custom task and you want to emit intermittent data, use the StreamWriter. LangGraph provides get_stream_writer() which gives you a callback to send custom streaming events. For example:
```python
from langgraph.config import get_stream_writer

def tool_node(arg: str):
    writer = get_stream_writer()
    for chunk in custom_data_stream(arg):
        writer(chunk)        # stream out chunk as a 'custom' event
    result = finalize_result()
    return {"output": result}
```
In the graph, you would call graph.stream(..., stream_mode="custom") to receive those chunks​. This approach keeps the node function a normal function (with a final return), while still streaming data during execution.

By following these patterns, you transition between nodes without ending the overall stream. Each node either returns a result or writes to the stream via the provided writer. Do not attempt to yield a next node identifier or partial state – always return your node’s outcome so the graph engine can do its job. For conditional nodes, for example:
```python
ef decide_next(state):
    if state["use_tool"]:
        return "ToolNode"
    else:
        return "__end__"  # or StateGraph.END
```
Here, returning "ToolNode" tells the graph to continue to that node, whereas returning END stops the graph​. Using return (not yield) ensures the graph knows the node is finished and can move on.

Managing Conditional Transitions in Streaming

Conditional edges in LangGraph allow branching to different nodes based on a function’s output. If you have a condition node that chooses the next step, make sure it returns the correct value expected by add_conditional_edges or path_map. A common mistake is returning a value that isn’t mapped to a node, which can cause the graph to stop or raise an error. For example, if the condition function returns END (or Node.End) unintentionally, the graph will terminate immediately​ – which would stop the stream. Double-check that your condition logic only returns END when you truly intend to finish the workflow.

Another potential pitfall is trying to stream from within the condition function. Typically, a condition node just performs a check and returns a branch key; it usually shouldn’t produce user-facing output. If you do need an LLM to decide the next node (for example, an AI deciding an action), consider whether you want to stream its reasoning or not. Often, such “decision” LLM calls are run without streaming (to avoid confusing the end-user) and perhaps logged via debug events. If you do stream during a condition node, be aware that it will send tokens to the client as it’s deciding – which might not be desirable. In most cases, keep condition nodes simple and fast. Let the main content-generating nodes handle the streaming of tokens or messages that the user sees.

Stream Modes and Configuration Considerations

**Choosing the right stream_mode is important for multi-node graphs. For most interactive applications where you want to show the AI’s answer token-by-token, use stream_mode="messages". This mode will emit tokens from any LLM call in your graph, not just the final node. That means if you have multiple LLM nodes, each will stream its output in turn. If you only want to stream the final answer (suppressing earlier LLM token streams), you might need to adjust the design (e.g., don’t use streaming on earlier nodes, or filter events). Another option is using LangChain’s Expression Language (LCEL) or callbacks to only forward final-step tokens, but with LangGraph alone, the messages mode streams all LLM calls.** On the other hand, if you want to monitor the state evolution instead of raw tokens, stream_mode="updates" can be useful – it will yield a dict of just the keys that changed after each node runs. This is useful for debugging or for non-LLM data flows.

You should also be aware of the output_keys parameter on .stream/astream (if you only want certain state fields in the streamed output) and the interrupt_before/after options​. By default, LangGraph will stream after every node (“interrupt_after” all nodes). If for some reason you wanted to only stream up to a certain node, you could use these, but in most cases you can leave them as defaults so that the stream covers the whole graph.

Finally, consider the underlying LLM configuration. Ensure the LLM or chain you call is set up for streaming. If you use LangChain’s ChatOpenAI, it supports streaming via the callback mechanism (which LangGraph utilizes). As one LangGraph maintainer noted, you must use an LLM that supports streaming; otherwise, .astream(..., stream_mode="messages") will appear to lump the output all at once​. In practice, ChatOpenAI with the LangGraph tracer works out-of-the-box, but if you use a custom LLM or an older invocation method, make sure it’s compatible with streaming.

RunnableConfig and async issues: If your graph or nodes are asynchronous, be mindful of Python version differences. For Python 3.11+, LangGraph propagates the streaming callbacks automatically via context variables. In Python versions below 3.11, you must manually pass the RunnableConfig (which contains callback info) into any async LLM calls inside your nodes. In other words, if you write an async node that calls await llm.ainvoke(...), pass the config object through: await llm.ainvoke(prompt, config). LangGraph’s streaming uses an internal tracer (callback handler), and without contextvars (in older Python), that context can be lost unless explicitly forwarded​. The LangGraph docs emphasize this: “pass the RunnableConfig in the node function and into model.ainvoke(..., config); this is required for Python < 3.11”​. Forgetting this step can result in no tokens being streamed from the LLM (the model will still return a final result, but you won’t see incremental chunks). If you’re using Python 3.11+ or synchronous model.invoke, this is handled for you; just be aware of it if you encounter a situation where streaming events aren’t appearing.

Also, prefer using the async .astream with async node functions when possible. While LangGraph does allow sync nodes and a sync .stream(), mixing a sync LLM call that blocks until completion will naturally output tokens only after completion. For example, calling a sync LLM without streaming (or one that buffers output) means .stream can only emit after that node finishes. If instead you use an async LLM call that yields tokens (with proper config), .astream can interleave those token events into the output iterator as they happen. In short, use asynchronous LLM calls for streaming so that other tasks (like sending tokens to the client) aren’t blocked.

Common Pitfalls and Tips

To ensure a smooth streaming experience across multiple nodes, keep in mind these common mistakes and how to avoid them:
- Using yield in node functions: As discussed, turning a node into a generator will break the graph execution​. Always return final results from nodes. Use LangGraph’s streaming utilities for intermediate data instead of Python generators.
- Incorrect conditional return values: Make sure conditional nodes return valid node keys or END. Returning an unmapped value or mistakenly returning END will stop the graph early​, making it seem like streaming stopped. Double-check your path_map and return values in conditional edges.
- Not enabling streaming on the LLM: If you use a custom LLM or chain, ensure it emits tokens via callbacks. For instance, OpenAI in LangChain should have streaming=True or be invoked in a way that supports streaming. If the model/chain doesn’t stream, LangGraph will only get a final output to emit. The symptom is .astream(..., "messages") yielding one chunk (the full message) at the end. Use models known to support streaming (e.g. ChatOpenAI from LangChain, which LangGraph leverages for token callbacks).
- Forgetting to compile or connect nodes: Always add edges between your nodes (or use set_entry_point) and call graph.compile() before running. If a node isn’t actually reached (due to a missing edge or entry point misconfigured), your stream may appear to “hang” or end early. Ensure the graph structure covers all transitions you expect.
- Breaking out of the stream loop prematurely: If you stop iterating over graph.astream before it’s done, you’ll truncate the output. For example, avoid logic that stops when you think you have “enough” tokens – unless that’s intentional. Typically, let the graph run to completion (or until a known terminating condition in the state is reached via normal graph logic).
- Misunderstanding stream_mode output: Using the wrong mode can give the impression of broken streaming. For instance, stream_mode="values" only emits after each node completes (no token-level data), so an LLM node will still internally stream but you won’t see it token-by-token – you’d just get the final accumulated value at step end​. If you expected a steady token flow, this looks “stuck” until the end of the node. The fix is to use messages mode for token streaming. Similarly, if you use a combination of modes, ensure your code handles the tuple outputs. A mistake like not unpacking the tuple (mode, data) could lead to confusion and possibly breaking your consumption loop.
- Not passing RunnableConfig in async nodes (Python < 3.11): As noted, forgetting this can silence token events​. The node will still return a result, but you lose the incremental pieces.

In summary, to stream across multiple nodes: define each node to return its result (no Python yields), set up conditional transitions with correct return values, and call the graph’s astream with an appropriate stream_mode (often "messages" for LLM tokens). The streaming will naturally continue from one node to the next as long as the graph has not ended. If a transition seems to break the flow, inspect whether the graph stopped or if it’s just a non-streaming step. By following these practices – and using LangGraph’s built-in streaming tools – you can maintain a responsive, token-by-token output even in complex multi-step workflows.

Example scenario: Imagine a graph with three nodes: QuestionNode (prompts an LLM for an analysis), DecisionNode (decides if a follow-up is needed), and AnswerNode (prompts an LLM for the final answer). Using stream_mode="messages", when you run graph.astream(...), you’ll first receive tokens from QuestionNode’s LLM as it thinks (streamed in real-time). Once that node finishes, DecisionNode runs – it might quickly return either “AnswerNode” or END. During this brief step you might get an updates event (if using that mode) or no tokens (since it’s just logic). If it chose AnswerNode, the graph proceeds and you then start receiving tokens from AnswerNode’s LLM call. From the client perspective, it’s one continuous stream of AI output, with a slight pause between the first part and second part. The stream remains open the whole time – you didn’t have to restart anything – and it closes only after AnswerNode finishes and the graph ends. This illustrates how LangGraph manages state and control flow internally while you handle a single streaming output loop.

References
LangGraph Streaming Guide – streaming modes and usage of .stream/.astream​
PYTHON.LANGCHAIN.COM LANGCHAIN-AI.GITHUB.IO
LangGraph Conditional Edges – how to return next node or END in a path function​
LANGCHAIN-AI.GITHUB.IO
Stack Overflow – caution against using yield in LangGraph node functions​
STACKOVERFLOW.COM
LangGraph How-To – using get_stream_writer() for custom streaming inside a node​
LANGCHAIN-AI.GITHUB.IO
LangChain/LangGraph Discussion – limitation of streaming directly between nodes​
REDDIT.COM
LangGraph Streaming Tokens Example – importance of passing RunnableConfig in async calls (Python <3.11)​
LANGCHAIN-AI.GITHUB.IO
