Best practice for fast LLM failover and accessing sys.chat_history in Chatflow #37332

sharpdima · 2026-06-11T09:53:51Z

sharpdima
Jun 11, 2026

Self Checks

I have searched for existing issues search for existing issues, including closed ones.
I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:)
Please do not modify this template :) and fill in all the required fields.

Content

Hi Dify Team,

I'm building a Chatflow and need some advice on optimizing the failover speed between LLMs.

Currently, I use a primary LLM node (DeepSeek API) with a "Fail Branch" routing to a fallback model (Claude). Due to regional network instability, the primary API sometimes drops. However, the native LLM node takes too long to timeout and trigger the fail branch, resulting in a poor user experience.

To solve this, I replaced the native LLM node with an HTTP Request node (calling the DeepSeek API directly). By setting the HTTP connection timeout to 2 seconds, the failover to the fallback model works instantly and perfectly!

The Problem:
By using the HTTP Request node, I lost access to conversational memory. When I try to build the message payload in a Code node right before the HTTP node, the sys.chat_history variable is not available in the variable selector to be passed as an array.

My Questions:

Is there a way to access or format sys.chat_history so it can be passed into an HTTP Request / Code node?

Or, alternatively, is there a way to set a very short, node-specific connection timeout (e.g., 2s) directly on the native LLM node so it fails over quickly without waiting for the global daemon timeout?

@sharpdima · 2026-06-11T10:02:05Z

dosubot[bot]
Bot Jun 11, 2026

Hey @sharpdima, great question — your HTTP Request + short timeout approach for fast failover is clever. Here's what I found on both of your questions:

Accessing chat history in Code/HTTP Request nodes:

Unfortunately, there is no built-in sys.chat_history system variable in Dify Chatflow [1]. This is a commonly requested feature. The available system variables (sys.query, sys.conversation_id, sys.dialogue_count, etc.) don't include the actual message history.

There are two main workarounds:

Variable Assigner approach [2] — Store each Q&A pair into a conversation-scoped array variable using a Variable Assigner node + Code node for array merging. Then pass that accumulated array into your HTTP Request node's payload. You'd reference the user question from the Start node and the LLM answer from a previous Answer node, appending them each turn.
API-based retrieval [3] — Use sys.conversation_id (which is available) in a Code node to call Dify's own /messages API endpoint, retrieving the conversation history programmatically and formatting it for your HTTP request payload.

Setting a short timeout on the native LLM node:

There's no per-node connection timeout setting exposed in the UI. The relevant environment variables are:

HTTP_REQUEST_MAX_CONNECT_TIMEOUT (default 10s) — for HTTP Request nodes [4]
PLUGIN_DAEMON_TIMEOUT (default 600s) — for LLM node plugin communication [5]
API_TOOL_DEFAULT_CONNECT_TIMEOUT (default 10s) — connection timeout reaching LLM endpoints [6]

These are global (not per-node), so lowering them would affect all LLM nodes. You can't set a 2-second timeout on just one LLM node without your HTTP Request approach.

Recommended approach:

Your HTTP Request strategy is actually the best path for fast failover. Combine it with workaround #1 above (Variable Assigner storing history in a conversation variable array) to reconstruct chat context. Pass that array into a Code node that formats the OpenAI-compatible messages payload, then feed the output into your HTTP Request node. This gives you both the 2s timeout failover and conversation memory.

Alternatively, Dify does support per-node retry configuration with a "First Token Timeout" mechanism [7] — if the first token doesn't arrive within a configured timeout, it retries. Combined with the Fail Branch, this might help, but it still won't give you the sub-2-second failover you're achieving with the HTTP Request approach.

To reply, just mention @dosu.

Docs are dead. Just use Dosu.

0 replies

richardchen874-sys · 2026-06-14T07:20:40Z

richardchen874-sys
Jun 14, 2026

This is a very practical failover pattern.

For production Chatflow, I’d probably keep your HTTP Request approach for the primary DeepSeek call, because a 2s connection timeout is much better for UX than waiting for the native LLM node timeout.

For the memory issue, I’d avoid trying to depend on hidden sys.chat_history. A cleaner setup could be:

store compact conversation state in a conversation variable
format only the last N turns into OpenAI-compatible messages
keep the HTTP Request node for fast DeepSeek timeout
route to Claude fallback when DeepSeek fails
log timeout / fallback rate separately so you can see how often the primary provider drops

This also makes the workflow more provider-independent. For DeepSeek or any OpenAI-compatible API, fast failover + compact memory + fallback routing is usually more reliable than depending on one native provider node.

I’m working around DeepSeek / OpenAI-compatible API access for small AI builders, and this kind of failover + stability problem is exactly what matters once a chatbot moves from testing to real users.

0 replies

ishita-0301 · 2026-06-16T06:07:48Z

ishita-0301
Jun 16, 2026

I like the HTTP node workaround, but losing conversation context is definitely a tradeoff. It would be great if "sys.chat_history" were exposed in a format that could be passed directly into Code or HTTP nodes for custom integrations.

Also, fast failover is really important for production agents. Beyond switching models quickly, I've found it useful to have runtime safeguards that detect repeated failures or retry loops. I've been following the open source FailproofAI project for ideas in this area: https://github.com/FailproofAI/failproofai.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Best practice for fast LLM failover and accessing sys.chat_history in Chatflow #37332

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Best practice for fast LLM failover and accessing sys.chat_history in Chatflow #37332

Uh oh!

sharpdima Jun 11, 2026

Self Checks

Content

Replies: 3 comments

Uh oh!

dosubot[bot] Bot Jun 11, 2026

Uh oh!

richardchen874-sys Jun 14, 2026

Uh oh!

ishita-0301 Jun 16, 2026

sharpdima
Jun 11, 2026

dosubot[bot]
Bot Jun 11, 2026

richardchen874-sys
Jun 14, 2026

ishita-0301
Jun 16, 2026