Replies: 3 comments
-
|
Hey @sharpdima, great question — your HTTP Request + short timeout approach for fast failover is clever. Here's what I found on both of your questions: Accessing chat history in Code/HTTP Request nodes: Unfortunately, there is no built-in There are two main workarounds:
Setting a short timeout on the native LLM node: There's no per-node connection timeout setting exposed in the UI. The relevant environment variables are:
These are global (not per-node), so lowering them would affect all LLM nodes. You can't set a 2-second timeout on just one LLM node without your HTTP Request approach. Recommended approach: Your HTTP Request strategy is actually the best path for fast failover. Combine it with workaround #1 above (Variable Assigner storing history in a conversation variable array) to reconstruct chat context. Pass that array into a Code node that formats the OpenAI-compatible messages payload, then feed the output into your HTTP Request node. This gives you both the 2s timeout failover and conversation memory. Alternatively, Dify does support per-node retry configuration with a "First Token Timeout" mechanism [7] — if the first token doesn't arrive within a configured timeout, it retries. Combined with the Fail Branch, this might help, but it still won't give you the sub-2-second failover you're achieving with the HTTP Request approach. To reply, just mention @dosu. Docs are dead. Just use Dosu. |
Beta Was this translation helpful? Give feedback.
-
|
This is a very practical failover pattern. For production Chatflow, I’d probably keep your HTTP Request approach for the primary DeepSeek call, because a 2s connection timeout is much better for UX than waiting for the native LLM node timeout. For the memory issue, I’d avoid trying to depend on hidden
This also makes the workflow more provider-independent. For DeepSeek or any OpenAI-compatible API, fast failover + compact memory + fallback routing is usually more reliable than depending on one native provider node. I’m working around DeepSeek / OpenAI-compatible API access for small AI builders, and this kind of failover + stability problem is exactly what matters once a chatbot moves from testing to real users. |
Beta Was this translation helpful? Give feedback.
-
|
I like the HTTP node workaround, but losing conversation context is definitely a tradeoff. It would be great if "sys.chat_history" were exposed in a format that could be passed directly into Code or HTTP nodes for custom integrations. Also, fast failover is really important for production agents. Beyond switching models quickly, I've found it useful to have runtime safeguards that detect repeated failures or retry loops. I've been following the open source FailproofAI project for ideas in this area: https://github.com/FailproofAI/failproofai. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Self Checks
Content
Hi Dify Team,
I'm building a Chatflow and need some advice on optimizing the failover speed between LLMs.
Currently, I use a primary LLM node (DeepSeek API) with a "Fail Branch" routing to a fallback model (Claude). Due to regional network instability, the primary API sometimes drops. However, the native LLM node takes too long to timeout and trigger the fail branch, resulting in a poor user experience.
To solve this, I replaced the native LLM node with an HTTP Request node (calling the DeepSeek API directly). By setting the HTTP connection timeout to 2 seconds, the failover to the fallback model works instantly and perfectly!
The Problem:
By using the HTTP Request node, I lost access to conversational memory. When I try to build the message payload in a Code node right before the HTTP node, the sys.chat_history variable is not available in the variable selector to be passed as an array.
My Questions:
Is there a way to access or format sys.chat_history so it can be passed into an HTTP Request / Code node?
Or, alternatively, is there a way to set a very short, node-specific connection timeout (e.g., 2s) directly on the native LLM node so it fails over quickly without waiting for the global daemon timeout?
Beta Was this translation helpful? Give feedback.
All reactions