Experience with local LLMs #7758

shuber42 · 2026-06-08T09:19:04Z

shuber42
Jun 8, 2026

Hello,

I am currently trying to implement a solution based on Paperclip, but the results so far are very disappointed: Neither the Ollama nor Hermes adapter were able to properly call the tools, but only explained them.
Using OpenCode with qwen3.5:35b does not yield any results as well, I was able to let the CEO create one agent, but after this nothing happens, besides a loop that has no outcome (the transcript of the run is just saying 'step started ses_$hash').

I am therefore wondering what experience other people have made?
Has anyone a solid local setup that works reliably?

Tobi-Adesoye · 2026-06-08T16:59:26Z

Tobi-Adesoye
Jun 8, 2026

Optimizing Local LLM Tool-Calling & Loop Recovery

The infinite step started ses_$hash loop typically triggers when local models wrap tool arguments in conversational prose or markdown blocks rather than emitting pure, raw JSON schemas. Smaller context configurations often fail to balance system prompt constraints with complex tool definitions.
To stabilize a local Ollama/Hermes/Qwen setup on Paperclip, try these three adjustments:

Inject Strict Formatting Guidelines: Append explicit instructions to your baseline agent prompts forcing the model to output only the tool-call object:

[CRITICAL] Do not explain the tools. Do not output introductory text. Output valid JSON matching the schema directly.

Implement an In-Process Fallback Wrapper: If you are running custom nodes, introduce a regex parsing layer to strip conversational noise before it hits the execution engine:

python
Clean accidental markdown or prose wraps around local tool JSON strings
import re
def extract_clean_json(raw_output: str) -> str:
match = re.search(r"{.*}", raw_output, re.DOTALL)
return match.group(0) if match else raw_output

Scale Context & Penalties: If using Ollama, adjust your parameters in your configuration profile to tighten token predictability:
Set temperature to 0.0 (forces deterministic execution).
Ensure your local model context (num_ctx) is set to at least 8192 to prevent memory thrashing during multi-agent loops.

1 reply

shuber42 Jun 9, 2026
Author

Thanks for your suggestions!

I implemented your first and last suggestion.
Specifically, I set the temperature value of OpenCode to false in the configuration of the model and created a custom Modelfile that has the parameter temperature set to 0. And I added the instruction to the AGENTS.md file. This makes a huge difference and it works way better than before. I can't quite follow your second suggestion: Where exactly can I place middle layer code like this in paperclip? What exactly are custom nodes? I know that you can run a process (i.e. custom code) as an agent, is this what you refer to?

And nevertheless I end up in the state of "Missing disposition recovery blocked" pretty often. It seems like the model is just not finishing the Paperclip workflow as intended (not doing something with the issue, like changing the state). Do you have any more suggestions how to stabilize this?

Thank you!

Tobi-Adesoye · 2026-06-09T06:19:44Z

Tobi-Adesoye
Jun 9, 2026

@shuber42 Fantastic to hear that locking down the deterministic parameters and prompt guidelines smoothed out the local execution loop! OpenCode at `temperature=0` handles structured syntax constraints significantly better. To answer your questions on architecture and handling that "Missing disposition recovery blocked" bottleneck: 1. Where to Inject Middle-Layer Code in Paperclip When I mentioned "custom nodes," I was referring exactly to what you noted: custom execution runtimes, background processes, or tools exposed as executable primitives to your agents. In Paperclip’s framework, instead of rewriting core source engine code, you can inject that JSON extraction regex directly into the "Tool Execution Hook" or your "Custom Agent's Output Parser block". If you run your process as a custom agent script, you can include this helper function to strip away any conversational noise or accidental markdown wraps before the payload hits Paperclip's handler: python import re def extract_clean_json(raw_output: str) -> str: Cleans accidental conversational noise or markdown wraps around local tool JSON strings. match = re.search(r"{.*}", raw_output, re.DOTALL) return match.group(0) if match else raw_output Integration inside your Agent Execution Loop raw_response = local_llm.generate(prompt) clean_payload = extract_clean_json(raw_response) Pass clean_payload safely to Paperclip's dispatch handler 2. Solving "Missing disposition recovery blocked" This error triggers because the local LLM successfully processes the core engineering task, but its context window gets saturated, or it lacks the final logical push to trigger the structural termination API (e.g., calling update_issue_status(state="resolved")). It drops out of the loop without telling Paperclip it's finished. To stabilize this and force a clean exit disposition, try these two additions: - The "Final Step" Self-Correction Prompt: Update your AGENTS.md instructions with an explicit exit-condition sequence: [DISPOSITION CRITERIA] Once your analytical task or code modification is complete, you MUST explicitly invoke the final state mutation tool (e.g., updating issue state or submitting a disposition status). Do not end your execution block in conversational prose without executing a closing tool call. - *Inject an Explicit finish_task Tool:* Local models often struggle when the state update requires deep payload formatting. Exposing a dead-simple, argument-free primitive like complete_agent_session() or close_active_ticket() gives smaller local models an easy, highly visible target to call when they hit the end of their logic. Let me know if adding that explicit termination prompt forces OpenCode to clean up its states properly!

…

On Tue, Jun 9, 2026 at 7:01 AM shuber42 ***@***.***> wrote: Thanks for your suggestions! I implemented your first and last suggestion. Specifically, I set the temperature value of OpenCode to false in the configuration of the model and created a custom Modelfile that has the parameter temperature set to 0. And I added the instruction to the AGENTS.md file. This makes a huge difference and it works way better than before. I can't quite follow your second suggestion: Where exactly can I place middle layer code like this in paperclip? What exactly are custom nodes? I know that you can run a process (i.e. custom code) as an agent, is this what you refer to? And nevertheless I end up in the state of "Missing disposition recovery blocked" pretty often. It seems like the model is just not finishing the Paperclip workflow as intended (not doing something with the issue, like changing the state). Do you have any more suggestions how to stabilize this? Thank you! — Reply to this email directly, view it on GitHub <#7758?email_source=notifications&email_token=AQSPCWFPX6742KPLIP2CT6T466R3PA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZSGMYTSNRYUZZGKYLTN5XKOY3PNVWWK3TUUVSXMZLOOSWGM33PORSXEX3DNRUWG2Y#discussioncomment-17231968>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSPCWHHNZ436YAHNG4X5KT466R3PAVCNFSM6AAAAACZ6ZMRIOVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTOMRTGE4TMOA> . Triage notifications, keep track of coding agent tasks and review pull requests on the go with GitHub Mobile for iOS <https://github.com/notifications/mobile/ios/AQSPCWHXJ2YEK2G5OKOMGHT466R3PA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZSGMYTSNRYUZZGKYLTN5XKOY3PNVWWK3TUUVSXMZLOOSVGM33PORSXEX3JN5ZQ> and Android <https://github.com/notifications/mobile/android/AQSPCWHVF6QB4LF47IHKNMT466R3PA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZSGMYTSNRYUZZGKYLTN5XKOY3PNVWWK3TUUVSXMZLOOSXGM33PORSXEX3BNZSHE33JMQ>. Download it today! You are receiving this because you commented.Message ID: ***@***.***>

1 reply

shuber42 Jun 10, 2026
Author

So if I understand you correctly, this is only valid for custom agent scripts, but not for out of the box solutions like OpenCode?

Thanks for the suggestion! I let it run over night now and the results are better, but not where I would like them. Of around three dozen tasks, there are two hands full in the disposition state. But to be fair, I only added the termination prompt, so maybe I can look into the finish_task tool to improve disposition stability.
I wonder what local models you can recommend for Paperclip?

Tobi-Adesoye · 2026-06-10T06:32:07Z

Tobi-Adesoye
Jun 10, 2026

Hi shuber42, You’ve got it exactly right. Out-of-the-box, turn-key solutions like OpenCode usually handle their loops internally, so these explicit agent-loop interventions are primarily for when you are configuring custom orchestration scripts or connecting raw agent frameworks. It’s great to hear that adding the termination prompt cut down the loop failures overnight! Moving from complete infinite-step stalls to having tasks land in the "disposition state" is a huge structural step forward. This means the model is successfully attempting to wrap up; it’s just stumbling on the landing gear. Moving forward, implementing that explicit `finish_task` tool is definitely your best next lever. It shifts the burden of ending a task from conversational text generation (which local models tend to stretch out) to a deterministic function call. Regarding local models that play nicely with Paperclip’s multi-agent architecture and heartbeat loop: 1. Qwen 2.5 (14B / 32B / 72B - Instruct) **Why it fits:** Qwen is arguably the gold standard for local tool-calling right now. It has excellent native support for function schemas and respects system prompts tightly. If you have the hardware to run the 32B or 72B variants (even heavily quantized), your task disposition stability will skyrocket. 2. DeepSeek-V2.5 (or Llama-3-Instruct fine-tunes) **Why it fits:** These models handle dense instruction-following and structured ticketing/markdown text generation exceptionally well. They are far less prone to conversational drift, which directly cuts down on those infinite tool-arg packaging loops. 3. Llama-3.1-8B-Instruct (For lighter tasks) **Why it fits:** If you are restricted to a lower VRAM budget, the 8B variant works, but *only* if you use the `finish_task` tool approach. It needs that rigid, functional guardrail to cleanly exit Paperclip tasks without trailing off into conversational filler. Keep me posted on how the `finish_task` tool updates handle those remaining tickets in disposition. It sounds like you are incredibly close to a rock-solid local setup! Best, Tobi

…

On Wed, Jun 10, 2026 at 7:18 AM shuber42 ***@***.***> wrote: So if I understand you correctly, this is only valid for custom agent scripts, but not for out of the box solutions like OpenCode? Thanks for the suggestion! I let it run over night now and the results are better, but not where I would like them. Of around three dozen tasks, there are two hands full in the disposition state. But to be fair, I only added the termination prompt, so maybe I can look into the finish_task tool to improve disposition stability. I wonder what local models you can recommend for Paperclip? — Reply to this email directly, view it on GitHub <#7758?email_source=notifications&email_token=AQSPCWDW64GG7FDFXZXGLRT47D4VFA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZSGQ3DSMRTUZZGKYLTN5XKOY3PNVWWK3TUUVSXMZLOOSWGM33PORSXEX3DNRUWG2Y#discussioncomment-17246923>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQSPCWHU3ZLDCH3FQVVGKR347D4VFAVCNFSM6AAAAACZ6ZMRIOVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTOMRUGY4TEMY> . Triage notifications, keep track of coding agent tasks and review pull requests on the go with GitHub Mobile for iOS <https://github.com/notifications/mobile/ios/AQSPCWA6SL5WJTV7UXAC5MD47D4VFA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZSGQ3DSMRTUZZGKYLTN5XKOY3PNVWWK3TUUVSXMZLOOSVGM33PORSXEX3JN5ZQ> and Android <https://github.com/notifications/mobile/android/AQSPCWEWETJBZEON4IHFH4347D4VFA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNZSGQ3DSMRTUZZGKYLTN5XKOY3PNVWWK3TUUVSXMZLOOSXGM33PORSXEX3BNZSHE33JMQ>. Download it today! You are receiving this because you commented.Message ID: ***@***.***>

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experience with local LLMs #7758

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Experience with local LLMs #7758

Uh oh!

Uh oh!

shuber42 Jun 8, 2026

Replies: 3 comments · 2 replies

Uh oh!

Tobi-Adesoye Jun 8, 2026

Uh oh!

shuber42 Jun 9, 2026 Author

Uh oh!

Tobi-Adesoye Jun 9, 2026

Uh oh!

shuber42 Jun 10, 2026 Author

Uh oh!

Tobi-Adesoye Jun 10, 2026

shuber42
Jun 8, 2026

Replies: 3 comments 2 replies

Tobi-Adesoye
Jun 8, 2026

shuber42 Jun 9, 2026
Author

Tobi-Adesoye
Jun 9, 2026

shuber42 Jun 10, 2026
Author

Tobi-Adesoye
Jun 10, 2026