feat: Phase 0 architecture optimize and LLM heartbeat#38
Merged
Conversation
Extend claw_tool_t with required_caps (SWARM_CAP_* bitmap) and flags (CLAW_TOOL_LOCAL_ONLY) so the swarm RPC layer can match tools to capable nodes and refuse to delegate local-only tools. Update claw_tool_register() signature and all 29 call sites: - GPIO tools: SWARM_CAP_GPIO - LCD tools: SWARM_CAP_LCD - Audio tools: SWARM_CAP_SPEAKER - Net tools: SWARM_CAP_INTERNET - System/sched/skill tools: CLAW_TOOL_LOCAL_ONLY Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
Extend heartbeat from 16 to 20 bytes with role and active_tasks fields. Add enum swarm_role (WORKER/THINKER/COORDINATOR/OBSERVER) with automatic self-detection based on capabilities. Replace first-match node selection with load-aware strategy that picks the online node with lowest load among those matching the required capability bitmap. Add exponential-backoff RPC retry (3 attempts, 500ms/1s/2s) and refuse to delegate tools marked CLAW_TOOL_LOCAL_ONLY. Replace tool_name_to_cap() prefix matching with tool registry lookup via claw_tool_find()->required_caps, falling back to prefix heuristic for unregistered tools. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
Replace the routing skeleton with a working service registry and type-based message dispatch. Services register with a type_mask bitmap and their own message queue; gateway delivers incoming messages to all matching consumers. Add GW_MSG_AI_REQ message type for future AI request queuing. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
The serial shell processed backspace byte-by-byte, requiring multiple presses to delete a single CJK character (3-byte UTF-8). Fix all line-editing operations to be UTF-8 aware: - Backspace: walk back over continuation bytes, delete entire sequence, erase correct column count (2 for CJK/emoji) - Delete key: detect UTF-8 lead byte to determine sequence length - Left/right arrows: skip complete UTF-8 sequences - Character input: read all continuation bytes atomically before inserting, preventing partial-character echo Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
The single AI worker thread dropped task callbacks when busy, starving lower-frequency tasks. With 3 tasks (10s, 15s, 30s), the 10s GPIO task monopolized the worker while the other two never executed. Add a pending flag to sched_ai_ctx_t. When the worker is busy, mark the task as pending instead of discarding it. After each task completes, the worker scans all contexts in round-robin order and immediately executes the next pending task before sleeping on the semaphore. This ensures all scheduled tasks eventually execute regardless of their interval, with zero additional memory or threads. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
The goto next_task loop never returned to sem_take because timer callbacks continuously set pending=1 during AI calls. With a 10s GPIO task taking ~5s per AI call, the worker drained one pending just as the next arrived, spinning forever and flooding the console. Remove the goto loop. The worker now processes exactly one task per sem_take wakeup, then sets worker_busy=0 and sleeps. The callback does sem_give when marking pending, so the worker wakes up promptly for the next queued task without spinning. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
Reflect the recent architecture optimizations across all docs: architecture.md (en/zh): - Gateway: service registry with type_mask dispatch, AI_REQ type - Scheduler: round-robin pending queue, task starvation prevention - AI Engine: tool capability declarations (SWARM_CAP_*, LOCAL_ONLY) - Swarm: 20B heartbeat with role/load, load-aware node selection, exponential-backoff RPC retry, required_caps matching - Resource budget: updated to measured 43% usage (100KB free heap) tuning.md (en/zh): - ESP32-C3 memory section: measured runtime data, NET_RESP_MAX reduction (16KB->4KB), heap-allocated sched buffers - Add SWARM_RPC_MAX_RETRIES and SWARM_RPC_RETRY_BASE_MS params CLAUDE.md: - Key Paths: gateway and tools descriptions updated Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
Implement the "cheap checks first" pattern used by OpenClaw and other Claw projects. When no events are pending, the heartbeat tick performs a lightweight LLM ping (max_tokens=1, ~200B request) instead of skipping entirely. New ai_ping() in ai_engine sends a minimal API request without acquiring s_api_lock, so it never blocks interactive ai_chat() calls. Any HTTP response (including 4xx) counts as "online"; only network failures count as "offline". State transitions (online<->offline) are logged and delivered to IM/console. heartbeat_llm_online() exposes the current state for other modules to query. Ping thread uses a 4KB stack (vs 8KB for full heartbeat AI thread), keeping memory overhead minimal. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
Move ai_boot_test_thread() inside the same #ifdef guards that protect its only call site (CONFIG_RTCLAW_AI_BOOT_TEST && no IM). Remove http_get_test() from net_service.c — dead code with zero callers, leftover from early bring-up debugging. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
required_caps(SWARM_CAP_* bitmap) andflags(CLAW_TOOL_LOCAL_ONLY) fields for swarm routing decisionsrequired_capsmatchingai_ping()— max_tokens=1, ~200B) with online/offline state change notificationshttp_get_test(), guardai_boot_test_thread()with matching#ifdefTest plan
meson compilepasses on vexpress-a9 (zero warnings)make build-esp32c3-qemupasses (zero warnings)scripts/check-patch.sh --stagedpasses on all commitsmake run-esp32c3-qemu🦞 Generated with Claude Code