LILBOT AGENT

Clean-room Local Coding Agent / Windows-first / DeepSeek-ready

LilBot is a futuristic local coding-agent lab: agent loop, tool bus, permission gate, sandbox, memory core, skills, subagents, and MCP-style adapters.

Current Status

LilBot is now past the empty-shell stage. The project has a working local agent loop, OpenAI-compatible provider layer, tool registry, workspace sandbox, permission manager, durable memory, markdown skills, subagents, MCP-style adapters, a Windows-first TUI, and a growing compatibility surface inspired by and LilBot.

The main quality focus has shifted from adding more compatible names to making core capabilities enforceable and durable:

Subagent allowed_tools are enforced at runtime, including explicit empty allowlists and Claude-style tool names such as Read and Grep.
Custom subagents now use a five-gate allowed-tool flow: gates 1-3 reject unsafe creation, while gates 4-5 deny unsafe runtime tool calls and record transcript evidence.
Forked skills now execute through the subagent runtime instead of only rendering prompt text back into the parent conversation.
Subagent transcripts are persisted under .lilbot/subagent-transcripts/ and exposed through transcript_handle.
Subagent lifecycle now has a configurable concurrency cap, persisted task restart resume, transcript-cursor reads, structured dashboard progress, and optional subagent-level worktree isolation.
agent_open, Agent, and Task now expose dynamic tool descriptions: active built-in/custom agents, when-to-use guidance, full tool allowlists, active subagent status, and continue-existing-agent guidance are rendered into the tool schemas seen by the parent model.
The deterministic delegation planner remains as a tested probe/reference for question bursts, code exploration, research, mixed public facts, and writing fallback, while the main loop now relies on the parent model reading the dynamic Agent tool prompt to choose subagents during normal tool-calling.
Delegation routing now has a regression matrix and a local probe script for simple prompts, no-question-mark Chinese question bursts, code exploration, research, mixed public facts, and semantic-planner writing fallback.
EnterPlanMode and ExitPlanMode persist plan lifecycle and approval state.
Pending plan approval now blocks write and execution tools through the central tool registry until the plan is approved or rejected.
Windows shell execution now runs through a PowerShell safety analyzer that classifies separators, redirection, subprocess boundaries, background launches, destructive commands, and unsafe delete/move targets before permission prompts.
EnterWorktree and ExitWorktree probe git worktree support, return an honest unsupported result, and support cleanup/remove for created worktrees.
Worktree isolation now has branch naming and WorktreeMergeBack dry-run or merge execution for bringing a worktree branch back into the target branch.
LSP phase 2 is available through symbols, definitions, workspace symbols, references, diagnostics, and rename preview tools. LilBot uses a local language server when one is installed, otherwise it falls back to Python AST, regex symbols, grep evidence, and project-map context.
Batch 1 workspace cleanup is implemented: apply_patch has a pure-Python fallback, run_tests writes log artifacts under .lilbot/test-artifacts/, and project_map now detects frameworks, entrypoints, package managers, and key source files.
The dashboard Trace panel starts with a WELCOME banner, the left Agent info card shows a wider grouped tool/skill inventory, the Work panel shows runtime, active tool, subagent, transcript, and worktree status, and Windows /copy / F2 uses the native Unicode clipboard format for Chinese text instead of clip.exe.
Composer slash commands now have an explicit fast path: local and local-ui commands such as /clear, /tokens, /plan, and /do are handled locally, while prompt commands such as /review intentionally enter the Agent Loop.
The current test suite covers these enforcement and lifecycle paths.

Current verified baseline:

python -m pytest
109 passed, 6 skipped

Whole Architecture

now we only have the CLI version, in the future, I will find some guy to coorperate with me to develop a coller software version like this:

Why `python -m lilbot`

-m means run a Python module as a program.

When you run:

python -m lilbot

Python does this:

current conda/python environment
  -> find package named lilbot
  -> execute lilbot/__main__.py
  -> __main__.py calls lilbot.cli:main()

Why this is good on Windows:

It uses the exact python from your active conda environment.
It avoids hardcoding script paths.
It works before installing a global lilbot.exe command.
It is the standard way to run package-style CLIs during development.

Later we can also expose:

lilbot

through pyproject.toml, but python -m lilbot is the cleanest dev command.

Flight Deck

LilBot is aiming for a terminal cockpit, not a boring command prompt.

┌──────────────────────────────────────────────────────────────┐
│ Agent LilBot-agent-code - deepseek-v4-flash     ready  v0.1  │
├─────────────────────────────┬────────────────────────────────┤
│                             │                                │
│       L I L B O T           │      Work / Tool Stream         │
│   local coding agent        │      permissions / memory       │
│                             │      subagents / mcp            │
├─────────────────────────────┴────────────────────────────────┤
│ Composer: write a task, use /, or run ! command safely        │
└──────────────────────────────────────────────────────────────┘

Current default renderer: prompt_toolkit full-screen dashboard.

Classic fallback renderer: Rich, available with --classic.

Python can absolutely build a CLI/TUI as polished as TypeScript tools. Terminals receive ANSI escape sequences, keyboard events, mouse events, and layout redraws. Python libraries like prompt_toolkit, Rich, and Textual can drive those just as well as Node libraries.

System Map

flowchart TB
    User["User / Windows Terminal"]:::human
    TUI["LilBot TUI\nprompt_toolkit fullscreen\nRich classic fallback"]:::ui
    Loop["Agent Loop\nreason -> tool -> observe -> continue"]:::core
    Provider["Provider Layer\nDeepSeek V4 / OpenAI-compatible / local rule model"]:::model
    Registry["Tool Registry\nschemas + handlers + result model"]:::tool
    Gate["Permission Gate\nask / accept-all / deny-all"]:::guard
    Sandbox["Workspace Sandbox\npath boundary + shell boundary"]:::guard
    Memory["Memory Core\nproject JSONL + search"]:::memory
    Skills["Skill Deck\nmarkdown prompt capsules"]:::skill
    Subagents["Subagent Bay\ncoder / reviewer / researcher / planner"]:::agent
    MCP["MCP Dock\nexternal tools via JSON-RPC style adapter"]:::mcp
    Files["Workspace Files"]:::data

    User --> TUI --> Loop
    Loop <--> Provider
    Loop --> Registry
    Registry --> Gate --> Sandbox
    Sandbox --> Files
    Registry --> Memory
    Registry --> Skills
    Registry --> Subagents
    Registry --> MCP

    classDef human fill:#111827,stroke:#00e5ff,color:#ffffff,stroke-width:2px
    classDef ui fill:#061A2E,stroke:#36c5f0,color:#ffffff,stroke-width:2px
    classDef core fill:#1E1B4B,stroke:#a78bfa,color:#ffffff,stroke-width:2px
    classDef model fill:#312E81,stroke:#c084fc,color:#ffffff,stroke-width:2px
    classDef tool fill:#042F2E,stroke:#2dd4bf,color:#ffffff,stroke-width:2px
    classDef guard fill:#451A03,stroke:#f59e0b,color:#ffffff,stroke-width:2px
    classDef memory fill:#052E16,stroke:#22c55e,color:#ffffff,stroke-width:2px
    classDef skill fill:#3B0764,stroke:#e879f9,color:#ffffff,stroke-width:2px
    classDef agent fill:#172554,stroke:#60a5fa,color:#ffffff,stroke-width:2px
    classDef mcp fill:#4C0519,stroke:#fb7185,color:#ffffff,stroke-width:2px
    classDef data fill:#0F172A,stroke:#94a3b8,color:#ffffff,stroke-width:2px

Implementation Map

Layer	Main Files	Current State
CLI and runtime wiring	`lilbot/cli.py`, `lilbot/__main__.py`	Builds config, provider, registry, sandbox, memory, skills, subagents, MCP, TUI, and the typed slash-command fast path.
Agent loop	`lilbot/core/agent.py`, `lilbot/core/delegation.py`, `lilbot/core/events.py`, `lilbot/core/prompts.py`	Runs provider turns, executes tools, tracks usage, compacts history, and exposes live Agent tool schemas so the parent model can choose subagents during normal tool-calling.
Provider layer	`lilbot/llm/providers.py`	Supports the local rule model and OpenAI-compatible providers such as DeepSeek.
Tool bus	`lilbot/tools/registry.py`, `lilbot/tools/builtin.py`	Registers schemas and handlers for workspace, git, shell, memory, skills, subagents, tasks, automation, MCP, web, LSP/navigation, worktree merge-back, document/media probes, compatibility aliases, and central plan-approval gating.
Safety boundary	`lilbot/sandbox/workspace.py`, `lilbot/sandbox/permissions.py`	Enforces workspace path boundaries and ask/accept-all/deny-all permission modes.
Memory	`lilbot/memory/store.py`	Persists project memory as JSONL with list/search/delete helpers.
Skills	`lilbot/skills/registry.py`, `lilbot/skills/bundled/`	Loads inline and forked markdown skills, Claude-style frontmatter, aliases, companion files, allowed tools, agent hints, and model hints.
Subagents	`lilbot/subagents/manager.py`, `lilbot/subagents/render.py`	Provides built-in and custom agents, dynamic Agent tool descriptions, five-gate custom allowed-tool validation, runtime tool allowlists, concurrency limits, restart resume, structured final reports, cancellation, progress events, transcript handles, and optional worktree isolation.
MCP adapter	`lilbot/mcp/manager.py`	Reads `.lilbot/mcp.json` and provides phase-1 server/tool/resource integration.
TUI	`lilbot/tui/classic.py`, `lilbot/tui/dashboard.py`, `lilbot/tui/windows_console.py`	Provides Rich classic fallback and a prompt_toolkit dashboard with Trace, expanded Agent inventory, structured Work status, permission popups, and transcript/progress visibility.

Capability Progress

Area	Done	Next Gap
Workspace tools	File reads, directory listing, search, git status/diff/log/show/blame, bounded handles, diagnostics, pure-Python patch fallback, test log artifacts, and framework-aware `project_map`.	Richer test classification and artifact retrieval UX.
Code navigation	`lsp_symbols`, `lsp_definition`, `lsp_workspace_symbols`, `lsp_references`, `lsp_diagnostics`, and `lsp_rename_preview`, with local LSP when available and AST/regex/grep fallback when unavailable.	Persistent warm LSP server sessions, references quality for dynamic languages, and safe rename apply.
Skill ecosystem	Bundled skills, `SKILL.md` folders, metadata parsing, `load_skill`, inline skills, forked skill execution through subagents.	Source precedence, hooks, path-filtered skills, safer shell expansion.
Subagents	Built-in roles, custom agents, five-gate allowed-tool protection, Claude tool-name compatibility, dynamic `Agent` tool prompt parity, delegation matrix probes, concurrency cap, restart resume, progress events, structured dashboard status, durable transcript handles, optional worktree isolation.	Exact model-state resume, per-agent resource quotas, richer cancellation semantics.
Planning lifecycle	`update_plan`, checklists, goals, `EnterPlanMode`, `ExitPlanMode`, persisted approval state, write/execute gating while approval is pending.	Better approval UX and plan review surfaces in the TUI.
Worktree lifecycle	`EnterWorktree` / `ExitWorktree` with explicit unsupported fallback and cleanup/remove; subagents can request managed `worktree` isolation; `WorktreeMergeBack` can preflight or merge a source branch back.	Conflict UI, merge-back artifact summaries, stronger cleanup diagnostics.
Shell and PowerShell	Permission-gated shell execution, background jobs, PowerShell safety metadata, destructive command classification, and hard blocks for unsafe delete/move targets.	Expand analyzer coverage for advanced PowerShell AST cases and richer remediation hints.
External integrations	Web search/fetch, GitHub via `gh`, MCP phase-1 adapter, automation records.	Deeper MCP resource discovery, stronger GitHub workflows, real automation scheduler.
Analysis/media/docs	RLM Python sessions, pandoc/OCR/image probes.	Artifact handles, richer document/spreadsheet/presentation workflows.
Composer commands	Typed slash registry with `local`, `local-ui`, and `prompt` command modes; `/clear`, `/tokens`, `/plan`, `/do`, and `/review` are routed without ad hoc composer hard-coding.	User/project contributed slash commands from skills and MCP prompt discovery.

Agent Loop

sequenceDiagram
    autonumber
    participant U as User
    participant UI as LilBot TUI
    participant A as Agent Loop
    participant L as DeepSeek / Provider
    participant R as Tool Registry
    participant P as Permission Gate
    participant S as Sandbox
    participant W as Workspace

    U->>UI: prompt or slash command
    UI->>A: user message
    A->>L: messages + tool schemas
    L-->>A: text or tool calls
    alt tool call requested
        A->>R: execute tool
        R->>P: ask permission if needed
        P-->>R: allow / deny
        R->>S: workspace-scoped action
        S->>W: read / write / search / shell
        W-->>S: result
        S-->>R: safe output
        R-->>A: tool result
        A->>L: observation
    else direct answer
        A-->>UI: final text
    end
    UI-->>U: cockpit output

Permission Gate

stateDiagram-v2
    [*] --> InspectTool
    InspectTool --> NoApprovalNeeded: read/list/search
    InspectTool --> ApprovalNeeded: write/edit/bash
    ApprovalNeeded --> AllowOnce: y
    ApprovalNeeded --> AlwaysAllow: a
    ApprovalNeeded --> DenyOnce: n
    ApprovalNeeded --> AlwaysDeny: d
    AlwaysAllow --> PersistRule
    AlwaysDeny --> PersistRule
    AllowOnce --> ExecuteTool
    NoApprovalNeeded --> ExecuteTool
    DenyOnce --> StopTool
    PersistRule --> ExecuteOrStop
    ExecuteOrStop --> ExecuteTool: allow
    ExecuteOrStop --> StopTool: deny
    ExecuteTool --> [*]
    StopTool --> [*]

Memory / Skills / Subagents

flowchart LR
    Prompt["User Intent"]:::input

    subgraph MemoryCore["Memory Core"]
        M1["memory_save"]
        M2["memory_search"]
        M3["memory_list"]
        M4["memory_delete"]
    end

    subgraph SkillDeck["Skill Deck"]
        S1["review.md"]
        S2["plan.md"]
        S3["commit.md"]
        S4["summarize.md"]
    end

    subgraph AgentBay["Subagent Bay"]
        A1["coder"]
        A2["reviewer"]
        A3["researcher"]
        A4["planner"]
    end

    Prompt --> MemoryCore
    Prompt --> SkillDeck
    Prompt --> AgentBay

    MemoryCore --> Context["Injected Context"]
    SkillDeck --> Rendered["Rendered Prompt Capsule"]
    AgentBay --> Result["Background or inline result"]

    Context --> Loop["Agent Loop"]
    Rendered --> Loop
    Result --> Loop

    classDef input fill:#020617,stroke:#00e5ff,color:#ffffff
    classDef default fill:#111827,stroke:#64748b,color:#ffffff

Teams / Teammates

Subagents are one-shot: spawn → run → collect. Teams add long-running teammates that stay alive across turns, talk to each other, and share a task board — modeled on multi-agent "swarm" coordination but built on LilBot's own threaded subagent runtime (gates, transcripts, and persistence are all reused).

State lives per-project under .lilbot/teams/<slug>/:

config.json — team + members
tasks.json — shared task board (assignee + blocks / blocked_by)
mailbox/<name>.json — per-agent inbox (file-locked, concurrency-safe)

How a team runs

flowchart TD
    Lead["Lead (you)"]:::input
    Lead -->|team_create| Team["Team: bugfix"]
    Lead -->|"Agent(team_name, name, subagent_type)"| Impl["impl (implementer)\nlong-running thread"]
    Lead -->|"Agent(team_name, name, subagent_type)"| Rev["rev (review)\nlong-running thread"]

    Impl -->|"send_message to=lead + idle"| Box["mailbox/"]
    Rev  -->|"send_message to=lead + idle"| Box
    Box  -->|"drained at next loop turn"| Note["&lt;team-notification&gt; injected"]
    Note --> Lead
    Lead -->|"send_message wakes a teammate"| Impl

    classDef input fill:#020617,stroke:#00e5ff,color:#ffffff
    classDef default fill:#111827,stroke:#64748b,color:#ffffff

A teammate runs one full agent turn (same tool loop + gates as a subagent), reports to lead, then goes idle and polls its mailbox. The lead never blocks: teammate messages are drained at the top of each agent-loop turn and injected as <team-notification> coordination signals.

Tools (the model uses these autonomously)

Tool	Purpose
`team_create` / `team_delete` / `team_list`	manage teams
`Agent(team_name=, name=, subagent_type=)`	spawn a long-running teammate (vs. one-shot when `team_name` is omitted)
`send_message`	message a teammate by name, `lead`, or `*` (broadcast); wakes idle teammates
`team_task_create` / `team_task_list` / `team_task_get` / `team_task_update`	shared task board with dependencies

Every teammate automatically gets the coordination tools (send_message, team_task_*) on top of its role's tools, scoped to its identity via the tool context — so it knows who it is and which team it belongs to.

Slash commands

/team list                 # teams, members, live status, last activity
/team new NAME             # create a team locally
/team msg NAME TEXT        # send a message (wakes the teammate)
/team rm NAME              # delete a team

The Flight Deck's Work pane (F5) shows a live Teammates panel: name [status] activity tools=N tok=….

Worktree isolation (opt-in)

Pass isolation: "worktree" when spawning a teammate to give it its own git worktree under .lilbot/worktrees/, so concurrent teammates don't edit the same files. If the workspace is not a git repo, it degrades gracefully to the shared workspace. Worktrees are removed on team_delete.

An isolated teammate auto-accepts writes (accept-all) because its PathSandbox confines every file operation to its own worktree — changes can't reach the main tree until the lead reviews/merges them. A non-isolated teammate instead inherits the lead's permission mode, so in the default ask mode it cannot write unless you approve (or run with accept-all).

Example

"Build a team: have an implementer fix the null check in auth.py, then a reviewer verify the change."

The lead creates a team, spawns impl and (after impl reports back) rev, collects both results via auto-injected notifications, and answers you — without ever polling.

Learn more

Deep dive (source-grounded, interview/teaching grade): docs/TEAMS_EXPLAINED.md
Live demo: python experiment/teams_demo.py (stub, no network) or --real (DeepSeek)

MCP Dock

flowchart TB
    Config[".lilbot/mcp.json"]
    Manager["MCPManager"]
    ServerA["server: filesystem"]
    ServerB["server: browser"]
    ServerC["server: custom lab tool"]
    ToolCall["mcp_call(server, tool, args)"]
    Result["tool result → Agent Loop"]

    Config --> Manager
    Manager --> ServerA
    Manager --> ServerB
    Manager --> ServerC
    ToolCall --> Manager
    Manager --> Result

    class Config file
    class Manager core
    class ServerA,ServerB,ServerC server
    class ToolCall toolcall
    class Result result

    classDef file fill:#172554,stroke:#60a5fa,color:#fff
    classDef core fill:#1e1b4b,stroke:#a78bfa,color:#fff
    classDef server fill:#042f2e,stroke:#2dd4bf,color:#fff
    classDef toolcall fill:#451a03,stroke:#f59e0b,color:#fff
    classDef result fill:#052e16,stroke:#22c55e,color:#fff

Windows Quick Start

Python 3.10 is OK. The project is tested with Python 3.10.20 on Windows.

cd F:\Experiment_laborotory\collection-lilbot-source-code-main\LilBot-agent-code
conda activate LilBot
pip install -r requirements.txt
pip check
python -m lilbot

Use the legacy printed interface only when debugging:

python -m lilbot --classic

If box lines or Chinese text look wrong, force UTF-8 for the current PowerShell tab:

chcp 65001
$OutputEncoding = [System.Text.UTF8Encoding]::new()
[Console]::InputEncoding = [System.Text.UTF8Encoding]::new()
[Console]::OutputEncoding = [System.Text.UTF8Encoding]::new()
python -m lilbot

Recommended terminal:

Windows Terminal + Cascadia Mono / JetBrains Mono

DeepSeek

Do not commit API keys. Set the key only in your shell or in Windows user environment variables.

For local development, LilBot also auto-loads .env from the project root. The file is ignored by Git.

DEEPSEEK_API_KEY=sk-...
LILBOT_PROVIDER=deepseek
LILBOT_MODEL=deepseek-v4-flash
LILBOT_BASE_URL=https://api.deepseek.com

$env:DEEPSEEK_API_KEY="sk-..."
python -m lilbot --provider deepseek --model deepseek-v4-flash

One-shot real API smoke test:

$env:DEEPSEEK_API_KEY="sk-..."
python -m lilbot --provider deepseek --model deepseek-v4-flash --print "Reply exactly: LilBot OK"

Endpoint:

https://api.deepseek.com

Command Deck

Slash commands are intercepted in the Composer before the normal Agent Loop. They are registered with an explicit execution type:

local: deterministic local read/config action, no model call.
local-ui: deterministic UI/session action, no model call unless the command explicitly includes a task to send onward.
prompt: expands into a prompt and intentionally enters the Agent Loop.

Command	Type	Purpose
`/help [command]`	`local`	Show all commands or one command's metadata
`/clear`	`local-ui`	Clear Trace and reset the local conversation
`/copy`	`local-ui`	Copy the Trace panel to clipboard
`/theme`	`local-ui`	Show theme preview
`/model [flash\|pro]`	`local`	View or switch DeepSeek model
`/tools`	`local`	List registered tools
`/skills`	`local`	List skills
`/skill NAME ARGS`	`prompt`	Render a skill and run it through Agent
`/memory list/search/save/delete`	`local`	Manage memory
`/agents`	`local`	List subagent types and tasks
`/agent TYPE PROMPT`	`local`	Run a focused subagent task
`/mcp`	`local`	List MCP-style server config
`/permissions ask/accept-all/deny-all`	`local`	Switch permission mode
`/tokens`	`local`	Show local token/context usage
`/plan [task]`	`local-ui`	Enter Plan Mode; with `task`, send a planning prompt to Agent
`/do [approved\|rejected]`	`local-ui`	Exit Plan Mode and persist approval state
`/review [focus]`	`prompt`	Ask Agent to review the current git diff
`/display`	`local`	Show terminal and font diagnostics
`/exit`	`local-ui`	Quit

Slash Command Theory

Slash commands exist for the "fast lane" cases where calling the LLM would be wasteful or impossible. /clear is a UI/session reset, /tokens is a local usage read, and /plan without a task is a deterministic state change into EnterPlanMode. These commands return in milliseconds and do not consume model tokens.

Prompt commands are the explicit exception. /review and /skill are still slash commands, but their job is to turn a short command into a structured Agent request. /plan design auth module first enters Plan Mode locally, then sends the task text to the Agent for planning. The dashboard uses the same registry metadata to decide whether a slash command should appear as a popup or as a normal Trace/Agent turn.

Dashboard interaction notes:

Trace is the main conversation and tool-execution stream.
Select text in Trace to copy, or use /copy / F2. On Windows this writes CF_UNICODETEXT, so Chinese Trace content should paste without mojibake.
Right-click paste and Ctrl+V are supported in the Composer.
The top bar shows approximate context usage, for example ctx 03%.
During model work, the footer switches to a wave animation.
The Work panel shows runtime status, active tool state, recent subagents, last progress event, transcript handles, worktree branch, and worktree state.

Manual Subagent Concurrency Test

Use this quick local probe to verify queueing without calling a real model:

$env:LILBOT_SUBAGENT_MAX_CONCURRENT='3'
@'
from pathlib import Path
import threading, time
from lilbot.core.events import ProviderTurn
from lilbot.subagents import SubAgentManager

release = threading.Event()
def provider(messages, tools):
    release.wait(20)
    return ProviderTurn(content="done")

manager = SubAgentManager(provider, Path(".lilbot/agents"), max_concurrent=3)
tasks = [manager.open("writer", f"manual concurrency {i}", background=True) for i in range(6)]
time.sleep(0.3)
print(manager.runtime_status())
release.set()
for task in tasks:
    while not task.terminal:
        time.sleep(0.05)
print(manager.runtime_status())
'@ | python -

Expected first print: running is 3, queued is 3. Expected second print: all six tasks are terminal.

Manual Delegation Matrix Probe

Use this local probe to inspect routing without calling a model or the network:

python experiment\delegation_matrix.py
python experiment\delegation_matrix.py "谁是2025年NBA冠军 那谁是那一年的FMVP呢 哦对还有NBA是哪一个国家的比赛呀"

Expected shape for the second command: a deterministic plan with three researcher probes named auto_question_01, auto_question_02, and auto_question_03. A prompt such as 请创作一篇古风的1000字散文... should show deterministic_plan: null plus semantic_planner_if_no_plan: true, meaning the host did not hard-code that genre but the model-side delegation planner should be consulted.

Lifecycle Theory

LilBot treats long-running agent work as a small lifecycle system rather than a single function call. A subagent task moves through queued, running, and a terminal state such as completed, failed, or cancelled. Every meaningful transition is also appended to a JSONL transcript. This makes the dashboard and tools read the same source of truth:

SubAgentTask state
  -> persisted in .lilbot/subagent-tasks.json
  -> mirrored by transcript events in .lilbot/subagent-transcripts/*.jsonl
  -> projected into Work panel progress rows

Restart resume is deliberately conservative. If LilBot restarts while a task is non-terminal, the task is recovered as queued, marked with recovered=True, and scheduled again after the runtime is configured. This resumes from the assignment prompt and transcript evidence, not from hidden model token state. That keeps the behavior honest while still avoiding the old failure-only recovery path.

Transcript cursors are line-based. agent_transcript returns events after a cursor and a new cursor, so a dashboard or future GUI can poll progress without re-reading the whole transcript.

Question-burst delegation is a separate planner rule. When one user message contains three or more unrelated questions, LilBot opens focused researcher subagents up to LILBOT_SUBAGENT_MAX_CONCURRENT and the current step budget. If there are more questions than available subagent slots, LilBot groups the extra questions into compact ordered question groups instead of dropping them. Normal broad code/research/planning tasks keep a more conservative budget so they still leave room for parent synthesis.

Delegation Theory

/Claude-style delegation does not try to hard-code every possible task category in the host runtime. The product pattern is:

agent descriptions + tool prompt
  -> model decides whether Agent is useful
  -> runtime validates allowed/disallowed tools, recursion, permissions,
     lifecycle, transcripts, and isolation

Clean-room notes from the local /LilBot source audit:

AgentTool/prompt.ts dynamically renders available agent types, their whenToUse descriptions, and tool limits into the tool prompt, so the parent model can decide when to spawn one or several agents.
loadAgentsDir.ts loads project/user custom agents from agent files with descriptions, prompts, tool allow/disallow lists, models, and permission mode.
agentToolUtils.ts and tool constants hard-code safety boundaries: recursive agent tools are disallowed, async/background tools are constrained, and custom agent tool lists are resolved before runtime execution.
forkSubagent.ts is a lifecycle/runtime path, not a keyword classifier: it inherits context through a directive and still prevents recursive forking.

LilBot now follows that split more closely. ToolRegistry.schemas() asks the subagent manager for live render context, then lilbot/subagents/render.py injects lines into agent_open, Agent, and Task:

- researcher: Use for web research ... (Tools: web_search, fetch_url, ...)
- explore: Use for codebase mapping ... (Tools: project_map, read_file, ...)

The parent model sees these descriptions during normal tool-calling, including active subagent status and guidance to continue an existing agent before launching a duplicate. The runtime remains the source of truth: unknown agent types, recursive subagent tools, write/execute tools, plan-control tools, custom allowlists, transcripts, worktree isolation, and lifecycle state are still enforced by the subagent manager and registry gates.

delegation.py remains useful as a deterministic probe and planning reference. It is covered by a regression matrix for question bursts, no-question-mark Chinese clauses, code exploration, mixed public facts, and writing fallback. That lets us test routing theory without forcing the host runtime to hard-code every future genre or task category.

LSP Theory

LilBot's code navigation tools follow a "semantic first, evidence fallback" rule:

local language server available
  -> use LSP request
  -> normalize result into path/line/character records
otherwise
  -> Python AST for Python symbols and syntax diagnostics
  -> regex symbol extraction for common languages
  -> grep-style reference evidence

The current LSP surface includes:

Tool	Purpose	Fallback
`lsp_symbols`	Document/project symbols	AST/regex scan
`lsp_definition`	Symbol definition	AST/regex definitions, then grep evidence
`lsp_workspace_symbols`	Workspace symbol search	AST/regex project scan
`lsp_references`	Reference lookup	grep-style references
`lsp_diagnostics`	Diagnostics	Python syntax diagnostics
`lsp_rename_preview`	Rename edit preview	reference candidates only

Rename preview does not write files. It returns candidate edits so a later phase can add permission-gated apply semantics with conflict checks.

Worktree Theory

Worktree isolation gives a subagent a separate checkout under .lilbot/worktrees/ so it can inspect or modify files without sharing the main workspace directory. Branch naming matters because merge-back needs a durable source branch, not just a detached checkout.

The current flow is:

EnterWorktree or subagent isolation=worktree
  -> probe git worktree support
  -> create a named branch/worktree when supported
  -> run work in the worktree sandbox
  -> optionally cleanup/remove the worktree
  -> WorktreeMergeBack dry-run shows source -> target diff
  -> WorktreeMergeBack dry_run=false merges after permission and clean-tree checks

Unsupported systems return structured unsupported results rather than pretending worktree isolation happened.

Update Log

2026-06-17

Added typed Composer slash commands with local, local-ui, and prompt modes, so deterministic UI/session commands bypass the Agent Loop.
Added /clear, /tokens, /plan, /do, and /review.
- /clear resets the local conversation and clears Trace.
- /tokens reads local usage/context estimates without a model call.
- /plan enters persisted Plan Mode locally; /plan <task> also sends the task to Agent for planning.
- /do exits Plan Mode and records approval state.
- /review is a prompt command that asks Agent to inspect the current diff.
Updated dashboard composer routing to use slash command metadata: local commands render in the command popup, while prompt commands render as normal Trace/Agent turns.
Added regression tests proving local slash commands do not call agent.run_turn() and prompt commands call run_prompt() intentionally.

2026-06-16

Added subagent restart resume: persisted non-terminal tasks recover as queued and are automatically scheduled after tool/context configuration.
Added transcript cursor reads through agent_transcript and progress metadata in subagent projections.
Added worktree branch naming for managed subagent worktrees and EnterWorktree branch/ref options.
Added WorktreeMergeBack / worktree_merge_back for merge-back preflight and permission-gated execution.
Added LSP phase 2 tools: lsp_workspace_symbols, lsp_references, lsp_diagnostics, and lsp_rename_preview.
Updated the Work panel to show subagent last event, event count, resume count, transcript handle, worktree branch, and worktree path.
Added dynamic Agent tool prompt parity: agent_open, Agent, and Task now render live agent types, when-to-use guidance, full tool allowlists, active subagent status, and continue-existing-agent guidance into the tool schemas seen by the parent model.
Kept deterministic delegation routing as a testable probe/reference instead of host-runtime keyword control: the matrix covers no-question-mark Chinese question bursts, code exploration, research, mixed fact scopes, and semantic writing fallback.
Expanded the left Agent dashboard card so the information page shows more of the tool and skill inventory instead of leaving a large blank area.
Re-audited the local /LilBot AgentTool source and documented the split between dynamic model-side agent selection and host-side safety, permission, lifecycle, transcript, and isolation enforcement.
Added tests/test_delegation_matrix.py plus experiment/delegation_matrix.py to probe the full delegation route for simple prompts, no-question-mark Chinese question bursts, code exploration, research, mixed fact scopes, and semantic-planner writing fallback.
Fixed Windows /copy / F2 Chinese clipboard mojibake by writing native CF_UNICODETEXT instead of piping text through clip.exe.

2026-06-15

Enforced plan approval for write/execute tools.
Added PowerShell safety analysis before shell permission prompts.
Added custom subagent five-gate allowed-tool protection.
Added forked skill execution through subagents.
Added durable subagent transcripts, concurrency limits, restart recovery, optional worktree isolation, and structured dashboard subagent status.
Added LSP phase 1 symbols/definition tools and Batch 1 workspace cleanup: pure-Python patch fallback, test log artifacts, and framework-aware project_map.

Next Development Focus

Recommended next batch:

Agent listing attachment and continuation UX.
- Mirror the optional agent-list attachment path so changing custom agents does not always mutate the tool schema.
- Add a stronger "continue existing agent" workflow around agent_eval follow-up messages and dashboard transcript handles.
- Add tests that assert the parent can reuse an existing matching subagent instead of opening a duplicate.
Persistent LSP sessions.
- Keep language servers warm across calls instead of one short request per lookup.
- Add server lifecycle controls, cache invalidation, and richer diagnostics collection.
Worktree merge UX.
- Add merge-back artifact summaries.
- Add conflict reporting and cleanup diagnostics.
- Add safer branch naming policy for repeated task names.
Product-level hooks and lifecycle.
- Add pre-tool/post-tool hooks.
- Surface plan approval and PowerShell risk in the dashboard more clearly.
- Add richer task/test artifact retrieval from handles.
Subagent resource controls.
- Add per-agent time/tool/output quotas.
- Add clearer cancellation semantics for queued vs running tasks.
- Explore exact resume from structured conversation checkpoints.

Roadmap

flowchart LR
    V01["v0.1\nfullscreen dashboard\nDeepSeek link\ncore tools"] --> V02["v0.2\nlive work panel\nstreaming transcript\nbetter logo motion"]
    V02 --> V03["v0.3\nstronger sandbox\npatch editor\npermission memory"]
    V03 --> V04["v0.4\nreal MCP sessions\nsubagent worktrees\nskill marketplace"]
    V04 --> V05["v1.0\nLilBot mission control"]

Repository Upload

Remote:

git remote -v

Push:

git push -u origin main

If GitHub asks for login, use Git Credential Manager or GitHub CLI:

gh auth login
gh auth setup-git
git push -u origin main

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
docs		docs
experiment		experiment
lilbot		lilbot
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
AUTHORS		AUTHORS
CHANGELOG.md		CHANGELOG.md
README.md		README.md
SPEC_TEAMS_AND_TEAMMATES.md		SPEC_TEAMS_AND_TEAMMATES.md
Untitled		Untitled
hello.txt		hello.txt
hello002.txt		hello002.txt
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

LILBOT AGENT

Clean-room Local Coding Agent / Windows-first / DeepSeek-ready

Current Status

Whole Architecture

Why python -m lilbot

Flight Deck

System Map

Implementation Map

Capability Progress

Agent Loop

Permission Gate

Memory / Skills / Subagents

Teams / Teammates

How a team runs

Tools (the model uses these autonomously)

Slash commands

Worktree isolation (opt-in)

Example

Learn more

MCP Dock

Windows Quick Start

DeepSeek

Command Deck

Slash Command Theory

Manual Subagent Concurrency Test

Manual Delegation Matrix Probe

Lifecycle Theory

Delegation Theory

LSP Theory

Worktree Theory

Update Log

2026-06-17

2026-06-16

2026-06-15

Next Development Focus

Roadmap

Repository Upload

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Why `python -m lilbot`

Packages