GitHub - nullpointexception-i/agent-sphere: This project is an AI Agent orchestration platform. It uses an LLM-driven decision engine, combined with capabilities (built-in tools, MCP protocol, CLI execution, browser operations, etc.), to achieve a basic closed loop from perception → planning → execution → feedback.本项目是一个面向 AI Agent 编排平台。它通过 LLM 驱动的决策引擎，结合能力（内置工具、MCP 协议、CLI 执行、浏览器操作等）

This project is an AI Agent orchestration platform. Driven by an LLM-based decision engine and combined with capabilities (built-in tools, MCP protocol, CLI execution, browser automation, etc.), it implements a primary closed loop of Perception → Planning → Execution → Feedback.

It supports configuring different model providers: OpenAI, DeepSeek, QuickRouter (relay station), BigModel (Zhipu AI), LiteLLM.

Screenshots

▶ Click to watch the video demo

1. Quick Start for Development

See: QUICK_START.md

2. Architecture

2.1 Overall Structure

2.2 Core Components

2.2.1 SessionRunner (ReAct Engine)

Manages the complete execution lifecycle of an AI session, implementing the Plan → Act → Observe → Learn loop:

Alignment with the ReAct pattern:

2.2.2 Capability Layer

Capability Type	Implementation	Description	Examples
MCP (Model Context Protocol)	MCP Server client	Standard protocol, connects to any MCP Server	Jira, GitHub, Slack, databases
Builtin (built-in tools)	SPI: `CapabilityBuiltinToolSpi`	Java SPI extension	WebFetch, WebRead, Chrome, Todowrite, DocWrite
Chrome Browser	Chrome Extension bridge	DOM operations + real-time visual feedback	Navigate, click, fill forms, screenshot
CLI (command line)	`ProcessBuilder` execution	Local or remote shell	Git operations, build/deploy, system administration
Skill (composite skills)	Multi-step task orchestration	LLM-driven task decomposition	Cross-system workflows

2.2.3 Chrome Extension (Browser Bridge)

3. Algorithm — Core Algorithms

3.1 ReAct Execution Loop

The core loop of AgentSphere follows the ReAct (Reasoning + Acting) pattern, combining the LLM's reasoning ability with tool execution ability:

Message structure:

[
  {role: "system",    content: "You are a browser assistant..."},
  {role: "user",      content: "Help me check the weather in Guangzhou"},
  {role: "assistant", tool_calls: [{id: "call_1", name: "navigate", args: "..."}}]},
  {role: "tool",      tool_call_id: "call_1", content: '{"tabId": 42, "url": "..."}'},
  {role: "assistant", content: "The weather in Guangzhou tomorrow is..."},
  {role: "user",      content: "What should I prepare for going out tomorrow"},
  ...
]

Multi-turn tool call example:

3.2 Multi-level Memory System

AgentSphere implements a multi-level memory system covering the full chain from persistence to runtime caching:

Memory Level Details

Level	Storage	Lifecycle	Capacity	Purpose
L1: KernelContext	ConcurrentHashMap	During run (TTL 30min)	1 per session	Tool list, model route
L2: Messages	ArrayList	During run	Dozens of turns	LLM input/output
L3: LLM Interaction	PostgreSQL	Permanent	Configurable	Debugging & audit
L4: Tool Call	PostgreSQL	Permanent	Unlimited	Replay, observation
L5: Compact Record	PostgreSQL	Permanent	Cumulative	Context compression
L6: Session	PostgreSQL	Permanent	1 per session	Metadata

3.2.1 Context Assembly

HistoryLoader is responsible for loading historical messages from persistent storage and assembling them into the LLM context:

Tool result compression flow:

3.2.2 Context Compaction

Triggered when the estimated tokens of messages exceed maxInputTokens × budget-ratio:

Full compression chain flow:

3.2.3 Tool Call Record State Machine

Each record contains:

callId — Tool call ID generated by the LLM (e.g., call_abc123)
argumentsJson — Original input arguments
compressedArguments — Compressed version of input JSON (write-time compression)
artifact — Original return result
compressedArtifact — Compressed version of result JSON (write-time compression)
Used by HistoryLoader for replay, observation panel display, and auditing

3.2.4 Tool Result Compression Strategy

jsonCompress(node, depth, maxValueChars) {
  if (depth > 5) return "[deep nested]";

  if (node instanceof Map) {
    // Recursively compress each value
    return map.mapValues(v -> jsonCompress(v, depth+1, maxValueChars))
  }

  if (node instanceof List) {
    if (list.size() <= 5) return list.map(v -> jsonCompress(v, depth+1))
    // Large array: keep first 3 + total count
    return { _count: 13, _showing: 3, items: [...] }
  }

  if (node instanceof String) {
    if (text.length() <= maxValueChars) return text
    // Long string: first 100 + ellipsis + last 50
    return text[0..100] + "...[+ N chars]...\n" + text[-50..-1]
  }

  return node // Number, Boolean pass-through
}

3.3 Model Routing and Fallback

AgentSphere provides a multi-level model fault-tolerance mechanism to ensure high availability of LLM calls.

Routing Configuration

Fallback Execution Flow

Note: The compression budget calculation is based on the actual route's maxInputTokens, detected within the execute callback. See the formula below for details.

Compression Budget Calculation

budget = maxInputTokens × budget-ratio (default 0.7)

Example:
  Route: GLM-4.1V-Thinking-Flash, maxInputTokens=1_000_000
  → budget = 1_000_000 × 0.7 = 700_000 tokens
  → When messages exceed 700K tokens → trigger compaction

Dynamic adjustment:
  budget-ratio: 0.5  → Triggers earlier (preserves more context quality)
  budget-ratio: 0.8  → Triggers later (saves compression overhead)

Timeout Parameters

Parameter	Default	Description
`llm.connect-timeout`	30s	Timeout for connecting to LLM API
`llm.read-timeout`	60s	Timeout for reading response
`llm.stream-read-timeout`	120s	Stream read timeout
`llm.stream-timeout`	120s	Total timeout for streaming calls
`runner.turn-timeout`	180s	Total timeout for a single LLM turn

3.4 Browser Operation Flow

3.5 Multi-tab Management

3.6 Timeout and Cancellation Chain

3.7 Session Following

4. Administration — Operations and Management

4.1 Configuration Reference

Config Item	Default	Description
`session.idle-timeout`	30m	Session idle timeout
`session.max-concurrent-runs`	10	Maximum concurrent executions
`runner.max-loop-count`	128	Maximum loop count per run
`runner.turn-timeout`	180s	Single LLM turn timeout
`runner.compaction.budget-ratio`	0.7	Compaction trigger threshold (ratio of maxInputTokens)
`llm.connect-timeout`	30s	LLM API connection timeout
`llm.read-timeout`	60s	LLM API read timeout
`llm.stream-timeout`	120s	Total streaming call timeout
`tool.max-parallel`	3	Maximum parallel tool executions
`tool.execution-timeout`	60s	Single batch tool execution timeout
`tool.submit-timeout`	30s	Tool submission timeout

4.2 Observability

AgentSphere provides a three-tier observation system:

4.2.1 Real-time Events (SSE Events)

Real-time push of LLM call chain:

content_token     → "The weather in Guangzhou tomorrow..."
reasoning_token   → "🤔 The user is asking about weather, I need to open a weather website"
                  → "⚙️ navigate: calling..."
                  → "⚙️ navigate: succeeded ✅"
                  → "⚙️ getContent: calling..."
                  → "⚙️ getContent: succeeded ✅"
                  → "⏹️ Run cancelled" or "✅ Run completed"

SSE Event	Trigger	Frontend Effect
`content_token`	LLM text generation	Typewriter effect
`reasoning_token`	LLM reasoning, tool status	Reasoning panel
`browser_operation`	Chrome operation command	Extension execution
`run_running`	Run starts	Status indicator
`run_completed`	Run completes	Completion notification
`run_failed`	Run fails	Error prompt
`tool_call_started`	Tool PENDING	Tool call list
`tool_call_succeeded`	Tool completes	✅ icon
`tool_call_failed`	Tool fails	❌ icon
`compaction_running`	Compaction starts	Reasoning panel
`compaction_completed`	Compaction completes	Reasoning panel

4.2.2 Run Activity API

Provides complete tool call history querying:

GET /api/v1/instance/runs/{runId}/activities?offset=0&limit=20

Response:
{
  "total": 20,
  "records": [
    { "activityType": "llm_interaction",
      "modelName": "deepseek-v4-flash",
      "interactionType": "CHAT_REPLY",
      "durationMs": 2588,
      "requestBody": "{...}",
      "responseBody": "{...}",
      "success": true },
    { "activityType": "tool_call",
      "toolName": "builtin_5",
      "displayName": "builtin.CapabilityBuiltinToolChrome",
      "argumentsJson": "{...}",
      "artifact": "{...}",
      "status": "SUCCEEDED" }
  ]
}

4.2.3 Session Panel

View	Content
Run List	View historical runs by session, showing userMessage + assistantReply
Tool Call List	Latest tool call records for the current session (sorted by creation time descending)
Todo List	Todo checklist for the current session, with status tracking
Operation Log	Historical operation records in the Chrome Extension popup

4.3 Logging System

Logger	Level	Purpose
`ControllerLogAspect`	INFO	API request/response logging
`ChromeCallbackController`	WARN	Browser operation failures
`FiberSet`	WARN	Tool timeout/failure
`SessionRunner`	INFO	Execution turns and status
`LlmInteractionPersistListener`	DEBUG	LLM interaction record persistence
`RuntimeEventListener`	DEBUG	Tool call lifecycle events

4.4 Key Deployment Steps

# 1. Build the backend
cd agent-sphere
mvn compile -pl agent-sphere-bootstrap -am

# 2. Start the backend
mvn spring-boot:run -pl agent-sphere-bootstrap

# 3. Start the frontend
cd agent-sphere-ui
npm run dev

# 4. Load the Chrome Extension
# Chrome → chrome://extensions → Developer mode → Load unpacked
# Select the agent-sphere-chrome-extension directory

# 5. Configure URLs
# Click the extension icon → Settings Tab
# Frontend URL: http://localhost:8000
# Backend URL:  http://localhost:8080

4.5 Architecture Decision Records (ADR)

Decision	Solution	Reason
SSE vs WebSocket	Server-Sent Events	One-way push requires no client confirmation, natively supported by browsers
fetch+ReadableStream vs EventSource	fetch + ReadableStream	EventSource cannot carry Authorization headers in MV3 Service Worker
Virtual Threads	Java 21 Virtual Threads	Simplifies concurrency model, one virtual thread per tool
Chrome Extension standalone deployment	Independent project	Decoupled from Web UI, permission isolation
Multi-emitter SSE	`List<SseEmitter>` per session	Web UI and Extension share the same SSE channel
FiberSet cancel(true)	`CompletableFuture.cancel(true)`	Effectively interrupts blocking virtual threads on timeout
Tool result write-time compression	`RuntimeEventListener` compresses then writes to `compressed_artifact`	HistoryLoader reads without re-compression, reducing redundant computation
Token budget-based compaction trigger	`shouldCompact` inside `runTurn`'s execute callback	Uses the actual called model route's maxInputTokens for accuracy
Compaction cursor	`compactedUptoRunId` marks compacted runs	HistoryLoader skips compacted runs, only loads subsequent ones
Compaction protection loop	Max 3 retries	Prevents infinite loops when compaction fails due to network fluctuations

4.6 Capability Extension

Adding a New Built-in Tool

@Component
public class CapabilityBuiltinToolMyTool implements CapabilityBuiltinToolSpi {
    @Override
    public BuiltinToolEnum getToolType() { return BuiltinToolEnum.MY_TOOL; }

    @Override
    public ToolInfoVO getInfo() {
        ToolInfoVO info = new ToolInfoVO();
        info.setName(BuiltinToolConstants.NAME_PREFIX + "MyTool");
        info.setDescription("Description for LLM");
        info.setParamSchema(ToolSchemaUtil.generateParamSchema(MyToolDTO.class));
        info.setResponseSchema(ToolSchemaUtil.generateParamSchema(MyToolResultVO.class));
        return info;
    }

    @Override
    public ExecuteResult execute(ExecuteContext ctx) {
        MyToolDTO dto = (MyToolDTO) ctx;
        // Implementation logic
        return new MyToolResultVO(/* result */);
    }
}

5. Project Structure

6. Tech Stack

Domain	Technology
Backend Runtime	Java 21, Spring Boot 3.4, Virtual Threads
Database	PostgreSQL, Flyway migrations
Cache/Distributed Lock	Redis (Redisson)
Frontend	React, UmiJS, Ant Design Pro
Chrome Extension	Manifest V3, Service Worker, Content Script
Real-time Communication	SSE (Server-Sent Events), multi-emitter broadcast
Tool Protocol	MCP (Model Context Protocol, Streamable HTTP)
API Security	Bearer Token, @WithTenant multi-tenancy
LLM Integration	SPI provider abstraction, automatic fallback routing

7. MCP Integration Example

AgentSphere supports connecting to any external service via the MCP protocol. Taking Jira as an example:

# 1. Deploy the Jira MCP Server
npx @roovet/jira-mcp --port 3100

# 2. Add the MCP capability in the AgentSphere admin console
curl -X POST /api/v1/capability/mcp \
  -d '{"name":"Jira MCP","serverUrl":"http://localhost:3100","serverType":"streamable-http"}'

# 3. Bind it to an Agent instance
curl -X POST /api/v1/instance/instance-capabilities \
  -d '{"instanceId":1,"capabilityType":"mcp","capabilityId":1}'

# 4. Users simply send instructions in the chat
# "Help me check my unfinished tasks on Jira"
# → LLM calls MCP tool → Jira API → returns result

8. License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
agent-sphere-chrome-extension		agent-sphere-chrome-extension
agent-sphere-readme		agent-sphere-readme
agent-sphere-ui		agent-sphere-ui
agent-sphere		agent-sphere
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
QUICK_START-cn.md		QUICK_START-cn.md
QUICK_START.md		QUICK_START.md
README-cn.md		README-cn.md
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

It supports configuring different model providers: OpenAI, DeepSeek, QuickRouter (relay station), BigModel (Zhipu AI), LiteLLM.

1. Quick Start for Development

2. Architecture

2.1 Overall Structure

2.2 Core Components

2.2.1 SessionRunner (ReAct Engine)

2.2.2 Capability Layer

2.2.3 Chrome Extension (Browser Bridge)

3. Algorithm — Core Algorithms

3.1 ReAct Execution Loop

3.2 Multi-level Memory System

Memory Level Details

3.2.1 Context Assembly

3.2.2 Context Compaction

3.2.3 Tool Call Record State Machine

3.2.4 Tool Result Compression Strategy

3.3 Model Routing and Fallback

Routing Configuration

Fallback Execution Flow

Compression Budget Calculation

Timeout Parameters

3.4 Browser Operation Flow

3.5 Multi-tab Management

3.6 Timeout and Cancellation Chain

3.7 Session Following

4. Administration — Operations and Management

4.1 Configuration Reference

4.2 Observability

4.2.1 Real-time Events (SSE Events)

4.2.2 Run Activity API

4.2.3 Session Panel

4.3 Logging System

4.4 Key Deployment Steps

4.5 Architecture Decision Records (ADR)

4.6 Capability Extension

Adding a New Built-in Tool

5. Project Structure

6. Tech Stack

7. MCP Integration Example

8. License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages