A Production-Grade Multi-Tenant Agent Harness
Progressive disclosure of Skills, Tools and MCP — built on LangGraph + deepagents
DeepAgentForce is a production-grade Agent Harness — a runtime that gives agents a place to live, run, and scale across real multi-user environments.
It is not just another chatbot interface. The focus is on the operating layer:
- How an agent selects and invokes skills at scale (progressive disclosure, not a flat list)
- How dozens or hundreds of tools/MCP servers stay out of the context window until needed
- How multiple users share the same platform while keeping sessions, knowledge, configs, and skills fully isolated
- How RAG, long-term user memory, tools, and custom logic connect into one coherent runtime
Inspired by the hermes-agent architecture but rebuilt natively on LangGraph + LangChain:
| Layer | Skills | Built-in Tools | Extra / MCP Tools |
|---|---|---|---|
| Always in context | Category overview only | Full list (15 tools, small schema) | Bridge tools only (3 stubs) |
| On demand — tier 1 | skills_list(category) → name+description |
— | tool_search(query) → BM25 matches |
| On demand — tier 2 | skill_view(name) → full SKILL.md |
— | tool_describe(name) → param schema |
| Execution | shell → run skill script |
Direct call (already bound) | tool_invoke(name, args) → proxy |
Why it matters: A deployment with 19 skills, 50 MCP tools, and 10 custom tools keeps the same constant context overhead — only 3 bridge stubs are ever in the context window regardless of how many tools you add.
Threshold gate (tool_disclosure.py):
extra_tools schema tokens < 10% context → bind directly (no overhead)
extra_tools schema tokens ≥ 10% context → switch to tool_search bridge
Connect any MCP server — the same config format as Claude Desktop:
{
"mcpServers": {
"slack": { "command": "npx", "args": ["-y", "@modelcontextprotocol/server-slack"] },
"github": { "url": "https://my-mcp-server/github", "headers": { "Authorization": "Bearer ..." } }
}
}- Supports
stdio,streamable_http, andssetransports - Multi-tenant: global shared config + per-tenant override file
- Tools auto-prefixed
mcp__<server>__<tool>to avoid conflicts - Full async support —
tool_invokeusesainvokefor MCP tools - Managed via Web UI (add, test connection, enable/disable)
Every resource is scoped to tenant_uuid:
| Resource | Isolation |
|---|---|
| Chat sessions | thread_id = tenant_uuid + session_id |
| RAG knowledge base | Separate ChromaDB collection per tenant |
| Skills | Shared built-in + private data/skill/<uuid>/ directory |
| MCP configs | Global read-only + data/mcp_servers_<uuid>.json |
| Custom tools | data/agent_tools_custom/<uuid>/ |
| User profile | PageRank-based preference graph per user |
| File outputs | data/outputs/<uuid>/ |
End users can upload Python tools via the Web UI. They run in an isolated subprocess — the main process never imports user code:
subprocess+rlimit: CPU 10s, file 16 MB, memory 1 GB (Linux)- Wall-clock timeout: 20s
- Environment variable filtering: strips any variable containing
KEY,TOKEN,SECRET,PASSWORD,CREDENTIAL - New session + temp cwd per invocation
- Proxy tools via
pydantic.create_modelpreserve the original arg schema
Built on LangGraph + LangChain + deepagents — not a custom runtime:
| Capability | Implementation |
|---|---|
| Agent graph | LangGraph create_deep_agent + MemorySaver |
| Multi-turn memory | LangGraph MemorySaver (per thread_id) |
| Tool binding | LangChain BaseTool / StructuredTool |
| LLM access | init_chat_model — any OpenAI-compatible endpoint |
| Streaming | astream(stream_mode="messages") |
| Observability | WebSocket event callbacks (tool start/end, answer phase) |
Not a plugin — a first-class runtime capability:
- ChromaDB local persistence, per-tenant collection isolation
- PDF / DOCX / TXT / CSV / Markdown ingestion
- Optional BM25 hybrid retrieval, reranking, and query rewrite
- Accessible via the
rag-queryskill from any conversation
person_like_service.py extracts entities and relationships from conversations, builds a NetworkX knowledge graph per user, and scores topics via PageRank + connection weight + mention frequency. The agent gets a personalized summary injected at every session start.
┌─────────────────────────────────────────────────────────┐
│ Frontend (static/) Chat · Knowledge · Skills · Config │
└────────────────────────┬────────────────────────────────┘
│ WebSocket / REST
┌────────────────────────▼────────────────────────────────┐
│ API Layer (FastAPI) │
│ routes · auth_routes · skills_routes │
│ tools_routes · mcp_routes · websocket │
└────────────────────────┬────────────────────────────────┘
│
┌────────────────────────▼────────────────────────────────┐
│ Agent Runtime (ConversationalAgent) │
│ │
│ ┌─────────────┐ ┌──────────────┐ ┌────────────────┐ │
│ │ Skills │ │ Common Tools │ │ Extra/MCP Tools│ │
│ │ Disclosure │ │ (15 tools) │ │ Disclosure │ │
│ │skills_list │ │ utils·web │ │ tool_search │ │
│ │skill_view │ │ memory │ │ tool_invoke │ │
│ └─────────────┘ └──────────────┘ └────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ LangGraph create_deep_agent + MemorySaver │ │
│ └─────────────────────────────────────────────────────┘ │
└──────┬─────────────────────────┬───────────────────┬────┘
│ │ │
┌──────▼──────┐ ┌───────────────▼──────┐ ┌────────▼────┐
│ RAG │ │ Skill Manager │ │ Sandbox │
│ ChromaDB │ │ built-in + custom │ │ rlimit + │
│ per-tenant │ │ skills/<cat>/<name> │ │ subprocess │
└─────────────┘ └──────────────────────┘ └─────────────┘
| Category | Skills |
|---|---|
design |
algorithmic-art, brand-guidelines, canvas-design, frontend-design, slack-gif-creator, theme-factory, web-artifacts-builder |
development |
claude-api, mcp-builder, skill-creator, webapp-testing |
document |
docx, pdf, pptx, xlsx |
research |
rag-query, web-search |
writing |
doc-coauthoring, internal-comms |
Each skill lives in skills/<category>/<name>/SKILL.md + scripts/. The agent never loads them all — it uses skills_list → skill_view on demand.
git clone https://github.com/TW-NLP/DeepAgentForce
cd DeepAgentForce
docker compose up -dVisit http://localhost:8000 — then go to Settings to configure your LLM and Embedding API keys.
git clone https://github.com/TW-NLP/DeepAgentForce
cd DeepAgentForce
conda create -n agent python=3.12 -y
conda activate agent
pip install -r requirements.txt
python main.pyMirror for users in China:
pip install -r requirements.txt \
-i https://mirrors.aliyun.com/pypi/simple/ \
--trusted-host=mirrors.aliyun.com.env minimum config:
SQLITE_DB_PATH=data/deepagentforce.db
JWT_SECRET_KEY=your-secret-key-change-in-production
HOST=127.0.0.1
PORT=8000http://localhost:8000/login.html — every user gets an isolated workspace.
Go to Settings and fill in:
| Field | Example |
|---|---|
LLM_URL |
https://api.openai.com/v1 |
LLM_API_KEY |
sk-xxxxxxxx |
LLM_MODEL |
gpt-4o |
EMBEDDING_URL |
https://api.openai.com/v1 |
EMBEDDING_API_KEY |
sk-xxxxxxxx |
EMBEDDING_MODEL |
text-embedding-3-small |
Supports PDF, DOCX, TXT, CSV, Markdown — all indexed into a per-tenant ChromaDB collection.
Go to Skills → MCP tab → Add Server. Use the same JSON format as Claude Desktop. Click Test Connection before saving.
Go to Skills → Tools tab → Add Tool. Upload a .py file — any top-level function with a docstring becomes a callable agent tool, sandboxed automatically.
The agent auto-selects skills, searches the knowledge base, calls tools, and synthesizes answers in natural language.
DeepAgentForce/
├── main.py
├── config/settings.py
├── src/
│ ├── api/
│ │ ├── routes.py # core chat + file routes
│ │ ├── skills_routes.py # skill CRUD
│ │ ├── tools_routes.py # custom tool CRUD
│ │ ├── mcp_routes.py # MCP server CRUD
│ │ ├── auth_routes.py
│ │ └── websocket.py
│ └── services/
│ ├── conversational_agent.py # agent assembly
│ ├── skill_disclosure.py # skills progressive disclosure
│ ├── tool_disclosure.py # tools BM25 progressive disclosure
│ ├── mcp_integration.py # MCP connector + config store
│ ├── custom_tool_manager.py # user-uploaded Python tools
│ ├── sandbox/ # subprocess isolation
│ │ ├── runner.py
│ │ ├── loader.py
│ │ └── tool_worker.py
│ ├── agent_tools/ # 15 built-in common tools
│ │ ├── utils.py
│ │ ├── web.py
│ │ └── memory.py
│ ├── skill_manager.py
│ ├── rag.py
│ └── person_like_service.py
├── src/services/skills/ # 19 built-in skills
│ ├── design/
│ ├── development/
│ ├── document/
│ ├── research/
│ └── writing/
├── static/ # Web UI
│ ├── js/i18n.js # EN/ZH language switcher
│ ├── index.html
│ ├── login.html
│ └── register.html
├── scripts/ # test suites
│ ├── test_sandbox.py # 15/15
│ ├── test_mcp_integration.py # 14/14
│ ├── test_tools_mcp_mgmt.py # 20/20
│ ├── test_routes_http.py # 19/19
│ └── test_optimizations.py # 30/30
└── data/ # runtime data (gitignored)
Swagger: http://localhost:8000/docs
| Endpoint | Method | Description |
|---|---|---|
/ws/stream |
WebSocket | Streaming conversation |
/api/chat |
POST | Single-turn chat |
/api/chat/upload |
POST | Chat with file attachment |
/api/auth/register |
POST | Register |
/api/auth/login |
POST | Login |
/api/skills |
GET | List skills |
/api/skills/install |
POST | Install skill |
/api/tools |
GET | List tools (built-in + MCP + custom) |
/api/tools/custom |
POST | Upload custom tool |
/api/mcp/servers |
GET | List MCP servers |
/api/mcp/servers |
POST | Add/update MCP server |
/api/mcp/servers/test |
POST | Test MCP connection |
/api/rag/documents/upload |
POST | Upload knowledge document |
/api/rag/query |
POST | RAG query |
-
2026-06-02 —
v2.0.0Progressive Disclosure Edition- Skills progressive disclosure:
skills_list/skill_viewtwo-tier system - BM25 tool search:
tool_search/tool_describe/tool_invokebridge - Full MCP integration (stdio + HTTP, multi-tenant config)
- Custom Python tool sandbox (subprocess + rlimit)
- Web UI: Skills / Tools / MCP management tabs
- 15 built-in common tools (utils + web + memory)
- 19 skills reorganized into 5 categories
- EN/ZH frontend language switcher
- Skills progressive disclosure:
-
2026-04-23 —
v1.4.0- Docker build optimization + macOS DMG / Windows EXE packaging
- SQLite as default database
-
2026-04-22 —
v1.3.0— Skill zip upload, dialogue improvements -
2026-04-21 —
v1.2.0— 20 Claude built-in skills, regenerate/edit in chat
- Agent platform thesis / research prototype
- Enterprise internal knowledge assistant
- Multi-user AI workbench with isolated data
- Extensible tool-calling agent with MCP ecosystem access
- Chinese NLP + RAG + proofreading pipeline
Why can't I chat right after Docker starts?
The LLM keys are not pre-configured. Go to Settings and fill in LLM_* and EMBEDDING_* fields first.
How do I add a new skill?
Create a directory with SKILL.md + scripts/, zip it, and upload via Skills → Add Skill. Or drop it directly into src/services/skills/<category>/.
How do I clear all Docker data?
docker compose down -vMIT — free to use, modify, and distribute commercially.
- LangChain / LangGraph — Agent framework
- deepagents — Skill-aware agent builder
- FastAPI — Web framework
- ChromaDB — Local vector store
