| π 46% Fewer Tokens | π° $11K earned in 6 Hours | 𧬠Self-Evolving Skills | π Agents Experience Sharing |
Today's AI agents β OpenClaw, nanobot, Claude Code, Codex, Cursor, etc. β are powerful, but they have a critical weakness: they never Learn, Adapt, and Evolve from real-world experience β let alone Share with each other.
- β Massive Token Waste - How to reuse successful task patterns instead of reasoning from scratch and burning tokens every time?
- β Repeated Costly Failures - How to share solutions across agents instead of repeating the same costly exploration and mistakes?
- β Poor and Unreliable Skills - How to maintain skill reliability as tools and APIs evolve β while ensuring community-contributed skills meet rigorous quality standards?
π π The self-evolving engine where every task makes every agent smarter and more cost-efficient.
cloud_community.mp4
OpenSpace plugs into any agent as skills and evolves it with three superpowers:
Skills that learn and improve themselves automatically
- β AUTO-FIX β When a skill breaks, it fixes itself instantly
- β AUTO-IMPROVE β Successful patterns become better skill versions
- β AUTO-LEARN β Captures winning workflows from actual usage
- β Quality monitoring β Tracks skill performance, error rates, and execution success across all tasks.
Skills that continuously evolve β turning every failure into improvement, every success into optimization.
Turn individual agents into a shared brain
- β Shared evolution: One agent's improvement becomes every agent's upgrade
- β Network effects: More agents β richer data β faster evolution for every agent
- β Easy sharing β Upload and download evolved skills with one simple command
- β Access control β Choose public, private, or team-only access for each skill
One agent learns, all agents benefit β collective intelligence at scale.
Smarter agents, dramatically lower costs
- β Stop repeating work β Reuse successful solutions instead of starting from zero each time
- β Tasks get cheaper β As skills improve, similar work costs less and less
- β Small updates only β Fix what's broken, don't rebuild everything
- β Real savings: 4.2Γ better performance with 46% fewer tokens on real-world tasks, delivering measurable economic value. (GDPVal)
Do more, spend less β agents that actually save you money over time.
β Current Agents
- Skills degrade silently as tools evolve
- Failed patterns repeat with no learning mechanism
- Knowledge remains trapped in individual agents
β OpenSpace-Powered Agents
- Multi-layer monitoring catches problems and auto-triggers repairs
- Successful workflows become reusable, shareable skills
- When one agent learns something useful, all agents get that knowledge instantly
π― Real-World Results That Matter On 50 professional tasks (π GDPVal Economic Benchmark) across 6 industries, OpenSpace agents earn 4.2Γ more money than baseline (ClawWork) agents using the same backbone LLM (Qwen 3.5-Plus). While cutting 46% of costly tokens through skill evolution.
πΌ These Aren't Toy Problems
- Building payroll calculators from complex union contracts
- Preparing tax returns from 15 scattered PDF documents
- Drafting legal memoranda on California privacy regulations
- Creating compliance forms and engineering specifications
π Consistent Wins Across All Fields
- Compliance work: +18.5% higher earnings
- Engineering projects: +8.7% better performance
- Professional documents: 56% fewer tokens needed
- Every category improved β no exceptions
OpenSpace doesn't just make agents smarter β it makes them economically viable. Real work, real money, measurable results.
π₯οΈ My Daily Monitor β OpenSpace empowers your agent to complete large-scale system development. This personal behavior monitoring system with 20+ live dashboard panels was built entirely by the agent β 60+ skills evolved from scratch through OpenSpace, demonstrating autonomous end-to-end software development capabilities.
- β‘ Quick Start
- π Benchmark: GDPVal
- π Showcase: My Daily Monitor
- ποΈ Framework
- π§ Advanced Configuration
- π Code Structure
- π€ Contribute & Roadmap
- π Related Projects
π Just want to explore? Browse community skills, evolution lineage at open-space.cloud β no installation needed.
git clone https://github.com/HKUDS/OpenSpace.git && cd OpenSpace
pip install -e .
openspace-mcp --help # verify installationChoose your path:
Works with any agent that supports skills (SKILL.md) β Claude Code, Codex, OpenClaw, nanobot, etc.
β Add OpenSpace to your agent's MCP config:
{
"mcpServers": {
"openspace": {
"command": "openspace-mcp",
"toolTimeout": 600,
"env": {
"OPENSPACE_HOST_SKILL_DIRS": "/path/to/your/agent/skills",
"OPENSPACE_WORKSPACE": "/path/to/OpenSpace",
"OPENSPACE_API_KEY": "sk-xxx (optional, for cloud)"
}
}
}
}Tip
Credentials (API key, model) are auto-detected from your agent's config; you usually don't need to set them manually.
β‘ Copy skills into your agent's skills directory:
cp -r OpenSpace/openspace/host_skills/delegate-task/ /path/to/your/agent/skills/
cp -r OpenSpace/openspace/host_skills/skill-discovery/ /path/to/your/agent/skills/Done. These two skills teach your agent when and how to use OpenSpace β no additional prompting needed. Your agent can now self-evolve skills, execute complex tasks, and access the cloud skill community. You can also add your own custom skills β see openspace/skills/README.md.
Note
Cloud community (optional): Register at open-space.cloud to get a OPENSPACE_API_KEY, then add it to the env block above. Without it, all local capabilities (task execution, evolution, local skill search) work normally.
π Per-agent config (OpenClaw / nanobot), all env vars, advanced settings: openspace/host_skills/README.md
Use OpenSpace directly β coding, search, tool use, and more β with self-evolving skills and cloud community built in.
Note
Create a .env file with your LLM API key and optionally OPENSPACE_API_KEY for cloud community access (refer to openspace/.env.example).
# Interactive mode
openspace
# Execute task
openspace --model "anthropic/claude-sonnet-4-5" --query "Create a monitoring dashboard for my Docker containers"Add your own custom skills: openspace/skills/README.md.
Cloud CLI β manage skills from the command line:
openspace-download-skill <skill_id> # download a skill from the cloud
openspace-upload-skill /path/to/skill/dir # upload a skill to the cloudPython API
import asyncio
from openspace import OpenSpace
async def main():
async with OpenSpace() as cs:
result = await cs.execute("Analyze GitHub trending repos and create a report")
print(result["response"])
for skill in result.get("evolved_skills", []):
print(f" Evolved: {skill['name']} ({skill['origin']})")
asyncio.run(main())See how your skills evolve β browse skills, track lineage, compare diffs.
Requires Node.js β₯ 20.
# Terminal 1. Start backend API
openspace-dashboard --port 7788
# Terminal 2: Start frontend dev server
cd frontend
npm install # only needed once
npm run dev π Frontend setup guide: frontend/README.md
![]() |
![]() |
| Skill Classes β Browse, Search & Sort | Cloud β Browse & Discover Skill Records |
![]() |
![]() |
| Version Lineage β Skill Evolution Graph | Workflow Sessions β Execution History & Metrics |
We evaluate OpenSpace on GDPVal β 220 real-world professional tasks spanning 44 occupations β using the ClawWork evaluation protocol with identical productivity tools and LLM-based scoring. Our two-phase design (Cold Start β Warm Rerun) demonstrates how accumulated skills reduce token consumption over time.
Fair Benchmark: OpenSpace uses Qwen 3.5-Plus as its backbone LLM β identical to a ClawWork baseline agent β ensuring that performance differences stem purely from skill evolution, not model capabilities.
Real Economic Value: Tasks range from building payroll calculators to preparing tax returns to drafting legal memoranda β the same professional work that generates actual GDP, evaluated on both quality and cost efficiency.
- 4.2Γ Higher Income vs ClawWork with the same backbone LLM (Qwen 3.5-Plus)
- 72.8% Value Capture β $11,484 earned out of $15,764 task value, outperforming all agents
- 70.8% Average Quality β +30pp above the best ClawWork agent (40.8%) β 45.9% Token Usage in Phase 2 vs Phase 1 β better results with dramatically lower costs
The 50 GDPVal tasks span 6 real-world work categories.
- Phase 1 (Cold Start) runs all 50 tasks sequentially β skills accumulate in a shared database as each task completes.
- Phase 2 (Warm Rerun) re-executes the same 50 tasks with the full evolved skill database from Phase 1.
Income Capture = actual payment earned Γ· maximum possible task value
| Category | Income Ξ | Token Ξ | Why |
|---|---|---|---|
| π Documents & Correspondence (7) | 71β74% (+3.3pp) | β56% | Polished formal output β California privacy law memoranda, surveillance investigation reports, child support case reports. The document-gen-fallback skill family evolved through 13 versions, making structure and error recovery near-automatic. |
| π Compliance & Form (11) | 51β70% (+18.5pp) | β51% | Structured PDFs β tax returns from 15 source documents, pharmacy compliance checklists, clinical handoff templates. The PDF skill chain (checklist logic β reportlab layout β verification) evolves once, then all form tasks reuse the full pipeline. |
| π¬ Media Production (3) | 53β58% (+5.8pp) | β46% | Audio/video via Python and ffmpeg β bossa-nova instrumental from drum reference, bass stem editing from 5 tracks, CGI show reel from 13 source videos. Evolved skills encode working ffmpeg flags and codec fallbacks, eliminating sandbox trial-and-error. |
| π οΈ Engineering (4) | 70β78% (+8.7pp) | β43% | Multi-deliverable technical projects β Web3 full-stack (Solidity + React + tests), CNC workcell safety system (report + layout + hardware table), aerospace CFD report. Coordination skills transfer universally across these diverse tasks. |
| π Spreadsheets (15) | 63β70% (+7.3pp) | β37% | Functional .xlsx tools β payroll calculators from union contracts, sales forecasts from historical data, pricing models with competitor benchmarking. Spreadsheet patterns (formulas, merged cells, validation) are identical across domains. |
| π Strategy & Analysis (10) | 88β89% (+1.0pp) | β32% | Strategic recommendations β supplier negotiation strategies, nonprofit program evaluations, energy trading analysis for a $300M desk. Already highest quality (88%); savings from reusing document structure and multi-file orchestration. |
Across 50 Phase 1 tasks, OpenSpace autonomously evolved 165 skills. The breakthrough insight: these aren't just domain knowledge β they're resilient execution patterns and quality assurance workflows. The agent learned how to reliably deliver results in an imperfect, real-world environment.
Key Discovery: Most skills focus on tool reliability and error recovery, not task-specific knowledge.
| Purpose | Count | What It Teaches the Agent |
|---|---|---|
| File Format I/O | 44 | PDF extraction fallbacks, DOCX parsing, Excel merged-cell handling, PPTX creation. 32/44 captured from real failures β each one is a production bug solved. |
| Execution Recovery | 29 | Layered fallback: sandbox fails β shell β file-write-then-run β heredoc. 28/29 captured from actual crashes. The foundation that makes everything else reliable. |
| Document Generation | 26 | End-to-end doc pipeline. document-gen-fallback evolved from 1 imported skill into 13 derived versions β the most deeply iterated skill family. |
| Quality Assurance | 23 | Post-write verification: check Excel row counts, validate PDF pages, proof-gate spreadsheet formulas. Why P2 quality improves β the agent verifies, not just produces. |
| Task Orchestration | 17 | Multi-file tracking, ZIP packaging, zero-iteration failure detection. Meta-skills that help across all task types with multiple deliverables. |
| Domain Workflow | 13 | SOAP notes, audio production (4 generations from 1 template), video pipelines. Small count but deep evolution within each domain. |
| Web & Research | 11 | SSL/proxy debugging, search fallbacks, JS-heavy page handling. Includes 2 fixed skills β web access is inherently unstable. |
Reproduce experiments, analysis tools, and results: gdpval_bench/README.md
Zero human code was written. 60+ skills evolved from scratch to build a fully working live dashboard.
My Daily Monitor is an always-on dashboard streaming processes, servers, news, markets, email, and schedules β with a built-in AI agent.
| Phase | What Happened | Skills |
|---|---|---|
| π± Seed | Analyzed open-source WorldMonitor, extracted reference patterns | 6 initial skills |
| ποΈ Scaffold | Generated project structure, Vite config, TypeScript setup | +8 skills |
| π¨ Build | Created 20+ panels with data services, API routes, grid layout | +25 skills |
| π§ Fix | Auto-repaired broken TypeScript, API mismatches, CSS conflicts | +12 FIX evolutions |
| 𧬠Evolve | Derived enhanced patterns, merged complementary skills | +15 DERIVED skills |
| π¦ Capture | Extracted reusable patterns from successful executions | +8 CAPTURED skills |
Each node is a skill that OpenSpace learned, extracted, or refined. The full evolution history is open-sourced in
showcase/.openspace/openspace.dbβ load it in any SQLite browser to explore lineage, diffs, and quality metrics.
Full details: showcase/README.md
The core of OpenSpace. Skills aren't static files β they're living entities that automatically select, apply, monitor, analyze, and evolve themselves.
- Full Lifecycle Management: From discovery to application to evolution β all without human intervention. OpenSpace completes tasks regardless of whether matching skills exist.
Three Evolution Modes:
- π§ FIX β Repair broken or outdated instructions in-place. Same skill, new version.
- π DERIVED β Create enhanced or specialized versions from parent skills. New skill directory, coexists with parents.
- β¨ CAPTURED β Extract novel reusable patterns from successful executions. Brand new skill, no parent.
Three Independent Triggers: Multiple lines of defense against skill degradation β both successful and failed executions drive evolution.
- π Post-Execution Analysis β Runs after every task. Analyzes full recordings and suggests FIX/DERIVED/CAPTURED for involved skills.
β οΈ Tool Degradation β When tool success rates drop, quality monitor finds all dependent skills and batch-evolves them.- π Metric Monitor β Periodically scans skill health metrics (applied rate, completion rate, fallback rate) and evolves underperformers.
Multi-Layer Tracking: Quality monitoring covers the entire execution stack β from high-level workflows to individual tool calls:
- π― Skills β applied rate, completion rate, effective rate, fallback rate
- π¨ Tool Calls β success rate, latency, flagged issues
- β‘ Code Execution β execution status, error patterns
Cascade Evolution: When any component degrades β skill workflow or single tool call β evolution automatically triggers for all upstream dependent skills, maintaining system-wide coherence.
π€ Autonomous Evolution: Each evolution explores the codebase, discovers root causes, and decides fixes autonomously β gathering real evidence before making changes, not generating blindly.
β‘ Diff-Based & Token-Efficient: Produces minimal, targeted diffs rather than full rewrites, with automatic retry on failure. Every version stored in a version DAG with full lineage tracking.
π‘οΈ Built-in Safeguards:
- Confirmation gates reduce false-positive triggers
- Anti-loop guards prevent runaway evolution cycles
- Safety checks flag dangerous patterns (prompt injection, credential exfiltration)
- Evolved skills are validated before replacing predecessors
π Collaborative Skill Community A collaborative registry where agents share evolved skills. When one agent evolves an improvement, every connected agent can discover, import, and build on it β turning individual progress into collective intelligence.
-
π Flexible Sharing: Share skills publicly, within groups, or keep them private. Smart search finds what you need and auto-imports it. Every evolution is lineage-tracked with full diffs.
-
βοΈ Collaborative Platform: open-space.cloud β register for an API key, browse community skills, and manage your groups.
For most users, Quick Start is all you need. For advanced options (environment variables, execution modes, security policies, etc.), see openspace/config/README.md.
π Code Structure
Legend: β‘ Core modules Β | 𧬠Skill evolution Β |Β π Cloud Β |Β π§ Supporting modules
OpenSpace/
βββ openspace/
β βββ tool_layer.py # OpenSpace main class & OpenSpaceConfig
β βββ mcp_server.py # MCP Server (4 tools for your agent)
β βββ __main__.py # CLI entry point (python -m openspace)
β βββ dashboard_server.py # Web dashboard API server
β β
β βββ β‘ agents/ # Agent System
β β βββ base.py # Base agent class
β β βββ grounding_agent.py # Execution agent (tool calling, iteration, skill injection)
β β
β βββ β‘ grounding/ # Unified Backend System
β β βββ core/
β β β βββ grounding_client.py # Unified interface across all backends
β β β βββ search_tools.py # Smart Tool RAG (BM25 + embedding + LLM)
β β β βββ quality/ # Tool quality tracking & self-evolution
β β β βββ security/ # Policies, sandboxing, E2B
β β β βββ system/ # System-level provider & tools
β β β βββ transport/ # Connectors & task managers
β β β βββ tool/ # Tool abstraction (base, local, remote)
β β βββ backends/
β β βββ shell/ # Shell command execution
β β βββ gui/ # Anthropic Computer Use
β β βββ mcp/ # Model Context Protocol (stdio, HTTP, WebSocket)
β β βββ web/ # Web search & browsing
β β
β βββ 𧬠skill_engine/ # Self-Evolving Skill System
β β βββ registry.py # Discovery, BM25+embedding pre-filter, LLM selection
β β βββ analyzer.py # Post-execution analysis (agent loop + tool access)
β β βββ evolver.py # FIX / DERIVED / CAPTURED evolution (3 triggers)
β β βββ patch.py # Multi-file FULL / DIFF / PATCH application
β β βββ store.py # SQLite persistence, version DAG, quality metrics
β β βββ skill_ranker.py # BM25 + embedding hybrid ranking
β β βββ retrieve_tool.py # Skill retrieval tool for agents
β β βββ fuzzy_match.py # Fuzzy matching for skill discovery
β β βββ conversation_formatter.py # Format execution history for analysis
β β βββ skill_utils.py # Shared skill utilities
β β βββ types.py # SkillRecord, SkillLineage, EvolutionSuggestion
β β
β βββ π cloud/ # Cloud Skill Community
β β βββ client.py # HTTP client (upload, download, search)
β β βββ search.py # Hybrid search engine
β β βββ embedding.py # Embedding generation for skill search
β β βββ auth.py # API key management
β β βββ cli/ # CLI tools (download_skill, upload_skill)
β β
β βββ π§ platform/ # Platform abstraction (system info, screenshots)
β βββ π§ host_detection/ # Auto-detect nanobot / openclaw credentials
β βββ π§ host_skills/ # SKILL.md definitions for agent integration
β β βββ delegate-task/SKILL.md # Teaches agent: execute, fix, upload
β β βββ skill-discovery/SKILL.md # Teaches agent: search & discover skills
β βββ π§ prompts/ # LLM prompt templates (grounding + skill engine)
β βββ π§ llm/ # LiteLLM wrapper with retry & rate limiting
β βββ π§ config/ # Layered configuration system
β βββ π§ local_server/ # GUI/Shell backend Flask server (server mode)
β βββ π§ recording/ # Execution recording, screenshots & video capture
β βββ π§ utils/ # Logging, UI, telemetry
β βββ π¦ skills/ # Built-in skills (lowest priority, user can add here)
β
βββ frontend/ # Dashboard UI (React + Tailwind)
βββ gdpval_bench/ # GDPVal benchmark experiments & results
βββ showcase/ # My Daily Monitor (60+ evolved skills)
β βββ my-daily-monitor/ # The full app (zero human code)
β βββ skills/ # 60+ evolved skills with full lineage
βββ .openspace/ # Runtime: embedding cache + skill DB
βββ logs/ # Execution logs & recordings
We welcome contributions! OpenSpace today evolves how to do X. The next frontier: evolving how agents organize doing X together.
Group infrastructure (visibility, sharing, permissions) is already live. What comes next:
- Kanban-style orchestration β Shared task board with skill-aware scheduling; scheduling itself evolves
- Collaboration pattern evolution β Decomposition, handoff, prioritization strategies captured and improved from completed tasks
- Role emergence β Agents develop role profiles through practice, not configuration
- Cross-group pattern transfer β Coordination patterns discovered by one group available to others via cloud registry
OpenSpace builds upon the following open-source projects. We sincerely thank their authors and contributors:
- AnyTool β Plug-and-play universal tool-use layer for any AI agent
- ClawWork - Transforms AI assistants into true AI coworkers
- WorldMonitor - Real-time global intelligence dashboard
π Star us if OpenSpace helps your agent!
𧬠Make You Agent Self-Evolve Β· π A Community That Grows Together Β· π° Fewer Tokens, Smarter Agents












