MCP Code Execution - Enhanced Edition

99.6% Token Reduction through Skills-based CLI execution and progressive tool discovery for Model Context Protocol (MCP) servers.

Note: This project is optimized for Claude Code with native Skills framework support. While the core runtime works with any AI agent, the Skills framework (99.6% token reduction) is designed for Claude Code's operational intelligence.

🎯 What This Is

An enhanced implementation of Anthropic's Code Execution with MCP pattern, optimized for Claude Code, combining the best ideas from the MCP community and adding significant improvements:

Skills Framework: Pattern for creating reusable CLI-based workflows (99.6% token reduction) - Claude Code optimized
Multi-Transport: Full support for stdio, SSE, and HTTP MCP servers
Container Sandboxing: Optional rootless isolation with security controls
Type Safety: Pydantic models throughout with full validation
Production-Ready: 129 passing tests, comprehensive error handling

🤖 Claude Code Integration

The Skills framework is designed to work with Claude Code's operational intelligence:

Agents discover skills via filesystem (ls ./skills/)
Skills use CLI arguments (immutable templates)
Compatible with Claude Code's agent workflow
Supports Claude Code's progressive disclosure pattern

Note: Core runtime (script writing, 98.7% reduction) works with any AI agent. Skills framework (99.6% reduction) is Claude Code optimized.

🙏 Acknowledgments

This project builds upon and merges ideas from:

ipdelete/mcp-code-execution - Original implementation of Anthropic's PRIMARY pattern
- Filesystem-based progressive disclosure
- Type-safe Pydantic wrappers
- Schema discovery system
- Lazy server connections
elusznik/mcp-server-code-execution-mode - Production security patterns
- Container sandboxing architecture
- Comprehensive security controls
- Production deployment patterns

Our contribution: Merged the best of both, added Skills system with CLI-based execution, implemented multi-transport support, and refined the architecture for maximum efficiency.

✨ Key Enhancements

1. Skills Framework (NEW - 99.6% Token Reduction)

A pattern for creating reusable CLI-based workflow templates that agents execute with arguments:

# Simple example (generic)
uv run python -m runtime.harness skills/simple_fetch.py \
    --url "https://example.com"

# Pipeline example (generic)
uv run python -m runtime.harness skills/multi_tool_pipeline.py \
    --repo-path "." \
    --max-commits 5

Benefits over script writing:

18x better tokens: 110 vs 2,000
24x faster: 5 seconds vs 2 minutes
Immutable templates: No file editing
Reusable workflows: Same logic, different data

What's included:

Framework pattern and template (CLI-based, immutable)
2 generic examples (simple_fetch.py, multi_tool_pipeline.py)

2. Multi-Transport Support (NEW)

Full support for all MCP transport types:

{
  "mcpServers": {
    "local-tool": {
      "type": "stdio",
      "command": "uvx",
      "args": ["mcp-server-git"]
    },
    "jina": {
      "type": "sse",
      "url": "https://mcp.jina.ai/sse",
      "headers": {"Authorization": "Bearer YOUR_KEY"}
    },
    "exa": {
      "type": "http",
      "url": "https://mcp.exa.ai/mcp",
      "headers": {"x-api-key": "YOUR_KEY"}
    }
  }
}

3. Container Sandboxing (Enhanced)

Optional rootless container execution with comprehensive security:

# Sandbox mode with security controls
uv run python -m runtime.harness workspace/script.py --sandbox

Security features:

Rootless execution (UID 65534:65534)
Network isolation (--network none)
Read-only root filesystem
Memory/CPU/PID limits
Capability dropping (--cap-drop ALL)
Timeout enforcement

🚀 Quick Start

Prerequisites

Claude Code (recommended for Skills framework support)
Python 3.11+ (3.14 not recommended due to anyio compatibility)
uv package manager
(Optional) Docker or Podman for sandboxing

Note: Skills framework (99.6% reduction) requires Claude Code. Core runtime (98.7% reduction) works with any AI agent.

Installation

# Clone repository
git clone https://github.com/yourusername/mcp-code-execution-enhanced.git
cd mcp-code-execution-enhanced

# Install dependencies
uv sync

# Verify installation
uv run python -c "from runtime.mcp_client import get_mcp_client_manager; print('✓ Ready')"

Configuration

Important for Claude Code Users: This project uses its own mcp_config.json for MCP server configuration, separate from Claude Code's global configuration (~/.claude.json). To avoid conflicts, you may want to disable MCP servers in Claude Code's configuration while using this project, or ensure they don't overlap.

Create mcp_config.json in the project root with your MCP servers:

{
  "mcpServers": {
    "git": {
      "type": "stdio",
      "command": "uvx",
      "args": ["mcp-server-git", "--repository", "."]
    },
    "fetch": {
      "type": "stdio",
      "command": "uvx",
      "args": ["mcp-server-fetch"]
    }
  },
  "sandbox": {
    "enabled": false,
    "runtime": "auto",
    "image": "python:3.11-slim"
  }
}

Generate Tool Wrappers

# Auto-generate Python wrappers from your MCP servers
uv run mcp-generate

# This creates typed wrappers in ./servers/

📖 How It Works

PREFERRED: Skills-Based Execution (99.6% reduction)

For multi-step workflows (research, data processing, synthesis):

Discover skills: ls ./skills/ → see available skill templates and examples
Read documentation: cat ./skills/simple_fetch.py → see CLI args and pattern

Execute with parameters:

uv run python -m runtime.harness skills/simple_fetch.py \
    --url "https://example.com"

Example Skills (Framework Demonstrations):

Generic examples (skills/):

simple_fetch.py - Basic single-tool execution pattern
multi_tool_pipeline.py - Multi-tool chaining pattern

Note: Skills is a framework - use these examples as templates to create workflows for your specific MCP servers and use cases.

ALTERNATIVE: Direct Script Writing (98.7% reduction)

For simple tasks or novel workflows:

Explore tools: ls ./servers/ → discover available MCP tools
Write script: Create Python script using tool imports
Execute: uv run python -m runtime.harness workspace/script.py

Example script:

import asyncio
from runtime.mcp_client import call_mcp_tool

async def main():
    result = await call_mcp_tool(
        "git__git_log",
        {"repo_path": ".", "max_count": 10}
    )
    print(f"Fetched {len(result)} commits")
    return result

if __name__ == "__main__":
    asyncio.run(main())

🏗️ Architecture

Progressive Disclosure Pattern

Traditional Approach (High Token Usage):

Agent → MCP Server → [Full Tool Schemas 27,300 tokens] → Agent

Skills-Based (99.6% Reduction - PREFERRED):

Agent → Discovers skills → Reads skill docs → Executes with CLI args
Skill → Multi-server orchestration → Returns results
Tokens: ~110 (skill discovery + documentation)
Time: ~5 seconds

Script Writing (98.7% Reduction - ALTERNATIVE):

Agent → Discovers tools → Writes script
Script → MCP Server → Returns data
Agent → Processes/summarizes
Tokens: ~2,000 (tool discovery + script writing)
Time: ~2 minutes

Key Components

runtime/mcp_client.py: Lazy-loading MCP client manager with multi-transport support
runtime/harness.py: Dual-mode script execution (direct/sandbox)
runtime/generate_wrappers.py: Auto-generate typed wrappers from MCP schemas
runtime/sandbox/: Container sandboxing with security controls
skills/: 8 CLI-based immutable workflow templates

🎓 Skills System

Philosophy

DON'T: Write scripts from scratch each time DO: Use pre-written skills with CLI arguments

Creating Custom Skills

"""
SKILL: Your Skill Name

DESCRIPTION: What it does

CLI ARGUMENTS:
    --query    Research query (required)
    --limit    Max results (default: 10)

USAGE:
    uv run python -m runtime.harness skills/your_skill.py \
        --query "your question" \
        --limit 5
"""

import argparse
import asyncio
import sys

def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument("--query", required=True)
    parser.add_argument("--limit", type=int, default=10)

    # Filter script path from args
    args_to_parse = [arg for arg in sys.argv[1:] if not arg.endswith(".py")]
    return parser.parse_args(args_to_parse)

async def main():
    args = parse_args()
    # Your workflow logic here
    return result

if __name__ == "__main__":
    asyncio.run(main())

See skills/README.md for complete documentation.

🔌 Multi-Transport Support

stdio (Subprocess-based)

{
  "type": "stdio",
  "command": "uvx",
  "args": ["mcp-server-name"],
  "env": {"API_KEY": "your-key"}
}

SSE (Server-Sent Events)

{
  "type": "sse",
  "url": "https://mcp.example.com/sse",
  "headers": {"Authorization": "Bearer YOUR_KEY"}
}

HTTP (Streamable HTTP)

{
  "type": "http",
  "url": "https://mcp.example.com/mcp",
  "headers": {"x-api-key": "YOUR_KEY"}
}

See docs/TRANSPORTS.md for detailed information.

🔐 Sandbox Mode

Configuration

{
  "sandbox": {
    "enabled": true,
    "runtime": "auto",
    "image": "python:3.11-slim",
    "memory_limit": "512m",
    "timeout": 30
  }
}

Security Controls

Rootless execution: UID 65534:65534 (nobody)
Network isolation: --network none
Filesystem: Read-only root, writable tmpfs
Resource limits: Memory, CPU, PID constraints
Capabilities: All dropped (--cap-drop ALL)
Security: no-new-privileges, SELinux labels

See SECURITY.md for complete security documentation.

🧪 Testing

# Run all tests (129 total)
uv run pytest

# Unit tests only
uv run pytest tests/unit/

# Integration tests (requires Docker/Podman for sandbox tests)
uv run pytest tests/integration/

# With coverage
uv run pytest --cov=src/runtime

📚 Documentation

README.md (this file) - Overview and quick start
CLAUDE.md - Quick reference for Claude Code
AGENTS.md.template - Template for adapting to other AI frameworks
skills/README.md - Skills system guide
skills/SKILLS.md - Complete skills documentation
docs/USAGE.md - Comprehensive user guide
docs/ARCHITECTURE.md - Technical architecture
docs/CONFIGURATION.md - MCP server configuration management (Claude Code vs project)
docs/TRANSPORTS.md - Transport-specific details
SECURITY.md - Security architecture and best practices

🛠️ Development

Code Quality

# Type checking
uv run mypy src/

# Formatting
uv run black src/ tests/

# Linting
uv run ruff check src/ tests/

Project Scripts

# Generate wrappers from tool definitions
uv run mcp-generate

# (Optional) Generate discovery config with LLM parameter generation
uv run mcp-generate-discovery

# (Optional) Execute safe tools and infer schemas
uv run mcp-discover

# Execute a script with MCP tools available
uv run mcp-exec workspace/script.py

# Execute in sandbox mode
uv run mcp-exec workspace/script.py --sandbox

📊 Efficiency Comparison

Approach	Tokens	Time	Use Case
Traditional	27,300	N/A	All tool schemas loaded upfront
Skills (NEW)	110	5 sec	Multi-step workflows (PREFERRED)
Script Writing	2,000	2 min	Novel workflows (ALTERNATIVE)

Skills achieve 99.6% reduction - exceeding Anthropic's 98.7% target!

🎨 What Makes This Enhanced

Beyond Original Projects

From ipdelete/mcp-code-execution:

✅ Filesystem-based progressive disclosure
✅ Type-safe Pydantic wrappers
✅ Lazy server connections
✅ Schema discovery system

From elusznik/mcp-server-code-execution-mode:

✅ Container sandboxing architecture
✅ Security controls and policies
✅ Production deployment patterns

Enhanced in this project:

⭐ Skills system: CLI-based immutable templates (99.6% reduction)
⭐ Multi-transport: stdio + SSE + HTTP support (100% server coverage)
⭐ Dual-mode execution: Direct (fast) + Sandbox (secure)
⭐ Python 3.11 stable: Avoiding 3.14 anyio compatibility issues
⭐ Comprehensive testing: 129 tests covering all features
⭐ Enhanced documentation: Complete guides for all features

Architecture Innovations

Skills vs Scripts:

Skills are immutable templates executed with CLI arguments
No file editing required (parameters via --query, --num-urls, etc.)
Reusable across different queries and contexts
Pre-tested and documented workflows

Multi-Transport:

Single codebase supports all transport types
Automatic transport detection
Unified configuration format
Seamless server connections

Dual-Mode Execution:

Direct mode: Fast, full access (development)
Sandbox mode: Secure, isolated (production)
Same code, different security postures
Runtime selection via flag or config

🔧 Configuration Reference

Minimal Configuration

{
  "mcpServers": {
    "git": {
      "command": "uvx",
      "args": ["mcp-server-git", "--repository", "."]
    }
  }
}

Complete Configuration

{
  "mcpServers": {
    "local-stdio": {
      "type": "stdio",
      "command": "uvx",
      "args": ["mcp-server-name"],
      "env": {"API_KEY": "key"},
      "disabled": false
    },
    "remote-sse": {
      "type": "sse",
      "url": "https://mcp.example.com/sse",
      "headers": {"Authorization": "Bearer KEY"},
      "disabled": false
    },
    "remote-http": {
      "type": "http",
      "url": "https://mcp.example.com/mcp",
      "headers": {"x-api-key": "KEY"},
      "disabled": false
    }
  },
  "sandbox": {
    "enabled": false,
    "runtime": "auto",
    "image": "python:3.11-slim",
    "memory_limit": "512m",
    "cpu_limit": "1.0",
    "timeout": 30,
    "max_timeout": 120
  }
}

📦 Features

Core Features

🦥 Lazy Loading: Servers connect only when tools are called
🔒 Type Safety: Pydantic models for all tool inputs/outputs
🔄 Defensive Coding: Handles variable MCP response structures
📦 Auto-generated Wrappers: Typed Python functions from MCP schemas
🛠️ Field Normalization: Handles inconsistent API casing

Enhanced Features

🎯 Skills Framework: Pattern for CLI-based reusable workflows
🔌 Multi-Transport: stdio, SSE, and HTTP support
🔐 Container Sandboxing: Optional rootless isolation
🧪 Comprehensive Testing: 129 tests with full coverage
📖 Complete Documentation: Guides for every feature

🎓 Examples

See the examples/ directory for:

example_progressive_disclosure.py - Classic token reduction pattern
example_tool_chaining.py - LLM orchestration pattern
example_sandbox_usage.py - Container sandboxing demo
example_sandbox_simple.py - Basic sandbox usage

See the skills/ directory for production-ready workflows.

🐛 Troubleshooting

Common Issues

"MCP server not configured"

Check mcp_config.json server names match your calls

"Connection closed"

Verify server command: which <command>
Check server logs for startup errors

"Module not found"

Run uv run mcp-generate to regenerate wrappers
Ensure src/ is in PYTHONPATH (harness handles this)

Import errors in skills

Skills must be run via harness (sets PYTHONPATH)
Don't run skills directly: python skills/skill.py ❌
Correct: uv run python -m runtime.harness skills/skill.py ✅

Python Version Issues

Python 3.14 compatibility:

Not recommended due to anyio <4.9.0 breaking changes
Use Python 3.11 or 3.12 for stability
See issue tracker for updates

🤝 Contributing

We welcome contributions! Areas of interest:

New skills: Add more workflow templates
MCP server support: Test with different servers
Documentation: Improve guides and examples
Testing: Expand test coverage
Performance: Optimize token usage further

Development Setup

# Install with dev dependencies
uv sync --all-extras

# Run quality checks
uv run black src/ tests/
uv run mypy src/
uv run ruff check src/ tests/
uv run pytest

📄 License

MIT License - see LICENSE file for details

🔗 References

Original Projects

ipdelete/mcp-code-execution - Anthropic's PRIMARY pattern
elusznik/mcp-server-code-execution-mode - Production security

MCP Resources

Python Resources

🌟 Features Comparison

Feature	Original (ipdelete)	Bridge (elusznik)	Enhanced (this)
Progressive Disclosure	✅ PRIMARY	⚠️ ALTERNATIVE	✅ PRIMARY
Token Reduction	98.7%	~95%	99.6%
Type Safety	✅ Pydantic	⚠️ Basic	✅ Enhanced
Sandboxing	❌ None	✅ Required	✅ Optional
Multi-Transport	❌ stdio only	❌ stdio only	✅ stdio/SSE/HTTP
Skills Framework	❌ None	❌ None	✅ Yes + examples
CLI Execution	❌ None	❌ None	✅ Immutable
Test Coverage	⚠️ Partial	⚠️ Partial	✅ Comprehensive
Python 3.11	✅ Yes	⚠️ 3.12+	✅ Stable

💡 Use Cases

Perfect For

✅ AI agents needing to orchestrate multiple MCP tools
✅ Research workflows (web search → read → synthesize)
✅ Data processing pipelines (fetch → transform → output)
✅ Code discovery (search → analyze → recommend)
✅ Production deployments requiring security isolation
✅ Teams needing reproducible research workflows

Not Ideal For

❌ Single tool calls (use MCP directly instead)
❌ Real-time interactive tools (better suited for direct integration)
❌ GUI applications (command-line focused)

🚦 Getting Started Checklist

Install Python 3.11+ and uv
Clone repository
Run uv sync
Create mcp_config.json with your MCP servers
Run uv run mcp-generate to create wrappers
Try a skill: uv run python -m runtime.harness skills/simple_fetch.py --url "https://example.com"
Read AGENTS.md for operational guide
Explore skills/ for available workflows
Review docs/ for detailed documentation

❓ FAQ

Q: Why Skills instead of writing scripts? A: Skills achieve 99.6% token reduction vs 98.7% for scripts, and execute 24x faster (5 sec vs 2 min). They're pre-tested, documented, and immutable.

Q: Can I use this without Claude Code? A: Yes, but with limitations. The core runtime (script writing, 98.7% reduction) works with any AI agent. The Skills framework (99.6% reduction) is optimized for Claude Code's operational intelligence.

Q: Can I still write custom scripts? A: Yes! Skills are PREFERRED for common workflows (with Claude Code), but custom scripts are fully supported for novel use cases and other AI agents.

Q: What's the difference from the original projects? A: We merged the best of both (progressive disclosure + security), added Skills system, multi-transport support, and refined the architecture.

Q: Why Python 3.11 instead of 3.14? A: anyio <4.9.0 has compatibility issues with Python 3.14's asyncio changes. 3.11 is stable and well-tested.

Q: Is sandboxing required? A: No, it's optional. Use direct mode for development (fast), sandbox mode for production (secure).

Q: How do I add my own MCP servers? A: Add them to mcp_config.json, run uv run mcp-generate, and they're ready to use!

🎯 Next Steps

Explore Skills: ls skills/ and cat skills/simple_fetch.py
Try examples: Run the example skills or create your own
Read CLAUDE.md: Quick operational guide (for Claude Code users)
Review docs/: Deep dive into architecture
Create custom skill: Follow the template for your use case

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
skills		skills
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md.template		AGENTS.md.template
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
discovery_config.example.json		discovery_config.example.json
mcp_config.example.json		mcp_config.example.json
mcp_config.sandbox.example.json		mcp_config.sandbox.example.json
pyproject.toml		pyproject.toml
uv.lock		uv.lock

License

yoloshii/mcp-code-execution-enhanced

Folders and files

Latest commit

History

Repository files navigation

MCP Code Execution - Enhanced Edition

🎯 What This Is

🤖 Claude Code Integration

🙏 Acknowledgments

✨ Key Enhancements

1. Skills Framework (NEW - 99.6% Token Reduction)

2. Multi-Transport Support (NEW)

3. Container Sandboxing (Enhanced)

🚀 Quick Start

Prerequisites

Installation

Configuration

Generate Tool Wrappers

📖 How It Works

PREFERRED: Skills-Based Execution (99.6% reduction)

ALTERNATIVE: Direct Script Writing (98.7% reduction)

🏗️ Architecture

Progressive Disclosure Pattern

Key Components

🎓 Skills System

Philosophy

Creating Custom Skills

🔌 Multi-Transport Support

stdio (Subprocess-based)

SSE (Server-Sent Events)

HTTP (Streamable HTTP)

🔐 Sandbox Mode

Configuration

Security Controls

🧪 Testing

📚 Documentation

🛠️ Development

Code Quality

Project Scripts

📊 Efficiency Comparison

🎨 What Makes This Enhanced

Beyond Original Projects

Architecture Innovations

🔧 Configuration Reference

Minimal Configuration

Complete Configuration

📦 Features

Core Features

Enhanced Features

🎓 Examples

🐛 Troubleshooting

Common Issues

Python Version Issues

🤝 Contributing

Development Setup

📄 License

🔗 References

Original Projects

MCP Resources

Python Resources

🌟 Features Comparison

💡 Use Cases

Perfect For

Not Ideal For

🚦 Getting Started Checklist

❓ FAQ

🎯 Next Steps

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages