A local code-executing deep agent with Progressive Disclosure Skills. This agent can execute Python scripts, process data files, and dynamically load specialized capabilities without bloating its context window.
- Local Code Execution: Run shell commands and Python scripts on your machine
- Progressive Disclosure: Skills are loaded on-demand, not all at startup
- File-Based Skills: Capabilities defined in SKILL.md files with scripts and docs
- Safety Rails: Human-in-the-loop approval for command execution and file edits
- Specialized Skills:
- CSV Analytics: Efficiently process large CSV files
- PDF Processing: Extract form fields and text from PDFs
This project demonstrates:
- Custom
DockerExecutionBackendfor isolated command execution with consistent filesystem paths - Custom
SkillsMiddlewarefor progressive capability disclosure - Skills-based architecture where tools are files, not hardcoded Python functions
- Docker-based execution environment where
/data,/scripts,/resultspaths work consistently in both file operations and executed scripts
- Python 3.11 or higher
- uv - Modern, fast Python package manager
- Docker Desktop installed and running
- Anthropic API key for Claude models
- Install uv (if you haven't already):
# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"- Clone the repository:
git clone https://github.com/jfglanc/code-execution-deep-agent.git
cd code-execution-deep-agent- Install dependencies:
uv sync --devThis automatically:
- Creates a virtual environment in
.venv/ - Installs the project in editable mode
- Installs all dependencies including LangGraph CLI
- Generates
uv.lockfor reproducible builds
- Set up your API key:
cp .env.example .envThen edit .env and add your Anthropic API key:
ANTHROPIC_API_KEY=sk-ant-...
# Optional: Enable LangSmith tracing
LANGSMITH_API_KEY=lsv2...- Build the Docker image:
docker build -f libs/backends/docker/Dockerfile -t code-execution-agent:latest .This creates a container image with Python 3.11 and pre-installed data science packages (pandas, numpy, matplotlib, etc.).
- Start the execution container:
docker run -d --name code-execution-agent -v "$(pwd)/workspace:/workspace" code-execution-agent:latestOn Windows (PowerShell), use:
docker run -d --name code-execution-agent -v "${PWD}/workspace:/workspace" code-execution-agent:latestNote: The container must be running for the agent to execute commands. See docs/docker-setup.md for troubleshooting and management.
- Generate sample data (optional, for demos):
uv run python workspace/data/generate_sample_data.pyThis creates:
workspace/data/orders.csv- 10,000 sample order recordsworkspace/data/sample_form.pdf- PDF with fillable form fields
- Start the LangGraph server:
uv run langgraph devThis starts the agent server with:
- Development UI at http://localhost:8123
- API server for agent interactions
- Hot reload on code changes
- Open the UI and start chatting:
Navigate to http://localhost:8123 in your browser to interact with the agent.
> What are the top 5 orders by amount in /workspace/data/orders.csv?
The agent will:
- Read the csv-analytics SKILL.md
- Execute the filter_high_value.py script
- Return a summary of the top 5 orders
> Extract the form fields from /workspace/data/sample_form.pdf
The agent will:
- Read the pdf-processing SKILL.md
- Execute the extract_forms.py script
- Return the extracted field names and values
The agent can also handle general questions without using skills:
> What is the difference between pandas and numpy?
libs/
├── backends/
│ └── docker/
│ ├── backend.py # DockerExecutionBackend implementation
│ ├── Dockerfile # Container definition
│ └── README.md # Docker backend documentation
└── middleware/
└── skills.py # SkillsMiddleware implementation
agent/
├── config.py # Configuration and setup
├── prompt.py # System prompt
└── graph.py # Agent graph and main entry point
skills/
├── csv-analytics/
│ ├── SKILL.md # Skill definition and usage
│ ├── scripts/ # Python scripts for CSV processing
│ └── docs/ # Supporting documentation
└── pdf-processing/
├── SKILL.md # Skill definition and usage
├── scripts/ # Python scripts for PDF extraction
└── docs/ # Supporting documentation
workspace/
└── data/ # Sample and working data files
tests/
├── test_execute_backend.py # Unit tests for DockerExecutionBackend
├── test_skills_middleware.py # Unit tests for SkillsMiddleware
├── test_e2e_csv_flow.py # End-to-end CSV workflow tests
└── test_e2e_pdf_flow.py # End-to-end PDF workflow tests
Instead of loading all skill documentation into the agent's context at startup, the agent:
- Startup: Sees only skill names and brief descriptions in system prompt
- Discovery: When a query matches a skill, reads the SKILL.md file
- Execution: Follows SKILL.md instructions to run scripts via
executetool - Efficiency: Only loads what's needed, keeping token usage low
The agent uses a CompositeBackend:
- Default:
DockerExecutionBackend(workspace/) - read-write + execute in container - Route
/skills/:FilesystemBackend(skills/) - read-only on host
This separation ensures:
- Skills are protected from accidental modification
- Commands execute in isolated Docker environment
- Workspace is fully accessible for data processing
- Clear distinction between capabilities and working data
The agent requires user approval for:
- Command execution (
executetool): Review commands before they run - File edits (
edit_filetool): Preview changes before applying
This prevents accidental destructive operations while maintaining flexibility.
Run all tests:
uv run pytestRun specific test suites:
# Unit tests only
uv run pytest tests/test_execute_backend.py tests/test_skills_middleware.py
# Integration tests (requires ANTHROPIC_API_KEY)
uv run pytest tests/test_e2e_csv_flow.py tests/test_e2e_pdf_flow.py -m integrationTo add a new skill:
- Create a directory under
skills/:
mkdir -p skills/my-skill/scripts skills/my-skill/docs- Create
SKILL.mdwith frontmatter:
---
name: my-skill
description: Brief description of what this skill does
---
# My Skill
Detailed usage instructions...
## Scripts
- my_script.py: What it does and how to use it- Add scripts under
scripts/:
#!/usr/bin/env python3
# Your script here- Add supporting docs under
docs/(optional)
The agent will automatically discover and use your skill!
For detailed architecture information, see docs/architecture.md.
MIT License - See LICENSE file for details
By Jan Franco Glanc Gomez