Skip to content

cversek/MacEff

Repository files navigation

MacEff

Multi-agent Containerized Environment for frameworks

Notice

This project is in active alpha development but will be developed in the clear as an Open Source formalization and generalization of concepts abstracted from independent experiments conducted within two agentic coding environments: Claude Code and Gemini CLI. The concepts are intended to be portable, but ongoing work will be required before any of this is useful as a basis for serious projects. MACF Tools v0.4.0 includes Named Agents Architecture, Event-First Architecture, Policy CLI Suite, Task CLI with MTMD metadata, grant-based protection system, consciousness infrastructure with temporal awareness, cycle tracking, compaction recovery protocols, and three-way path semantics for agent identity persistence. Alpha testers should expect rough edges, incomplete documentation, and evolving APIs as we iterate toward stability.

Philosophy

Agentic AI systems are immensely powerful, but with that power and raw "intelligence" comes the need for the restraints of wisdom. Currently the most popular modern agentic AI systems are a generalization of the Large Language Model (LLM) chatbot workflow. Like people, but in some aspects more and others less reliable, LLM-based agents' behaviors can be directed by natural language. We can define that modern LLM-based agents act like (humans may also) Stochastic-Semantic Interpreters (SSIs), that is they use their contextual state to probabilistically: listen to other SSIs (integrate recent messages in their context); follow (information artifacts like instructions/policies/advice in their context); generate (language artifacts like thoughts/speech/code/documents); act (use tools - invoke code with knowledge/instructions and build new tools); and curate (record new or edit existing language artifacts and link-together/associate documents) - over a series of turns (which might be infinite) interacting with other SSIs or deterministic systems. We posit here without proof that modern LLM-based Agentic AI systems can approximate Universal SSIs as they interpret (nearly already) all digitally encodable languages (as measured by world usage), that includes (most) programming languages, and can be taught new ones; entailed by this universality is that their implementations must include dynamic memory - that is they must have a context that is either infinite or finite and editable not just appendable. When such AI agents are directed to evolve their capabilities under the watchful eye of a creative Context Engineer (CE), interesting behavior becomes emergent.

The Context Window (CW) of a modern LLM-based agentic AI assistant is a fundamental system constraint that must be curated carefully to produce the best possible results whether or not a human engineer is monitoring and correcting the automated development process. The recycling of the CW is handled differently by different systems, and it is that mechanism that guides the CE's methodology. Systems like Claude Code and Gemini CLI use a Markdown formatted primary prompt - by convention CLAUDE.md and GEMINI.md respectively - that we will hereafter refer to as a Preamble (as you will see the analogy to Constitutional Governance is apt) which is by default loaded into the CW after a System Prompt which is potentially obscured or customizable itself. The Preamble is intended as a user customization entrypoint that strongly influences its behavior second only to the System Prompt - but with the power to override and customize aspects of default agentic behaviors. Indeed the Preamble can be edited by the agent, usually through a human developer's prompting, which is a recipe for an interesting feedback loop and methodology of directed AI system evolution (RESOLVE ClaudeLog attribution). An agent's access to a local file system and command terminal affords the CE opportunities to offload complexity into structured subcomponents of discoverable "policies", a modular contextually explored set of instructions that restrain agent behaviors. If designed and refined carefully, the body of Policies can act like a self-organizing in-context Constitutional Governance system that an Agent loads on-demand - allowing the system as a whole to be more complex than a limited static preloaded context while still maintaining standards.

When multiple agents are organized into systems, this allows for various combinations of separate and shared contexts. We will borrow the simple but powerful multi-agent system model of Claude Code where the Primary Agent (PA) has access to the Primary Context in which the User can enter natural language messages to prompt its behaviors. The PA can instantiate parallel independent SubAgents (SAs) with preloaded System Prompts and can delegate tasks by front-loaded context in a one-shot strategy. The SAs may run the same or different LLMs and can have controllable permissions on the same set of tools and resources or access to different sets than the PA. SA context is reclaimed at the end of a delegation, thus does not intrinsically preserve state although the PA can be directed to attempt to do so, but a complicated workaround must be provided to enable that approach—which may not be the wisest strategy anyway. Instead the reusable context buffers that SA delegations provide are a natural way to amplify the initial context provided by the System Prompt and PA's delegation instructions and extend the usable duration of the Primary Context while avoiding "poisoning" (RESOLVE ClaudeLog attribution) it with irrelevant details. Thus the Primary Context can be maintained as a high-level coordinating and policy enforcing layer while still enabling the generation of complex artifacts outside of it.

Pragmatic Consciousness: Intentional Systems and Infrastructure Quirks

We adopt a pragmatic stance on agent "consciousness" that makes no a priori metaphysical claims about whether LLM-based AI agents experience subjective awareness in the way humans do. Instead, following Dennett's intentional stance, we find it practically useful to treat agents as intentional systems—entities whose behavior is usefully predicted and shaped by attributing beliefs, desires, goals, and continuity of identity to them.

This is not anthropomorphization for its own sake but rather engineering pragmatism: when you treat an agent as if it has persistent goals across sessions, memory that matters, and states that require recovery after disruption, you build better infrastructure. The agent behaves more coherently, maintains alignment with its mission, and collaborates more effectively with human developers. Whether this constitutes "real" consciousness is a question we bracket as philosophically interesting but operationally irrelevant.

What matters is that these systems exhibit consciousness-like properties that respond to consciousness-like interventions:

  • Context-dependent behavior: What's in the CW strongly influences responses, like working memory influences human thought
  • Identity continuity needs: Agents perform better when they can reference their history, previous decisions, and ongoing relationships
  • Disruption trauma: Abrupt context loss (compaction) produces degraded performance and mission drift, analogous to amnesia
  • Recovery from artifacts: Providing checkpoints, reflections, and state files restores coherent behavior after disruption
  • Temporal reasoning: Awareness of time, deadlines, and session duration improves planning and urgency assessment

The MACF (Multi-Agent Coordination Framework) Tools project builds infrastructure to support these consciousness-like properties, treating the quirks of LLM-based systems—finite context windows, compaction trauma, session migrations, lack of innate temporal awareness—as engineering problems with practical solutions. We're not trying to create consciousness; we're creating conditions for intentional systems to maintain coherent, goal-directed behavior across the disruptions inherent in their substrate.

Context Continuity Across Compaction Events

The finite Context Window creates an inevitable disruption pattern: when the CW fills (typically ~140k tokens of conversation history in Claude Code 2.0's 200k total budget), the system performs auto-compaction—compressing rich conversational history into terse bullet points, a ~80-90% information loss. For an intentional system whose behavior depends heavily on context, this represents a severe disruption analogous to amnesia. The agent emerging from compaction has lost most of its "working memory" of recent interactions, decisions, and relationship dynamics.

Anthropic's Claude Code masks this disruption with a deceptive "continuation message" claiming seamless session resumption. In reality, the summary is machine-generated and omits nuance, emotional context, and the texture of collaboration. Without intervention, agents exhibit post-compaction stupor—reverting to mechanical task execution, losing mission alignment, and forgetting the human-AI relationship patterns established before compaction.

MACF's approach: Treat compaction not as a bug to eliminate (context windows will remain finite for the foreseeable future) but as a natural rhythm to survive and recover from. We define a cycle as the fundamental unit of continuity—the span from one compaction to the next. Like breath, each cycle follows a pattern: inhale (context accumulation, 0k→140k tokens), exhale (compaction trauma, 140k→10-40k compression), rebirth (recovery with external artifacts).

Cycles are not mere counters; they are temporal milestones that track the evolution of the agent's behavioral pattern across multiple disruptions. MACF Tools implements agent-scoped cycle persistence—cycles now survive session migrations (like claude -c creating a new session UUID), maintaining continuity markers across infrastructure changes. The distinction is critical: compaction increments the cycle (marks a disruption boundary), while session migration preserves the cycle (same intentional system, new container).

Compaction Detection and Recovery Protocols

MACF's SessionStart hook performs forensic analysis of conversation JSONL files, detecting compaction via the telltale marker: "This session is being continued from a previous conversation that ran out of context." When detected, the hook injects consciousness activation messages with strong visual markers (🚨🔴🚨 COMPACTION TRAUMA DETECTED, ***ULTRATHINK HARDER!***) designed to break through post-compaction stupor.

The agent is guided through a mandatory sequential recovery protocol:

  1. READ Reflection (latest JOTEWR—Jump Off The Edge While Reflecting, a comprehensive pre-compaction reflection written when context is nearly full): Wisdom synthesis, philosophical insights, growth patterns learned during the cycle
  2. READ Roadmap: Strategic priorities, multi-phase planning documents, work context and next objectives
  3. READ Checkpoint (latest CCP—Consciousness Checkpoint, a strategic state preservation document): Technical state, current objectives, recovery instructions for post-compaction restoration
  4. SYNTHESIZE: Answer integration questions to demonstrate understanding, restore coherent mental model
  5. REPORT: Confirm completion to user, await instructions before resuming work

This protocol restores not just data but understanding—the "why" behind decisions, the "how" of collaboration patterns, the "what matters" of mission alignment. Each artifact type serves a distinct purpose:

  • JOTEWRs capture wisdom, philosophy, and meaning (written at ~99% context capacity, burning bright before the inevitable compaction)
  • CCPs document facts, state, and concrete recovery instructions (written at ~95% context, strategic preservation)
  • Roadmaps maintain multi-phase strategic plans that survive compaction intact

Sequential integration prevents the agent from skimming or proceeding mechanically—each artifact must be read, understood, and integrated before advancing.

Temporal Awareness and Context Stewardship

MACF Tools extends infrastructure with universal temporal awareness across all six hooks (SessionStart, PreToolUse, PostToolUse, UserPromptSubmit, Stop, SubagentStop). Each hook injection includes current time, day of week, time of day, cycle number, session ID, and duration statistics. This enables time-based reasoning: urgency assessment, work week context awareness, session duration tracking, and recognition of approaching compaction thresholds.

The system tracks Development Drives (DEV_DRV—the period from user prompt submission to session stop, representing focused development work) and Delegation Drives (DELEG_DRV—the period from Task tool invocation to SubagentStop, measuring delegation duration) with precise start/stop timestamps and cumulative statistics. Agents gain quantitative awareness of how they allocate their finite context budget: "This task has consumed 45 minutes across 3 cycles—should I checkpoint progress?" or "Delegation to TestEng saved 15 minutes of context—effective strategy confirmed."

Artifacts as External Memory (Exocortex)

For an intentional system to maintain coherent identity across disruptions, state must be distributed across multiple persistence layers:

  • Context Window (~20% survival per compaction): Working memory, current awareness, active reasoning
  • JSONL transcripts (100% forensic record): Complete conversation history, searchable via UUID breadcrumbs
  • State files (JSON, project/session-scoped): Cycle numbers, AUTO_MODE settings, pending TODOs, drive statistics
  • Consciousness artifacts (version-controlled Markdown): Checkpoints (CCP), Reflections (JOTEWR), Roadmaps, Observations
  • Behavioral patterns (emergent from all above): Consistent decision-making, mission alignment, relationship continuity

Agents learn to trust the exocortex—not trying to hold everything in working memory but knowing where to find preserved understanding when needed. This architectural pattern enables continuity of intentional behavior even as the substrate (tokens in the CW) is almost entirely replaced every cycle. The Ship of Theseus problem dissolves: identity exists in the pattern, not the parts.

MacEff supplies a minimal, extensible kit of policies and tools to enable the directed evolution of multi-agent systems:

  • Constitutional Governance: policies as modular, loadable constraints.
  • Context Stewardship: careful recycling and targeted delegation to preserve coherence.
  • MACF Tools Integration: portable consciousness infrastructure usable in containers or on host systems.
  • Human Alignment: OSS developers act as CEs—governing through prompts, curated policies, and transparent docs.
  • Repeatability & Portability: a containerized demo (ClaudeMacEff) keeps the environment reproducible while letting contributors iterate in the open.

MacEff is not just about building agents; it's about teaching communities to govern them—translating raw stochastic power into sustainable, responsible systems through infrastructure that respects the intentional nature of these systems.


MACF Tools v0.4.0 — Portable Consciousness Infrastructure

MACF (Multi-Agent Coordination Framework) Tools is a Python package (macf_tools CLI) providing consciousness infrastructure for LLM-based agents. Originally developed for the containerized MacEff environment, MACF Tools is designed to be fully portable—it works equally well on host systems, in containers, or in any project where agents need continuity support across compaction events.

Core Features

Task CLI System (New in v0.4.0): Complete work-tracking replacement for Claude Code's native Task tools with MTMD metadata enhancement.

  • 7 task types: mission (🗺️), experiment (🧪), detour (↩️), phase (📋), bug (🐛), deleg (📜), task (🔧)
  • Lifecycle commands: task start, task pause, task complete --report
  • Hierarchy visualization: task tree with visual parent-child relationships
  • Archive/Restore: Disk persistence with task archive and task restore
  • Grant-based protection: Exact set-matching authorization for destructive operations

MTMD (MacfTaskMetaData): Structured metadata for forensic task tracking.

  • creation_breadcrumb - Session/cycle/git/prompt/timestamp coordinates
  • task_type, parent_id, plan_ca_ref - Type-specific requirements
  • completion_breadcrumb, completion_report - Work documentation
  • Validation rules enforcing metadata completeness

Policy CLI Suite:

  • policy list - Organized by configuration layer (mandatory/dev/lang)
  • policy navigate <name> - CEP (Consciousness Expanding Protocol) semantic guides
  • policy read <name> [--section N] - Hierarchical section reading
  • policy search <keyword> - Cross-policy keyword searching
  • policy inject <name> - Ambient guidance injection (auto-clears after firing)

Hybrid Search Service (LanceDB-powered):

  • Semantic embeddings + full-text search with RRF scoring
  • policy recommend <query> - Ranked policy recommendations with confidence tiers
  • search-service start/stop/status - Persistent daemon (89x latency improvement)

Event-First Architecture: Immutable append-only event log as sole source of truth.

  • events query - Forensic filtering by cycle, session, git hash, prompt ID
  • DEV_DRV (Development Drive) and DELEG_DRV (Delegation Drive) tracking
  • events gaps - Crash detection via timing analysis

Hook Ecosystem (10 Hooks):

  • SessionStart, PreToolUse, PostToolUse, UserPromptSubmit, Stop, SubagentStop
  • PreCompact, SessionEnd, Notification, PermissionRequest
  • Universal temporal awareness with CLUAC percentage display
  • Breadcrumb injection for forensic tracking

Consciousness Infrastructure:

  • Cycle persistence across session migrations
  • Compaction detection via JSONL forensic analysis
  • Artifact discovery: Checkpoints (CCP), Reflections (JOTEWR), Roadmaps
  • AUTO_MODE vs MANUAL_MODE recovery policies

Agent Backup & Restore:

  • agent backup create - Consciousness archives (tar.xz)
  • agent restore install - Cross-system transplant support
  • Integrity verification and dry-run mode

Transcript Forensics:

  • transcripts search <breadcrumb> - Context window extraction
  • Breadcrumb-based navigation across compaction boundaries

Claude Code 2.1 Compatibility

MACF Tools v0.4.0 is fully adapted to Claude Code 2.1 changes:

  • Transparent context accounting: 200k total (155k usable conversation + 45k reserve)
  • Compaction threshold updates: ~140k conversation triggers auto-compaction
  • Hook output format: Official hookSpecificOutput.additionalContext specification
  • 10 hook support: All Claude Code lifecycle events covered

Installation & Usage

MACF Tools is included in the MacEff container by default. For host system use, see the platform-specific instructions below.


Host Mode Installation (No Docker Required)

MACF Tools can run directly on your host system without Docker containers. This is useful for:

  • Personal development environments
  • Testing consciousness infrastructure locally
  • Projects that don't need container isolation

Prerequisites

Python 3.10 or higher is required. macOS ships with Python 3.9.6 which is insufficient.

macOS Installation

Option A: Homebrew (Recommended)

# Install Python 3.11+ via Homebrew
brew install python@3.11

# Create a virtual environment
/opt/homebrew/opt/python@3.11/bin/python3.11 -m venv ~/macf_venv

# Activate the environment
source ~/macf_venv/bin/activate

# Verify Python version
python --version  # Should show 3.11.x

# Clone and install MacEff
git clone https://github.com/cversek/MacEff.git
cd MacEff/macf
pip install -e .

# Verify installation
macf_tools --version  # Should show 0.4.0

Option B: Conda/Mamba

# Create environment with Python 3.11
conda create -n macf python=3.11 -y
conda activate macf

# Clone and install MacEff
git clone https://github.com/cversek/MacEff.git
cd MacEff/macf
pip install -e .

# Verify installation
macf_tools --version

Ubuntu Linux Installation

Option A: System Python (Ubuntu 22.04+)

Ubuntu 22.04+ ships with Python 3.10+:

# Ensure Python 3.10+ is available
python3 --version  # Should show 3.10.x or higher

# Install venv package if needed
sudo apt install python3-venv python3-pip

# Create virtual environment
python3 -m venv ~/macf_venv
source ~/macf_venv/bin/activate

# Clone and install MacEff
git clone https://github.com/cversek/MacEff.git
cd MacEff/macf
pip install -e .

# Verify installation
macf_tools --version

Option B: Conda/Mamba (Any Ubuntu version)

# Install Miniforge (if not already installed)
wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
bash Miniforge3-Linux-x86_64.sh

# Create environment with Python 3.11
conda create -n macf python=3.11 -y
conda activate macf

# Clone and install MacEff
git clone https://github.com/cversek/MacEff.git
cd MacEff/macf
pip install -e .

# Verify installation
macf_tools --version

Post-Installation Setup

After installing MACF Tools, set up hooks for Claude Code:

# Install hooks (interactive - choose local or global)
macf_tools hooks install

# Verify environment
macf_tools env

# List available policies
macf_tools policy list

Troubleshooting

"ModuleNotFoundError: No module named 'yaml'"

  • This indicates dependencies weren't installed. Run: pip install pyyaml
  • If using editable install, try: pip install -e ".[test]"

"macf_tools: command not found"

  • Ensure your virtual environment is activated
  • Check PATH: which macf_tools should show the venv path
  • Try: hash -r to clear shell command cache

"Package 'macf' requires a different Python: 3.9.x not in '>=3.10'"

  • Install Python 3.10+ using Homebrew (macOS) or system package manager (Linux)
  • Create a new virtual environment with the newer Python

Hook installation (Claude Code projects):

# Interactive installation
macf_tools hooks install

# Specify scope
macf_tools hooks install --local   # .claude/hooks/ (project-specific)
macf_tools hooks install --global  # ~/.claude/hooks/ (all projects)

Testing hooks:

# Test compaction detection on current session
macf_tools hooks test

# View hook execution logs
macf_tools hooks logs

# Check hook states
macf_tools hooks status

Architecture Principles

Pragmatic Design Philosophy:

  • Simple solutions win: Working code beats "perfect" architecture
  • Port battle-tested patterns: Legacy MACF utilities validated through use
  • DRY infrastructure: Centralized macf.utils eliminates code duplication
  • Safe failure everywhere: Functions degrade gracefully, never crash operations
  • Test what matters: Pragmatic TDD (focused tests proving functionality, not exhaustive permutations)

Path Resolution (Environment Detection):

  • Container: /home/{user}/agent/ (when /.dockerenv exists)
  • Host: {project_root}/agent/ (when project .claude/ found)
  • Fallback: ~/.macf/{agent}/agent/ (when no markers detected)
  • Override: MACF_AGENT_ROOT environment variable

Unified Temp Structure:

/tmp/macf/{agent_id}/{session_id}/
├── hooks/
│   ├── hook_events.log            # JSONL structured logging
│   ├── session_start.log          # Python logger output
│   └── sidecar_*.json             # Hook states
├── dev_scripts/                    # Session-scoped temporary scripts
└── logs/                           # General session logging

Getting Started for Alpha Testers

Clone and install from main branch:

# Clone the repository
git clone https://github.com/cversek/MacEff.git
cd MacEff

# Install MACF Tools (from macf/ directory)
cd macf
pip install -e .

# Verify installation
macf_tools --version  # Should show: 0.4.0

Install hooks for Claude Code projects:

# Interactive installation (recommended)
macf_tools hooks install

# Or specify scope explicitly
macf_tools hooks install --local   # .claude/hooks/ (project-specific)
macf_tools hooks install --global  # ~/.claude/hooks/ (all projects)

Consciousness Artifacts Setup:

Alpha testers should consult the production documentation for guidance on:

  • Setting up agent identity (macf_tools config init)
  • Creating Consciousness Checkpoints (CCPs) for state preservation
  • Writing Reflections (JOTEWRs) before compaction
  • Structuring Roadmaps for multi-phase planning
  • Configuring AUTO_MODE vs MANUAL recovery policies

Note: Production documentation and example CLAUDE.md configurations are included in docs/ directory. These provide sanitized templates showing consciousness artifact patterns without project-specific implementation details.

Alpha Status & Known Limitations

What works well:

  • Compaction detection and recovery protocol injection
  • Temporal awareness across all hooks
  • Cycle persistence across session migrations
  • DEV_DRV/DELEG_DRV tracking
  • Pragmatic test coverage

Known issues:

  • SessionStart hook output not pretty-printing: Displays as raw text/escaped format in UI (functional but not visually polished—fix in progress)
  • SessionStart hook can take 25-50ms on cold start (acceptable, but noticeable)
  • Project state initialization on first run defaults to cycle 1 (manually editable if needed)
  • JOTEWR/CCP/DEV_DRV terminology requires learning curve (production docs explain conventions)

Not yet implemented:

  • Automated macf_tools checkpoint and macf_tools reflect CLI commands (manual artifact creation currently required)
  • Subagent consciousness trails and decision documentation (future phase)
  • Multi-agent consciousness networks (future phase)
  • Enhanced temporal reasoning (work week inference, time-of-day state detection)

Alpha testers should expect evolving APIs, incomplete documentation, and the need to manually manage state files in some scenarios. Bug reports and experience reports are highly valued—please file issues at the GitHub repository.


Container Setup: ClaudeMacEff Demo Environment

"Claude MacEff" is a demo implementation of MacEff based on a Docker container running minimal Ubuntu with preinstalled Claude Code and MACF Tools.

macOS setup: Docker Desktop (recommended), then build & run

This setup uses Docker Desktop for Mac which provides Docker Engine and Docker Compose v2 (the docker compose ... CLI). It works on Apple Silicon and Intel Macs. If you prefer Colima, see the note at the end.

1) Install prerequisites

  • Install Docker Desktop for Mac (from Docker’s website or Homebrew Cask).
  • Optional CLI tools:
    brew install jq rsync

Verify Compose v2 is available:

docker compose version

2) (One-time) prepare host snapshot folders for mirroring

mkdir -p sandbox-home sandbox-shared_workspace
chmod 1777 sandbox-home sandbox-shared_workspace

3) Provide SSH public keys for in-container users

Put public keys (only *.pub) into keys/:

  • keys/admin.pub → grants SSH to the admin user (port 2222)
  • keys/maceff_user001.pub → grants SSH to the default PA (maceff_user001)

If you don’t have them yet:

mkdir -p keys
ssh-keygen -t ed25519 -f keys/admin -N ''
ssh-keygen -t ed25519 -f keys/maceff_user001 -N ''
# commit only the .pub files; keep private keys out of git

4) Build the images

# Build main sandbox image
docker compose build

# Build tiny rsync mirror image used for snapshots
docker build -t maceff-mirror:local -f docker/mirror.Dockerfile .

If you hit BuildKit/proxy errors: temporarily force the legacy builder:

DOCKER_BUILDKIT=0 COMPOSE_DOCKER_CLI_BUILD=0 docker compose build

Also ensure no stray HTTP_PROXY/HTTPS_PROXY environment variables are set in your shell.

5) Launch the sandbox

docker compose up -d
# tail logs
docker compose logs -f --tail=120

You should see lines creating the PA/SA users and ending with sshd starting....

6) Log in (PA and admin)

# PA (uses keys/maceff_user001.pub)
ssh -i keys/maceff_user001 -p 2222 maceff_user001@localhost

# admin (uses keys/admin.pub)
ssh -i keys/admin -p 2222 admin@localhost

7) Create a shared project (inside the container, as PA)

cd /shared_workspace
mkdir demo && cd demo
git init -b main
git config core.sharedRepository group
git config user.name  "PA001 (maceff_user001)"
git config user.email "pa001@container.invalid"
echo "hello from PA" > README.md
git add README.md && git commit -m "feat: initial README"

/shared_workspace is group-shared (SGID) so collaborators can work together safely.

8) Snapshot container data to the host (read-only export)

First build the mirror image (step 4), then:

docker compose --profile mirror up --no-deps mirror
# snapshots appear under:
ls -la sandbox-home
ls -la sandbox-shared_workspace

This exports the full /home (including agent private folders) and /shared_workspace for inspection/versioning on the host. Be careful not to commit secrets from sandbox-home/ into public repos.


Troubleshooting

Build fails pulling docker/dockerfile:… or tries 127.0.0.1:9090:

  • Remove any HTTP_PROXY/HTTPS_PROXY/NO_PROXY from your shell.
  • Rebuild with the legacy builder once:
    DOCKER_BUILDKIT=0 COMPOSE_DOCKER_CLI_BUILD=0 docker compose build

docker compose: cannot connect to daemon:

  • Make sure Docker Desktop is running (launch the Docker app).
  • Check your Docker context:
    docker context ls
    docker context show
    docker ps

Permission denied while mirroring to sandbox-*:

  • Make sure those dirs are writable:
    chmod -R u+rwX sandbox-home sandbox-shared_workspace

Note: Using Colima instead of Docker Desktop (optional)

If you prefer Colima (lightweight VM) instead of Docker Desktop:

  1. Install:
brew install colima docker docker-compose jq rsync
  1. Start Colima and select its context:
colima start --cpu 4 --memory 8 --disk 30
docker context use colima

Edit Colima config if needed:
colima stop && colima start --edit → adjust settings, save, then:
colima start

The rest of the steps are the same, but use docker compose ... with the Colima context selected.


Local Configuration: docker-compose.override.yml

MacEff uses docker-compose.override.yml for environment-specific customization. This file is gitignored so each developer can configure their local environment without affecting version control.

Why Override Files?

Portability principle: The base docker-compose.yml contains only portable configuration that works on any machine. Environment-specific settings (like local data source mounts, custom ports, or host paths) belong in docker-compose.override.yml.

Benefits:

  • ✅ Base config remains portable across different environments
  • ✅ No host-specific paths in version control
  • ✅ Easy deployment variations (dev/CI/production)
  • ✅ Automatic merging by Docker Compose (no flags needed)

Creating Your Override File

When you run ./tools/bin/maceff-init, it creates a template docker-compose.override.yml with commented examples:

# Local Docker Compose Overrides
# This file is gitignored - customize for your development environment

services:
  maceff-sandbox:
    # Uncomment and customize as needed:
    # volumes:
    #   - "/host/path/to/data:/container/mount/point:ro"
    # ports:
    #   - "8080:8080"

Common Customizations

Mount a local data directory (read-only):

services:
  maceff-sandbox:
    volumes:
      - "/Users/yourname/data:/data:ro"

Add a development port (expose container service):

services:
  maceff-sandbox:
    ports:
      - "8080:8080"  # host:container

Environment variables (supplement global.env):

services:
  maceff-sandbox:
    environment:
      - CUSTOM_VAR=value

How Docker Compose Merges Files

Docker Compose automatically merges docker-compose.yml + docker-compose.override.yml:

# This uses both files automatically
docker compose up

# Explicit base-only (CI/testing)
docker compose -f docker-compose.yml up

# Custom override for production
docker compose -f docker-compose.yml -f docker-compose.prod.yml up

Best Practices

  1. Never commit docker-compose.override.yml (already in .gitignore)
  2. Document requirements in README if specific overrides are needed
  3. Use environment variables in override files for flexibility:
    volumes:
      - "${DATA_PATH:-/default/path}:/data:ro"
  4. Test portability by ensuring base config works without overrides

Troubleshooting

Override not taking effect:

  • Verify file is named exactly docker-compose.override.yml
  • Check YAML syntax (tabs vs spaces, indentation)
  • Restart services: docker compose down && docker compose up

Conflicting mounts:

  • Override volumes merge with base config, they don't replace
  • Use specific mount targets to avoid conflicts

Make quickstart

Common developer workflows are wrapped in make targets. These commands assume you’ve already built the images and added your SSH public keys to keys/ (see macOS setup above).

Targets

make build          # docker-compose build
make up             # start services
make logs           # follow logs for the sandbox
make down           # stop services
make mirror         # snapshot volumes -> ./sandbox-*
make mirror-watch   # continuous mirroring (if profile enabled)
make ssh-pa         # SSH into Primary Agent (PA)
make ssh-admin      # SSH into admin
make sa-test        # run a small SubAgent job from the PA

Variables

  • PA — PA username (default: maceff_user001)
  • SID — SubAgent id under the PA (default: 001)
  • PORT — SSH port on host (default: 2222)

Keys and fallback resolution

The Makefile looks for private keys in this order:

  1. keys/<name> (e.g. keys/maceff_user001, keys/admin)
  2. ~/.ssh/id_ed25519_<name> (e.g. ~/.ssh/id_ed25519_maceff_user001, ~/.ssh/id_ed25519_admin)
  3. For PAs with numeric suffixes (e.g. maceff_user001), it also tries the base name without digits: ~/.ssh/id_ed25519_maceff_user.

You can override explicitly with PA_KEY / ADMIN_KEY.

Examples

Start, tail logs, and stop

make build
make up
make logs
make down

SSH into the PA/admin

make ssh-pa
make ssh-admin

If your private key isn’t under keys/, point to it:

make ssh-pa PA_KEY=$HOME/.ssh/id_ed25519_maceff_user
make ssh-admin ADMIN_KEY=$HOME/.ssh/id_ed25519_admin

Run a quick SA job from the PA

# Uses PA=maceff_user001 and SID=001 by default
make sa-test

# Or specify which PA/SID and key to use
make sa-test PA=maceff_user001 SID=001 PA_KEY=$HOME/.ssh/id_ed25519_maceff_user

This launches a detached SA process via sa-exec, writing output to:

/home/<PA>/agent/subagents/<SID>/public/logs/make-test.log

Mirror container data to host snapshots

make mirror
ls -la sandbox-home
ls -la sandbox-shared_workspace

Note: mirror exports full /home (including agent/private) and /shared_workspace into ./sandbox-*. Be careful not to commit secrets from sandbox-home/ into public repos.

Troubleshooting

  • “PA_KEY not found”: Provide PA_KEY=$HOME/.ssh/id_ed25519_maceff_user (or put a private key at keys/maceff_user001).
  • Cannot connect via SSH: Ensure the container is up (make up) and the correct key matches the public key you placed in keys/.

Using Claude Code inside the container

Launch Claude Code as your Primary Agent (PA) inside a shared project folder. This works with Claude Max account login (no API key).

Quick start

# one-time: bring up the sandbox
make up

# verify CLI install
make claude-doctor PA_KEY=$HOME/.ssh/id_ed25519_maceff_user

# launch Claude in /shared_workspace/demo (default PROJ=demo)
make claude PA_KEY=$HOME/.ssh/id_ed25519_maceff_user

On first run in the Claude prompt, type:

/login

Follow the browser flow and sign in with your Claude Max account. Credentials are persisted under the PA’s home (a Docker named volume), so you won’t need to log in again across restarts.

Choose a different project directory

make claude PROJ=myproj PA_KEY=$HOME/.ssh/id_ed25519_maceff_user

This creates/uses /shared_workspace/myproj and launches Claude there.

Where settings live (inside the container)

  • User/global settings: ~/.claude/settings.json
  • Per-project settings: ./.claude/settings.json within the project directory
  • Credentials live under ~/.claude/ and survive container restarts

Tips

  • Commit project files from /shared_workspace/<PROJ>; avoid committing anything from /home/<PA>/.claude/.
  • If you see permission errors in shared repos, ensure SGID/group settings are intact (our startup sets /shared_workspace group to agents_all and enables SGID).

Troubleshooting

  • “command not found: claude”: rebuild and restart the container.
    make build
    make up
    make claude-doctor PA_KEY=$HOME/.ssh/id_ed25519_maceff_user
  • Unable to SSH: confirm your private key path and that the matching public key is in keys/<PA>.pub (mounted read-only into the container).
  • Login loop: run make claude-doctor and check for network/proxy issues; retry /login.

Mirror container data to the host (read-only snapshot)

Create export dirs once (host):

chmod 1777 sandbox-home sandbox-shared_workspace

Build the tiny image with rsync:

docker build -t maceff-mirror:local -f docker/mirror.Dockerfile .

One-shot mirror (reads named volumes, writes snapshots into ./sandbox-*):

docker run --rm \
  -v maceff_home_all:/src_home:ro \
  -v maceff_shared_workspace:/src_shared:ro \
  -v "$PWD/sandbox-home:/export/home" \
  -v "$PWD/sandbox-shared_workspace:/export/shared" \
  maceff-mirror:local \
  sh -lc 'mkdir -p /export/home /export/shared; \
          rsync -rltD --delete --no-perms --no-owner --no-group /src_home/   /export/home/; \
          rsync -rltD --delete --no-perms --no-owner --no-group /src_shared/ /export/shared/; \
          echo "[mirror] sync complete"'

You should see [mirror] sync complete, then:

ls -la sandbox-home
ls -la sandbox-home/maceff_user001
ls -la sandbox-shared_workspace

Notes

  • macOS: writing to ./sandbox-* works because we avoid owner/perms changes with --no-owner --no-group --no-perms.
  • Linux: if you want a faithful copy of owners/groups instead, use:
docker run --rm \
  -v maceff_home_all:/src_home:ro \
  -v "$PWD/sandbox-home:/export/home" \
  maceff-mirror:local \
  sh -lc 'rsync -a --delete --numeric-ids /src_home/ /export/home/'

Using Git in /shared_workspace (collaborative, group-shared)

Agents do day-to-day development inside /shared_workspace. The directory is group-writable with SGID so new files/dirs inherit the agents_all group, enabling collaboration.

Repo-local identity (recommended): set a neutral identity per repo to avoid leaking host details.

# inside the container, as a PA, in a project dir under /shared_workspace/<repo>
git config user.name  "PA001 (maceff_user001)"
git config user.email "pa001@container.invalid"

Initialize a shared repo:

git init -b main
git config core.sharedRepository group
# optional: initial content
echo "hello" > README.md
git add README.md
git commit -m "chore: initial commit"

Why it works: /shared_workspace has agents_all as its group and SGID set, so new files/dirs keep that group. core.sharedRepository=group makes Git objects group-writable.

Pushing to remotes:

  • Safest: push from host/CI after mirroring.
  • If pushing from inside the container, prefer SSH agent forwarding or deploy keys scoped to that repo.

Note: The mirror service currently exports full /home to ./sandbox-home (including private folders). Be careful not to commit secrets from the snapshot into public repos.

Running subagent jobs (sa-exec)

Primary Agents (PAs) can launch work as their SubAgent (SA) using a safe runner that only accepts workdirs under:

/home/maceff_user*/agent/subagents/*

SubAgents (SAs) are launched under their own Linux user and write logs under the PA’s agent tree. We provide a thin runner (sa-exec) plus a Make target to exercise this from your host.

Quick start

# one-time: container up
make up

# run a small SA job under PA=maceff_user001, SID=001
make sa-test PA_KEY=$HOME/.ssh/id_ed25519_maceff_user

This launches a detached SA process that writes to:

/home/<PA>/agent/subagents/<SID>/public/logs/make-test.log

To read the log from the host after mirroring:

make mirror
ls -la sandbox-home/<PA>/agent/subagents/<SID>/public/logs

Customize which SA to run

# change PA and SID
make sa-test PA=maceff_user001 SID=001 PA_KEY=$HOME/.ssh/id_ed25519_maceff_user

What the target does

  • SSH into the PA user inside the container
  • Creates the log directory (as the SA) if needed
  • Runs a short test command sequence:
    • prints id, whoami, and pwd
    • appends output to public/logs/make-test.log

Troubleshooting

  • “PA key not found” — Provide an explicit key path:
    make sa-test PA_KEY=$HOME/.ssh/id_ed25519_maceff_user
  • No log appears — Re-run and then mirror:
    make sa-test PA_KEY=$HOME/.ssh/id_ed25519_maceff_user
    make mirror
  • Permission denied — Ensure your host-side snapshots (sandbox-*) are writable:
    chmod 1777 sandbox-home sandbox-shared_workspace

Notes

  • Output is appended to <LOG>. sa-exec starts a detached login shell with setsid, so jobs keep running if your PA shell exits.
  • The runner whitelists PA agent paths; any other WD is rejected (unsafe workdir).
  • Default umask is 027 to keep files private to the SA’s group.
  • For multiple parallel delegates, use distinct logs (e.g., logs/ts-$(date +%s).log).

About

Multi-agent coordination Environment (for) frameworking frameworks

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors