98% reduction in token consumption while giving agents more autonomy and flexibility.
⚠️ IMPORTANT: Keep this repository PRIVATEThis implementation stores OAuth credentials in
./mnt/mcp-creds/which are currently committed to the repository. Proper OAuth flow for MCP servers is coming soon to the Agencii platform. Until then, ensure your repository visibility is set to private to protect your credentials.
This implementation follows Anthropic's Code Execution with MCP pattern, where agents write code to interact with MCP servers instead of making direct tool calls. The agent discovers tools by exploring the filesystem and loads only what it needs for each task.
Traditional MCP (Direct Tool Calls):
- Loads all 19 tool definitions upfront (~150K tokens)
- Every intermediate result flows through model context
- Example: Copying a transcript consumes 32K tokens
Code Execution Pattern:
- Loads tools on-demand from filesystem (~2K tokens)
- Processes data in execution environment
- Same task consumes 4K tokens with skills, 12K without
sales_ops agent
├── IPythonInterpreter (code execution)
├── PersistentShellTool (file discovery)
└── MCP Servers (as code APIs)
├── servers/notion/ (15 tools)
│ ├── search.py
│ ├── fetch.py
│ └── ... (other tools)
└── servers/gdrive/ (4 tools)
├── search.py
├── read_file.py
├── read_sheet.py
└── update_cell.py
git clone <your-repo>
cd code-exec-agent# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtAdd to .env:
OPENAI_API_KEY=your-openai-key
# Google Drive (required)
GDRIVE_CREDENTIALS_JSON={"installed":{"client_id":"...","client_secret":"...","redirect_uris":["http://localhost"]}}
# Notion (uses OAuth via mcp-remote - auto-configured)Getting Google Drive Credentials:
- Create Google Cloud project
- Enable Google Drive API, Google Sheets API, Google Docs API
- Create OAuth Client ID for "Desktop App"
- Download JSON and add to
GDRIVE_CREDENTIALS_JSON
npx @isaacphi/mcp-gdrive
# Follow OAuth flow in browser
# Press Ctrl+C after "Setting up automatic token refresh"python agency.pyTask: Add transcript from Google Doc to Notion page
Add this transcript from this Google doc https://docs.google.com/document/d/YOUR_DOC_ID
to this Notion page https://www.notion.so/YOUR_PAGE_ID
What Happens:
- Agent checks
./mnt/skills/for existing skill - If not found, reads only needed tools:
servers/gdrive/read_file.pyservers/notion/update_page.py
- Writes code in IPythonInterpreter:
from servers.gdrive import read_file
from servers.notion import update_page
# Read transcript (stays in execution environment)
transcript = await read_file(fileId="YOUR_DOC_ID")
# Update Notion page
await update_page(data={
"page_id": "YOUR_PAGE_ID",
"command": "replace_content",
"new_str": transcript
})- Suggests saving as reusable skill
- Next time: uses skill directly (4K tokens vs 12K)
Follow this step-by-step workflow using Cursor's AI commands:
Create a sales_ops agent with 2 built-in tools: IPythonInterpreter and PersistentShellTool
Why these tools:
IPythonInterpreter- Executes code with top-level awaitPersistentShellTool- Discovers files and reads tool definitions
/mcp-code-exec
Add the following mcp servers to sales_ops agent:
https://developers.notion.com/docs/get-started-with-mcp
https://github.com/isaacphi/mcp-gdrive
What this does:
- Creates
servers/notion/with 15 tool files - Creates
servers/gdrive/with 4 tool files - Each tool is a Python file with async function
- Auto-creates
server.pyfor connection management - Tests server connections
/write-instructions @sales_ops.py
Main role: Performing operational tasks for the team
Business goal: Improve efficiency
Process:
1. Discover skills in ./mnt/skills folder
2. Use skill if it matches task
3. If no skills found, read ONLY necessary tool files
4. Import and combine tools in IPythonInterpreter
5. Suggest new skills to be added
Keep these instructions short. Don't add mcp usage examples or don't list all mcps. Agent should discover them autonomously.
Agent should also minimize token consumption by performing as few tool calls as possible and only reading the necessary tool files to complete the task.
Output: Summary + skill suggestions
Key workflow points:
- Skills-first approach - Always check
./mnt/skills/first - Progressive disclosure - Only read tools you need
- Self-improvement - Create reusable skills over time
- Minimize token consumption - Agent shouldn't read too many files
If you see authentication errors:
I added secrets, please retest google drive tools make sure each tool is production ready
Then authenticate:
npx @isaacphi/mcp-gdrive# Test locally
python agency.py
# Deploy to Agency Swarm platform
git push origin main
# Go to platform.agency-swarm.ai
# Create new agency from repo
# Add environment variablesUser: Add transcript to Notion
Agent → MCP: gdrive.read_file(docId)
MCP → Agent: [Full 50KB transcript in context]
Agent → MCP: notion.update_page(pageId, transcript)
[Agent rewrites full 50KB transcript again]
Result: 32,000 tokens consumed
User: Add transcript to Notion
Agent → Shell: ls ./mnt/skills/
Agent → Shell: cat servers/gdrive/read_file.py
Agent → IPython:
from servers.gdrive import read_file
from servers.notion import update_page
transcript = await read_file(fileId="...")
await update_page(data={...})
Result: 12,000 tokens (first time), 4,000 tokens (with skill)
Instead of loading all 19 tools upfront:
# Traditional: All tools loaded immediately
✗ 150K tokens - Full definitions for all 19 tools in context
# Code Execution: Load on demand
✓ 2K tokens - List directory to see available tools
✓ Read only the 2 files needed for current task
Agent builds its own library of reusable functions:
./mnt/skills/
├── copy_gdrive_to_notion.py
├── export_sheet_to_csv.py
└── search_and_email_results.py
Skills persist across chat sessions. Each completed task is an opportunity to create a new skill.
Test Task: Copy Google Doc transcript to Notion page
| Approach | First Run | With Skill | Reduction |
|---|---|---|---|
| Direct MCP | 32K tokens | 32K tokens | - |
| Code Execution | 12K tokens | 4K tokens | 88% |
✅ Use Code Execution Pattern for:
- Operations agents (data sync, reporting)
- Research agents (gather, analyze, summarize)
- Analytics agents (query, transform, visualize)
- Agents with 10+ tools
- Tasks with large data processing
❌ Use Traditional MCP for:
- Simple customer support (3-5 tools)
- Single-purpose agents
- Tasks requiring immediate consistency
- When infrastructure overhead isn't acceptable
The agent follows this process for every task:
1. Check Skills
└─ ls ./mnt/skills/
└─ If match found → Execute skill → Done
2. Identify Tools Needed
└─ Based on task: Notion? Drive? Both?
3. Read ONLY Necessary Tools
└─ cat servers/notion/fetch.py
└─ cat servers/gdrive/read_file.py
└─ DO NOT read server.py or other files
4. Combine Tools in Code
└─ Write Python code in IPythonInterpreter
└─ Use await directly (top-level await enabled)
5. Suggest New Skill
└─ Analyze workflow
└─ Propose reusable function
└─ Save to ./mnt/skills/
Content Operations:
search()- Semantic search across workspacefetch()- Get page/database detailscreate_pages()- Create new pagesupdate_page()- Update properties/contentmove_pages()- Move to new parentduplicate_page()- Duplicate page
Database Operations:
create_database()- Create with schemaupdate_database()- Update schema
Comments:
create_comment()- Add commentget_comments()- Get all comments
Workspace:
get_teams()- List teamsget_users()- List userslist_agents()- List custom agentsget_self()- Get bot infoget_user()- Get specific user
Drive:
search()- Search filesread_file()- Read contents
Sheets:
read_sheet()- Read spreadsheetupdate_cell()- Update cell value
code-exec-agent/
├── sales_ops/ # Main agent
│ ├── sales_ops.py # Agent configuration
│ ├── instructions.md # Agent prompt (key to performance)
│ └── tools/ # Built-in tools (empty - uses framework)
├── servers/ # MCP servers as code
│ ├── notion/
│ │ ├── server.py # Connection management
│ │ ├── __init__.py # Exports all tools
│ │ ├── search.py # Individual tool
│ │ └── ... (15 tools)
│ └── gdrive/
│ ├── server.py
│ ├── __init__.py
│ └── ... (4 tools)
├── mnt/
│ ├── skills/ # Agent-created reusable functions
│ └── mcp-creds/ # OAuth tokens (auto-managed)
├── agency.py # Entry point
├── .env # Credentials
└── requirements.txt
Problem: Agent reads server.py, README.md, etc.
Solution: Update instructions.md:
**DO NOT** read any other tool, readme, or server files to avoid extra token consumption.
Only read what you need for the specific task.Problem: OAuth/Authentication are not working after deployment.
Solution:
- Ensure all OAuth tokens are saved to
./mnt/mcp-creds/ - Ensure persistent storage is enabled under "Agency" tab
or
- Trigg OAuth flow again locally
- Commit and deploy to Agencii.ai
- Test in another chat
Problem: Persistent storage is not enabled so skills are not saved.
Solution:
- Open your agency on Agencii.ai
- Enable storage under "Agency" tab
- Wait for build to complete
- Tell your agent to save the skill
- Test in another chat
python agency.py- Push to GitHub (private repo)
- Go to https://agencii.ai
- Create new agency from repo
- Add environment variables
- Deploy
Platform Benefits:
- Persistent
./mnt/storage (skills preserved) - Automatic scaling
- Built-in tracing & analytics
- No infrastructure management
- Write clear instructions - Prompting is key for this pattern
- Build skills progressively - Start simple, improve over time
- Use specific task descriptions - Help agent identify needed tools
- Review traces - Check platform dashboard for optimization opportunities
- Start with common workflows - Build skill library for repeated tasks
✅ Ready for production IF:
- You have clear, well-tested instructions
- Tasks are operational (not simple Q&A)
- You monitor and optimize prompts
- You use skills for repeated workflows
- Simple customer support (use direct MCP)
- Mission-critical real-time operations
- Tasks requiring <1s response time
- Anthropic: Code Execution with MCP
- Agency Swarm Documentation
- Notion MCP Server
- Google Drive MCP Server
This is a reference implementation of the Code Execution Pattern. Improvements welcome:
- Better prompting strategies
- More efficient skill suggestions
- Additional MCP server integrations
- Performance optimizations
MIT
Built with Agency Swarm implementing Anthropic's Code Execution Pattern