Coder Crew - Agentic AI Code Execution System

An intelligent agentic AI system powered by CrewAI that writes, executes, and validates Python code autonomously. This project showcases advanced AI agent orchestration with real code execution capabilities using containerized environments.

🏗️ Architecture Overview

System Design Philosophy

This project implements a single-agent crew with sequential processing and safe code execution capabilities. Unlike traditional multi-agent systems, this architecture focuses on a specialized Python developer agent that handles the complete coding lifecycle: planning, implementation, execution, and validation.

Core Components

1. Agents (`config/agents.yaml`)

Agents are autonomous AI entities with specific roles, goals, and capabilities. Each agent is configured with:

Role: Defines the agent's expertise (e.g., "Python Developer")
Goal: The agent's objective, dynamically populated with user inputs
Backstory: Provides context that shapes the agent's behavior and decision-making
LLM: The underlying language model (gpt-4o-mini in this implementation)

Current Agent Configuration:

coder:
  role: Python Developer
  goal: Write python code to achieve assignment and validate output
  backstory: Seasoned python developer with a knack for clean, efficient code
  llm: gpt-4o-mini

2. Tasks (`config/tasks.yaml`)

Tasks define specific work items assigned to agents. Key attributes include:

Description: Detailed instructions for what needs to be accomplished
Expected Output: Clear specification of deliverables
Agent Assignment: Which agent executes the task
Output File: Where results are persisted

Current Task Configuration:

coding_task:
  description: Write python code to achieve this: {assignment}
  expected_output: A text file with code and execution output
  agent: coder
  output_file: output/code_and_output.txt

3. Crew Orchestration (`crew.py`)

The Coder class orchestrates the entire system using the @CrewBase decorator pattern:

@CrewBase
class Coder:
    """Coder crew with safe code execution capabilities"""
    
    @agent
    def coder(self) -> Agent:
        # Agent with CodeInterpreterTool for safe execution
        
    @task
    def coding_task(self) -> Task:
        # Task configuration
        
    @crew
    def crew(self) -> Crew:
        # Crew assembly with sequential processing

Key Configuration Parameters:

allow_code_execution=True: Enables the agent to execute code
code_execution_mode="safe": Runs code in isolated containers
max_execution_time=30: Timeout for code execution (seconds)
max_retry_limit=3: Number of retry attempts on failure
process=Process.sequential: Tasks execute in order (vs. hierarchical)

Process Types in CrewAI

Sequential Process (Current Implementation)

Flow: Linear, one task after another
Use Case: Simple, ordered workflows where each task builds on the previous
Benefits: Predictable, easy to debug, minimal overhead
Implementation: Process.sequential

User Input → Agent Plans → Agent Codes → Agent Executes → Output File

Hierarchical Process (Alternative)

Flow: Manager agent delegates tasks to worker agents
Use Case: Complex projects requiring task decomposition and parallel work
Benefits: Better for multi-agent coordination, task delegation, quality control
Implementation: Process.hierarchical with a manager LLM

                    Manager Agent
                         |
        ┌────────────────┼────────────────┐
        ↓                ↓                ↓
   Coder Agent    Tester Agent    Reviewer Agent

Code Execution Architecture

CodeInterpreterTool

The CodeInterpreterTool provides safe, containerized Python code execution:

Isolation: Code runs in Docker/Podman containers
Security: Sandboxed environment prevents system access
Flexibility: Supports custom base images and packages
Output Capture: Returns both stdout and execution results

Configuration:

tools=[
    CodeInterpreterTool(
        user_docker_base_url="npipe:////./pipe/podman-machine-default"
    )
]

🐋 Podman Integration: The Windows Docker Alternative

The Challenge

CrewAI's CodeInterpreterTool requires Docker for containerized code execution. On Windows, Docker Desktop has licensing restrictions and resource overhead. Podman is a lightweight, open-source alternative, but CrewAI wasn't designed to work with Podman out of the box.

The Solution: Making Podman Work with CrewAI on Windows

This project successfully integrates Podman with CrewAI through a three-step workaround:

Step 1: Docker CLI Alias

CrewAI performs subprocess checks for docker.exe. Solution: Create a symbolic link or copy podman.exe to docker.exe in your PATH.

# Navigate to Podman installation directory
cd "C:\Program Files\RedHat\Podman"

# Copy podman.exe as docker.exe
Copy-Item podman.exe docker.exe

Why This Works: CrewAI's internal checks look for docker CLI availability without actually using Docker-specific commands.

Step 2: Set DOCKER_HOST Environment Variable

The Docker SDK for Python (used by CodeInterpreterTool) needs to connect to Podman's socket:

# Set for current session
$env:DOCKER_HOST = "npipe:////./pipe/podman-machine-default"

# Set permanently (User scope)
[System.Environment]::SetEnvironmentVariable(
    "DOCKER_HOST", 
    "npipe:////./pipe/podman-machine-default", 
    "User"
)

Technical Details:

npipe:// - Named pipe protocol for Windows
////./pipe/ - Windows named pipe path syntax
podman-machine-default - Default Podman machine socket name

Step 3: Clean Docker Desktop Artifacts

Remove Docker Desktop configuration remnants that interfere with Podman:

# Edit or delete ~/.docker/config.json
# Remove: "credsStore": "desktop"

Why This Matters: Docker Desktop's credential store causes authentication conflicts with Podman.

Architecture Impact

By configuring user_docker_base_url in the CodeInterpreterTool:

CodeInterpreterTool(user_docker_base_url="npipe:////./pipe/podman-machine-default")

The agent now routes all containerized code execution through Podman, maintaining:

✅ Full code execution capabilities
✅ Container isolation and security
✅ No Docker Desktop licensing concerns
✅ Lower system resource footprint

🔍 Deep Dive: Issues and Solutions

Getting CrewAI to work with Podman on Windows required solving 5 critical issues. Here's the complete breakdown:

Issue 1: CrewAI Requires Docker CLI

Problem: CrewAI validates Docker availability using:

subprocess.run(["docker", "info"])

With only Podman installed, there's no docker command, causing CrewAI to fail initialization.

Solution: Copy podman.exe as docker.exe to a directory on PATH:

# Create a local bin directory (if it doesn't exist)
New-Item -ItemType Directory -Force -Path "$env:USERPROFILE\.local\bin"

# Copy podman.exe as docker.exe
Copy-Item "C:\Program Files\RedHat\Podman\podman.exe" "$env:USERPROFILE\.local\bin\docker.exe"

# Add to PATH for current session
$env:Path += ";$env:USERPROFILE\.local\bin"

# Add to PATH permanently (User scope)
$currentPath = [System.Environment]::GetEnvironmentVariable("Path", "User")
[System.Environment]::SetEnvironmentVariable("Path", "$currentPath;$env:USERPROFILE\.local\bin", "User")

Why This Works: CrewAI only checks for docker CLI existence, not Docker-specific functionality. Podman is Docker-compatible, so this alias works seamlessly.

Issue 2: .bat Alias Doesn't Work with subprocess

First Attempt (Failed): Creating a docker.bat batch file wrapper.

Problem: subprocess.run(["docker", "info"]) passes arguments as a list (no shell invocation). On Windows, this only looks for .exe, .com, .cmd executables — not .bat files when called without shell=True.

Solution: Use an actual .exe file (copy or symlink) instead of a batch script. The solution from Issue 1 resolves this.

Technical Note: PowerShell aliases (Set-Alias) also don't work because they're shell-specific and not visible to Python's subprocess module.

Issue 3: Docker Python SDK Connecting to Wrong Host

Problem: The docker-py SDK (used internally by CodeInterpreterTool) defaults to:

tcp://127.0.0.1:2376  # Docker's default TCP socket

But Podman on Windows uses a named pipe:

npipe:////./pipe/podman-machine-default

Without proper configuration, the SDK couldn't connect to Podman's socket.

Solution: Set the DOCKER_HOST environment variable:

# Temporary (current session)
$env:DOCKER_HOST = "npipe:////./pipe/podman-machine-default"

# Permanent (User scope) - RECOMMENDED
[System.Environment]::SetEnvironmentVariable(
    "DOCKER_HOST", 
    "npipe:////./pipe/podman-machine-default", 
    "User"
)

Verification:

# Check current value
echo $env:DOCKER_HOST

# Test connection
podman ps  # Should work without errors

Important: After setting permanently, restart your terminal or IDE to pick up the new environment variable.

Issue 4: user_docker_base_url Parameter Ignored

First Attempt (Failed): Passing user_docker_base_url directly to CodeInterpreterTool:

tools=[CodeInterpreterTool(user_docker_base_url="npipe:////./pipe/podman-machine-default")]

Problem: When allow_code_execution=True is set on the Agent (which is required for code execution), CrewAI creates its own internal CodeInterpreterTool instance that ignores any tool you pass in the tools list. This internal instance calls docker.from_env(), which doesn't receive the custom URL.

Solution: Set DOCKER_HOST at the environment level so docker.from_env() picks it up globally:

# In your code (optional, if not set at system level)
import os
os.environ["DOCKER_HOST"] = "npipe:////./pipe/podman-machine-default"

Better Approach: Set DOCKER_HOST permanently at the system level (see Issue 3), so it applies to all processes.

Architectural Insight: This behavior is by design in CrewAI. The internal tool creation ensures consistent code execution environments across all agents, but requires environment-level configuration rather than parameter passing.

Issue 5: docker-credential-desktop Not Found

Problem: After uninstalling Docker Desktop, ~/.docker/config.json retained this configuration:

{
  "credsStore": "desktop",
  "auths": {}
}

When docker-py tried to authenticate, it looked for docker-credential-desktop.exe, which no longer exists, causing authentication failures.

Error Message:

Error: docker-credential-desktop not found

Solution: Clean the Docker config file:

# Option 1: Reset to minimal config
Set-Content "$env:USERPROFILE\.docker\config.json" '{"auths": {}}'

# Option 2: Delete the entire config (will be recreated)
Remove-Item "$env:USERPROFILE\.docker\config.json" -Force

# Option 3: Manually edit and remove the "credsStore" line
notepad "$env:USERPROFILE\.docker\config.json"

Prevention: If you still have Docker Desktop installed alongside Podman, you may need to configure Podman to use its own credential store:

{
  "auths": {},
  "credHelpers": {
    "registry.example.com": "podman"
  }
}

✅ Complete Setup Checklist

To ensure everything is configured correctly:

Podman Desktop installed and machine running (podman machine list)
docker.exe alias created (Issue 1)
DOCKER_HOST environment variable set permanently (Issue 3)
.docker/config.json cleaned of Desktop artifacts (Issue 5)
Terminal/IDE restarted to pick up environment changes
Verification: docker ps works without errors
Verification: podman ps shows same containers as docker ps

📦 Installation

Prerequisites

Python >=3.10, <3.13
Podman Desktop (or Docker Desktop)
OpenAI API key

Setup Steps

Clone the repository

git clone <your-repo-url>
cd coder

Install UV package manager

pip install uv

Install dependencies

crewai install

Configure environment variables Create a .env file (see .env.example):

OPENAI_API_KEY=your_api_key_here

If using Podman, configure Docker compatibility (see Podman Integration section)

🚀 Usage

Basic Execution

Run the crew from your terminal:

crewai run

You'll be prompted to enter a coding assignment. The agent will:

Plan the implementation approach
Write the Python code
Execute it in a safe container
Validate and return results

Example Session

Input:

Enter the coding assignment: Calculate the first 10,000 terms of the Leibniz formula for Pi

Output: (output/code_and_output.txt)

# Python program to calculate the first 10,000 terms of the series
total = 0
for i in range(10000):
    term = 1 / (2 * i + 1)
    if i % 2 == 0:
        total += term  # Add for even index
    else:
        total -= term  # Subtract for odd index

# Multiply the total by 4
result = total * 4
print(result)

Execution Output:

3.1414926535900345

The agent successfully:

✅ Understood the mathematical concept (Leibniz formula: π/4 = 1 - 1/3 + 1/5 - 1/7 + ...)
✅ Implemented clean, efficient Python code
✅ Executed it safely in a container
✅ Validated the result (approximates π = 3.14159...)

🔧 Customization

Modifying Agent Behavior

Edit src/coder/config/agents.yaml:

coder:
  role: >
    Senior Python Architect
  goal: >
    Design and implement production-grade Python solutions
  backstory: >
    15 years of experience in scalable system design
  llm: gpt-4  # Upgrade to more powerful model

Adding New Tasks

Edit src/coder/config/tasks.yaml:

code_review_task:
  description: >
    Review the code for best practices, security, and performance
  expected_output: >
    A detailed code review report
  agent: coder
  output_file: output/review.txt

Switching to Hierarchical Process

Modify src/coder/crew.py:

@crew
def crew(self) -> Crew:
    return Crew(
        agents=self.agents,
        tasks=self.tasks,
        process=Process.hierarchical,  # Changed from sequential
        manager_llm="gpt-4",  # Required for hierarchical
        verbose=True,
    )

📂 Project Structure

coder/
├── src/coder/
│   ├── __init__.py
│   ├── main.py              # Entry point, handles user input
│   ├── crew.py              # Crew orchestration and agent config
│   └── config/
│       ├── agents.yaml      # Agent definitions
│       └── tasks.yaml       # Task configurations
├── output/
│   └── code_and_output.txt  # Generated code and results
├── pyproject.toml           # Dependencies and metadata
├── .env.example             # Environment variable template
├── LICENSE                  # MIT License
└── README.md                # This file

🛠️ Troubleshooting

Podman/Docker Connection Issues

Symptom: Error connecting to Docker daemon or docker.errors.DockerException

Quick Checks:

# 1. Verify Podman machine is running
podman machine list
# Should show "Currently running" status

# 2. Check DOCKER_HOST environment variable
echo $env:DOCKER_HOST
# Should output: npipe:////./pipe/podman-machine-default

# 3. Test connection
podman ps
docker ps  # Should show same output

Solutions:

If docker command not found → See Issue 1 in Podman Deep Dive
If DOCKER_HOST is empty → See Issue 3
If docker-credential-desktop error → See Issue 5

Restart Checklist (after configuration changes):

# 1. Stop and restart Podman machine
podman machine stop
podman machine start

# 2. Restart terminal or reload environment
refreshenv  # If using chocolatey
# OR close and reopen terminal

# 3. Verify connection
docker ps

Code Execution Timeouts

Symptom: ERROR: Code execution timed out after 30 seconds

Solution: Increase timeout in crew.py:

@agent
def coder(self) -> Agent:
    return Agent(
        config=self.agents_config["coder"],
        max_execution_time=60,  # Increase from 30 to 60 seconds
        max_retry_limit=5,      # Also increase retries if needed
        # ... other config
    )

For Long-Running Tasks:

max_execution_time=300  # 5 minutes for complex computations

Agent Not Executing Code

Symptom: Agent returns code but doesn't run it

Checklist:

Verify allow_code_execution=True in agent definition
Check code_execution_mode="safe" (not "unsafe" or missing)
Ensure Podman/Docker is accessible (see connection issues above)
Check agent logs for detailed error messages

API Rate Limits or Slow Response

Symptom: RateLimitError or slow agent responses

Solutions:

Switch to faster/cheaper model in agents.yaml:

coder:
  llm: gpt-3.5-turbo  # Faster, higher rate limits than gpt-4

Add retry with backoff (in your code):

from tenacity import retry, stop_after_attempt, wait_exponential

@retry(stop=stop_after_attempt(3), wait=wait_exponential(min=1, max=10))
def run():
    result = Coder().crew().kickoff(inputs=inputs)
    return result

Use environment variable to set custom delays:

# In .env file
OPENAI_API_RATE_LIMIT_DELAY=2  # Seconds between requests

Module Import Errors

Symptom: ModuleNotFoundError or ImportError

Solution: Reinstall dependencies:

# Clean install
uv pip uninstall crewai crewai-tools
crewai install

# OR force reinstall
pip install --force-reinstall crewai==0.108.0 crewai-tools==0.38.1

Output File Not Created

Symptom: output/code_and_output.txt not appearing

Checks:

# 1. Verify output directory exists
Test-Path "output"  # Should return True

# 2. Check file permissions
Get-Acl "output"

# 3. Manually create directory
New-Item -ItemType Directory -Force -Path "output"

Verify task configuration in tasks.yaml:

coding_task:
  output_file: output/code_and_output.txt  # Ensure this path is correct

Windows-Specific Issues

Issue: Path separator problems (backslash vs forward slash)

Solution: CrewAI expects forward slashes (/) even on Windows:

output_file: output/code_and_output.txt  # ✅ Correct
output_file: output\code_and_output.txt  # ❌ May cause issues

Issue: PowerShell execution policy blocks scripts

Solution:

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

Still Having Issues?

Enable verbose logging: Set verbose=True in crew.py (already enabled by default)
Check CrewAI logs: Look for detailed error messages in terminal output
Verify OpenAI API key: echo $env:OPENAI_API_KEY should show your key
Review Podman Deep Dive: See the complete 5-issue breakdown above
Community support: Visit CrewAI Discord

📝 License

MIT License - see LICENSE file for details

🙏 Acknowledgments

CrewAI - Multi-agent orchestration framework
Podman - Container management alternative to Docker
OpenAI - Language model provider

🔗 Resources

Built with 🤖 Agentic AI and ⚡ CrewAI

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src/coder		src/coder
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Coder Crew - Agentic AI Code Execution System

🏗️ Architecture Overview

System Design Philosophy

Core Components

1. Agents (config/agents.yaml)

2. Tasks (config/tasks.yaml)

3. Crew Orchestration (crew.py)

Process Types in CrewAI

Sequential Process (Current Implementation)

Hierarchical Process (Alternative)

Code Execution Architecture

CodeInterpreterTool

🐋 Podman Integration: The Windows Docker Alternative

The Challenge

The Solution: Making Podman Work with CrewAI on Windows

Step 1: Docker CLI Alias

Step 2: Set DOCKER_HOST Environment Variable

Step 3: Clean Docker Desktop Artifacts

Architecture Impact

🔍 Deep Dive: Issues and Solutions

Issue 1: CrewAI Requires Docker CLI

Issue 2: .bat Alias Doesn't Work with subprocess

Issue 3: Docker Python SDK Connecting to Wrong Host

Issue 4: user_docker_base_url Parameter Ignored

Issue 5: docker-credential-desktop Not Found

✅ Complete Setup Checklist

📦 Installation

Prerequisites

Setup Steps

🚀 Usage

Basic Execution

Example Session

🔧 Customization

Modifying Agent Behavior

Adding New Tasks

Switching to Hierarchical Process

📂 Project Structure

🛠️ Troubleshooting

Podman/Docker Connection Issues

Code Execution Timeouts

Agent Not Executing Code

API Rate Limits or Slow Response

Module Import Errors

Output File Not Created

Windows-Specific Issues

Still Having Issues?

📝 License

🙏 Acknowledgments

🔗 Resources

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Agents (`config/agents.yaml`)

2. Tasks (`config/tasks.yaml`)

3. Crew Orchestration (`crew.py`)

Packages