Add comprehensive TDD implementation plan with 4-layer HMAS architecture by mvillmow · Pull Request #1 · HomericIntelligence/ProjectKeystone

mvillmow · 2025-11-17T22:07:22Z

Summary

Complete implementation plan for ProjectKeystone - a high-performance C++20 Hierarchical Multi-Agent System (HMAS). This PR establishes the architectural foundation, development methodology, and detailed roadmap for building the system.

Architecture: 4-Layer Hierarchy

The system uses a 4-layer hierarchical architecture that mirrors the agent development structure:

Level 0: Chief Architect Agent (strategic decisions, system-wide coordination)
Level 1: Component Lead Agent (component-level architecture, module coordination)
Level 2: Module Lead Agent (task decomposition, result synthesis, code review)
Level 3: Task Agent (concrete execution, code implementation, testing)

This creates a self-similar organization where the system structure reflects the development process.

Development Approach: TDD with E2E Testing

Test-Driven Development with End-to-End tests as primary validation:

Write failing E2E test first (complete workflow)
Implement minimal code to pass test
Refactor and optimize
Incremental complexity: 2 agents → 3 layers → 4 layers → full system
Every commit has working E2E tests

Key Documents

Primary Implementation Guides

TDD_FOUR_LAYER_ROADMAP.md - Complete 14-week TDD roadmap with E2E test scenarios
FOUR_LAYER_ARCHITECTURE.md - Complete 4-layer architecture specification
README.md - Overview, navigation, and quick start

Supporting Documentation

modules.md - C++20 module structure (Keystone.Core, Protocol, Agents, Integration)
build-system.md - CMake 3.28+ configuration with C++20 module support
testing-strategy.md - Testing frameworks (GoogleTest, benchmarks, coverage)
risks.md - Risk analysis and mitigation strategies

Implementation Timeline

Total: 14 weeks

Phase 1 (Weeks 1-3): L0 + L3 only - Core infrastructure validation
Phase 2 (Weeks 4-6): Add L2 (Module Lead) - Task synthesis and retry logic
Phase 3 (Weeks 7-9): Add L1 (Component Lead) - Multi-module coordination
Phase 4 (Weeks 10-12): Full multi-component system - Parallel execution
Phase 5 (Weeks 13-14): Performance optimization and chaos testing

Technology Stack

Language: C++20 (modules, coroutines, concepts)
Concurrency: Actor Model with concurrentqueue (lock-free)
Serialization: Cista (zero-copy internal), Protobuf (external gRPC)
AI Integration: ONNX Runtime (local), gRPC clients (remote)
Build: CMake 3.28+, vcpkg
Testing: GoogleTest, Google Benchmark

Performance Targets

Throughput: >1M messages/second
Latency: <1ms internal (p99), <10ms external (p99)
Scalability: Linear to 100+ concurrent agents
Test Coverage: >95% core, >90% agents

What's Next

Start implementation following TDD_FOUR_LAYER_ROADMAP.md:

Week 1: Set up E2E test infrastructure
Week 2-3: Implement L0 + L3 with basic message passing
Week 4-6: Add L2 for task synthesis
Continue through phases to full system

Commits Included

ee86a48 - Initial comprehensive implementation plan (7 documents)
f1f297d - Add TDD-first approach with two-agent initial model
06fb634 - Update to 4-layer architecture matching dev agent structure
5c38806 - Remove legacy 3-layer architecture documents

Ready for review and merge to begin implementation! 🚀

Add complete implementation plan for the high-performance C++20 Hierarchical Multi-Agent System (HMAS) including: - Overall implementation roadmap (README.md) - System architecture and design patterns (architecture.md) - Detailed 8-phase implementation timeline (phases.md) - C++20 module structure and dependencies (modules.md) - Build system configuration with CMake 3.28+ (build-system.md) - Comprehensive testing strategy with TDD approach (testing-strategy.md) - Risk analysis and mitigation strategies (risks.md) Key Features: - 3-layer agent hierarchy (Root/Branch/Leaf) - Actor-model with C++20 coroutines - Lock-free message passing via concurrentqueue - Zero-copy serialization (Cista) internally - gRPC/Protobuf for external AI services - 20-week implementation timeline - >95% test coverage targets The plan emphasizes modern C++20 features including modules, coroutines, and advanced concurrency primitives while maintaining high performance (1M+ msg/sec throughput, <1ms latency).

Major update to implementation strategy: 1. TDD with E2E Testing (TDD_APPROACH.md): - E2E tests as primary validation method (not unit tests first) - Red-Green-Refactor cycle with complete workflows - Write failing E2E test → implement → refactor - Every phase starts with E2E test scenario 2. Two-Agent Initial Model (TWO_AGENT_ARCHITECTURE.md): - Start simple: Coordinator + Worker (not 3 layers) - Validate ALL core infrastructure before scaling - Incremental path: 2 agents → multiple workers → 3 layers - Message passing, coroutines, state machines proven early 3. Revised Roadmap (README.md): - Phase 0: E2E test infrastructure (Week 1) - Phase 1: Message passing (Weeks 2-3) - Phase 2: Coordinator-Worker pattern (Weeks 4-5) - Phase 3: Performance validation (Week 6) - Phase 4: External AI integration (Weeks 7-8) - Phase 5: Expand to 3-layer hierarchy (Weeks 9-11) - Phase 6: Production hardening (Weeks 12-14) Key Benefits: ✅ Fast validation - working system in 5 weeks ✅ Reduced risk - complexity added incrementally ✅ Always executable - every commit has passing E2E tests ✅ Performance proven early - benchmarks from Week 6 ✅ Easier debugging - simple 2-agent system initially This approach prioritizes "working software over comprehensive documentation" while maintaining rigorous TDD discipline.

Major architecture update from 3-layer to 4-layer hierarchy: Architecture Change: OLD (3 layers): L1: Root Agent → L2: Branch Agent → L3: Leaf Agent NEW (4 layers): L0: Chief Architect Agent (strategic decisions) L1: Component Lead Agent (component coordination) L2: Module Lead Agent (module synthesis) L3: Task Agent (concrete execution) Rationale: - Mirrors the agent development structure used to build the system - Clear separation: strategic (L0) → tactical (L1) → operational (L2) → execution (L3) - Original "Branch" layer was doing 2 jobs (now split into L1 + L2) - Self-similar architecture (system structure = development structure) New Documents: 1. FOUR_LAYER_ARCHITECTURE.md - Complete 4-layer specification - Agent type definitions for each layer - Message flow examples - Comparison to original 3-layer plan 2. TDD_FOUR_LAYER_ROADMAP.md - Incremental build path: 2 agents → 3 layers → 4 layers - E2E tests for each phase - Phase 1 (Weeks 1-3): L0 + L3 only - Phase 2 (Weeks 4-6): Add L2 (Module Lead) - Phase 3 (Weeks 7-9): Add L1 (Component Lead) - Phase 4 (Weeks 10-12): Full multi-component system - Phase 5 (Weeks 13-14): Performance & chaos testing Updated README.md: - Points to 4-layer as primary architecture - Updated documentation structure with new docs Timeline: 14 weeks (vs original 20 weeks) Agents: Up to 100+ agents (4 components × 5 modules × 5 tasks) This architecture provides better scalability, clearer separation of concerns, and more intuitive development workflow.

Clean up legacy documentation as requested: Removed: - architecture.md (3-layer hierarchy - superseded by 4-layer) - phases.md (original unit-test-first phases - superseded by TDD roadmap) - TDD_APPROACH.md (general TDD methodology - merged into TDD_FOUR_LAYER_ROADMAP.md) - TWO_AGENT_ARCHITECTURE.md (redundant - covered in TDD_FOUR_LAYER_ROADMAP.md Phase 1) Current Documentation: 1. TDD_FOUR_LAYER_ROADMAP.md - Primary implementation guide 2. FOUR_LAYER_ARCHITECTURE.md - System architecture specification 3. README.md - Overview and navigation 4. modules.md - C++20 module structure 5. build-system.md - CMake and toolchain 6. testing-strategy.md - Testing frameworks 7. risks.md - Risk analysis Benefits: ✅ No legacy/outdated documentation ✅ Single source of truth for architecture (4-layer) ✅ Single source of truth for implementation approach (TDD with E2E) ✅ Clearer navigation for developers The codebase now contains only the current 4-layer architecture plan with TDD/E2E methodology.

BUILD_DIR was referenced on lines 73, 91, 112, and 119 but never defined. Now defaults to build/release/bin to match Phase 3 CMake output structure. Fixes minor issue #1 from code review 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

This PR fixes 9 security vulnerabilities identified during comprehensive code review: ## CRITICAL Vulnerabilities Fixed 1. **Use-after-free in ProfilingSession** (profiling.cpp:83-121) - Root cause: Map pointer captured before lock release, invalidated by rehashing - Fix: Hold shared_lock during entire section access to prevent map modifications - Impact: Prevents crash in multi-threaded profiling scenarios - File: src/core/profiling.cpp 2. **Integer overflow in LeadAgentBase** (lead_agent_base_impl.hpp:215-225) - Root cause: size_t to int cast without bounds checking (size_t can be 2^64-1, int max is 2^31-1) - Fix: Check subtasks.size() > INT_MAX before cast, return error if exceeded - Impact: Prevents overflow causing negative coordination count - File: include/agents/lead_agent_base_impl.hpp 3. **Null pointer dereference + TOCTOU race in PullOrSteal** (pull_or_steal.cpp:43-65) - Root cause: queues.size() captured before loop, vector could shrink; null pointers not checked - Fix: Capture num_workers once, add combined bounds+null check before access - Impact: Prevents segfault in work-stealing scheduler - File: src/concurrency/pull_or_steal.cpp ## HIGH Priority Vulnerabilities Fixed 4. **Agent ID space exhaustion** (agent_id_interning.cpp:30-36) - Root cause: uint32_t wraps to 0 after 4,294,967,295, causing ID collisions - Fix: Check next_id_ == UINT32_MAX before increment, throw overflow_error - Impact: Prevents silent ID collision catastrophic failure - File: src/core/agent_id_interning.cpp 5. **Configuration validation missing** (config.hpp:147-160) - Root cause: Float percentage truncated to size_t without validation - Fix: Add static_assert compile-time checks for watermark percentage and result - Impact: Catches invalid configuration at compile time - File: include/core/config.hpp ## MEDIUM Priority Vulnerabilities Fixed 6. **Memory cleanup failure** (metrics.cpp:46-70) - Root cause: Time-based cleanup might not remove enough entries under flood - Fix: Add forced removal of oldest entries if time-based cleanup insufficient - Impact: Prevents unbounded memory growth under message flood - File: src/core/metrics.cpp 7. **Modulo by zero** (pull_or_steal.cpp:43-49) - Root cause: num_workers could be 0, causing (worker_index + i) % 0 - Fix: Check num_workers == 0, return nullopt early - Impact: Prevents FPE crash (fixed with CRITICAL #3) - File: src/concurrency/pull_or_steal.cpp ## Testing - Added comprehensive security regression test suite (test_security_regression.cpp) - 10 new test cases covering all 7 vulnerabilities - Stress tests: 100 threads × 100 sections × 10 records (ProfilingSession) - Memory flood test: 20,000 messages (2x limit) for metrics cleanup - All 476 tests pass with AddressSanitizer + UndefinedBehaviorSanitizer - Test runtime: 66.33 seconds ## Files Changed - CMakeLists.txt: Added security regression test file - src/core/profiling.cpp: Fixed use-after-free with shared_lock - include/agents/lead_agent_base_impl.hpp: Added integer overflow check - src/concurrency/pull_or_steal.cpp: Fixed null pointer + TOCTOU + modulo by zero - src/core/agent_id_interning.cpp: Added agent ID overflow check - include/core/config.hpp: Added compile-time validation with static_assert - src/core/metrics.cpp: Added forced cleanup for memory leak prevention - tests/unit/test_security_regression.cpp: NEW - Comprehensive regression tests 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

claude added 4 commits November 17, 2025 21:06

mvillmow merged commit 97a0e09 into main Nov 17, 2025

mvillmow deleted the claude/plan-architect-orchestrator-01HfTcHYACkRCX5gX5L5uMxg branch November 19, 2025 23:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add comprehensive TDD implementation plan with 4-layer HMAS architecture#1

Add comprehensive TDD implementation plan with 4-layer HMAS architecture#1
mvillmow merged 4 commits intomainfrom
claude/plan-architect-orchestrator-01HfTcHYACkRCX5gX5L5uMxg

mvillmow commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mvillmow commented Nov 17, 2025

Summary

Architecture: 4-Layer Hierarchy

Development Approach: TDD with E2E Testing

Key Documents

Primary Implementation Guides

Supporting Documentation

Implementation Timeline

Technology Stack

Performance Targets

What's Next

Commits Included

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants