Skip to content

Pipeline Design 203

ezigus edited this page Mar 20, 2026 · 1 revision

ADR written to .claude/pipeline-artifacts/design.md. Here's the summary:

Decision: Source-based extraction (Approach A) — three new lib/ modules sourced by a thin orchestrator.

Key architectural points:

  1. Three modules with clear single-responsibility boundaries:

    • recruit-role-manager.sh (~900 lines) — role lifecycle, matching, evolution
    • recruit-team-composer.sh (~500 lines) — team assembly, routing, decomposition
    • recruit-config-validator.sh (~900 lines) — feedback, audit, profiles, meta-learning
  2. Strict source order (role-manager → team-composer → config-validator) respects the acyclic dependency graph. No circular deps exist.

  3. Zero blast radius on callers — all 9 external scripts continue invoking bash sw-recruit.sh <cmd> unchanged. The CLI router stays in the orchestrator.

  4. Shared state ownership — orchestrator owns all 10 storage paths, 9 policy vars, and 4 utility functions. Modules read these but never declare them.

  5. Source guards (_RECRUIT_*_LOADED) follow the established lib/ convention seen in decide-signals.sh, daemon-state.sh, policy.sh, etc.

  6. Error handling inherited — modules get set -euo pipefail from the orchestrator and must not set their own. cy.sh`)

  • Shared mutable state: 10 storage path variables (ROLES_DB, PROFILES_DB, etc.) and 9 policy variables must be visible to all modules

Decision

Source-based extraction into three lib/ modules. sw-recruit.sh becomes a thin orchestrator (~350 lines) that owns shared state, utility functions, and the CLI router. It sources the three modules in dependency order.

Source Order (strict)

sw-recruit.sh (lines 1-150: shared state, utilities)
  → source lib/recruit-role-manager.sh      (no module deps)
  → source lib/recruit-team-composer.sh      (depends on role-manager)
  → source lib/recruit-config-validator.sh   (depends on role-manager)
  → CLI router (case statement, ~40 lines)

Component Diagram

+------------------------------------------------------------------+
|                      sw-recruit.sh (orchestrator)                 |
|  Shared state: ROLES_DB, PROFILES_DB, TALENT_DB, etc. (10 vars) |
|  Policy vars: RECRUIT_CONFIDENCE_THRESHOLD, etc. (9 vars)        |
|  Utilities: _recruit_locked_write, _recruit_call_claude, etc.    |
|  Sources: lib/compat.sh, lib/helpers.sh, sw-intelligence.sh      |
|  CLI router: case $cmd in roles|match|team|... esac              |
+--------+-----------------+-----------------+---------------------+
         | source           | source           | source
         v                  v                  v
+------------------+ +------------------+ +----------------------+
| recruit-role-    | | recruit-team-    | | recruit-config-      |
| manager.sh       | | composer.sh      | | validator.sh         |
| (~900 lines)     | | (~500 lines)     | | (~900 lines)         |
|                  | |                  | |                      |
| Role defs/init   | | cmd_team         | | cmd_record_outcome   |
| Matching engine  | | cmd_route        | | cmd_ingest_pipeline  |
| cmd_roles        | | cmd_specialize   | | cmd_evaluate         |
| cmd_match        | | cmd_decompose    | | cmd_profiles/promote |
| cmd_create_role  | |                  | | cmd_onboard/mind     |
| cmd_evolve       | | Depends on:      | | cmd_reflect/audit    |
| cmd_invent       | | role-manager     | | cmd_stats/help       |
| cmd_self_tune    | | (matching fns)   | |                      |
|                  | |                  | | Depends on:          |
| No module deps   | |                  | | role-manager         |
+------------------+ +------------------+ +----------------------+

External callers (UNCHANGED — zero blast radius):
  sw-pm.sh, sw-pipeline.sh, sw-triage.sh, sw-swarm.sh,
  sw-autonomous.sh, sw-loop.sh, lib/pipeline-stages-build.sh
  All invoke: bash sw-recruit.sh <cmd> [args]

Module Boundaries

lib/recruit-role-manager.sh (~900 lines, 12 functions) — Role lifecycle:

  • Role definitions (initialize_builtin_roles)
  • Matching engine (_recruit_keyword_match, _recruit_llm_match, _recruit_record_match, cmd_match)
  • Role CRUD (cmd_roles, cmd_create_role)
  • Evolution (cmd_evolve, _recruit_track_role_usage, _recruit_compute_population_stats)
  • Self-tuning (cmd_self_tune)
  • Invention (cmd_invent)

lib/recruit-team-composer.sh (~500 lines, 4 functions) — Team assembly:

  • Team composition (cmd_team)
  • Task routing (cmd_route)
  • Specialization listing (cmd_specializations)
  • Goal decomposition (cmd_decompose)

lib/recruit-config-validator.sh (~900 lines, 14 functions) — Feedback, audit, profiles:

  • Outcome recording (cmd_record_outcome, cmd_ingest_pipeline)
  • Evaluation (cmd_evaluate)
  • Profiles & promotion (cmd_profiles, cmd_promote, cmd_onboard)
  • Meta-learning (_recruit_meta_learning_check, cmd_reflect, _recruit_reflect, _recruit_meta_validate_self_tune)
  • Theory of mind (cmd_mind)
  • Audit & stats (cmd_audit, cmd_stats, cmd_help)

Source Guards

Each module uses the established lib/ convention:

[[ -n "${_RECRUIT_ROLE_MANAGER_LOADED:-}" ]] && return 0
_RECRUIT_ROLE_MANAGER_LOADED=1

What Stays in the Orchestrator

  • Shebang, set -euo pipefail, ERR trap
  • SCRIPT_DIR, RECRUIT_VERSION
  • Dependency check (jq)
  • source lib/compat.sh, source lib/helpers.sh, output fallbacks
  • _recruit_locked_write() (used by all modules)
  • All 10 storage path variables (RECRUIT_ROOT, ROLES_DB, etc.)
  • _recruit_policy() and all 9 policy variables
  • ensure_recruit_dir()
  • Intelligence engine source + _recruit_has_claude(), _recruit_call_claude()
  • Three source statements for the modules
  • CLI router (case statement)

Interface Contracts

# ── Orchestrator → Modules (shared state contract) ──
# These variables MUST be set before sourcing any module:
#   RECRUIT_ROOT, ROLES_DB, PROFILES_DB, TALENT_DB, ONBOARDING_DB,
#   MATCH_HISTORY, ROLE_USAGE_DB, HEURISTICS_DB, AGENT_MINDS_DB,
#   INVENTED_ROLES_LOG, META_LEARNING_DB
#   RECRUIT_CONFIDENCE_THRESHOLD, RECRUIT_MAX_MATCH_HISTORY,
#   RECRUIT_META_ACCURACY_FLOOR, RECRUIT_LLM_TIMEOUT,
#   RECRUIT_DEFAULT_MODEL, RECRUIT_SELF_TUNE_MIN_MATCHES,
#   RECRUIT_PROMOTE_TASKS, RECRUIT_PROMOTE_SUCCESS,
#   RECRUIT_AUTO_EVOLVE_AFTER

# ── Orchestrator → Modules (shared functions) ──
# ensure_recruit_dir()                              → void
# _recruit_locked_write(target: path, tmp: path)    → void
# _recruit_has_claude()                              → exit 0 (available) or 1
# _recruit_call_claude(prompt: string, model?: str)  → stdout: string

# ── role-manager exports (consumed by team-composer, config-validator) ──
# initialize_builtin_roles()                         → void (idempotent)
# _recruit_keyword_match(task: string)               → stdout: role_name
# _recruit_track_role_usage(role: string, outcome: string) → void
# cmd_roles()                                        → stdout: text
# cmd_match(args...)                                 → stdout: JSON (--json) or text
# cmd_create_role(args...)                           → stdout: confirmation
# cmd_evolve()                                       → stdout: report
# cmd_invent()                                       → stdout: report
# cmd_self_tune()                                    → stdout: report

# ── team-composer exports (no downstream module consumers) ──
# cmd_team(args...)                                  → stdout: JSON (--json) or text
# cmd_route(args...)                                 → stdout: text
# cmd_specializations()                              → stdout: text
# cmd_decompose(args...)                             → stdout: JSON

# ── config-validator exports (no downstream module consumers) ──
# cmd_record_outcome(args...)                        → void + stdout confirmation
# cmd_ingest_pipeline(args...)                       → void + stdout confirmation
# cmd_evaluate(args...)                              → stdout: report
# cmd_profiles()                                     → stdout: report
# cmd_promote(args...)                               → stdout: confirmation
# cmd_onboard(args...)                               → stdout: confirmation
# cmd_mind(args...)                                  → stdout: JSON/text
# cmd_reflect()                                      → stdout: report
# cmd_audit()                                        → stdout: report (exit 0 pass, 1 fail)
# cmd_stats()                                        → stdout: report
# cmd_help()                                         → stdout: usage text

Data Flow

External caller → bash sw-recruit.sh <cmd> [args]
                         │
                   Orchestrator
                   ├─ Load env, storage paths, policy vars
                   ├─ source compat.sh, helpers.sh
                   ├─ Define shared utilities
                   ├─ source recruit-role-manager.sh
                   ├─ source recruit-team-composer.sh
                   ├─ source recruit-config-validator.sh
                   ├─ ensure_recruit_dir
                   └─ Route cmd → cmd_* function
                         │
                   cmd_* executes
                   ├─ reads/writes ~/.shipwright/recruitment/*.json
                   ├─ may call _recruit_call_claude() for LLM matching
                   └─ outputs JSON or text to stdout

Cross-Module Dependencies (acyclic)

orchestrator (shared state + utilities)
  │
  ├──▶ recruit-role-manager.sh
  │      reads: ROLES_DB, HEURISTICS_DB, MATCH_HISTORY, ROLE_USAGE_DB
  │      calls: ensure_recruit_dir, _recruit_locked_write,
  │             _recruit_has_claude, _recruit_call_claude
  │      exports: initialize_builtin_roles, _recruit_keyword_match,
  │               _recruit_track_role_usage
  │
  ├──▶ recruit-team-composer.sh
  │      reads: ROLES_DB
  │      calls from orchestrator: _recruit_has_claude, _recruit_call_claude
  │      calls from role-manager: initialize_builtin_roles, _recruit_keyword_match
  │
  └──▶ recruit-config-validator.sh
         reads: ROLES_DB, PROFILES_DB, TALENT_DB, ONBOARDING_DB,
                META_LEARNING_DB, AGENT_MINDS_DB
         calls from orchestrator: ensure_recruit_dir, _recruit_locked_write
         calls from role-manager: initialize_builtin_roles, _recruit_track_role_usage

No circular dependencies. team-composer and config-validator are independent of each other.

Error Boundaries

  • Orchestrator: owns set -euo pipefail and the ERR trap. Modules inherit this — they must not set their own pipefail or traps.
  • Each cmd_* function: validates its own arguments, prints usage to stderr, returns non-zero on bad input.
  • LLM failures: _recruit_call_claude returns empty string on failure. Callers in role-manager and team-composer fall back to keyword/heuristic matching. Unchanged from current behavior.
  • File I/O: _recruit_locked_write handles concurrent access via flock. Individual JSON operations use jq --arg for safe escaping (never string interpolation).
  • Unknown commands: the CLI router's * case prints an error and calls cmd_help.
  • Module load failures: if a module file is missing, source will fail under set -e, producing a clear error. No silent degradation.

API Design Note

These are not HTTP endpoints — they are bash CLI subcommands. The relevant considerations:

  • Endpoint Specification: N/A (CLI). Each subcommand documented in cmd_help.
  • Error Codes: Exit 0 on success, exit 1 on failure, error text on stderr.
  • Rate Limiting: N/A. Concurrent access handled by _recruit_locked_write file locking.
  • Versioning: RECRUIT_VERSION="3.0.0" in orchestrator. No breaking CLI changes in this refactor.

Alternatives Considered

  1. Standalone scripts with subprocess calls — Each module is an independent script invoked via bash. Pros: strong process isolation, independent error handling. Cons: breaks shared state (requires serializing 10 DB paths + 9 policy vars as env vars or arguments), significant performance overhead from subprocess spawning (documented failure pattern in this repo — fork limit issues in pipeline-state.sh), major rewrite of internal data flow. Rejected: blast radius too high for a refactoring task.

  2. Single-file refactor with section markers — Keep one file, add clear # === SECTION === comments. Pros: zero risk. Cons: doesn't achieve the goal — no independent testability, no reduced cognitive load, still 2,644 lines. Rejected: doesn't solve the problem.

Implementation Plan

  • Files to create:

    • scripts/lib/recruit-role-manager.sh — role lifecycle functions (from lines 156-902, 1376-1488, 1755-1999)
    • scripts/lib/recruit-team-composer.sh — team assembly functions (from lines 909-1181, 1654-1749)
    • scripts/lib/recruit-config-validator.sh — feedback/audit functions (from lines 560-740, 1187-1370, 1494-1648, 2002-2605)
    • scripts/sw-role-manager-test.sh — ~10 unit tests for role-manager functions
    • scripts/sw-team-composer-test.sh — ~10 unit tests for team-composer functions
    • scripts/sw-config-validator-test.sh — ~10 unit tests for config-validator functions
  • Files to modify:

    • scripts/sw-recruit.sh — remove extracted functions, add three source statements, keep shared state + CLI router (~350 lines remaining)
    • scripts/sw-recruit-test.sh — Section 18 static greps check for sw-recruit.sh references in caller scripts (not function defs in recruit itself), so most pass unchanged. Any greps that search for function definitions inside sw-recruit.sh must be updated to also search lib/recruit-*.sh.
  • Dependencies: None new. Existing: jq, lib/compat.sh, lib/helpers.sh, sw-intelligence.sh.

  • Risk areas:

    • Source order violation: If modules sourced in wrong order, _recruit_keyword_match or initialize_builtin_roles undefined when team-composer or config-validator calls them. Mitigation: strict source order, integration test.
    • Shared variable scope: If storage/policy variable accidentally moved into a module. Mitigation: orchestrator owns all variable declarations.
    • Double-sourcing: If test sources both sw-recruit.sh and a module directly. Mitigation: _LOADED guards.
    • Section 18 test greps: Currently grep for sw-recruit.sh references in caller scripts — these pass unchanged. Future static analysis should search lib/recruit-*.sh too.

Validation Criteria

  • scripts/sw-recruit.sh is under 400 lines (down from 2,644)
  • All 80+ existing tests in scripts/sw-recruit-test.sh pass with zero modifications to test logic
  • Each module has a _RECRUIT_*_LOADED source guard matching the lib/ convention
  • No module contains set -euo pipefail or its own ERR trap
  • npm test passes with no regressions
  • All 9 external callers work without modification (verified by Section 18 integration tests)
  • No Bash 3.2 incompatibilities (no associative arrays, no readarray, no ${var,,})
  • Cross-module function calls resolve correctly: cmd_team can call initialize_builtin_roles and _recruit_keyword_match
  • New unit test files exist and pass: sw-role-manager-test.sh, sw-team-composer-test.sh, sw-config-validator-test.sh
  • No circular dependencies between modules (DAG: orchestrator → role-manager → {team-composer, config-validator})

Clone this wiki locally