Skip to content
6 changes: 5 additions & 1 deletion SPECS/ARCHIVE/INDEX.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# mcpbridge-wrapper Tasks Archive

**Last Updated:** 2026-02-18 (P13-T4)
**Last Updated:** 2026-02-18 (P13-T5)

## Archived Tasks

Expand Down Expand Up @@ -108,6 +108,7 @@
| P13-T2 | [P13-T2_Implement_persistent_broker_daemon/](P13-T2_Implement_persistent_broker_daemon/) | 2026-02-17 | PASS |
| P13-T3 | [P13-T3_Implement_multi-client_transport_and_JSON-RPC_multiplexing/](P13-T3_Implement_multi-client_transport_and_JSON-RPC_multiplexing/) | 2026-02-18 | PASS |
| P13-T4 | [P13-T4_Add_stdio_proxy_mode/](P13-T4_Add_stdio_proxy_mode/) | 2026-02-18 | PASS |
| P13-T5 | [P13-T5_Validate_prompt_reduction_and_multi_client_stability/](P13-T5_Validate_prompt_reduction_and_multi_client_stability/) | 2026-02-18 | PARTIAL |

## Historical Artifacts

Expand Down Expand Up @@ -179,6 +180,7 @@
| [REVIEW_P13-T2_broker_daemon.md](_Historical/REVIEW_P13-T2_broker_daemon.md) | Review report for P13-T2 |
| [REVIEW_P13-T3_transport_multiplexing.md](_Historical/REVIEW_P13-T3_transport_multiplexing.md) | Review report for P13-T3 |
| [REVIEW_P13-T4_stdio_proxy_mode.md](_Historical/REVIEW_P13-T4_stdio_proxy_mode.md) | Review report for P13-T4 |
| [REVIEW_P13-T5_prompt_reduction_multi_client_stability.md](_Historical/REVIEW_P13-T5_prompt_reduction_multi_client_stability.md) | Review report for P13-T5 |

## Archive Log

Expand Down Expand Up @@ -312,3 +314,5 @@
| 2026-02-18 | P13-T3 | Archived REVIEW_P13-T3_transport_multiplexing report |
| 2026-02-18 | P13-T4 | Archived Add_stdio_proxy_mode (PASS) |
| 2026-02-18 | P13-T4 | Archived REVIEW_P13-T4_stdio_proxy_mode report |
| 2026-02-18 | P13-T5 | Archived Validate_prompt_reduction_and_multi_client_stability (PARTIAL) |
| 2026-02-18 | P13-T5 | Archived REVIEW_P13-T5_prompt_reduction_multi_client_stability report |
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# PRD: P13-T5 — Validate prompt reduction and multi-client stability

**Status:** IN PROGRESS
**Priority:** P1
**Branch:** `feature/P13-T5-prompt-reduction-multi-client-stability`
**Depends on:** P13-T4 ✅

---

## 1. Overview

P13-T4 introduced broker proxy mode so short-lived MCP client processes can forward through one long-lived broker session. P13-T5 validates that behavior under realistic usage patterns and captures evidence that broker mode reduces upstream bridge churn (and therefore reduces repeated Xcode permission prompts).

This task delivers a reproducible integration test suite, a process-churn comparison report, and a manual validation report that confirms observed prompt behavior while the broker remains running.

---

## 2. Scope

### In-scope
- Add integration tests for sequential and concurrent short-lived proxy clients that share one broker-owned upstream bridge.
- Add assertions/evidence capture for upstream bridge lifecycle stability during client churn.
- Add manual validation report documenting Xcode permission prompt behavior in direct mode vs broker mode.
- Add regression validation commands and results to the task validation report.

### Out-of-scope
- Broker mode documentation rollout/migration guides (P13-T6).
- New broker runtime features unrelated to validation.
- Non-Unix transport behavior.

---

## 3. Design

### 3.1 Test strategy

Create `tests/integration/test_broker_multi_client.py` with targeted scenarios:

1. Sequential short-lived clients:
- Launch N short-lived proxy client sessions against one broker.
- Verify all sessions complete successfully.
- Verify upstream bridge process identity/count stays stable.

2. Concurrent client load:
- Launch M concurrent client sessions.
- Verify responses are routed correctly and no cross-talk/corruption occurs.
- Verify no unexpected broker/upstream restarts during load.

### 3.2 Metrics artifact

Create a metrics artifact in the task archive that compares:
- Direct mode: upstream process starts for N short-lived sessions.
- Broker mode: upstream process starts for N short-lived sessions.

The artifact should include commands/inputs, observed counts, and interpretation.

### 3.3 Manual validation artifact

Create a manual validation report that records:
- Environment (macOS, Python, Xcode version)
- Steps to reproduce for direct mode and broker mode
- Prompt observations for both modes
- Result verdict aligned to acceptance criteria

---

## 4. File changes

| File | Change |
|------|--------|
| `tests/integration/test_broker_multi_client.py` | Add broker multi-client stability integration tests |
| `SPECS/INPROGRESS/P13-T5_Validation_Report.md` | Record quality gate outcomes and acceptance-criteria evidence |
| `SPECS/INPROGRESS/P13-T5_manual_prompt_validation.md` | Manual direct-vs-broker prompt behavior notes |

---

## 5. Acceptance criteria

- [ ] Sequential short-lived clients reuse one broker-owned upstream bridge process
- [ ] Concurrent client tool calls remain stable under load
- [ ] Manual test confirms no extra Xcode prompt while broker stays running
- [ ] Regression suite passes with broker mode enabled

---

## 6. Quality gates

- `pytest` — all tests pass
- `ruff check src/ tests/` — no lint errors
- `mypy src/` — no new type errors
- `pytest --cov` — coverage ≥ 90%

---
**Archived:** 2026-02-18
**Verdict:** PARTIAL
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Validation Report: P13-T5 — Validate prompt reduction and multi-client stability

**Date:** 2026-02-18
**Branch:** `feature/P13-T5-prompt-reduction-multi-client-stability`
**Verdict:** PARTIAL

---

## Quality Gates

| Gate | Result | Details |
|------|--------|---------|
| `pytest` | ✅ PASS | 577 passed, 5 skipped |
| `ruff check src/ tests/` | ✅ PASS | All checks passed |
| `mypy src/` | ✅ PASS | Success: no issues found in 18 source files |
| `pytest --cov` ≥ 90% | ✅ PASS | 92.31% total coverage |

---

## Acceptance Criteria

| Criterion | Status | Evidence |
|-----------|--------|----------|
| Sequential short-lived clients reuse one broker-owned upstream bridge process | ✅ | `tests/integration/test_broker_multi_client.py::test_sequential_short_lived_clients_reuse_single_upstream_bridge` |
| Concurrent client tool calls remain stable under load | ✅ | `tests/integration/test_broker_multi_client.py::test_concurrent_clients_remain_stable_under_load` |
| Manual test confirms no extra Xcode prompt while broker stays running | ⚠️ PARTIAL | `SPECS/INPROGRESS/P13-T5_manual_prompt_validation.md` (interactive prompt observation pending human-run desktop check) |
| Regression suite passes with broker mode enabled | ✅ | Full `pytest` and `pytest --cov` runs pass |

---

## Artifacts

| File | Description |
|------|-------------|
| `tests/integration/test_broker_multi_client.py` | New integration coverage for sequential reuse, concurrent stability, and single-upstream launch count |
| `SPECS/INPROGRESS/P13-T5_process_churn_metrics.md` | Direct-vs-broker upstream process churn comparison |
| `SPECS/INPROGRESS/P13-T5_manual_prompt_validation.md` | Manual prompt validation procedure and current status |

---

## Notes

- Process churn evidence shows broker mode reduced upstream process starts from 12 to 1 for equivalent short-lived session count in local validation.
- The remaining gap to a full PASS verdict is interactive prompt observation in an operator-driven desktop run.
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Manual Prompt Validation: P13-T5

**Date:** 2026-02-18
**Task:** Validate reduced Xcode permission prompt behavior in broker mode

## Environment checks

- `xcrun mcpbridge --help` executed successfully.
- Xcode process detected (`pgrep -x Xcode` returned a running PID during validation).

## Manual procedure

1. Start from a clean state (no stale broker processes/sockets).
2. Run repeated short-lived sessions in direct mode and record prompt behavior.
3. Run repeated short-lived sessions in broker mode and record prompt behavior.
4. Confirm whether prompts reappear while the broker-owned upstream session remains running.

## Result

**Status:** ⚠️ PARTIAL

- Automated evidence confirms broker mode keeps a single upstream process across many short-lived sessions.
- Interactive macOS prompt observation could not be fully captured from this non-interactive terminal workflow.
- A human-operated verification pass in an interactive desktop session is still required to conclusively mark prompt behavior as PASS.

## Supporting automated evidence

- `tests/integration/test_broker_multi_client.py` covers sequential reuse and concurrent stability.
- `test_broker_mode_launches_upstream_once_for_many_short_lived_clients` verifies a single upstream launch across 12 short-lived sessions.
- `SPECS/INPROGRESS/P13-T5_process_churn_metrics.md` captures direct-vs-broker churn comparison.
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# P13-T5 Process Churn Metrics (Direct vs Broker)

**Date:** 2026-02-18

## Summary

Measured process churn for 12 short-lived sessions:

| Mode | Sessions | Upstream process starts | Notes |
|------|----------|-------------------------|-------|
| Direct (per-session subprocess baseline) | 12 | 12 | Measured via local Python harness that starts one upstream stub per client session |
| Broker mode (P13 transport/proxy architecture) | 12 | 1 | Validated by `test_broker_mode_launches_upstream_once_for_many_short_lived_clients` |

## Evidence

### Direct-mode baseline command

```bash
python - <<'PY'
import subprocess
import sys
import tempfile
from pathlib import Path

with tempfile.TemporaryDirectory(prefix='p13t5-') as tmp:
script = Path(tmp) / 'direct_mode_stub.py'
script.write_text(
'import sys\n'
'for _ in sys.stdin:\n'
' pass\n'
)

pids = set()
for _ in range(12):
proc = subprocess.Popen(
[sys.executable, '-u', str(script)],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
)
proc.communicate(input='{"jsonrpc":"2.0","id":1}\n', timeout=2)
pids.add(proc.pid)

print(f'direct_mode_process_starts={len(pids)}')
PY
```

Output:

```text
direct_mode_process_starts=12
```

### Broker-mode evidence

`tests/integration/test_broker_multi_client.py::test_broker_mode_launches_upstream_once_for_many_short_lived_clients`
asserts `launch_count == 1` after 12 short-lived client sessions.

## Interpretation

For equivalent short-lived session count (N=12), broker mode reduced upstream starts from 12 to 1 (91.7% reduction), which directly addresses upstream churn that contributes to repeated authorization prompts.
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
## REVIEW REPORT — P13-T5 prompt reduction and multi-client stability

**Scope:** origin/main..HEAD
**Files:** 8

### Summary Verdict
- [ ] Approve
- [x] Approve with comments
- [ ] Request changes
- [ ] Block

### Critical Issues
- None.

### Secondary Issues
- [Low] Interactive Xcode permission prompt verification remains manual-only and is recorded as PARTIAL in the validation artifact; this is a known limitation of non-interactive automation, not a code defect.

### Architectural Notes
- The new integration tests exercise the real broker daemon + Unix socket transport + subprocess upstream path and verify both sequential reuse and concurrent routing stability.
- Process churn metrics are explicit and reproducible, with direct-vs-broker comparison captured in archive artifacts.

### Tests
- `pytest` — 577 passed, 5 skipped
- `ruff check src/ tests/` — pass
- `mypy src/` — pass
- `pytest --cov` — 92.31% total (>=90%)

### Next Steps
- No new follow-up tasks required from this review.
- Existing P13-T6 remains the next documentation/migration task.
9 changes: 4 additions & 5 deletions SPECS/INPROGRESS/next.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
# No Active Task

The previously selected task has been archived.

## Recently Archived

- 2026-02-18 — P13-T5: Validate prompt reduction and multi-client stability (PARTIAL)
- 2026-02-18 — P13-T4: Add stdio proxy mode for compatibility with existing MCP clients (PASS)
- 2026-02-18 — P13-T3: Implement multi-client transport and JSON-RPC multiplexing (PASS)
- 2026-02-17 — P13-T2: Implement persistent broker daemon with single upstream Xcode bridge (PASS)
Expand All @@ -12,6 +11,6 @@ The previously selected task has been archived.

## Suggested Next Tasks

- P13-T5: Validate prompt reduction and multi-client stability (P1, depends on P13-T4)
- P13-T6: Document broker mode configuration, migration, and rollback (P1, depends on P13-T4 ✅)
- FU-P12-T1-1: Remove or document `MCPInitializeParams` in schemas (P3)
- P13-T6 — Document broker mode configuration, migration, and rollback (P1, depends on P13-T4, partially unblocked by P13-T5 artifacts)
- FU-P13-T4-1 — Replace deprecated `asyncio.get_event_loop()` calls in `broker/proxy.py` (P2)
- FU-P13-T4-2 — Implement or remove `reconnect` parameter in `BrokerProxy` (P2)
11 changes: 6 additions & 5 deletions SPECS/Workplan.md
Original file line number Diff line number Diff line change
Expand Up @@ -1094,7 +1094,7 @@ Keep a single long-lived client/session running to reduce process churn. This is
#### Resolution Path
- [x] Design persistent broker architecture for shared upstream Xcode session (P13-T1)
- [x] Implement long-lived broker daemon with single upstream bridge connection (P13-T2)
- [ ] Add multi-client transport + stdio proxy mode to reuse broker session (P13-T3, P13-T4)
- [x] Add multi-client transport + stdio proxy mode to reuse broker session (P13-T3, P13-T4)
- [ ] Validate reduced prompt behavior and document rollout/migration steps (P13-T5, P13-T6)

---
Expand Down Expand Up @@ -2094,7 +2094,8 @@ Phase 9 Follow-up Backlog

---

#### P13-T5: Validate prompt reduction and multi-client stability
#### ✅ P13-T5: Validate prompt reduction and multi-client stability
- **Status:** ⚠️ PARTIAL (2026-02-18, interactive prompt verification pending)
- **Description:** Add integration and manual verification that repeated short-lived client sessions can reuse the broker session without repeated upstream churn, plus load tests for concurrent calls.
- **Priority:** P1
- **Dependencies:** P13-T4
Expand All @@ -2104,10 +2105,10 @@ Phase 9 Follow-up Backlog
- Manual validation report for Xcode permission prompt behavior
- Metrics comparison (direct mode vs broker mode process churn)
- **Acceptance Criteria:**
- [ ] Sequential short-lived clients reuse one broker-owned upstream bridge process
- [ ] Concurrent client tool calls remain stable under load
- [x] Sequential short-lived clients reuse one broker-owned upstream bridge process
- [x] Concurrent client tool calls remain stable under load
- [ ] Manual test confirms no extra Xcode prompt while broker stays running
- [ ] Regression suite passes with broker mode enabled
- [x] Regression suite passes with broker mode enabled

---

Expand Down
Loading