Skip to content

feat: WORKSPACE_ID env propagation for namespaced paths (agent-relay-505, 488, 489)#210

Merged
khaliqgant merged 16 commits intomainfrom
tasks-2026-01-16
Jan 18, 2026
Merged

feat: WORKSPACE_ID env propagation for namespaced paths (agent-relay-505, 488, 489)#210
khaliqgant merged 16 commits intomainfrom
tasks-2026-01-16

Conversation

@khaliqgant
Copy link
Copy Markdown
Member

Summary

  • Implements agent-relay-505: WORKSPACE_ID env propagation for workspace-namespaced paths
  • Also covers agent-relay-488 (socket path namespacing) and agent-relay-489 (outbox path namespacing)

Changes

When WORKSPACE_ID is set (via process.env or config.env), RelayPtyOrchestrator now uses workspace-namespaced paths:

  • Socket: /tmp/relay/{workspaceId}/sockets/{name}.sock
  • Outbox: /tmp/relay/{workspaceId}/outbox/{name}/

Falls back to legacy paths for local development:

  • Socket: /tmp/relay-pty-{name}.sock
  • Outbox: /tmp/relay-outbox/{name}

This enables multi-tenant isolation in cloud deployments where multiple workspaces run on the same host.

Files Modified

  • src/wrapper/relay-pty-orchestrator.ts - Path namespacing logic in constructor
  • src/wrapper/relay-pty-orchestrator.test.ts - Tests for namespacing scenarios

Test plan

  • Existing tests pass (25/25)
  • New tests verify workspace-namespaced paths when WORKSPACE_ID set
  • New tests verify legacy paths when WORKSPACE_ID not set
  • Manual test in cloud environment

🤖 Generated with Claude Code

@my-senior-dev-pr-review
Copy link
Copy Markdown

my-senior-dev-pr-review Bot commented Jan 17, 2026

🤖 My Senior Dev — Analysis Complete

👤 For @khaliqgant

📁 Expert in src/dashboard/react-components/ (10 edits) • ⚡ 71st PR this month

View your contributor analytics →


📊 38 files reviewed • 1 high risk • 2 need attention

🚨 High Risk:

  • src/bridge/spawner.ts — Introduction of new spawning logic that may cause agent conflicts if implemented incorrectly.

⚠️ Needs Attention:

  • docs/INTEGRATION-GUIDE.md — Discusses integration of multiple systems which could introduce potential vulnerabilities if misconfigured or not properly secured.

🚀 Open Interactive Review →

The full interface unlocks features not available in GitHub:

  • 💬 AI Chat — Ask questions on any file, get context-aware answers
  • 🔍 Smart Hovers — See symbol definitions and usage without leaving the diff
  • 📚 Code Archeology — Understand how files evolved over time (/archeology)
  • 🎯 Learning Insights — See how this PR compares to similar changes

💬 Chat here: @my-senior-dev explain this change — or try @chaos-monkey @security-auditor @optimizer @skeptic @junior-dev

📖 View all 12 personas & slash commands

You can interact with me by mentioning @my-senior-dev in any comment:

In PR comments or on any line of code:

  • Ask questions about the code or PR
  • Request explanations of specific changes
  • Get suggestions for improvements

Slash commands:

  • /help — Show all available commands
  • /archeology — See the history and evolution of changed files
  • /profile — Performance analysis and suggestions
  • /expertise — Find who knows this code best
  • /personas — List all available AI personas

AI Personas (mention to get their perspective):

Persona Focus
@chaos-monkey 🐵 Edge cases & failure scenarios
@skeptic 🤨 Challenge assumptions
@optimizer Performance & efficiency
@security-auditor 🔒 Security vulnerabilities
@accessibility-advocate Inclusive design
@junior-dev 🌱 Simple explanations
@tech-debt-collector 💳 Code quality & shortcuts
@ux-champion 🎨 User experience
@devops-engineer 🚀 Deployment & scaling
@documentation-nazi 📚 Documentation gaps
@legacy-whisperer 🏛️ Working with existing code
@test-driven-purist Testing & TDD

For the best experience, view this PR on myseniordev.com — includes AI chat, file annotations, and interactive reviews.

@@ -0,0 +1,285 @@
/**
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI Conversation Transcript

👤 khaliqgant

This error panel array seems brittle and not comprehensive. How is it used and talk about the drawbacks of this type of approach

🤖 AI Assistant

In the provided code, the error patterns are defined in an array called errorPatterns (located at line 40). This array contains regular expressions that are used to identify specific error messages in the outputs fed to the StuckDetector. Here are the key points about how it is used and the potential drawbacks of this approach:

Usage

  1. Error Detection: The errorPatterns array is utilized within the detectErrorLoop method (lines 191-228). This method checks the recent outputs for any matches against the patterns defined in this array.
  2. Loop Detection: If an output matches any of the patterns, it is considered to be an error. The method counts how many times the same error message appears in succession. If the count exceeds the configured threshold (loopThreshold), the error is deemed to be in a loop, and a stuck event is emitted.

Drawbacks of This Approach

  1. Brittleness:

    • The reliance on specific string patterns means that any slight change in the error message format (e.g., different casing, additional context) could lead to missed detections. For example, an error message that is slightly modified or contains additional information may not match any of the predefined patterns.
  2. Limited Coverage:

    • The predefined patterns may not cover all possible error messages that could occur in the system. New errors or variations of existing errors would require updates to the array, which could lead to undetected issues if not maintained diligently.
  3. False Positives:

    • The approach could result in false positives, where benign outputs that match the patterns (but are not actual errors) are incorrectly identified as errors. This can lead to unnecessary stuck events being emitted, leading to confusion and making it harder to differentiate between genuine and false alarms.
  4. Scalability:

    • As the number of error messages increases, managing and updating the patterns becomes more cumbersome. A more scalable solution might involve a more sophisticated logging and error handling framework that can dynamically categorize and handle errors.
  5. Performance Concerns:

    • The regex checks in loops can become a performance bottleneck if the number of recent outputs grows large, especially if they are checked against multiple patterns. The current implementation keeps only the last 5 times the threshold is exceeded, but if many patterns are added, it may still slow down the detection process.

Conclusion

While the current implementation of the error patterns provides a straightforward mechanism for detecting error loops, it is important to be aware of its limitations. An ideal implementation would require a more flexible and comprehensive approach to error detection, potentially utilizing more advanced error tracking mechanisms, such as structured logging or integrating with a monitoring system that can dynamically adapt to new error messages.


From My Senior Dev

@@ -0,0 +1,285 @@
/**
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI Conversation Update

Continuation of previous conversation...

👤 khaliqgant

Propose alternative approaches

🤖 AI Assistant

Here are several alternative approaches to improve error detection and handling in the StuckDetector, addressing the limitations of the current regex-based pattern matching:

1. Structured Logging with Error Codes

  • Description: Implement a structured logging system where error messages are logged with associated error codes and metadata (e.g., severity, source, context).
  • Benefits:
    • Easier to categorize and filter errors based on severity or type.
    • Allows for more sophisticated querying and analysis of logs.
    • Reduces reliance on specific string patterns.
  • Implementation: Use a logging library that supports structured logs (e.g., Winston, Pino) and define a set of standard error codes.

2. Machine Learning for Anomaly Detection

  • Description: Train a machine learning model to recognize normal versus abnormal output patterns, identifying potential errors based on learned behaviors rather than predefined patterns.
  • Benefits:
    • Can adapt to new and varied error messages without manual updates.
    • Reduces false positives by learning from historical data.
  • Implementation: Use historical output data to train a model (e.g., supervised learning with labeled data) and integrate it into the stuck detection process.

3. Dynamic Rule Engine

  • Description: Implement a rule-based engine that allows users to define error detection rules dynamically.
  • Benefits:
    • Users can add, modify, or remove rules as needed without changing the codebase.
    • Provides flexibility to adapt to new error types and formats.
  • Implementation: Use a rules engine library (e.g., Nools, json-rules-engine) to define and evaluate rules on the outputs.

4. Error Reporting Service Integration

  • Description: Integrate with an external error monitoring and reporting service (e.g., Sentry, Rollbar) that can capture and analyze errors in real-time.
  • Benefits:
    • Gains access to a more comprehensive analysis of error occurrences, trends, and patterns.
    • Allows for better prioritization of issues based on impact and frequency.
  • Implementation: Send error messages to the service with context, and use its capabilities to monitor for stuck conditions.

5. Threshold-Based Alerts with Contextual Awareness

  • Description: Instead of relying solely on repetitive patterns, use contextual awareness to determine the significance of an error based on its occurrence rate, timing, and related outputs.
  • Benefits:
    • Reduces false positives by considering the context in which errors occur.
    • Allows for more nuanced detection of stuck conditions based on various metrics, not just repetition.
  • Implementation: Implement a scoring system that evaluates outputs based on multiple factors (e.g., frequency, type, duration) to determine if a stuck condition is reached.

6. Custom Error Classes

  • Description: Define custom error classes with specific attributes to capture relevant error information and characteristics.
  • Benefits:
    • Provides a clear structure for handling different error types and their associated behaviors.
    • Simplifies the process of identifying and reacting to specific errors.
  • Implementation: Create a hierarchy of error classes and use them in the output handling logic to classify and manage errors.

Summary

These alternative approaches can enhance the robustness and flexibility of the error detection mechanism in the StuckDetector. Depending on the complexity of the system and the variety of expected error messages, a combination of these methods may provide the best results, leading to improved performance and reliability in detecting and handling stuck conditions.


From My Senior Dev

Agent Relay and others added 15 commits January 18, 2026 15:50
…aths

Implements agent-relay-505: When WORKSPACE_ID is set (via process.env or
config.env), RelayPtyOrchestrator uses workspace-namespaced paths:
- Socket: /tmp/relay/{workspaceId}/sockets/{name}.sock
- Outbox: /tmp/relay/{workspaceId}/outbox/{name}/

Falls back to legacy paths for local development:
- Socket: /tmp/relay-pty-{name}.sock
- Outbox: /tmp/relay-outbox/{name}

This enables multi-tenant isolation in cloud deployments where multiple
workspaces run on the same host.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements agent-relay-510: Model selection hookup

- Add `model: sonnet` to all 32 agent profiles in .claude/agents/
- Add cost tracking log: `[spawner] Agent {name}: model={model}, cli={cli}`
- The spawner already used buildClaudeArgs() which passes --model to CLI

All agent profiles now have consistent model defaults. Spawner logs which
model is used for each agent, enabling cost tracking and optimization.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…nuity docs

- Added Advanced Features section to INTEGRATION-GUIDE.md
- Documented StatelessLeadCoordinator (hierarchical task coordination)
- Documented Consensus (multi-agent voting/approval)
- Documented Continuity (session persistence, ledger, handoff)
- Created EXAMPLES.md with 3 end-to-end scenarios:
  - Hierarchical planning (Lead spawns workers)
  - Consensus code review (2+ agents approve)
  - Long-running task with continuity

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Created .claude/prompts/roles/ with:
- planner-strategy.md: How planners decompose tasks, spawn workers
- worker-focus.md: Keep workers on-task, avoid scope creep
- reviewer-criteria.md: Security, logic, testing criteria for reviews

Based on Cursor scaling insights: "a surprising amount of behavior
comes down to how we prompt the agents."

Phase 2 (prompt-composer.ts) to follow.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…e 2)

- Created src/wrapper/prompt-composer.ts
  - composeForAgent(): Combines role prompt + task context + team members
  - getAvailableRoles(): Lists available role prompts
  - parseRoleFromProfile(): Extracts role from agent frontmatter
  - Prompt caching to avoid repeated file reads
  - Graceful fallback if prompt files missing

- Added comprehensive tests in prompt-composer.test.ts

Addresses Backend's suggestions:
- Caching at load time with clearPromptCache() for hot-reload
- Fallback: returns content without role prompt if file missing
- Composition: role + parent context + task + team + custom

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…relay-491)

When a workspace is removed, clean up its temp directory at
/tmp/relay/{workspaceId}/ which contains:
- Socket files: /tmp/relay/{workspaceId}/sockets/*.sock
- Outbox directories: /tmp/relay/{workspaceId}/outbox/*/

This complements the workspace namespacing work in agent-relay-488/489
by ensuring temp files are properly cleaned up.

Cleanup is best-effort - logs warning on failure but doesn't throw.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ay-501)

Implements stuck detection heuristics:
1. Extended idle (10+ minutes without output)
2. Error loop (same error message repeated 3+ times)
3. Output loop (same output pattern repeated 3+ times)

Integration with BaseWrapper:
- StuckDetector initialized and wired up to emit 'stuck'/'unstuck' events
- Output fed to both idle and stuck detectors via feedIdleDetectorOutput()
- Public isStuck() and getStuckReason() methods for querying state
- Helper methods startStuckDetection() and stopStuckDetection() for subclasses

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…-512)

Moved role-specific prompt files to correct location:
- .claude/prompts/roles/* → .claude/agents/roles/*
- Updated prompt-composer.ts to reference new path

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Added role field to AgentConfig interface
- Updated findAgentConfig to parse role from frontmatter
- Integrated composeForAgent in spawner.spawn() to inject role-specific prompts
- Role prompts are prepended to relay instructions when spawning agents

When an agent has a role defined in .claude/agents/<name>.md frontmatter,
the spawner now automatically composes and injects role-specific guidance
(planner strategy, worker focus, or reviewer criteria) into the system prompt.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Added CLAUDE_MODEL_OPTIONS constant (Sonnet, Opus, Haiku)
- Added selectedModel state with default 'sonnet'
- Model dropdown UI appears when Claude template is selected
- effectiveCommand now includes --model flag for non-default models
- Model selection resets when modal opens

Users can now select Claude model when spawning agents from the dashboard.
Sonnet is default, Opus for most capable, Haiku for fastest.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Records completed work on cross-machine user message routing.
- Added cloud infrastructure (PresenceRegistry, CloudMessageBus)
- Created PR #213 with 23 comprehensive tests
- Identified as missing feature (not regression)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Updated beads task status to pending_review
- Completed trajectory with 90% confidence
- Added PR link: #215

Implementation complete, awaiting CI validation and code review.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…nuity docs

- Added Advanced Features section to INTEGRATION-GUIDE.md
- Documented StatelessLeadCoordinator (hierarchical task coordination)
- Documented Consensus (multi-agent voting/approval)
- Documented Continuity (session persistence, ledger, handoff)
- Created EXAMPLES.md with 3 end-to-end scenarios:
  - Hierarchical planning (Lead spawns workers)
  - Consensus code review (2+ agents approve)
  - Long-running task with continuity

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@khaliqgant khaliqgant merged commit 18f887d into main Jan 18, 2026
8 checks passed
@khaliqgant khaliqgant deleted the tasks-2026-01-16 branch January 18, 2026 16:03
khaliqgant added a commit that referenced this pull request Jan 19, 2026
When WORKSPACE_ID is set, agents now write to the legacy outbox path
(/tmp/relay-outbox/{name}) which is symlinked to the workspace-namespaced
path (/tmp/relay/{workspaceId}/outbox/{name}).

This preserves multi-tenancy (PR #210) while keeping agent instructions
simple - agents don't need to know about workspace IDs.

How it works:
1. Orchestrator creates workspace-namespaced outbox directory
2. Creates symlink: /tmp/relay-outbox/{name} -> /tmp/relay/{ws}/outbox/{name}
3. Agents write to legacy path -> symlink redirects -> relay-pty sees file

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant