Skip to content

fix(wrapper): revert aggressive retry logic causing message latency#287

Merged
khaliqgant merged 3 commits intomainfrom
fix/relay-message-latency
Jan 24, 2026
Merged

fix(wrapper): revert aggressive retry logic causing message latency#287
khaliqgant merged 3 commits intomainfrom
fix/relay-message-latency

Conversation

@khaliqgant
Copy link
Copy Markdown
Member

@khaliqgant khaliqgant commented Jan 24, 2026

Summary

Revert commit a23bffa's exponential backoff retry logic that caused 2-4 minute message injection delays.

Root Cause

  • Commit a23bffa introduced MAX_INJECTION_RETRIES=5 with exponential backoff (2000ms base × 2^n)
  • Combined with 30-second socket timeouts, this created worst-case 212+ second latencies
  • The retry logic was unnecessary - the system worked without retries before this commit

Changes

  1. Removed MAX_INJECTION_RETRIES and INJECTION_RETRY_BASE_MS constants
  2. Reverted failure handling to original behavior (report failure immediately, no retry loops)
  3. Restored error logging (removed debug gate - errors now always logged)

Impact

  • Message latency reduced from ~2-4 minutes back to ~30 second baseline
  • Errors now visible in logs without requiring debug mode
  • Faster failure reporting for real-time agent messaging

Files Modified

  • packages/wrapper/src/relay-pty-orchestrator.ts (net -27 lines)

Test Plan

  • Verify message injection failures report immediately
  • Verify error messages appear in console logs
  • Verify message latency returns to baseline (30s worst-case)

Fixes the 2-4 minute message delivery regression reported on Jan 24.


Open with Devin

The exponential backoff retry logic introduced in a23bffa caused message
delivery delays of 2-4 minutes when injections failed. This was too
aggressive for real-time agent communication.

Changes:
- Removed MAX_INJECTION_RETRIES (5) and INJECTION_RETRY_BASE_MS (2000)
- Reverted to immediate failure reporting without retry loops
- Fixed logError to always output (was incorrectly gated by debug flag)

Messages now fail immediately when injection fails, allowing the system
to recover faster rather than blocking in exponential backoff loops.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@my-senior-dev-pr-review
Copy link
Copy Markdown

my-senior-dev-pr-review Bot commented Jan 24, 2026

🤖 My Senior Dev — Analysis Complete

👤 For @khaliqgant

📁 Expert in src (662 edits) • ⚡ 142nd PR this month

View your contributor analytics →


📊 6 files reviewed • 1 high risk • 3 need attention

🚨 High Risk:

  • packages/wrapper/src/relay-pty-orchestrator.ts — Modifications affect core message processing logic which could introduce reliability issues under failure scenarios.

⚠️ Needs Attention:

  • .trajectories/completed/2026-01/traj_1b1dj40sl6jl.json — Metadata files can influence documentation clarity, though they are not critical to functionality.
  • .trajectories/completed/2026-01/traj_1b1dj40sl6jl.md — Documentation clarity is important to understanding the changes made, despite non-functional impact.
  • +3 more concerns...

🚀 Open Interactive Review →

The full interface unlocks features not available in GitHub:

  • 💬 AI Chat — Ask questions on any file, get context-aware answers
  • 🔍 Smart Hovers — See symbol definitions and usage without leaving the diff
  • 📚 Code Archeology — Understand how files evolved over time (/archeology)
  • 🎯 Learning Insights — See how this PR compares to similar changes

💬 Chat here: @my-senior-dev explain this change — or try @chaos-monkey @security-auditor @optimizer @skeptic @junior-dev

📖 View all 12 personas & slash commands

You can interact with me by mentioning @my-senior-dev in any comment:

In PR comments or on any line of code:

  • Ask questions about the code or PR
  • Request explanations of specific changes
  • Get suggestions for improvements

Slash commands:

  • /help — Show all available commands
  • /archeology — See the history and evolution of changed files
  • /profile — Performance analysis and suggestions
  • /expertise — Find who knows this code best
  • /personas — List all available AI personas

AI Personas (mention to get their perspective):

Persona Focus
@chaos-monkey 🐵 Edge cases & failure scenarios
@skeptic 🤨 Challenge assumptions
@optimizer Performance & efficiency
@security-auditor 🔒 Security vulnerabilities
@accessibility-advocate Inclusive design
@junior-dev 🌱 Simple explanations
@tech-debt-collector 💳 Code quality & shortcuts
@ux-champion 🎨 User experience
@devops-engineer 🚀 Deployment & scaling
@documentation-nazi 📚 Documentation gaps
@legacy-whisperer 🏛️ Working with existing code
@test-driven-purist Testing & TDD

For the best experience, view this PR on myseniordev.com — includes AI chat, file annotations, and interactive reviews.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional flags.

Open in Devin Review

Agent Relay and others added 2 commits January 24, 2026 08:58
- Record complete investigation process for 2-4 minute message latency regression
- Document root cause: commit a23bffa's exponential retry backoff (2000ms × 2^n)
- Record strategy evaluation: 5 retry approaches analyzed, full revert chosen
- Track implementation decisions and credential blocker resolution
- Confidence: 90% - Fix verified, expected to restore 30s baseline latency

Trajectory: traj_i2h6krqx2iun
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Resolve merge conflict in .trajectories/index.json by combining
trajectory entries from both branches:
- Preserved latency fix trajectories (traj_1b1dj40sl6jl, traj_i2h6krqx2iun)
- Incorporated main branch trajectories (socket path fixes, legacy symlink fixes,
  path traversal validation, includeWorkflowConventions feature)

All latency fix changes remain intact:
- Removed aggressive retry logic from relay-pty-orchestrator
- Reduced default latency values for faster message injection

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
if (this.config.debug) {
console.error(`[relay-pty-orchestrator:${this.config.name}] ERROR: ${message}`);
}
console.error(`[relay-pty-orchestrator:${this.config.name}] ERROR: ${message}`);

Check warning

Code scanning / CodeQL

Log injection Medium

Log entry depends on a
user-provided value
.
Log entry depends on a
user-provided value
.
Log entry depends on a
user-provided value
.

Copilot Autofix

AI 3 months ago

To fix the problem, we should sanitize any user-controlled value before including it in log messages, particularly values that become part of the log “prefix” such as this.config.name. For plain-text logs, the primary concern is newline (\n, \r) and possibly other non-printable/control characters that can break log structure. A simple and robust approach is to derive a sanitized version of the agent name once in the RelayPtyOrchestrator constructor (e.g., this.safeName), where we strip or replace newline and carriage-return characters (and optionally other control characters) and then consistently use that safe value in all logging methods. This preserves existing behavior while preventing forged log lines.

The best minimal change here is:

  • In packages/wrapper/src/relay-pty-orchestrator.ts, add a private field (e.g., private readonly safeName: string;).
  • In the constructor, after validating config.name, compute this.safeName = config.name.replace(/[\r\n]/g, ' ') (or similar), which removes or neutralizes any line breaks but keeps the name readable.
  • Update the log and logError methods to use this.safeName instead of this.config.name in the log prefix.
  • Leave all other behavior unchanged; no new external dependencies are required.

This single change addresses all three variants, because they all flow through the same nameRelayPtyOrchestratorlogError path.

Suggested changeset 1
packages/wrapper/src/relay-pty-orchestrator.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/packages/wrapper/src/relay-pty-orchestrator.ts b/packages/wrapper/src/relay-pty-orchestrator.ts
--- a/packages/wrapper/src/relay-pty-orchestrator.ts
+++ b/packages/wrapper/src/relay-pty-orchestrator.ts
@@ -227,6 +227,9 @@
   private memoryMonitor: AgentMemoryMonitor;
   private memoryAlertHandler: ((alert: MemoryAlert) => void) | null = null;
 
+  // Sanitized agent name used for logging to prevent log injection
+  private readonly safeName: string;
+
   // Note: sessionEndProcessed and lastSummaryRawContent are inherited from BaseWrapper
 
   constructor(config: RelayPtyOrchestratorConfig) {
@@ -238,6 +241,9 @@
       throw new Error(`Invalid agent name: "${config.name}" contains path traversal characters`);
     }
 
+    // Sanitize agent name for safe logging (remove line breaks)
+    this.safeName = config.name.replace(/[\r\n]/g, ' ');
+
     // Get project paths (used for logs and local mode)
     const projectPaths = getProjectPaths(config.cwd);
 
@@ -321,7 +327,7 @@
    */
   private log(message: string): void {
     if (this.config.debug) {
-      console.log(`[relay-pty-orchestrator:${this.config.name}] ${message}`);
+      console.log(`[relay-pty-orchestrator:${this.safeName}] ${message}`);
     }
   }
 
@@ -329,7 +335,7 @@
    * Error log - always outputs (errors are important)
    */
   private logError(message: string): void {
-    console.error(`[relay-pty-orchestrator:${this.config.name}] ERROR: ${message}`);
+    console.error(`[relay-pty-orchestrator:${this.safeName}] ERROR: ${message}`);
   }
 
   /**
EOF
@@ -227,6 +227,9 @@
private memoryMonitor: AgentMemoryMonitor;
private memoryAlertHandler: ((alert: MemoryAlert) => void) | null = null;

// Sanitized agent name used for logging to prevent log injection
private readonly safeName: string;

// Note: sessionEndProcessed and lastSummaryRawContent are inherited from BaseWrapper

constructor(config: RelayPtyOrchestratorConfig) {
@@ -238,6 +241,9 @@
throw new Error(`Invalid agent name: "${config.name}" contains path traversal characters`);
}

// Sanitize agent name for safe logging (remove line breaks)
this.safeName = config.name.replace(/[\r\n]/g, ' ');

// Get project paths (used for logs and local mode)
const projectPaths = getProjectPaths(config.cwd);

@@ -321,7 +327,7 @@
*/
private log(message: string): void {
if (this.config.debug) {
console.log(`[relay-pty-orchestrator:${this.config.name}] ${message}`);
console.log(`[relay-pty-orchestrator:${this.safeName}] ${message}`);
}
}

@@ -329,7 +335,7 @@
* Error log - always outputs (errors are important)
*/
private logError(message: string): void {
console.error(`[relay-pty-orchestrator:${this.config.name}] ERROR: ${message}`);
console.error(`[relay-pty-orchestrator:${this.safeName}] ERROR: ${message}`);
}

/**
Copilot is powered by AI and may make mistakes. Always verify output.
@khaliqgant khaliqgant merged commit 760975a into main Jan 24, 2026
25 of 26 checks passed
@khaliqgant khaliqgant deleted the fix/relay-message-latency branch January 24, 2026 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants