Website#2
Merged
Merged
Conversation
lukemarsden
added a commit
that referenced
this pull request
Oct 10, 2025
FINAL WORKING SOLUTION after 8 attempts: Fix #1: Duplicate pause guard (Wolf) - Prevents multiple EOS events - Session count stays correct - CONFIRMED working in logs Fix #2: Prevent auto-leave on pause (Wolf + Helix) - Lobbies don't auto-leave when Wolf-UI pauses - Wolf-UI session stays connected to lobby even when disconnected - Lobby never becomes empty - No stale buffer accumulation - Agents keep running Test pattern: 1→2→3→1 should now work without rejoin hang! 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
lukemarsden
added a commit
that referenced
this pull request
Oct 16, 2025
FINAL WORKING SOLUTION after 8 attempts: Fix #1: Duplicate pause guard (Wolf) - Prevents multiple EOS events - Session count stays correct - CONFIRMED working in logs Fix #2: Prevent auto-leave on pause (Wolf + Helix) - Lobbies don't auto-leave when Wolf-UI pauses - Wolf-UI session stays connected to lobby even when disconnected - Lobby never becomes empty - No stale buffer accumulation - Agents keep running Test pattern: 1→2→3→1 should now work without rejoin hang! 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
lukemarsden
added a commit
that referenced
this pull request
Nov 14, 2025
THREE CRITICAL BUGS found causing HTTPS deadlock within 16 hours: BUG #1: GStreamer Thread-Safety Violation (PRIMARY ROOT CAUSE) - gst_element_send_event() called from HTTPS thread (wrong context!) - Must be called from pipeline's g_main_loop_run() thread - HTTPS thread blocks on GStreamer internal mutex (0x70537c0062b0) - Located in streaming.cpp:124, 132, 176, 184, 401, 524 - FIX: Use g_main_loop_quit() instead (thread-safe) BUG #2: NVIDIA Driver Mutex Deadlock (SECONDARY) - Multiple GStreamer pipelines compete for NVIDIA mutex (0x705580003b80) - Circular deadlock: HTTPS→GStreamer→NVIDIA→? - Core dump shows 2 threads stuck on same NVIDIA mutex - Inside proprietary libEGL_nvidia.so.0 (no symbols) - FIX: Separate CUDA contexts per pipeline OR remove NVIDIA from SSL BUG #3: HTTPS Connection Leak (CONTRIBUTING FACTOR) - custom-https.cpp error handler doesn't close sockets - 17 leaked connections in 16 hours (~1/hour leak rate!) - Connections stuck in CLOSE_WAIT forever - From: external browsers, moonlight-web, localhost - FIX: Add socket->close() in error handler COMPLETE DEADLOCK CHAIN: 1. HTTPS request fires StopStreamEvent (endpoints.hpp:484) 2. Event handler runs in HTTPS thread (synchronous dispatch) 3. Calls gst_element_send_event() - WRONG THREAD (Bug #1) 4. Blocks on GStreamer mutex 5. GStreamer holds mutex, waiting on NVIDIA 6. NVIDIA mutex held by another operation 7. ALL new HTTPS requests block on continue_lock() 8. System appears completely hung for HTTPS EVIDENCE: - HTTP (port 47989) still works perfectly - HTTPS (port 47984) completely hung - Core dump shows exact mutex addresses and call stacks - 17 leaked CLOSE_WAIT connections - Thread 99 stuck in gst_element_send_event from wrong context CRITICAL FIX: Replace all gst_element_send_event(eos) with g_main_loop_quit() in event handlers at streaming.cpp:124,132,176,184,401,524
lukemarsden
added a commit
that referenced
this pull request
Nov 19, 2025
BUG #1: Inconsistent NVIDIA runtime detection pattern - Line 811 used 'grep -i nvidia' (too broad, matches image names) - Line 779 used 'grep -i "runtimes.*nvidia"' (correct, matches runtime only) - Fixed line 811 to use consistent pattern - Prevents false positives when nvidia/cuda images are present BUG #2: Race condition after Docker installation - Docker daemon takes 1-3 seconds to initialize after systemctl start - check_docker_sudo() was called immediately, could fail if daemon not ready - Added 30-second wait loop checking 'docker ps' readiness - Applied to both Ubuntu/Debian and Fedora installation paths - Prevents intermittent failures: "Docker is not running" after fresh install Both fixes are defensive and prevent edge cases without changing behavior for correctly configured systems.
3 tasks
chocobar
added a commit
that referenced
this pull request
Feb 11, 2026
…ation - Fix silently swallowed Exec() error in migration (bug #1) - Fix WHERE condition: LENGTH(name) > 255 instead of OCTET_LENGTH > 2704 (bug #2) - Add Go-level name truncation in CreateSession, UpdateSession, UpdateSessionMeta, and UpdateSessionName to prevent cryptic GORM errors - Add 6 unit tests covering truncation for ASCII, multibyte (CJK), and boundary cases across all session name write paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
lukemarsden
added a commit
that referenced
this pull request
Mar 18, 2026
Both sessions have spec_task_id set. First has no agent_type (incomplete creation). This race likely causes duplicate message sends (issue #2) and the agent jumping from spec to implementation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
lukemarsden
added a commit
that referenced
this pull request
Mar 18, 2026
Spec-Ref: helix-specs@620ace9f4:001588_read-helixs-design2026
lukemarsden
added a commit
that referenced
this pull request
Mar 18, 2026
lukemarsden
added a commit
that referenced
this pull request
Mar 18, 2026
Issue #1 (stuck "Starting Desktop"): - Add defer in StartDesktop to clear external_agent_status on any error - Give waitForDesktopBridge its own 90s context decoupled from dockerCtx Issue #4 (status not cleared on stop): - StopDesktop unconditionally clears external_agent_status and status_message Issue #5 (no restart button in Starting state): - Frontend: show Stop button in "starting" state in both screenshot and stream modes - Show "may have failed to start" message after 2-minute timeout Issue #10a (duplicate sessions per spectask): - Re-read task from DB before CreateSession; skip if PlanningSessionID already set Issue #10b (scanner targets wrong sessions): - processPendingPromptsForIdleSessions now filters to canonical planning_session_id only Issue #2 (duplicate message sends): - Add ClaimPromptForSending() atomic store method (UPDATE WHERE status IN pending/failed) - Both interrupt and any-pending delivery paths use claim before send Issue #7 (promotion race gives empty zvol): - resolveDockerDataDir: acquire read lock before fresh zvol creation; re-check after Issue #3: Already handled by existing open_thread on agent_ready reconnect Issue #6: Fixed in merged PR #1947 (RecoverStaleBuilds 60s retry) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Spec-Ref: helix-specs@04b515c3c:001588_read-helixs-design2026
lukemarsden
added a commit
that referenced
this pull request
Mar 19, 2026
Tests were missing expectations for two new store calls introduced by the ZFS deployment issue fixes: - ClaimPromptForSending (fix #2): added before MarkPromptAsSent / MarkPromptAsFailed in processAnyPendingPrompt and handleAgentReady tests - GetSpecTask (fix #10b): added to all processPendingPromptsForIdleSessions tests; returns a SpecTask with PlanningSessionID matching the test session so canonical-session filtering passes through correctly Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Spec-Ref: helix-specs@04b515c3c:001588_read-helixs-design2026
chocobar
pushed a commit
that referenced
this pull request
Apr 22, 2026
The Design Review UI made two sequential API calls on "Approve Design": 1. submitReviewMutation (marks review record approved) 2. v1SpecTasksApproveSpecsCreate (approves the spec task) If #1 succeeded but #2 failed, the review showed "approved" but the spec task stayed in spec_review with SpecApproval == nil — creating the inconsistent state that led to the infinite loop. Fix: move spec task approval into submitDesignReview's "approve" case (matching the existing pattern where "request_changes" already updates the spec task). Remove the redundant second API call from the frontend. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Spec-Ref: helix-specs@560941003:001869_bug-report-spec-tasks
philwinder
added a commit
that referenced
this pull request
Apr 28, 2026
Previously the activation prompt only carried Body. The Worker had to call read_events to learn Subject, From, ThreadID, Extra — exactly the round-trip that caused the docs-engineer to misroute issue #3 to PR #2 during the github demo's E2E run. renderTrigger now formats every populated envelope field into the prompt, omitting empties for cleanliness. The Trigger.Body field is dropped; callers pass the full Message instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
philwinder
added a commit
that referenced
this pull request
May 4, 2026
Previously the activation prompt only carried Body. The Worker had to call read_events to learn Subject, From, ThreadID, Extra — exactly the round-trip that caused the docs-engineer to misroute issue #3 to PR #2 during the github demo's E2E run. renderTrigger now formats every populated envelope field into the prompt, omitting empties for cleanliness. The Trigger.Body field is dropped; callers pass the full Message instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
NOTE: must delete keycloak db otherwise everything will be fucked