Skip to content

fix(platform): streaming UX, workflow recovery, and dev health#563

Merged
larryro merged 3 commits into
mainfrom
fix/streaming-ux-workflow-recovery-dev-health
Feb 26, 2026
Merged

fix(platform): streaming UX, workflow recovery, and dev health#563
larryro merged 3 commits into
mainfrom
fix/streaming-ux-workflow-recovery-dev-health

Conversation

@larryro
Copy link
Copy Markdown
Collaborator

@larryro larryro commented Feb 25, 2026

Summary

  • Auto-scroll stabilization: Replaced wheel/touch event sniffing with a position-based approach ("was user at bottom before content grew?"), eliminating race conditions. Added a programmatic scroll guard to prevent our own scrollTo calls from falsely disabling auto-scroll.
  • Code block streaming improvements: Debounced Shiki highlighting during streaming (150ms) so rapidly changing code blocks show plain text instead of triggering expensive highlight calls every frame. Removed cursor injection from <pre> elements to prevent misplaced cursors inside code blocks.
  • Workflow recovery hardening: Increased stuck workflow recovery timeout from 30 min → 6 hours, added per-workflow timeout via workflowConfig, and cancel underlying component workflow executions during recovery.
  • Dev script reliability: Added TCP health check with auto-restart for Convex backend in the dev script, recovering from unresponsive states with configurable retry limits.
  • Route loader unblocking: Switched /dashboard/$id loader from blocking ensureQueryData to non-blocking prefetchQuery so navigation isn't blocked by slow queries.

Test plan

  • Added tests for useAutoScroll hook (474 new lines)
  • Added tests for CodeBlock debounced highlighting (134 new lines)
  • Extended useStreamBuffer tests (169 new lines)
  • Extended recoverStuckExecutions tests (190 new lines)
  • Updated IncrementalMarkdown cursor injection tests

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added health-check and auto-restart capabilities for development environment backend.
    • Improved loading state indicators for dashboard during data fetch.
  • Bug Fixes

    • Enhanced code syntax highlighting with debouncing for smoother streaming performance.
    • Refined chat auto-scroll behavior to respect user scroll position and content changes.
    • Removed cursor injection from code blocks for cleaner display.
  • Tests

    • Added comprehensive test coverage for code highlighting and auto-scroll functionality.

…eliability

Remove cursor injection from <pre> elements to prevent misplaced cursors
inside code blocks. Always show thinking animation for incomplete assistant
messages regardless of streaming text presence.

Increase stuck workflow recovery timeout from 30 minutes to 6 hours and
support per-workflow timeout via workflowConfig. Cancel underlying component
workflow executions during recovery and schedule cleanup.

Add TCP health check with auto-restart for Convex backend in the dev script,
recovering from unresponsive states with configurable retry limits.
… unblock route loader

Debounce Shiki highlighting during streaming (150ms) so rapidly changing
code blocks show plain text instead of triggering expensive highlight calls
on every frame. Track which code version was highlighted to avoid showing
stale syntax-highlighted HTML.

Simplify useAutoScroll from wheel/touch event sniffing to a position-based
approach — "was the user at the bottom before content grew?" This eliminates
race conditions between input events and works with all scroll methods
(scrollbar drag, keyboard, programmatic). Fix findSafeAnchor to locate the
paragraph break before an open code fence instead of resetting to 0, and add
a monotonic guard to prevent anchor oscillation in long sections.

Switch the /dashboard/$id route loader from blocking ensureQueryData to
non-blocking prefetchQuery so navigation isn't blocked by slow queries,
with a spinner fallback during cold page refreshes.
Add a programmatic scroll guard ref to distinguish our own scrollTo
calls from user scrolls. Without this, the async scroll event from a
programmatic scrollTo can arrive after content has grown, causing the
geometry check to return false and permanently disabling auto-scroll.
Also switch scrollToBottom to instant behavior and re-arm auto-scroll
when the user clicks the scroll-to-bottom button.
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Feb 25, 2026

Greptile Summary

This PR delivers substantial improvements to streaming UX, workflow reliability, and development tooling. The changes address race conditions in auto-scroll, optimize code highlighting performance, harden workflow recovery with configurable timeouts, add health monitoring to the dev script, and unblock route navigation.

Key improvements:

  • Position-based auto-scroll eliminates race conditions from event-based detection, with comprehensive test coverage (474 new lines)
  • Debounced Shiki highlighting (150ms) prevents expensive re-renders during streaming, showing plain text until stable
  • Workflow recovery timeout increased from 30 min → 6 hours with per-workflow config support and component workflow cancellation
  • Dev script now includes TCP health checks with auto-restart (max 5 attempts) to recover from unresponsive Convex backend
  • Route loader switched to non-blocking prefetch, preventing navigation hangs on slow queries
  • Removed cursor injection from pre tags to fix misplaced cursors in code blocks

Test coverage: All core changes include comprehensive tests (967+ new test lines across 4 test files), demonstrating thorough validation of edge cases and error scenarios.

Confidence Score: 5/5

  • Safe to merge with high confidence - well-tested changes that improve reliability and UX
  • All major changes include comprehensive test coverage (967+ new test lines), the refactorings simplify logic while fixing bugs, no breaking changes to APIs, and the improvements address real production issues with defensive coding patterns
  • No files require special attention - all changes are well-tested and follow existing patterns

Important Files Changed

Filename Overview
services/platform/app/hooks/use-auto-scroll.ts Refactored from event-based to position-based auto-scroll detection, removing wheel/touch event handling and replacing with programmatic scroll guard. Well-tested with comprehensive coverage.
services/platform/app/features/chat/components/message-bubble/code-block.tsx Added 150ms debounce for Shiki highlighting during streaming and added ref to track which code version is highlighted to prevent stale HTML rendering.
services/platform/convex/workflow_engine/helpers/recovery/recover_stuck_executions.ts Increased timeout from 30 min to 6 hours, added per-workflow timeout via workflowConfig, and cancels component workflow executions during recovery. Thoroughly tested.
services/platform/scripts/dev.mjs Added TCP health check with auto-restart for Convex backend, including configurable retry limits and stability tracking. Improves dev reliability significantly.
services/platform/app/routes/dashboard/$id.tsx Changed from blocking ensureQueryData to non-blocking prefetchQuery in route loader, added loading spinner for better UX during slow queries.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[User Scrolls or Content Grows] --> B{Scroll Event Fired}
    B --> C{programmaticScrollRef = true?}
    C -->|Yes| D[Reset flag, set wasAtBottom = true]
    C -->|No| E[Update wasAtBottom via isAtBottom check]
    
    F[ResizeObserver Detects Content Height Change] --> G{wasAtBottom = true?}
    G -->|Yes| H[Set programmaticScrollRef = true]
    G -->|No| I[Do nothing]
    H --> J[scrollTo bottom with instant behavior]
    J --> K[Scroll event will fire asynchronously]
    K --> B
    
    L[Code Block Streaming] --> M{Code Changes Every ~50ms}
    M --> N[Debounce Timer Resets]
    N --> O{Stream Ends?}
    O -->|Yes| P[Wait 150ms]
    P --> Q[Trigger Single Shiki Highlight]
    O -->|No| R[Show Plain Text]
    
    S[Workflow Execution Running] --> T{updatedAt > timeout?}
    T -->|Yes| U[Get timeout from workflowConfig or default 6h]
    U --> V[Cancel Component Workflow]
    V --> W[Mark as failed]
    W --> X[Schedule cleanup after 10s]
    T -->|No| Y[Continue monitoring]
    
    Z[Dev Script Starts Convex] --> AA[TCP Health Check Every 30s]
    AA --> AB{Port 3210 Responds?}
    AB -->|No| AC{consecutiveFailures >= 3?}
    AB -->|Yes| AD[Reset failure counter]
    AC -->|Yes| AE[Kill and Restart Convex]
    AC -->|No| AF[Increment failure counter]
    AE --> AG{restartCount < 5?}
    AG -->|Yes| Z
    AG -->|No| AH[Shutdown]
Loading

Last reviewed commit: ec3028c

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 25, 2026

📝 Walkthrough

Walkthrough

This PR implements multiple interconnected improvements to the chat streaming UI, message processing pipeline, and Convex backend infrastructure. Changes include removing cursor injection from code blocks while adding debounced Shiki highlighting, refactoring anchor tracking to export findSafeAnchor, replacing event-driven auto-scroll with position-based ResizeObserver logic, removing isProcessingToolResult state tracking, updating the dashboard loader to use prefetch with loading-state UI, enhancing recover_stuck_executions with per-workflow timeouts and component workflow cancellation, and adding health-check and auto-restart capabilities to the Convex backend dev script.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title concisely describes the main changes across three key areas: streaming UX improvements, workflow recovery enhancements, and dev health monitoring—accurately reflecting the substantial modifications throughout the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/streaming-ux-workflow-recovery-dev-health

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
services/platform/app/hooks/use-auto-scroll.ts (1)

96-158: ⚠️ Potential issue | 🟡 Minor

Docstring at lines 104-105 contradicts the actual implementation.

The JSDoc says "Only scrolls on content growth, not shrinkage" but the code on line 139 triggers on any height change (changed), and the inline comment at line 133 explicitly explains why shrinkage is also handled. The docstring should be updated to match.

📝 Proposed fix
-   * Only scrolls on content growth, not shrinkage, to avoid unnecessary
-   * adjustments when elements are removed (e.g. streaming cursor).
+   * Scrolls on any content height change (growth or shrinkage) while at
+   * bottom, keeping the user anchored during streaming DOM transitions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/app/hooks/use-auto-scroll.ts` around lines 96 - 158, The
JSDoc above the useEffect in use-auto-scroll.ts is incorrect: update the
docstring that currently claims "Only scrolls on content growth, not shrinkage"
to accurately state that the effect will scroll on any content height change
(growth or shrinkage) while the user is at the bottom; reference the useEffect,
ResizeObserver callback, and variables lastContentHeightRef, wasAtBottomRef, and
programmaticScrollRef in the description so it matches the implementation which
checks changed and wasAtBottomRef before calling container.scrollTo and setting
programmaticScrollRef.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@services/platform/app/hooks/__tests__/use-auto-scroll.test.ts`:
- Around line 97-113: The tests override global.ResizeObserver in the beforeEach
block but never restore it; update the setup to capture the original
ResizeObserver (e.g., const originalResizeObserver = global.ResizeObserver)
before assigning the mock in beforeEach and then restore it in afterEach by
setting global.ResizeObserver = originalResizeObserver (in addition to the
existing vi.restoreAllMocks()); reference the existing beforeEach/afterEach and
the global.ResizeObserver assignment to locate where to add the save/restore
logic.
- Around line 52-56: The test suite currently fires scroll events synchronously
which prevents reproducing the delayed-event race that can consume the
programmaticScrollRef guard; update the mock container used in
use-auto-scroll.test.ts (the createMockContainer and its scrollToSpy) to support
a delayed/async dispatch mode (add setScrollDispatchMode('async') and make
fireScrollEvent dispatch the event after a microtask when in async mode), then
add the regression test that: sets the container geometry, sets scrollTop and
setScrollDispatchMode('async'), mounts the hook via setupStreamingHook with
enabled: true, simulates a user-away programmatic scroll
(container.scrollTop=100; container.fireScrollEvent();
result.current.scrollToBottom()), clears scrollToSpy, grows content via
setScrollGeometry/simulateContentGrowth, awaits a microtask Promise.resolve() to
flush the delayed scroll event, and asserts scrollToSpy was called with { top:
1200, behavior: 'instant' } to ensure programmaticScrollRef is still armed.

In `@services/platform/app/routes/dashboard/`$id.tsx:
- Around line 65-71: The parent layout currently renders nothing when !hasRole
&& !isLoading (shows Outlet when hasRole, Spinner when isLoading, else null);
update this branch to surface minimal UI feedback instead of null: render a
small informed placeholder (e.g., a centered message/card with a short
explanation and optional action like "Request access" or a link back) or trigger
a redirect for non-member state. Locate the conditional using hasRole,
isLoading, Outlet, Spinner and replace the null fallback with the chosen
UI/redirect so users aren’t presented with a blank pane when
getCurrentMemberContext returns null.

In `@services/platform/scripts/dev.mjs`:
- Around line 321-335: In waitForConvex(), the startup log uses a hardcoded port
"3210"; change the console.log to reference the CONVEX_PORT constant (alongside
CONVEX_HOST) so the message matches the actual port used by runCommand; update
the log line that currently prints '[dev] ⏳ Waiting for Convex backend on port
3210...' to interpolate CONVEX_PORT (and CONVEX_HOST if desired) so the message
remains accurate when CONVEX_PORT changes.
- Around line 241-256: The killProcessTree Promise can hang if the process never
emits 'exit'; update killProcessTree to add a safety timeout (e.g., configurable
ms) that will resolve the Promise if neither the 'exit' handler nor kill
callback settles it. Specifically, in killProcessTree (proc, signal = 'SIGKILL')
set a timer after calling kill that will remove the 'exit' listener and resolve
the Promise when it fires, and ensure you clear the timer and remove the
fallback listener when the process actually exits or when kill reports an error;
keep existing behavior for successful exit/error but guarantee the Promise
always settles to avoid blocking shutdown's Promise.all.
- Around line 308-319: The Convex exit handler inside spawnConvex references
shutdown which is declared later and can trigger a TDZ ReferenceError if the
exit fires before shutdown is initialized; fix by hoisting shutdown above
spawnConvex — either move the current const shutdown = async () => {...} so it
appears before function spawnConvex(), or convert it into a hoisted named async
function shutdown() { ... }; ensure any references to convexProcess,
shuttingDown, and restarting used in shutdown remain valid and update any
closures accordingly (spawnConvex, the convexProcess.on('exit', ...) handler,
and waitForConvex interactions).

---

Outside diff comments:
In `@services/platform/app/hooks/use-auto-scroll.ts`:
- Around line 96-158: The JSDoc above the useEffect in use-auto-scroll.ts is
incorrect: update the docstring that currently claims "Only scrolls on content
growth, not shrinkage" to accurately state that the effect will scroll on any
content height change (growth or shrinkage) while the user is at the bottom;
reference the useEffect, ResizeObserver callback, and variables
lastContentHeightRef, wasAtBottomRef, and programmaticScrollRef in the
description so it matches the implementation which checks changed and
wasAtBottomRef before calling container.scrollTo and setting
programmaticScrollRef.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2c7ca4e and ec3028c.

📒 Files selected for processing (16)
  • services/platform/app/features/chat/components/__tests__/incremental-markdown.test.tsx
  • services/platform/app/features/chat/components/chat-messages.tsx
  • services/platform/app/features/chat/components/incremental-markdown.tsx
  • services/platform/app/features/chat/components/message-bubble/__tests__/code-block.test.tsx
  • services/platform/app/features/chat/components/message-bubble/code-block.tsx
  • services/platform/app/features/chat/hooks/__tests__/use-stream-buffer.test.ts
  • services/platform/app/features/chat/hooks/use-message-processing.ts
  • services/platform/app/features/chat/hooks/use-stream-buffer.ts
  • services/platform/app/hooks/__tests__/use-auto-scroll.test.ts
  • services/platform/app/hooks/use-auto-scroll.ts
  • services/platform/app/hooks/use-current-member-context.ts
  • services/platform/app/routes/dashboard/$id.tsx
  • services/platform/convex/workflow_engine/helpers/recovery/recover_stuck_executions.test.ts
  • services/platform/convex/workflow_engine/helpers/recovery/recover_stuck_executions.ts
  • services/platform/convex/workflow_engine/internal_mutations.ts
  • services/platform/scripts/dev.mjs
💤 Files with no reviewable changes (1)
  • services/platform/app/features/chat/hooks/use-message-processing.ts

Comment on lines +52 to +56
const scrollToSpy = vi.fn((opts?: ScrollToOptions) => {
if (opts)
_scrollTop = Math.min(opts.top ?? 0, el.scrollHeight - el.clientHeight);
fireScrollEvent();
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify mock dispatch ordering and absence/presence of async-dispatch coverage.
rg -n -C2 "scrollToSpy = vi\\.fn|fireScrollEvent\\(\\)|queueMicrotask|setTimeout\\(|requestAnimationFrame\\(" services/platform/app/hooks/__tests__/use-auto-scroll.test.ts

Repository: tale-project/tale

Length of output: 1233


🏁 Script executed:

#!/bin/bash
# Find and examine the hook implementation
fd "use-auto-scroll.ts" --type f | grep -v test | head -5

Repository: tale-project/tale

Length of output: 107


🏁 Script executed:

#!/bin/bash
# Read the hook implementation to understand programmaticScrollRef
cat services/platform/app/hooks/use-auto-scroll.ts

Repository: tale-project/tale

Length of output: 6741


Add regression test for delayed scroll-event race condition.

The programmaticScrollRef guard (line 80 and ResizeObserver line 141) prevents async scroll events from falsely updating position state. However, the current mock fires scroll events synchronously (line 55), so the test suite never reproduces the delayed-event race where back-to-back scrollTo() calls cause a guard flag to be consumed by the wrong scroll event.

Add a delayed-dispatch mode to the mock and a dedicated regression test to cover this scenario:

Suggested test-harness change + regression test
 function createMockContainer() {
   const el = document.createElement('div');
   let _scrollTop = 0;
   const scrollListeners: EventListener[] = [];
+  let scrollDispatchMode: 'sync' | 'async' = 'sync';
@@
   const scrollToSpy = vi.fn((opts?: ScrollToOptions) => {
     if (opts)
       _scrollTop = Math.min(opts.top ?? 0, el.scrollHeight - el.clientHeight);
-    fireScrollEvent();
+    if (scrollDispatchMode === 'sync') {
+      fireScrollEvent();
+    } else {
+      queueMicrotask(() => fireScrollEvent());
+    }
   });
@@
   return {
     el,
     scrollToSpy,
@@
+    setScrollDispatchMode(mode: 'sync' | 'async') {
+      scrollDispatchMode = mode;
+    },
it('keeps auto-scroll armed when programmatic scroll event is delayed', async () => {
  const container = createMockContainer();
  container.setScrollGeometry(1000, 400);
  container.scrollTop = 600;
  container.setScrollDispatchMode('async');

  const { result, rerender } = setupStreamingHook(container);
  rerender({ enabled: true });

  // User was away, then uses programmatic scroll-to-bottom
  act(() => {
    container.scrollTop = 100;
    container.fireScrollEvent();
    result.current.scrollToBottom();
  });

  container.scrollToSpy.mockClear();

  // Next growth should still auto-scroll
  container.setScrollGeometry(1200, 400);
  simulateContentGrowth(800);

  await Promise.resolve(); // flush microtask-delayed scroll event
  expect(container.scrollToSpy).toHaveBeenCalledWith({
    top: 1200,
    behavior: 'instant',
  });
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/app/hooks/__tests__/use-auto-scroll.test.ts` around lines
52 - 56, The test suite currently fires scroll events synchronously which
prevents reproducing the delayed-event race that can consume the
programmaticScrollRef guard; update the mock container used in
use-auto-scroll.test.ts (the createMockContainer and its scrollToSpy) to support
a delayed/async dispatch mode (add setScrollDispatchMode('async') and make
fireScrollEvent dispatch the event after a microtask when in async mode), then
add the regression test that: sets the container geometry, sets scrollTop and
setScrollDispatchMode('async'), mounts the hook via setupStreamingHook with
enabled: true, simulates a user-away programmatic scroll
(container.scrollTop=100; container.fireScrollEvent();
result.current.scrollToBottom()), clears scrollToSpy, grows content via
setScrollGeometry/simulateContentGrowth, awaits a microtask Promise.resolve() to
flush the delayed scroll event, and asserts scrollToSpy was called with { top:
1200, behavior: 'instant' } to ensure programmaticScrollRef is still armed.

Comment on lines +97 to +113
beforeEach(() => {
resizeObserverCallback = undefined;
resizeObserverDisconnect = vi.fn();

global.ResizeObserver = class MockResizeObserver {
constructor(callback: ResizeObserverCallback) {
resizeObserverCallback = callback;
}
observe = vi.fn();
disconnect = resizeObserverDisconnect;
unobserve = vi.fn();
} as unknown as typeof ResizeObserver;
});

afterEach(() => {
vi.restoreAllMocks();
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Restore the original global ResizeObserver after each test.

The suite overrides global.ResizeObserver in beforeEach but never restores the original value. This can leak into unrelated tests when environment isolation settings change.

Suggested cleanup hardening
 let resizeObserverCallback: ResizeObserverCallback | undefined;
 let resizeObserverDisconnect: ReturnType<typeof vi.fn>;
+const originalResizeObserver = global.ResizeObserver;
@@
 afterEach(() => {
   vi.restoreAllMocks();
+  if (originalResizeObserver) {
+    global.ResizeObserver = originalResizeObserver;
+  } else {
+    delete (global as { ResizeObserver?: typeof ResizeObserver }).ResizeObserver;
+  }
 });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/app/hooks/__tests__/use-auto-scroll.test.ts` around lines
97 - 113, The tests override global.ResizeObserver in the beforeEach block but
never restore it; update the setup to capture the original ResizeObserver (e.g.,
const originalResizeObserver = global.ResizeObserver) before assigning the mock
in beforeEach and then restore it in afterEach by setting global.ResizeObserver
= originalResizeObserver (in addition to the existing vi.restoreAllMocks());
reference the existing beforeEach/afterEach and the global.ResizeObserver
assignment to locate where to add the save/restore logic.

Comment on lines +65 to +71
{hasRole ? (
<Outlet />
) : isLoading ? (
<div className="flex flex-1 items-center justify-center">
<Spinner size="md" />
</div>
) : null}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, confirm the file exists and check its structure
find . -name "\$id.tsx" -path "*/routes/dashboard/*" 2>/dev/null | head -20

Repository: tale-project/tale

Length of output: 109


🏁 Script executed:

#!/bin/bash
# List the dashboard route directory structure
find services/platform/app/routes/dashboard -type f -name "*.tsx" -o -name "*.ts" 2>/dev/null | sort

Repository: tale-project/tale

Length of output: 2823


🏁 Script executed:

#!/bin/bash
# Look for the specific file and read it
if [ -f "services/platform/app/routes/dashboard/\$id.tsx" ]; then
  cat -n "services/platform/app/routes/dashboard/\$id.tsx"
elif [ -f "services/platform/app/routes/dashboard/[\$]id.tsx" ]; then
  cat -n "services/platform/app/routes/dashboard/[\$]id.tsx"
else
  # Try finding it with a wildcard pattern
  find services/platform -name "*id*" -path "*/dashboard/*" -type f 2>/dev/null | head -10
fi

Repository: tale-project/tale

Length of output: 3630


🏁 Script executed:

#!/bin/bash
# Find the useCurrentMemberContext hook implementation
find services/platform -name "*.tsx" -o -name "*.ts" | xargs grep -l "useCurrentMemberContext" | head -5

Repository: tale-project/tale

Length of output: 351


🏁 Script executed:

#!/bin/bash
# Search for the hook definition
rg -n "export.*useCurrentMemberContext|function useCurrentMemberContext" services/platform/

Repository: tale-project/tale

Length of output: 162


🏁 Script executed:

#!/bin/bash
# Check parent route: /dashboard
cat -n "services/platform/app/routes/dashboard.tsx" 2>/dev/null || cat -n "services/platform/app/routes/dashboard/index.tsx" 2>/dev/null || echo "Not found, searching..."

Repository: tale-project/tale

Length of output: 953


🏁 Script executed:

#!/bin/bash
# Search for error boundaries or redirects in route tree
rg -n "ErrorBoundary|navigate|redirect" services/platform/app/routes/ -t tsx --max-count=50

Repository: tale-project/tale

Length of output: 88


🏁 Script executed:

#!/bin/bash
# Read the useCurrentMemberContext hook
cat -n "services/platform/app/hooks/use-current-member-context.ts"

Repository: tale-project/tale

Length of output: 1298


🏁 Script executed:

#!/bin/bash
# Search for error boundaries or redirects with correct type flags
rg -n "ErrorBoundary|navigate|redirect" services/platform/app/routes/ --type ts --type tsx --max-count=50 2>/dev/null || \
rg -n "ErrorBoundary|navigate|redirect" services/platform/app/routes/

Repository: tale-project/tale

Length of output: 8234


🏁 Script executed:

#!/bin/bash
# Check if there's an errorComponent or errorElement in the $id route definition
rg -A10 "createFileRoute.*dashboard.*id" services/platform/app/routes/dashboard/\$id.tsx

Repository: tale-project/tale

Length of output: 355


🏁 Script executed:

#!/bin/bash
# Find and read the Convex query to see if it silently swallows errors
find services/platform -path "*/convex/*" -name "*getCurrentMemberContext*" -o -path "*/members/queries*"

Repository: tale-project/tale

Length of output: 104


🏁 Script executed:

#!/bin/bash
# Search for getCurrentMemberContext implementation in Convex
rg -n "getCurrentMemberContext" services/platform/convex/ -A5 -B2

Repository: tale-project/tale

Length of output: 623


🏁 Script executed:

#!/bin/bash
# Read the full getCurrentMemberContext implementation
cat -n "services/platform/convex/members/queries.ts" | sed -n '36,80p'

Repository: tale-project/tale

Length of output: 1424


🏁 Script executed:

#!/bin/bash
# Check if there's a try-catch or error handling in the query
rg -A30 "export const getCurrentMemberContext" services/platform/convex/members/queries.ts

Repository: tale-project/tale

Length of output: 745


🏁 Script executed:

#!/bin/bash
# Check if any child routes have their own guards for org membership
rg -n "beforeLoad|errorComponent" services/platform/app/routes/dashboard/\$id/*.tsx | head -20

Repository: tale-project/tale

Length of output: 355


🏁 Script executed:

#!/bin/bash
# Check if the root app has an error boundary or catch-all error handling
find services/platform/app/routes -name "__root.tsx" -o -name "root.tsx" | xargs cat -n 2>/dev/null | head -50

Repository: tale-project/tale

Length of output: 1871


The null fallback reflects intentional Convex design—consider adding UI feedback for clarity.

The parent /dashboard route guards authentication at the session level, redirecting unauthenticated users to /log-in. From there, getCurrentMemberContext returns null intentionally when a user lacks membership in the requested organization (per Convex query convention: failures are swallowed and treated as an expected authorization state). This lands on the null branch when !hasRole && !isLoading.

Unlike the blocking ensureQueryData pattern that would fail navigation itself, prefetchQuery allows the render to proceed. The silent null return is by design, but it does produce a blank pane with no user indication that they are not members of this organization. Some child routes (e.g., /$id/index.tsx) redirect when not in the org, so progression works downstream, but the parent layout offers no feedback.

Consider adding a minimal UI indicator or ensuring all child routes that require membership redirect appropriately, rather than silently rendering empty content at this level.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/app/routes/dashboard/`$id.tsx around lines 65 - 71, The
parent layout currently renders nothing when !hasRole && !isLoading (shows
Outlet when hasRole, Spinner when isLoading, else null); update this branch to
surface minimal UI feedback instead of null: render a small informed placeholder
(e.g., a centered message/card with a short explanation and optional action like
"Request access" or a link back) or trigger a redirect for non-member state.
Locate the conditional using hasRole, isLoading, Outlet, Spinner and replace the
null fallback with the chosen UI/redirect so users aren’t presented with a blank
pane when getCurrentMemberContext returns null.

Comment on lines +241 to +256
function killProcessTree(proc, signal = 'SIGKILL') {
return new Promise((resolve) => {
if (!proc || proc.killed || !proc.pid) {
resolve();
return;
}
const onExit = () => resolve();
proc.once('exit', onExit);
kill(proc.pid, signal, (err) => {
if (err) {
proc.removeListener('exit', onExit);
resolve();
}
});
});
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

killProcessTree can hang indefinitely if the process never emits exit.

The kill callback only calls resolve() on error; on success it relies solely on the exit event. If the process never emits exit (e.g., a zombie or a pathological state), the Promise never settles, blocking shutdown's Promise.all forever.

🛡️ Proposed fix — add a safety timeout
 function killProcessTree(proc, signal = 'SIGKILL') {
   return new Promise((resolve) => {
     if (!proc || proc.killed || !proc.pid) {
       resolve();
       return;
     }
-    const onExit = () => resolve();
+    let settled = false;
+    const settle = () => {
+      if (settled) return;
+      settled = true;
+      clearTimeout(safetyTimer);
+      proc.removeListener('exit', settle);
+      resolve();
+    };
+    const safetyTimer = setTimeout(settle, 5_000);
-    proc.once('exit', onExit);
+    proc.once('exit', settle);
     kill(proc.pid, signal, (err) => {
       if (err) {
-        proc.removeListener('exit', onExit);
-        resolve();
+        settle();
       }
     });
   });
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/scripts/dev.mjs` around lines 241 - 256, The
killProcessTree Promise can hang if the process never emits 'exit'; update
killProcessTree to add a safety timeout (e.g., configurable ms) that will
resolve the Promise if neither the 'exit' handler nor kill callback settles it.
Specifically, in killProcessTree (proc, signal = 'SIGKILL') set a timer after
calling kill that will remove the 'exit' listener and resolve the Promise when
it fires, and ensure you clear the timer and remove the fallback listener when
the process actually exits or when kill reports an error; keep existing behavior
for successful exit/error but guarantee the Promise always settles to avoid
blocking shutdown's Promise.all.

Comment on lines +308 to +319
function spawnConvex() {
convexProcess = spawn('npx', ['--yes', 'convex', 'dev'], {
stdio: 'inherit',
cwd: platformRoot,
env: convexEnv,
});
convexProcess.on('exit', (code) => {
if (shuttingDown || restarting) return;
console.log(`[dev] Convex exited with code ${code}`);
void shutdown();
});
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

TDZ bug: shutdown is referenced before it is defined.

shutdown is declared as const shutdown = async () => {...} at Line 459, after this spawnConvex closure is created. The exit handler at Line 317 calls void shutdown(). During await waitForConvex() (Line 413), the event loop is free to process the Convex exit event. If Convex exits before execution reaches Line 459, the closure evaluates shutdown while it is still in the temporal dead zone, throwing a synchronous ReferenceError — unhandled because the event callback is outside the try/catch.

Move shutdown's declaration above spawnConvex (or convert it to a named async function shutdown() declaration, which is hoisted):

🐛 Proposed fix — hoist `shutdown` as a named async function
+    // Hoist shutdown so spawnConvex's exit handler can safely call it
+    async function shutdown() {
+      if (shuttingDown) return;
+      shuttingDown = true;
+
+      if (healthCheckTimer) clearInterval(healthCheckTimer);
+
+      console.log('\n[dev] 👋 Shutting down...');
+
+      await Promise.all([
+        killProcessTree(convexProcess, 'SIGTERM'),
+        killProcessTree(viteProcess, 'SIGTERM'),
+      ]);
+
+      await new Promise((resolve) => setTimeout(resolve, 1000));
+
+      console.log('[dev] ✅ All processes stopped');
+      process.exit(0);
+    }
+
     function spawnConvex() {
       ...
     }
     ...
-    // Handle shutdown
-    const shutdown = async () => {
-      if (shuttingDown) return;
-      shuttingDown = true;
-      ...
-    };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/scripts/dev.mjs` around lines 308 - 319, The Convex exit
handler inside spawnConvex references shutdown which is declared later and can
trigger a TDZ ReferenceError if the exit fires before shutdown is initialized;
fix by hoisting shutdown above spawnConvex — either move the current const
shutdown = async () => {...} so it appears before function spawnConvex(), or
convert it into a hoisted named async function shutdown() { ... }; ensure any
references to convexProcess, shuttingDown, and restarting used in shutdown
remain valid and update any closures accordingly (spawnConvex, the
convexProcess.on('exit', ...) handler, and waitForConvex interactions).

Comment on lines +321 to +335
async function waitForConvex() {
console.log('[dev] ⏳ Waiting for Convex backend on port 3210...');
await runCommand('npx', [
'--yes',
'wait-on',
`tcp:${CONVEX_HOST}:${CONVEX_PORT}`,
'--timeout',
'180000',
'--interval',
'250',
]);
convexReadyAt = Date.now();
consecutiveFailures = 0;
console.log('[dev] ✅ Convex backend is ready!');
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Hardcoded port 3210 in log message; use the CONVEX_PORT constant.

♻️ Proposed fix
-      console.log('[dev] ⏳ Waiting for Convex backend on port 3210...');
+      console.log(`[dev] ⏳ Waiting for Convex backend on port ${CONVEX_PORT}...`);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async function waitForConvex() {
console.log('[dev] ⏳ Waiting for Convex backend on port 3210...');
await runCommand('npx', [
'--yes',
'wait-on',
`tcp:${CONVEX_HOST}:${CONVEX_PORT}`,
'--timeout',
'180000',
'--interval',
'250',
]);
convexReadyAt = Date.now();
consecutiveFailures = 0;
console.log('[dev] ✅ Convex backend is ready!');
}
async function waitForConvex() {
console.log(`[dev] ⏳ Waiting for Convex backend on port ${CONVEX_PORT}...`);
await runCommand('npx', [
'--yes',
'wait-on',
`tcp:${CONVEX_HOST}:${CONVEX_PORT}`,
'--timeout',
'180000',
'--interval',
'250',
]);
convexReadyAt = Date.now();
consecutiveFailures = 0;
console.log('[dev] ✅ Convex backend is ready!');
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/scripts/dev.mjs` around lines 321 - 335, In
waitForConvex(), the startup log uses a hardcoded port "3210"; change the
console.log to reference the CONVEX_PORT constant (alongside CONVEX_HOST) so the
message matches the actual port used by runCommand; update the log line that
currently prints '[dev] ⏳ Waiting for Convex backend on port 3210...' to
interpolate CONVEX_PORT (and CONVEX_HOST if desired) so the message remains
accurate when CONVEX_PORT changes.

@larryro larryro merged commit 3238a79 into main Feb 26, 2026
18 checks passed
@larryro larryro deleted the fix/streaming-ux-workflow-recovery-dev-health branch February 26, 2026 03:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants