Skip to content

fix(slack): split Text/TextStreaming output filters, fix dropped and duplicated messages#407

Merged
Aaronontheweb merged 29 commits into
devfrom
claude-wt-skills-loading
Mar 24, 2026
Merged

fix(slack): split Text/TextStreaming output filters, fix dropped and duplicated messages#407
Aaronontheweb merged 29 commits into
devfrom
claude-wt-skills-loading

Conversation

@Aaronontheweb

Copy link
Copy Markdown
Collaborator

Summary

  • Add OutputFilter.TextStreaming flag to separate streaming deltas from final assembled text
  • Slack subscribes to Text only (final responses), TUI gets TextStreaming for live rendering
  • Eliminates root cause of both dropped final responses AND duplicate message posting
  • Remove dead streaming code from SlackThreadBindingActor (buffer, delta tracking, flush handler)
  • Fix slash-command IO failure silent fallback (emit error instead of falling through to LLM)
  • Promote Slack event drop/filter logging from Debug to Info level
  • Reclassify empty_text events from Dropped to Filtered counter
  • Batch skill enrichment into single LLM call, run in background to avoid blocking startup

Root Cause

The session actor emitted both TextDeltaOutput (streaming) and TextOutput (final) under the same OutputFilter.Text flag. Slack received both, leading to:

  1. Duplicate posts when no tool calls occurred
  2. Dropped final responses during tool loops (guards that prevented duplicates also blocked new content)

Fix

TextDeltaOutput and BufferFlush now emit under OutputFilter.TextStreaming. Adapters subscribe to one path:

  • Slack: Text | Files — gets final assembled text, posts once
  • TUI: Full — gets both, handles dedup via segment tracking (existing behavior)

Test plan

  • Verify Slack receives final response after multi-step tool loops
  • Verify no duplicate messages in Slack
  • Verify TUI netclaw chat still renders streaming tokens
  • Verify slash-command IO failure shows error instead of passing to LLM

…nges

Three OpenSpec changes to harden skills as a first-class platform capability:

1. compressed-skill-index: Replace verbose GenerateDescriptionMenu() with
   pipe-delimited compressed format, add LLM sidecar for trigger phrase
   generation, and filter skills by session audience and available tools.

2. skill-tools-and-slash-commands: Add skill_load, skill_read_resource, and
   skill_manage tools. Implement slash-command dispatch adopting Claude Code
   invocation model (name = command, disable-model-invocation, user-invocable).

3. trust-tiers-security-stub: Add SkillTrustTier enum with directory-based
   inference, ISkillContentScanner stub, restore skill-authoring system skill,
   and update existing system skill frontmatter with invocation control fields.
…ring skill

Implement the trust-tiers-security-stub OpenSpec change:

- Add SkillTrustTier enum (System/Operator/Community/External/Agent)
- Infer trust tier from directory location in SkillScanner
- Expand hidden-directory scanning to .community, .external, .agent
- Exclude .quarantine from scanning
- Add ISkillContentScanner interface with NoOpSkillContentScanner stub
- Register scanner in DI via SecurityServiceExtensions
- Restore skill-authoring system skill with complete frontmatter spec
  including invocation control fields and trust tier documentation
- Add disable-model-invocation to netclaw-operations skill
- Add 17 new tests covering trust tier inference and directory scanning
- Merge Operator into User tier (self-hosted model doesn't distinguish)
- Add DefaultMinimumAudience() extension mapping tiers to TrustAudience
- Default all tiers to Team minimum; Public requires explicit opt-in
- Consolidate AllowedHiddenDirectories + InferTrustTier into single
  HiddenDirectoryTiers dictionary (single source of truth)
- Update skill-authoring skill docs with new tier table
Replace verbose skill index (~2000 tokens) with pipe-delimited compressed
format (~200-400 tokens). Research from dotnet-skills evals shows compressed
indexes achieve 56.5% TPR vs 21.7% for verbose formats.

- Rewrite SkillRegistry.GenerateDescriptionMenu() to pipe-delimited format
  grouped by category, referencing skill_load instead of file_read
- Add per-audience pre-built menus (Public/Team/Personal) with trust-tier
  visibility filtering via DefaultMinimumAudience()
- Add DisableModelInvocation/UserInvocable/ArgumentHint frontmatter fields
  to SkillEntry and SkillFrontmatter (Claude Code invocation model)
- Create SkillIndexEnrichmentService (IHostedService) that generates
  trigger phrases via LLM sidecar, cached to disk by name+version
- Fallback to truncated description when sidecar unavailable
- Update SystemSkillSyncService to rebuild audience menus after re-scan
- 910 tests passing, zero slopwatch violations
…spatch

Implement the skill-tools-and-slash-commands OpenSpec change:

- Add skill_load tool (Grant=builtin): loads skill by name, returns body
  with frontmatter stripped + resource manifest
- Add skill_read_resource tool (Grant=builtin): reads files from skill's
  references/scripts/assets dirs with path traversal prevention
- Add skill_manage tool (Grant=builtin): 6-action CRUD
  (create/edit/patch/delete/write_file/remove_file) with frontmatter
  validation, atomic writes, content scanner integration, and registry
  re-scan after mutations
- Add slash-command dispatch to SkillRegistry: /name syntax resolves to
  skill, extracts remainder as user content
- Add slash-command interception in LlmSessionActor: matched commands
  inject skill body as transient system message before LLM call;
  unmatched commands return deterministic error listing available commands
- Register skill tools via WithSkillTools() extension method
- Add 7 new slash-command dispatch tests
- 917 tests passing, zero slopwatch violations
Add the critical IncludeNativeLibrariesForSelfExtract and
EnableCompressionInSingleFile flags that match the production CI
pipeline. Missing these causes SQLite native library failures at
runtime. Also adds skill copy step, platform RID table, and common
mistakes reference.
ConfigSchemaDoctorCheck now reports the actual validation errors and
which schema file was loaded when validation fails. Previously it only
said "Config does not match schema" with no diagnostic detail, making
root cause analysis impossible.

Also updates local-binary-swap skill to include Schemas directory copy
step — stale schema at ~/.netclaw/bin/Schemas/ from previous installs
was the root cause of false schema failures after binary swap.
OpenSpec change for security-tui-onboarding:

- Reorder wizard: security posture moves to step 3 (after ChatServices)
- New SecurityPosture step: explicit Personal/Team/Public selection with
  explanatory text, derives shell mode and audience defaults
- New Channels step: per-channel audience assignment with left/right arrow
  cycling, dynamic add/remove via conversations.list API
- Rework ACL step: type-to-filter Slack user search via users.list API
  instead of raw user ID copy-paste
- Remove Exposure step (posture selection replaces it)
- Includes wireframes for all new TUI screens
…narios

- Channels step: skip when Slack disabled, fall back to manual channel
  name entry when conversations.list fails
- ACL step: skip when Slack disabled, fall back to manual ID entry on
  both missing users:read scope and API failure
- Onboarding: explicit scenarios for Slack-disabled skip logic
Update specs and tasks to reuse existing LookupSlackUserTool pagination
and caching logic via extraction into a shared service, rather than
reimplementing users.list in ISlackProbe with raw HTTP. Both the tool
and init wizard consume the same code path.
…inal

Use existing VirtualTerminal + VirtualInputSource pattern from
InitWizardPageTests for headless integration tests of new security
onboarding steps. Covers posture selection, audience cycling, channel
add/remove, navigation, and Slack-disabled fallback paths.
…ble errors

No silent degradation to manual entry when Slack APIs fail. If
conversations.list or users.list fails, show the error reason and
block until the user fixes it (retry with Enter, go back with Esc).
Aligns with the no-silent-fallbacks rule in CLAUDE.md.
Reorder init wizard: Provider → ChatServices → SecurityPosture → ACL →
Channels → Search → BrowserAutomation → Identity → HealthCheck.

- Remove Exposure step (posture selection replaces it)
- Add SecurityPosture step with Personal/Team/Public selection list
- Add Channels step with per-channel audience cycling via left/right
  arrow keys, cursor navigation, d to remove channels
- DeriveSecurityDefaults() pre-populates channel entries from posture
- ChannelEntry class tracks display name, ID, audience, and DM flag
- SyncChannelAudiencesFromEntries() replaces PopulateChannelAudiences()
- Update all navigation tests for new step order
- 239/242 CLI tests passing (3 unrelated health check timeouts)
- Zero slopwatch violations
…ests

- Move webhook URL collection to Identity step (sub-step 4, after timezone)
- Add ViewModel tests: posture derivation, audience cycling, DM row guard
- Add headless TUI tests: SecurityPosture renders options, Channels renders entries
- Descope Slack API search (channel names and user IDs stay as text inputs)
- Update OpenSpec tasks — all complete
- Zero slopwatch violations, all tests passing
… restriction, tool tests

1. Slash-command IO failure now emits deterministic error to user instead
   of silently falling through to LLM (no-silent-fallbacks rule)
2. skill_manage restricted to Personal audience via IsProfileManagedTool —
   Public and Team sessions cannot create/edit/delete skills
3. Add 13 unit tests for SkillLoadTool, SkillReadResourceTool, and
   SkillManageTool covering path traversal, system skill protection,
   name validation, frontmatter validation, patch, delete, and write_file
… ChatServices

Reorder: Provider → SecurityPosture → ChatServices → Channels → Search →
BrowserAutomation → Identity → HealthCheck.

Security posture is now step 2 (right after LLM provider) so it informs
all downstream decisions. ACL (owner identity) folded into ChatServices
as sub-step 6 since it's Slack-specific. Channels step populated via
DeriveSecurityDefaults() when entering the step (after channel names
have been collected in ChatServices).

Removes standalone Acl wizard step entirely. Total steps reduced from
9 to 8.
…g, smart DM audience

- Channels step now populated from raw channel names entered in ChatServices
  (no longer depends on conversations.list resolution happening first)
- 'a' key opens text input to add a channel by name with deduplication
- 'd' key removes focused channel (DM row protected)
- DM audience defaults to Personal when only one allowed user (owner-only),
  otherwise follows posture default
- Removed raw ID column from channel display (unnecessary before resolution)
- Fixed GoBack_ReturnsToPreviousStep test for new step order
…rtup

SkillIndexEnrichmentService.StartAsync was awaiting LLM sidecar calls
synchronously, causing 10-second timeouts per skill when the LLM endpoint
is slow or unreachable. This blocked the hosted service pipeline and caused
the init wizard HealthCheck to time out waiting for daemon readiness.

Now fires enrichment as a background task — daemon starts immediately with
fallback descriptions, enrichment updates the index asynchronously.
Catch OperationCanceledException for clean shutdown and all other
exceptions to prevent unobserved task exceptions from the detached
background enrichment task.
Replace N sequential LLM sidecar calls (10s timeout each) with a single
batch call (30s timeout total). Sends all uncached skill names and
descriptions in one prompt, gets back a JSON object mapping names to
trigger phrases. Parses with markdown fence stripping and case-insensitive
key matching for robustness.

With 4 skills, this reduces worst-case enrichment time from 40s to 30s
and best-case from 4 round-trips to 1.
…ty_text as filtered

- All slack_event_dropped and slack_event_filtered now logged at Info
  level instead of Debug so they appear in default daemon logs
- empty_text events reclassified from Dropped to Filtered — these are
  expected noise (bot echoes, link unfurls), not lost user messages
- Dropped counter should now only reflect genuinely unexpected drops
  (acl_denied, thread_not_initialized)
…dy posted

The _postedThisTurn guard on TextOutput silently discarded all text after
the first post in a turn. This caused the final LLM response to be lost
when the model emitted intermediate "thinking aloud" text during tool
loops. The user saw "Let me check..." but never the actual result.

Remove the guard. Every TextOutput gets posted. If duplicates become a
problem, deduplicate explicitly rather than silently dropping content.
Archived completed changes:
- trust-tiers-security-stub (36/36 tasks)
- compressed-skill-index (46/47 tasks, 1 deferred follow-up)
- skill-tools-and-slash-commands (53/53 tasks)
- security-tui-onboarding (49/49 tasks)

Synced 6 new capability specs: skill-trust-tiers, skill-index-compression,
skill-tools, slash-command-dispatch, security-posture-tui, channel-audience-tui

Merged delta specs into 4 existing specs: netclaw-session (2 deltas),
netclaw-tools, netclaw-onboarding, netclaw-cli
…delivery

The _sawTextDelta flag prevented TextOutput from posting when streaming
deltas had been seen in the same turn. During multi-step tool loops,
intermediate text responses set _sawTextDelta via the streaming path,
causing the final LLM response (sent as TextOutput) to be silently
discarded. This is the second guard (after _postedThisTurn) that was
dropping completed responses.

Remove the guard entirely. TextOutput always posts regardless of
streaming state. The streaming path (TextDeltaOutput → buffer → flush)
handles its own posting independently.
…ping

Restore the _sawTextDelta check on TextOutput to prevent duplicate
posting when content was already delivered via streaming deltas. The
guard now works correctly because BufferFlush resets _sawTextDelta
after each LLM response — so the guard only suppresses the duplicate
TextOutput for the current response, not future responses after
subsequent tool calls in the same turn.

This matches the TUI (ChatPage.cs) behavior which checks per-segment
whether deltas were already rendered before displaying TextOutput.
…plicate delivery

Add OutputFilter.TextStreaming flag for streaming deltas (TextDeltaOutput,
BufferFlush). TextOutput uses the existing Text flag. Adapters subscribe
to one or the other:

- Slack subscribes to Text (final assembled text) — posts once per response
- TUI subscribes to Full (gets both, handles dedup via segment tracking)

This eliminates the root cause of both duplicate posting AND dropped final
responses in Slack. The _sawTextDelta and _postedThisTurn guards are removed
from SlackThreadBindingActor since it no longer receives streaming events.

The streaming buffer, delta tracking, and BufferFlush handler are removed
from the Slack adapter as dead code.
@Aaronontheweb Aaronontheweb enabled auto-merge (squash) March 24, 2026 17:51
@Aaronontheweb Aaronontheweb merged commit 83615a5 into dev Mar 24, 2026
3 checks passed
@Aaronontheweb Aaronontheweb deleted the claude-wt-skills-loading branch March 24, 2026 17:59
@Aaronontheweb Aaronontheweb mentioned this pull request Mar 25, 2026
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant