[pull] main from danny-avila:main#66
Merged
pull[bot] merged 26 commits intoinnFactory:mainfrom Apr 23, 2026
Merged
Conversation
* feat: add inert hook registry and executor scaffolding (phase 1/1) Introduces `src/hooks/` with types, registry, and executor for a 12-event hook lifecycle. Purely additive — not exported from `src/index.ts` and not yet wired into `Run`, `Graph`, or `ToolNode`. Integration lands in the next PR so this change is shippable independently with zero behavioral impact on hosts that don't opt in. - Discriminated-union input/output per event; generic `HookCallback<E>` and `HookMatcher<E>` for type-safe registration. - `HookRegistry` uses `Map<sessionId, bucket>` (not `Record`) to avoid O(n^2) churn under parallel registration in multi-tenant hosts. - `executeHooks` fans out in parallel, races hooks against a combined parent + timeout `AbortSignal` so non-listening hooks still time out, folds decisions with `deny > ask > allow` precedence, accumulates `additionalContext`, and self-removes `once: true` matchers after any successful hook. Errors are non-fatal: swallowed into the aggregated result and routed through an optional winston logger (falling back to `console.warn`), with internal matcher errors suppressed. 47 tests under `src/hooks/__tests__/` cover registry scoping, regex matching, precedence folding, once-self-removal (success/failure/mixed), timeout enforcement (including non-listening hooks), error non-fatality, and parent signal combination. * fix(hooks): address review findings — rename field, wire updatedOutput, cache patterns Resolves the comprehensive review on #87: - Rename `HookMatcher.matcher` -> `HookMatcher.pattern` to fix the self-referential naming (`m.matcher` reads as "the matcher's matcher"). - Wire `PostToolUseHookOutput.updatedOutput` through `AggregatedHookResult` and `fold()`. The type was previously a promise the executor didn't fulfill — a hook returning `{ updatedOutput: ... }` was silently dropped. - Correct the JSDoc on `updatedInput`/`updatedOutput`. The previous copy claimed non-determinism "in Promise.all resolution order" — but `Promise.all` preserves input order, so the fold is deterministic in registration order (outer loop over matchers, inner loop over each matcher's hooks). Updated the `executeHooks` function docstring to match. - Add regex compilation cache and length bound to `matchers.ts`. Patterns compile at most once per unique string and are reused across calls; patterns over MAX_PATTERN_LENGTH (512) are rejected outright as a cheap ReDoS mitigation. Documented the trust model (patterns are trusted input; hosts must validate user-supplied patterns upstream). - Document the wildcard-only semantics on query-less events: a non-empty pattern on an event that doesn't supply a matchQuery (RunStart, Stop, etc.) never matches. - Document the `once: true` concurrent-dispatch limitation: two parallel `executeHooks` calls each snapshot the matcher list, so `once` means "once per call", not "once globally". Matches Claude Code's semantics. Added a test that pins this behavior. - Merge `matchers.filter(...)` + task-build loop into a single pass per AGENTS.md "consolidate sequential O(n) operations." - Scope `eslint-disable no-console` to the single `console.warn` call in `reportErrors` instead of disabling it file-wide. - Fix the timeout-ignoring-hook test: clear the dangling `setTimeout` and assert the error surfaces an abort-shaped label. - Add tests: multi-hook `preventContinuation` (first-writer-wins on stopReason, flag without reason), `updatedOutput` flow + registration order, registration-order determinism for `updatedInput`, pattern length bound, compilation cache hit/miss/clear, concurrent `once` dispatch. Total: 47 -> 59 tests. Findings audited and resolved: - #1 ReDoS: cache + length bound + trust model doc (full fix blocked on host-side validation) - #2 Wrong updatedInput ordering JSDoc: fixed - #3 Dead `updatedOutput` type: wired through - #4 Concurrent `once` race: documented + test - #5 `HookMatcher.matcher` naming: renamed to `pattern` - #6 `WideCallback` dead code: rejected as inaccurate (used in `runHook`) - #7 Eslint-disable scope: line-scoped - #8 File-path comments: rejected; matches existing project convention - #9 Two-pass filter: single pass - #10 Regex recompilation: cached - CL-4 Undocumented wildcard semantics: documented - RT-3 Timeout test dangling timer: fixed + error content verified - RT-4 No multi-hook preventContinuation test: added * perf(hooks): atomic once, bounded LRU, ReDoS heuristic, shared signal Rework the phase-1 hooks scaffolding for the multi-tenant scale we actually operate at. These are behavior changes (semantics of `once` tightened, new pattern-rejection path) but none of them are wired into the runtime yet — this PR is still inert — so the blast radius is the hooks directory only. ## Atomic `once: true` Removal now happens synchronously inside `executeHooks`, between `registry.getMatchers` and the first `await`. Node's event loop serialises sync work, so two concurrent `executeHooks` calls can never both observe the same `once` matcher — whichever call runs its sync prefix first consumes it, and the loser sees an empty bucket. Semantic change: `once` is now at-most-one-dispatch, not at-most-one-successful-execution. If every hook in a `once` matcher throws, the matcher is still gone. The old retry-on-failure behavior was Claude Code's model — fine for a single-user CLI, but in a multi-tenant server it opens a window where concurrent bursts can re-dispatch a "once" hook non-idempotently. "At most once, ever" is the correct semantic for a shared registry. Hosts that need retry should register a normal matcher and self-unregister via the callback returned from `registry.register`. Dropped `collectOnceMatchersForRemoval` + its Map/array allocations — the post-hoc collection step is no longer needed. Tests: replaced "keeps the matcher when every hook throws" with "removes the matcher even when every hook in it throws". Added "fires exactly once across 3 concurrent calls" and "fires exactly once across 8-way concurrent dispatch when hooks are slow" (the slow test deliberately uses a 10ms hook to force overlap). ## Bounded LRU regex cache `matchers.ts` was unbounded. Under multi-tenant use where different tenants register different patterns, the cache would leak. Capped at `MAX_CACHE_SIZE = 256` entries with LRU eviction: hits refresh position (delete-then-set), misses at capacity evict the oldest key. Failed compiles are cached the same way so a tenant spamming bad patterns doesn't burn CPU on every call. Exposed `getMatcherCacheSize` and `MAX_CACHE_SIZE` for test observability. Tests verify eviction order (cold patterns recompiled after overflow) and that hot patterns refreshed mid-stream survive eviction pressure. ## ReDoS heuristic: nested-quantifier detector `hasNestedQuantifier` walks a pattern linearly, tracks paren depth, and rejects any pattern where a quantified group contains another quantifier — the `(a+)+`, `(.*)*`, `(\w+)+` shape that is the #1 catastrophic-backtracking source. Character classes (`[*+?]`) and escaped quantifiers (`\+`) are correctly ignored. Rejection happens before `new RegExp`, so adversarial input never reaches the compiler. This is a floor, not a ceiling — it doesn't catch ambiguous-alternation ReDoS like `(a|a)+`. Hosts that accept user-supplied patterns must still validate upstream. Documented in the `HookMatcher.pattern` and `matchesQuery` JSDoc. Test verifies that `(a+)+$` on a 35-char adversarial input resolves in under 50ms (would be seconds without the heuristic). ## Shared `AbortSignal` per matcher Was: one `AbortSignal.timeout(ms)` + one `AbortSignal.any([parent, timeout])` per hook. For a matcher with N hooks and the same timeout, that's N timers + N composite signals. Now: one signal per matcher, shared across all its hooks. Since hooks in a matcher already share `matcher.timeout`, they may as well share the timer. Behavior is identical (all hooks fire and race the same deadline) but allocation drops from N to 1 per matcher. ## Tests 72 passing (was 59, +13): - Atomic once: 2 concurrent tests + 1 failure-removal test updated - LRU: cache size cap, eviction order, hit refresh - Nested quantifier: classic shapes, deep nesting, escaped chars, character classes, false positives on legitimate patterns - ReDoS mitigation: rejection path, <50ms response on adversarial input * fix(hooks): fix ReDoS heuristic false-positives on (?:...) groups Resolves the follow-up review on #87. Four findings, all valid. ## #1 MAJOR — hasNestedQuantifier false-positives on group syntax The previous scanner treated `?` as a quantifier at any depth > 0, but `?` immediately after `(` is group syntax, not a quantifier: `(?:...)`, `(?=...)`, `(?!...)`, `(?<name>...)`, `(?<=...)`, `(?<!...)`. Patterns like `(?:pre_)?tool_name` were silently rejected and the registering hook never fired — with no error surfaced to the caller. Fix: explicit group-syntax prefix skip inside the `(` handler. When the scanner enters a group it peeks for `?` and advances past the modifier characters (`:`, `=`, `!`, `<=`, `<!`, or a named-group label `<name>`) before processing the group body. The `?` is therefore never considered a quantifier at the start of a group. While in there I also fixed a second correctness bug the reviewer didn't catch: the old depth-indexed array didn't propagate a child group's "contains quantifier" signal up to its parent, so `(?:(a+))+` (outer quantifier over a wrapped quantified group, equivalent to `(a+)+`) escaped detection. The scanner now uses an explicit stack of `QuantifierFrame` records, and when a child group closes it propagates `hasBacktrackRisk` to the parent frame — whether the child itself was quantified or not. This catches `(?:(a+))+` correctly while still allowing non-risky wrappers like `(?:(ab))+`. Added 9 tests covering non-capturing groups, lookaheads, lookbehinds, named captures, `(?:(a+))+` risk propagation, and `((ab)+)+` deep wrapping. Verified existing ReDoS-detection tests still pass. ## #2 MINOR — executeHooks.test.ts did not clear the pattern cache Matcher cache is module-level. `matchers.test.ts` clears it in `beforeEach`; `executeHooks.test.ts` did not, so patterns compiled during one test persisted across subsequent tests in the same file. Added `clearMatcherCache()` to `executeHooks`'s `beforeEach` — no test failures today, but restores test independence. ## #3 MINOR — Test utilities leaked into production barrel `clearMatcherCache` and `getMatcherCacheSize` are test-only helpers (their JSDoc says so) but were exported from `src/hooks/index.ts`. When the integration PR eventually re-exports `src/hooks` from `src/index.ts` they would become public API. Removed both from the barrel. Tests already import them directly from `../matchers`, so no test changes needed. `hasNestedQuantifier`, `MAX_PATTERN_LENGTH`, and `MAX_CACHE_SIZE` remain exported — hosts can legitimately use them for upstream validation. ## #4 NIT — LRU refresh overhead at low cache pressure Every `compile()` cache hit was doing `delete + set` to refresh LRU position, even when the cache was nowhere near capacity. With `MAX_CACHE_SIZE = 256` and typical working sets of tens of patterns, eviction pressure is near-zero and the refresh is pure overhead. Added a `LRU_REFRESH_THRESHOLD = 75%` gate: refresh only runs when the cache is above 192 entries. Below that the code fast-paths straight back from `compile()`. Above it the LRU semantics kick in so hot patterns still survive evictions — the existing "refreshes LRU position on hit" test still passes because by the time it runs the check, the cache is at capacity. Tests: 72 -> 81 (+9 for group-syntax and risk-propagation coverage).
* feat: wire run-level hooks into processStream (phase 1/2) Fires four hook lifecycle events from `Run.processStream`, making the inert hooks module from PR #87 live. Hosts pass a pre-constructed `HookRegistry` via `RunConfig.hooks`; when absent, all hook dispatch is skipped (zero overhead for non-adopters). ## Integration points - `RunConfig.hooks?: HookRegistry` — new optional field, mirrors the existing `customHandlers` pattern - `Graph.hookRegistry` — field alongside `handlerRegistry`, cleared in `clearHeavyState()` - `Run` constructor wires the registry onto both `this.hookRegistry` and `this.Graph.hookRegistry` ## Hook fire points in `processStream` 1. **RunStart** — after config setup, before stream creation. Receives `inputs.messages`, `runId`, `threadId`, `agentId`. 2. **UserPromptSubmit** — immediately after RunStart. Extracts prompt text from the last human message. If the hook returns `{decision: 'deny'}` or `{decision: 'ask'}`, the run aborts early and returns `undefined`. The `ask` case is an abort in v1 — the SDK returns the decision and the host decides how to implement interactive approval (see Phase 2 plan in the skills design report). 3. **Stop** — at the end of the try block, after the for-await loop. Receives the accumulated messages from `Graph.getRunMessages()`. Only fires on successful completion (not on error). 4. **StopFailure** — in a new catch block between try and finally. Receives the error message and last assistant message. Hook errors are swallowed (`.catch(() => {})`) so the original error propagates. ## Session teardown `hookRegistry.clearSession(runId)` is called in the finally block, guaranteeing cleanup even on error. Fulfills §3.9 rule #5 from the design report. ## Public API `src/hooks/` is now re-exported from `src/index.ts`. Hosts can import `HookRegistry`, `executeHooks`, and all hook types directly from `@librechat/agents`. ## Helper functions Three module-level functions added to `run.ts`: - `findLastHumanMessage` — backward scan for `getType() === 'human'` - `findLastAssistantMessage` — backward scan for `getType() === 'ai'` - `extractPromptText` — handles string and complex content arrays ## Tests 10 new integration tests using `Run.create` + `overrideTestModel` + `processStream` (same pattern as `custom-event-await.test.ts`): - RunStart fires with correct fields - UserPromptSubmit extracts prompt, deny aborts, ask aborts - Stop fires on success, does not fire on error - StopFailure fires on error, preserves original exception - Session cleared in finally (both success and error paths) - No-hooks baseline works identically Total: 81 existing + 10 integration = 91 passing. * fix(hooks): address PR 2 review — DRY helpers, test coverage, guards Resolves all findings from the comprehensive review on #88. - #1 MAJOR: Add 3 tests for `extractPromptText` — multi-part content (text + image blocks yields 'hello\nworld'), image-only content (yields ''), and empty content array (yields ''). - #2 MINOR: Merge `findLastHumanMessage`/`findLastAssistantMessage` into parameterized `findLastMessageOfType(messages, type)`. - #3 MINOR: Rewrite `extractPromptText` array path from `.filter().map().join()` triple-pass to single `for...of` loop accumulating into `parts[]` then `.join('\n')`. - #4 MINOR: Add `config.callbacks = undefined` on deny/ask early return path so Langfuse handler reference is cleaned up even though no stream ran. - #5 MINOR: Add comment on `UserPromptSubmit` fire point noting `attachments` is not yet wired (Phase 2). - #6 MINOR: Strengthen `StopFailure` test assertion from `toBeTruthy()` to `typeof === 'string'` + `length > 0`. - #7 MINOR: Add test for empty-content human message — verifies `UserPromptSubmit` fires with `prompt: ''`. - #8 NIT: Add inline comment on `stopHookActive: false` explaining it will be `true` in Phase 2 when stop is triggered by a hook. - #9 NIT: Use `hasHookFor('Stop', sessionId)` and `hasHookFor('StopFailure', sessionId)` guards before `getRunMessages()` to avoid the array copy when no hooks are registered for the event. - CL-2: Wrap Stop hook dispatch in its own `.catch()` so an infrastructure error in `executeHooks` (theoretical — it catches internally) cannot masquerade as a stream failure in the catch block. Tests: 91 -> 94 (+3 for extractPromptText coverage).
* feat: wire tool hooks into event-driven ToolNode (phase 1/3) Fires PreToolUse, PostToolUse, PostToolUseFailure, and PermissionDenied hooks inside `ToolNode.dispatchToolEvents` — the single chokepoint for all event-driven tool execution in LibreChat. ## Hook lifecycle in dispatchToolEvents 1. **PreToolUse** fires per call in parallel before ON_TOOL_EXECUTE dispatch. Denied calls (deny or ask) produce error ToolMessages and fire PermissionDenied; surviving calls proceed with optional updatedInput applied to tool args. 2. Only approved calls are batched into ON_TOOL_EXECUTE and sent to the host. If all calls are denied, the batch dispatch is skipped entirely. 3. **PostToolUse** fires per successful result. If a hook returns updatedOutput, the ToolMessage content is replaced before the message is appended to the graph. 4. **PostToolUseFailure** fires per error result. Observational only — cannot modify the error message. Post hooks are awaited (not fire-and-forget) so updatedOutput takes effect before the ToolMessage is consumed. Hook errors are swallowed via .catch() to avoid disrupting the tool result flow. PermissionDenied is fire-and-forget since it's purely observational. ## Plumbing - `ToolNodeOptions.hookRegistry?: HookRegistry` — new optional field - `ToolNode` constructor stores `this.hookRegistry` - `Graph.createToolNode()` passes `this.hookRegistry` at both construction sites (event-driven and traditional) - `hookRegistry` is set on Graph BEFORE `createWorkflow()` is called (moved from the post-create assignment in Run constructor into `createLegacyGraph` and `createMultiAgentGraph`) so ToolNode receives a non-undefined registry at construction time ## hasHookFor guards All four hook fire points use `hasHookFor(event, runId)` before calling `executeHooks`, so when no hooks are registered for the event the overhead is a single Map lookup + array length check — no Promise allocations, no function calls. ## Tests 9 new integration tests in `src/hooks/__tests__/toolHooks.test.ts` using event-driven mode (toolDefinitions + ON_TOOL_EXECUTE handler): - PreToolUse fires with correct fields - PreToolUse deny blocks execution (tool handler never called) - PreToolUse ask blocks execution (v1) - PreToolUse updatedInput rewrites args before dispatch to host - PreToolUse hook errors are non-fatal (tool still executes) - PermissionDenied fires after deny with reason - PostToolUse fires with tool output - PostToolUseFailure fires on error result - No-hooks baseline works identically Total: 94 existing + 9 new = 103 passing. * fix(hooks): address PR 3 review — step completion, ordering, catch, tests Resolves all 10 findings from the review on #90. ## Structural rewrite of dispatchToolEvents The post-processing loop was rewritten to fix #1, #2, #6, #8, #10: - #1 MAJOR: Denied calls now dispatch ON_RUN_STEP_COMPLETED via a shared `dispatchStepCompleted` helper (extracted from the duplicate logic in the executed-results loop). Without this, the frontend would show stuck spinners for denied tool calls. - #2 MAJOR: Output messages are now collected in a Map keyed by tool_call_id, then returned in the original `toolCalls` input order via `toolCalls.map(call => messageByCallId.get(call.id))`. Before, denied messages were prepended to executed messages, breaking the implicit ordering contract. - #6 MINOR: `toolUsageCount` is now incremented only for approved calls (moved from the initial `preToolCalls.map` into the `approvedEntries.map` that builds ToolCallRequests). Denied calls no longer inflate the turn counter. - #8 MINOR: Eliminated the double `result.status === 'error'` check. ToolMessage is now constructed inside each status branch directly after hook dispatch. - #10 NIT: `requests.find()` replaced with a pre-built `Map<id, ToolCallRequest>` for O(1) lookup per result, per AGENTS.md guidance on Map/Set over Array.find. ## Error protection - #3 MAJOR: PreToolUse `Promise.all` now wraps each `executeHooks` call in `.catch(() => emptyResult)` so an infrastructure error in the hooks subsystem cannot crash the entire tool batch. Matches the stated "hook errors are non-fatal" contract and is consistent with PostToolUse/PostToolUseFailure which already had `.catch()`. ## New tests - #4 MAJOR: Added `updatedOutput replaces the ToolMessage content` — registers a PostToolUse hook returning `{updatedOutput: 'REDACTED'}`, verifies the ToolMessage content in the graph's run messages is `'REDACTED'` (not the original output). - #7 MINOR: PermissionDenied test replaced 50ms `setTimeout` with a `Promise` resolved by the hook callback. No more flaky sleep. ## Cleanup - #9 NIT: Updated stale "once the tool-hook PR lands" comment in `src/hooks/index.ts`. Tests: 103 -> 104 (+1 for updatedOutput). * fix(hooks): off-by-one in dispatchStepCompleted index, test guard, turn fallback - #1 MAJOR: dispatchStepCompleted read post-increment toolUsageCount for the `index` field. Added `turn?: number` parameter; approved-call site passes `request?.turn` (the pre-increment value), denied-call site omits it (fallback is correct since count was never bumped). - #2 NIT: updatedOutput test now asserts `expect(toolMsg).toBeDefined()` before content extraction so a missing tool message produces a clear failure rather than "Expected undefined to be 'REDACTED'". - #3 NIT: PreToolUse input `turn` field restored `?? 0` fallback so first-use tools see `turn: 0` instead of `turn: undefined`. * fix(hooks): batch tests, turn JSDoc, freeze sentinel, remove unused hookRegistry Addresses follow-up review findings on #90. - #1 MAJOR: Add multi-call batch tests — partial deny (echo denied, math approved, order preserved) and all-denied (ON_TOOL_EXECUTE handler never called). Uses counter-based IDs (#7 fix) to avoid Date.now() collisions across calls in the same test. - #2 MINOR: Document PreToolUseHookInput.turn semantics — within a single batch, all calls to the same tool share the same turn value. Per-call discrimination within a batch is not supported in v1. - #3 MINOR: Remove hookRegistry from the traditional (non-event-driven) ToolNode construction site in Graph.ts. Hooks only fire in dispatchToolEvents; passing the registry to a ToolNode that never uses it created a false impression of hook support. - #4 MINOR: Add PostToolUse error non-fatality test — throwing hook does not crash the batch, original content is preserved. - #5+#8 MINOR: Freeze the hook-error fallback sentinel via Object.freeze and rename from emptyResult to HOOK_FALLBACK. Prevents future mutation bugs if someone pushes to the shared arrays. - #6 MINOR: Document in ToolNodeOptions.hookRegistry JSDoc that hooks only fire for event-driven calls — directToolNames bypass dispatch. Tests: 104 -> 107 (+3 for batch partial-deny, all-denied, and PostToolUse error resilience). * nit: simplify test assertions, keep shallow freeze (deep conflicts with mutable type)
Phases 1+2 of the Skills system, rebased on hooks PRs (#88, #90): Types: - InjectedMessage type in tools.ts (generic, any tool can use it) - SkillCatalogEntry type in skill.ts - ToolExecuteResult.injectedMessages field - Constants.SKILL_TOOL = 'skill' SkillTool (src/tools/SkillTool.ts): - JSON Schema (single source of truth), LCTool definition, createSkillTool() - Runtime-agnostic description (files "may" become accessible) - Requires event-driven execution mode ToolNode integration (src/tools/ToolNode.ts): - convertInjectedMessages() converts to HumanMessage (both roles, avoids Anthropic/Google SystemMessage rejection). Original role in additional_kwargs. - dispatchToolEvents returns { toolMessages, injected } tuple - Injected messages placed AFTER ToolMessages (provider ordering) - try/catch isolation around injection conversion - Works with hooks-based PreToolUse/PostToolUse lifecycle Catalog formatter (src/tools/skillCatalog.ts): - formatSkillCatalog() with truncation ladder and budget enforcement - fitNamesOnly drops trailing entries when names-only exceeds budget Tests: 30 total (18 SkillTool + 12 skillCatalog)
* feat: add local bash execution and bash programmatic tool calling tools Implement BashExecutor for local bash command execution via child_process and BashProgrammaticToolCalling for multi-tool orchestration using bash scripts with a local HTTP server for tool IPC via curl. * fix: use remote Code API for bash tools instead of local child_process BashExecutor now sends `lang: 'bash'` to the same /exec endpoint as CodeExecutor. BashProgrammaticToolCalling sends bash code to /exec/programmatic with the same round-trip loop as PTC. Both tools share session/file handling with their counterparts. ToolNode session injection and storage updated for all four code execution tool names.
Updated the constant name from EXECUTE_BASH to BASH_TOOL in the Constants enum and adjusted references in BashExecutor and ToolNode to reflect this change. This improves clarity and aligns with naming conventions.
When the skill handler uploads files to the code env and returns
artifact: { session_id, files }, ToolNode now stores that session
context (via storeCodeSessionFromResults). This makes skill-primed
files available to subsequent bash/code tool calls in the same run.
Also adds SKILL_TOOL to the codeSessionContext request building in
dispatchToolEvents, so the handler receives the current session
context and can check if files are already uploaded.
…95) * feat: add initialSessions to RunConfig for seeding Graph session state Allows hosts to seed the Graph's ToolSessionMap at run creation time. Used by skill file re-priming: after uploading skill files to the code env at run start, the host passes the session info so ToolNode can inject session_id + files into subsequent bash/code tool calls. * refactor: add CODE_EXECUTION_TOOLS set, DRY ToolNode checks Add CODE_EXECUTION_TOOLS ReadonlySet to common/enum.ts containing all four code execution tool constants (EXECUTE_CODE, BASH_TOOL, PROGRAMMATIC_TOOL_CALLING, BASH_PROGRAMMATIC_TOOL_CALLING). Replace 5 repeated 4-constant inline checks in ToolNode with CODE_EXECUTION_TOOLS.has(). SKILL_TOOL remains a separate check where needed since it's not a code execution tool itself.
When a skill is invoked, the body is injected as a HumanMessage into LangGraph state but NOT persisted to conversation history. On follow-up runs the skill body is lost. formatAgentMessages now accepts an optional invokedSkillBodies Map (skillName → body) and reconstructs HumanMessages at the correct position in the message sequence — after the skill's ToolMessage, matching where ToolNode originally injected them. Detection happens inside the existing tool_call processing loop (both with and without tools filtering), so there are zero extra message iteration passes.
Critical fix: - Capture endMessageIndex BEFORE skill body HumanMessage injection so injected messages are excluded from the assistant message's token distribution range. Prevents indexTokenCountMap corruption. Review fixes: - Remove dead else-if(!discoveredTools) branch inside if(discoveredTools) block and revert redundant optional chaining - Extract extractSkillName() helper to eliminate DRY violation (identical args parsing IIFE was duplicated in two locations) - Use Set<string> instead of string[] for pendingSkillNames (dedup) - Lazy allocation via ??= (only allocated when skills are present)
* fix: resolve review findings for hooks & skills PR - Fix stale JSDoc referencing non-existent future PR (run.ts) - Use requestMap (Map.get) instead of Array.find in storeCodeSessionFromResults - Extract shared updateCodeSession helper to eliminate duplicated session storage logic - Add sync critical section markers around once-matcher removal in executeHooks - Optimize fitNamesOnly from O(n²) to O(n) with running character sum - Widen ReDoS timing assertion from 50ms to 200ms for CI stability - Add test verifying ON_RUN_STEP_COMPLETED dispatch for denied tool calls * test: assert event shape in deny step completion test
) New READ_FILE constant and ReadFileToolDefinition for a general-purpose file reader. Schema: { file_path: string }. Skill files use the path format {skillName}/{path} (e.g. "pdf-analyzer/src/utils.py"). The tool is event-driven only (host handles via ON_TOOL_EXECUTE). Works without code execution for skill files (DB/storage reads). With code env, also supports code execution output files.
* feat: wire PreCompact/PostCompact hooks around summarization node Fires PreCompact and PostCompact hooks inside `createSummarizeNode`, the last two SDK-side hook events. These are auditing hooks — hosts can archive the full transcript before compaction, observe the summary produced, or log compaction events for telemetry. ## Fire points - **PreCompact** fires after ON_SUMMARIZE_START, before the LLM summarization call. Receives `messagesBeforeCount` and the trigger type from the agent's summarization config. - **PostCompact** fires after ON_SUMMARIZE_COMPLETE, after the summary is generated and enriched. Receives the summary text and `messagesAfterCount: 0` (the node removes all messages; the summary is injected into the system prompt, not as a message). Both hooks are observational and non-blocking — errors are swallowed via `.catch()`. Results are not consumed. ## Type fix `PreCompactHookInput.trigger` was defined as `'threshold' | 'manual' | 'error'` (from the initial PR #87) but those don't match the real summarization trigger types. Updated to: `'token_ratio' | 'remaining_tokens' | 'messages_to_refine' | 'default'` with `(string & {})` for forward compatibility. `'default'` covers the case where no trigger is configured and compaction fires on any pruning. ## Plumbing `hookRegistry` added to `CreateSummarizeNodeParams.graph` interface. `Graph.ts` passes `this.hookRegistry` at the `createSummarizeNode` call site, same pattern as ToolNode. ## Tests 4 new integration tests using `FakeListChatModel` for the summarization provider (no API keys required): - PreCompact fires with correct messagesBeforeCount and trigger - PostCompact fires with summary text - Hook errors don't crash compaction - No-hooks baseline works identically Total: 108 existing + 4 new = 112 passing. * fix(hooks): add threadId to compaction hooks, strengthen tests, fix imports Resolves review findings on #103. - #1 MAJOR: Add threadId to both PreCompact and PostCompact hook inputs. PreCompact extracts from the node's config parameter, PostCompact from runnableConfig. Restores consistency with all other hook sites in run.ts. - #2 MINOR: Assert trigger field in PreCompact test — verifies the 'default' fallback when no trigger is configured. - #3 MINOR: Strengthen PostCompact summary assertion from length > 0 to toContain('Summary') against the known FakeListChatModel response. - #4 MINOR: Add PostCompact error resilience test — throwing hook in dispatchCompletionEvents does not crash compaction. - #5 MINOR: Fix import order in node.ts per AGENTS.md (type imports longest→shortest, value imports longest→shortest). - #6 NIT: Update hooks/index.ts barrel comment to list createSummarizeNode as third consumer. - #8 NIT: Assert agentId in both PreCompact and PostCompact tests. Tests: 112 -> 113 (+1 PostCompact error resilience). * test: assert threadId propagation and exact summary match in compaction hooks * fix: use runnableConfig for PreCompact threadId extraction (matches PostCompact)
#104) * feat: add subagent-as-tool primitive for hierarchical agent delegation Adds a SubagentTool that lets a parent agent delegate tasks to child agents with isolated context windows. Verbose tool outputs stay in the child's context; only a filtered text summary returns to the parent. Key components: - SubagentConfig type on AgentInputs for declaring child agent types - SubagentExecutor: creates child StandardGraph, invokes with isolated state, filters result (strips tool_use/thinking blocks), fires SubagentStart/SubagentStop hooks - SubagentTool: LCTool definition with dynamic enum from configs - Graph integration: tool created in createAgentNode() via graphTools, automatically becomes a direct tool (bypasses event-driven dispatch) - Self-spawn support: reuse parent's AgentInputs for context isolation without separate agent config Wires the existing SubagentStart (with deny support) and SubagentStop hook types that were defined but had zero fire points. * fix: resolve review findings for subagent primitive Address all 6 issues from PR review: 1. Add subagentHooks.test.ts — end-to-end hook integration test through real Run pipeline (SubagentStart payload, SubagentStop messages, deny blocks execution). 3 new tests. 2. Remove `as unknown as` cast — follow MultiAgentGraph pattern (init graphTools as empty array, push directly). 3. Deduplicate schema — extract buildSubagentToolParams() in SubagentTool.ts, use it in Graph.ts. createSubagentToolDefinition() now delegates to it. Single source of truth for schema/description. 4. Propagate threadId — tool callback extracts thread_id from LangGraph config.configurable, passes to executor.execute(). Hooks now receive correct threadId. 5. Pass tokenCounter to child graph — added to SubagentExecutorOptions, threaded from agentContext.tokenCounter through to child StandardGraph constructor. 6. Pass signal in invoke() config — workflow.invoke() now receives signal for more responsive abort between graph steps. * fix: resolve second-pass audit findings for subagent primitive Address 12 valid findings from the audit — 3 MAJOR, 6 MINOR, 3 NIT: MAJOR fixes: 1. Depth tracking now works across graph boundaries. Previously the `depth`/`maxDepth` check only worked within a single executor; when `allowNested: true`, the child graph spawned its own executor at depth=0, making `maxSubagentDepth` non-functional. Fix: decrement `maxSubagentDepth` in `buildChildInputs`, so the child's own executor inherits a smaller budget and naturally blocks further nesting when it reaches 0. Removed the now-vestigial `depth` field. 2. Detect duplicate `SubagentConfig.type` values. `Map` construction silently dropped duplicate keys. Now `resolveSubagentConfigs` throws with a clear error. 3. Add test coverage for `allowNested: true`: verifies depth is decremented across boundaries, child inherits subagentConfigs, and budget is clamped to >= 0. MINOR fixes: 4. Runtime guard for empty `description` in the tool callback — the LLM can send undefined/empty values; now we default to a safe fallback string. 5. Updated `src/hooks/index.ts` barrel comment to list SubagentExecutor as the fourth hook consumer (SubagentStart/SubagentStop). 6. Extracted property description strings as constants in SubagentTool.ts; `SubagentToolSchema` and `buildSubagentToolParams` now share a single source of truth. 7. Changed SubagentStop from fire-and-forget to awaited, matching the PostCompact pattern. Removes fragile `setTimeout` synchronization in tests. Errors still swallowed (observational). 8. Truncate error messages from child graph to 200 chars to prevent leaking internal details (stack traces, API errors) to the parent LLM. 9. Document that `ask` decision is treated identically to `deny` in the non-interactive subagent context. NIT fixes: 11. Import order in SubagentExecutor.ts — `@/types` (multi-line, longest) now comes first in the local types sub-group. 12. Removed unused `contentParts` destructuring in verification script. 13. Fixed unsafe `msg.content as string` cast in verification script; now handles array content correctly. Skipped #10 (NIT): `_sourceInputs` underscore prefix is an accepted TS convention for cross-file-accessible-but-semantically-internal fields. Test count: 1074 passed (up from 1065, +9 new tests), 0 regressions. * fix: address third-pass audit findings for subagent primitive Resolve 3 NITs from the second review pass plus 2 new codex comments: NIT #1 — Tighten truncation test: previously asserted only `result.content.length < 500`. Now asserts exact envelope of 219 chars ("Subagent error: " prefix + 200 body + "..." suffix) so that regressions in ERROR_MESSAGE_MAX_CHARS are caught. Added companion test verifying short messages are not truncated. NIT #2 — Skip tool creation when maxSubagentDepth: 0. When a host set `maxSubagentDepth: 0` alongside `subagentConfigs`, the tool was still bound to the LLM, and every invocation returned the depth- exceeded error. The LLM wasted a turn. Now the `createAgentNode` guard includes `effectiveSubagentDepth > 0`, so the tool is not created at the leaf level. Added integration test. NIT #3 — Document buildChildInputs audience. Added @remarks block explaining this is an advanced utility exported primarily for testing and internal use; hosts configuring subagents should not call it directly. Codex P1 (false positive, clarified): maxSubagentDepth propagation across graph boundaries was already correct via the countdown in buildChildInputs, but it wasn't obvious from the call site. Added a multi-line JSDoc comment above the executor construction in createAgentNode explaining how depth consumes across boundaries through the decremented AgentInputs.maxSubagentDepth. Codex P2 (real bug, fixed): toolSchemaTokens did not include the subagent tool's schema. `calculateInstructionTokens()` was called in `fromConfig()` before graphTools was populated, so the later- injected subagent tool's schema was missing from the budget. Fixes: 1. Include `graphTools` in the iteration in `calculateInstructionTokens` (also helps handoff tools, a pre-existing issue with the same root cause). 2. Re-trigger the calculation at the end of `createAgentNode` after the subagent tool is pushed, so the token count reflects the final graphTools state before `createCallModel` awaits `tokenCalculationPromise`. Added test asserting `toolSchemaTokens` is larger for agents with subagents vs without, confirming the recalculation runs. Test count: 1077 passed (up from 1074, +3 new tests), 0 regressions. * fix: break circular dep between SubagentExecutor and StandardGraph Rollup warned on build: Graph.ts → subagent/index.ts → SubagentExecutor.ts → Graph.ts SubagentExecutor needed `StandardGraph` at runtime to construct the child graph; Graph.ts needs SubagentExecutor to attach the tool. With preserveModules, these land in separate chunks producing circular chunk imports with undefined execution order. Fix: dependency injection. SubagentExecutor now takes a required `createChildGraph: ChildGraphFactory` option and only imports `StandardGraph` as `import type` (erased at build). Graph.ts passes `(input) => new StandardGraph(input)` when constructing the executor. Runtime module graph is now acyclic; `import type` does not appear in emitted JS. Test updates: replaced `jest.spyOn(GraphModule, 'StandardGraph')` (which relied on the runtime import we just removed) with direct mock factories passed through the new option. Cleaner — tests no longer need to reach into module internals. Same coverage. Build: `npx rollup -c` now emits zero warnings. Test count: 1077 passed, 0 regressions. * chore: add multi-agent-subagent script to package.json Introduced a new script entry for multi-agent-subagent, enabling execution of the corresponding TypeScript file with necessary configurations. This addition enhances the multi-agent functionality by allowing for more granular control over subagent operations. * fix: isolate child subagent from parent's callback chain The manual verification script (`npm run multi-agent-subagent`) crashed with: Error: No agent context found for agent ID researcher at StandardGraph.getAgentContext at ChatModelStreamHandler.handle Root cause: `Run.processStream` drives the parent via `streamEvents`, which captures events from every nested runnable in the call tree — including the child graph's LLM calls. When the child's "researcher" agent streamed `on_chat_model_stream` events, the parent's `ChatModelStreamHandler` received them and tried to resolve the child's agent ID in the parent's `agentContexts` map, which only contains the supervisor. Error thrown. The Phase 1 plan was for child events to stay fully isolated. Unit tests missed this because the child graph was always mocked (no real `streamEvents` plumbing); only running the full pipeline end-to-end surfaced it. Fix: pass `callbacks: []` in the child `workflow.invoke()` config. This overrides the inherited callbacks for that invocation, breaking the event propagation chain. Also set `runName: subagent:<type>` and a child-scoped `configurable.thread_id` so the child has a distinct trace root rather than polluting the parent's trace. Manual verification now works: supervisor delegates to researcher, receives filtered text, synthesizes the final answer. Test suite: 1079 passed, 0 subagent-suite regressions. The two newly failing suites (ProgrammaticToolCalling / ToolSearch Live API tests) are pre-existing — they require `LIBRECHAT_CODE_BASEURL` to be configured and fail identically on the prior commit. They were skipped before because OPENAI_API_KEY was absent. * feat: align filterSubagentResult fallback with Claude Code When a subagent's last AIMessage is pure tool_use (e.g. the model hit maxTurns mid-tool-call), the previous implementation returned the "Task completed" sentinel, discarding any textual progress from earlier turns. Claude Code's agentToolUtils.finalizeAgentTool walks backwards from the last message, and on messages without text content continues scanning earlier AIMessages, surfacing the most recent textual thought instead of the sentinel. This change matches that behavior. The three `return` sites inside the loop that previously exited with the sentinel now `continue` when the AIMessage yields no text; the loop exits with "Task completed" only when no AIMessage in the history contains any text. Concrete example: [0] Human: "What's the capital of France?" [1] AI: "Let me search." + tool_use(search, ...) [2] Tool: "Paris." [3] AI: tool_use(search, ...) <-- maxTurns, no text Before: "Task completed" After: "Let me search." Added two tests — the turn-limit scenario above, and the analogous empty-string-content edge case. All nine existing filterSubagentResult tests still pass unchanged. * fix: update model name for non-Anthropic API to gpt-5.4 Changed the model name used in the multi-agent subagent script from 'gpt-4.1' to 'gpt-5.4' for improved performance and capabilities. This update ensures compatibility with the latest API offerings.
* feat: forward subagent child events and enable event-driven tools
Adds an opt-in event forwarding path for `SubagentExecutor` so host apps
that rely on event-driven tool dispatch (`toolDefinitions` +
`ON_TOOL_EXECUTE` handler) can use subagents — including self-spawn.
Before: a subagent in event-driven mode got `toolDefinitions` stripped in
`buildChildInputs` and ran with `callbacks: []`, so the LLM had no tool
schemas and ON_TOOL_EXECUTE never reached the host. The subagent replied
in prose without invoking any tools.
Now:
- `SubagentExecutor` accepts `parentHandlerRegistry` (a direct registry
or a lazy getter — `Run` wires the registry onto the graph AFTER
`createWorkflow`, so `createAgentNode` must capture lazily).
- When provided, `buildChildInputs` keeps `toolDefinitions`, and the
child's invoke attaches a `BaseCallbackHandler` forwarder that:
- routes ON_TOOL_EXECUTE to the parent's handler (event-driven tools
work), and
- wraps other custom events (run_step, run_step_delta,
run_step_completed, message_delta, reasoning_delta) in a new
`ON_SUBAGENT_UPDATE` envelope so hosts can render child progress
in a separate UI surface.
- New `SubagentUpdateEvent` type and `ON_SUBAGENT_UPDATE` GraphEvents
entry; start/stop/error envelopes fire around the child invoke.
- Legacy isolation is preserved when no registry is passed.
Test plan:
- `npx jest SubagentExecutor SubagentTool` — 36 pass (4 new covering
forwarding, toolDefinition retention, legacy strip, lazy getter).
- Integration test against OpenAI (`subagent-event-driven-debug.ts`)
confirms the child uses `calculator` via the parent's ON_TOOL_EXECUTE
path and the host sees start/run_step/run_step_completed/stop
ON_SUBAGENT_UPDATE envelopes.
* fix: gate child toolDefinitions on ON_TOOL_EXECUTE handler + propagate tool_call_id
Two follow-ups from review on #107:
1. Codex P1: `SubagentExecutor` was treating any non-null parent registry
as "forwarding enabled" and always kept `toolDefinitions` on the
child. Because `Run.create` always constructs a `HandlerRegistry`,
this meant configurations with tools declared but no `ON_TOOL_EXECUTE`
handler would hang forever — the child's `ToolNode` would dispatch a
batch request and the forwarder would find no handler to resolve or
reject it, leaving the promise pending.
Now `keepToolDefinitions` is gated on
`parentRegistry.getHandler(ON_TOOL_EXECUTE) != null`. A registry
without the handler falls back to the legacy "strip toolDefinitions"
path, which is a recoverable no-tools state instead of a hang.
2. `SubagentUpdateEvent` now carries `parentToolCallId`. `Graph.ts`
pulls it from `config.toolCall.id` (LangChain threads the originating
`ToolCall` onto `RunnableConfig`) when the subagent tool is invoked.
This lets hosts correlate subagent updates to the parent tool call
deterministically instead of inferring by event ordering — a concern
raised on the LibreChat PR using this event stream.
Also bumps to `3.1.67-dev.2`.
Tests: `npx jest SubagentExecutor` — 38 pass (2 new covering the handler
gate and `parentToolCallId` propagation). Integration script verified
against real OpenAI: every ON_SUBAGENT_UPDATE envelope now carries the
parent's `tool_call_id`.
* chore: address audit findings — drop unused param, tighten cast, add tests
Addresses the comprehensive audit of #107:
### MAJOR #1 — ON_TOOL_EXECUTE forwarding coverage
Two new tests in the `event forwarding` suite drive the forwarder
callback directly with a synthesized `ToolExecuteBatchRequest`:
- `routes child ON_TOOL_EXECUTE dispatches through the parent registry`
asserts the parent's `ON_TOOL_EXECUTE` handler is invoked, the batch's
`resolve` fires with canned results, and the child receives them.
- `does NOT forward ON_TOOL_EXECUTE when the parent registry has no
handler (safe fallback)` documents the companion case: without a
handler the `resolve` never fires, which is why the executor gates
`keepToolDefinitions` on handler presence in the first place.
### MINOR #2 — unused `childGraph` parameter
Removed from `createForwarderCallback`'s arg shape, destructuring, and
call site. It was accepted as `_childGraph` and never read.
### MINOR #3 — `data as never`
`EventHandler.handle` didn't include `ToolExecuteBatchRequest` in its
accepted data union, which is why the forwarder used `as never` to
shut the compiler up. Extended the union and switched the cast to the
concrete `ToolExecuteBatchRequest` type — the forwarder is now fully
type-checked on the hot path.
### MINOR #4 — error-phase envelope test
`emits an 'error' phase envelope when the child graph throws` uses
the throwing factory helper already in the file, registers a real
`ON_SUBAGENT_UPDATE` handler, and asserts start/error ordering plus
the `data.message` payload and `parentToolCallId` passthrough.
### MINOR #6 — dynamic imports in tests
Replaced `await import('@/events')` / `await import('@/common')` with
static top-level imports. Consistent with the rest of the file.
### MINOR #7 — `summarizeEvent` unit tests
Exported the pure function from `@/tools/subagent` and added 10 focused
tests covering every branch: tool_calls with one/many names, empty
tool_calls, message_creation, ON_TOOL_EXECUTE with/without names,
completed steps with/without a tool name, message_delta, and the
unknown-event fallback. Every user-visible label is now regression
protected.
### MINOR #8 — `config.toolCall` comment
Expanded the JSDoc to note the specific LangChain source
(`ToolRunnableConfig` in `@langchain/core/tools`, stable since 0.3.x)
and that the defensive read degrades to `undefined` on upstream
changes.
### NIT #9 — unused import
Removed `createContentAggregator` from `subagent-tools-debug.ts`.
### NIT #10 — `awaitHandlers = true` doc
Added a JSDoc block explaining the trade-off: required for
`ON_TOOL_EXECUTE` to actually block on the host's resolve, but it
also serializes observational events through any slow host handlers.
Refinement path noted for later if host-side latency becomes a problem.
Version bumped to `3.1.67-dev.3`.
Tests: 51 pass (40 SubagentExecutor + 10 new summarizeEvent + 1 new
error-phase + existing coverage retained).
* fix: clear parent run summary and discoveredTools from child inputs
Paired with Codex P1 on the LibreChat PR (danny-avila/LibreChat#12725)
that uses this event stream. `buildChildInputs` shallow-spreads the
parent's `AgentInputs`, which carries two run-scoped fields that leak
into the isolated child by default:
- `initialSummary` — cross-run conversation summary. On a conversation
that has already been summarized, every child would inherit it and
reason against unrelated prior chat.
- `discoveredTools` — tool names the parent's LLM searched for in
earlier turns via `tool_search`. The child gets primed to use
whatever the parent happened to find.
Both are cleared on the returned child inputs. Configuration fields
(summarization policy, provider options, context pruning config) are
preserved — those are policy, not context.
Affects both paths:
- Self-spawn: `resolveSubagentConfigs` clones parent `_sourceInputs`,
then `buildChildInputs` strips.
- Explicit children: host-built `AgentInputs` goes through the same
`buildChildInputs` before the child graph runs.
Version bumped to `3.1.67-dev.4`.
Regression test: `strips parent-run-scoped initialSummary and
discoveredTools from child inputs` — seeds parent inputs with
`initialSummary: { text, tokenCount }` and `discoveredTools: [...]`,
asserts both are `undefined` on the returned child. 52 pass.
* test: cover summarizeEvent step.type fallback branches
Addresses reviewer F2.1 (NIT) on #107. `summarizeEvent` resolves
run-step detail type via `step.stepDetails?.type ?? step.type ?? 'step'`.
The existing tests all passed `{ stepDetails: { type } }`, leaving the
second and third clauses of the chain uncovered.
Two new tests:
- top-level `step.type` (no `stepDetails` wrapper) for both
`tool_calls` and `message_creation` — exercises the second clause
- empty `{}` payload — exercises the `'step'` default and the generic
`Step: <detailType>` branch
Carried MINOR #5 (`SubagentUpdateEvent.data: unknown`) left as a
documented deferral; the JSDoc already notes "shape depends on phase"
and a discriminated union is a larger refactor.
54/54 pass.
* chore: add npm run scripts for subagent debug scripts
- `npm run subagent` — supervisor + researcher/coder subagent demo
(no tool use; from #104)
- `npm run subagent:events` — event-driven flow matching LibreChat's
architecture (toolDefinitions + ON_TOOL_EXECUTE handler + self-spawn
with calculator). This is the script that demonstrates the #107 fix.
- `npm run subagent:tools` — traditional LangChain tool instances with
self-spawn; pre-fix parity case.
All three accept `OPENAI_API_KEY` (and `--provider anthropic` for the
first).
…t timeouts for LLM tests
* 🗑️ refactor: Remove CODE_API_KEY completely from the sandbox tool layer Phase 8 (LibreChat danny-avila/LibreChat#12767) deprecates `LIBRECHAT_CODE_API_KEY` across every LibreChat-side reference. The sandbox service now authenticates consumers via other channels (network boundary, service token, etc.), so the library layer can drop the concept entirely instead of treating it as an optional forwarding parameter. **What's removed:** - `EnvVar.CODE_API_KEY` enum value (the `CODE_BASEURL` entry stays). - `apiKey` field and `[EnvVar.CODE_API_KEY]` indexer on `CodeExecutionToolParams`, `ProgrammaticToolCallingParams`, and `ToolSearchParams`. `BashExecutionToolParams` / `BashProgrammaticToolCallingParams` inherit the cleaner shapes. - The precedence chain (`params[EnvVar.CODE_API_KEY] ?? params.apiKey ?? getEnvironmentVariable(EnvVar.CODE_API_KEY) ?? ''`) gone from all five tool factories — no key resolution happens at all. - Every `X-API-Key` outbound header in `BashExecutor.ts`, `CodeExecutor.ts`, `ProgrammaticToolCalling.ts` (fetchSessionFiles + makeRequest + factory), and `ToolSearch.ts`. - `apiKey` parameter on `fetchSessionFiles(baseUrl, sessionId, proxy)` and `makeRequest(endpoint, body, proxy)` — both simplified to the remaining args. - Throws in five places: "No API key provided for bash execution tool.", "...for code execution tool.", "...for programmatic tool calling.", "...for bash programmatic tool calling.", "...for tool search in code_interpreter mode." - Dangling `getEnvironmentVariable` imports in BashExecutor/BashProgrammaticToolCalling/ProgrammaticToolCalling/ ToolSearch (still used by CodeExecutor for `CODE_BASEURL`). **Scripts:** - `programmatic_exec.ts`, `tool_search.ts`: remove `LIBRECHAT_CODE_API_KEY` env reads + pre-flight `process.exit(1)` guards + `{ apiKey }` passed to the factories. - `test_code_api.ts`: drop `API_KEY` constant + `X-API-Key` header. - Integration tests (`ProgrammaticToolCalling.integration.test.ts`, `ToolSearch.integration.test.ts`): the skip gate flips from "key present" to an explicit `RUN_CODE_INTEGRATION_TESTS=1` opt-in. Live-sandbox tests don't run by default in CI regardless of auth. **Backward compat:** Consumers that previously passed `apiKey` in params now get a TS error on that field — the correct remediation is to drop the arg. No runtime behavior remains for stragglers because the header is gone. **Verification:** - `grep -rn "CODE_API_KEY\|LIBRECHAT_CODE_API_KEY" src/` returns 0 hits. - `grep -rn "X-API-Key" src/tools/` returns only the SEARX search tool header (unrelated feature with its own optional apiKey). - `npx tsc --noEmit`: clean - `npx jest src/tools`: 379 passing, 23 skipped (integration gated on the new opt-in var). No regressions. - `npx eslint` clean on every touched file. * 📦 chore: bump version to 3.1.68-dev.1 * 🧹 chore: Remove stale LIBRECHAT_CODE_API_KEY from CI + publish workflows Follow-up to the hard removal (bf64da3). Both workflow files still passed `LIBRECHAT_CODE_API_KEY: ${{ secrets.LIBRECHAT_CODE_API_KEY }}` to the "Run PTC unit tests" and "Run Tool Search unit tests" steps — 4 references total. No code in the repo reads the env var anymore, so this was dead config that would mislead future maintainers. The LIBRECHAT_CODE_BASEURL var stays (still read by getCodeBaseURL). `grep -rn "CODE_API_KEY\|LIBRECHAT_CODE_API_KEY" .` (excluding node_modules) now returns zero hits across the whole repo.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )