-
Notifications
You must be signed in to change notification settings - Fork 8
Infrastructure
The under-the-hood systems that make DeepLore fast, reliable, and provider-agnostic. Most of this is automatic. The page exists so you can reason about failure modes, costs, and where your data goes.
DeepLore has six independent AI connection channels. Each feature picks its own provider, profile, proxy, model, and timeout. The channels are kept separate on purpose. Route a tool-calling Claude or GPT-4o to Emma while AI search runs on a cheap model. Split Scribe to a long-context model and keep retrieval on Haiku.
| Channel | Settings prefix | Default mode | What it does |
|---|---|---|---|
| AI Search | aiSearch* |
profile |
Stage-2 retrieval. Selects entries from the keyword-matched manifest. The "source" channel that others can inherit. |
| Session Scribe | scribe* |
inherit |
Auto-summaries written back to the vault every N messages. |
| Auto Lorebook | autoSuggest* |
inherit |
AI-suggested new entries from chat content. |
| AI Notepad | aiNotepad* |
inherit |
Extract-mode session notes (the post-generation extraction call). Tag mode uses no AI channel of its own. |
| Librarian | librarian* |
inherit (since v3) |
Emma's chat sessions and the writing-AI tool calls (search_lore, flag_lore). Auto-enables function calling on the active connection. |
| Optimize Keys | optimizeKeys* |
inherit |
The /dle-optimize-keys AI keyword refiner. |
inherit mode reuses the AI Search connection's mode, profile, proxy URL, and model. The feature still keeps its own maxTokens and timeout. Set the channel to profile or proxy to override.
The Librarian channel is the most commonly broken out separately. The reason: function calling is required for Emma and the writing-AI tools, and not every AI Search profile points at a tool-calling model. Per-feature override means you can leave AI Search on a non-tool model and route Librarian to Claude / GPT-4o / OpenRouter Haiku.
Important
The Librarian channel is intentionally separate from retrieval. Don't collapse them. If you want Emma cheap, point her at Haiku via her own profile and leave AI Search on whatever you like.
Each channel runs in one of three modes:
-
profile: routes through SillyTavern's Connection Manager (CMRS, theConnectionManagerRequestService). Picks up presets, instruct templates, system prompts, and provider quirks already configured in ST. Recommended for most setups. -
proxy(Custom Proxy): routes through ST's built-in CORS proxy (/proxy/<encoded URL>) to a separate Anthropic-compatible endpoint (e.g., claude-code-proxy). Used to talk to a tool-calling provider that ST's chat-completions route doesn't expose well. RequiresenableCorsProxy: truein ST'sconfig.yaml. -
inherit: non-AI-Search features only. Mirrors AI Search's mode/profile/proxy/model.
If you set a feature to proxy mode without enabling ST's CORS proxy, the call throws:
SillyTavern CORS proxy is not enabled. Set enableCorsProxy: true in
config.yaml, or use a Connection Profile instead of Custom Proxy mode.
DeepLore uses SillyTavern's built-in CORS proxy. There is no DLE server plugin. Two paths use it:
-
Proxy-mode AI calls (any channel set to
proxy) post to/proxy/<encoded target URL>with the Anthropic Messages API payload. The CORS proxy forwards the body verbatim. -
Librarian agentic loop in proxy mode also calls the Anthropic Messages API through the same
/proxy/<URL>route, with native tool calling (no CMRS translation).
In profile mode no CORS proxy is used. CMRS makes its own request directly through ST's normal chat-completions route.
Obsidian fetches go directly to your local Obsidian REST endpoint (no CORS bridge). Browser CORS is allowed by the Local REST API plugin's response headers when the call originates from http://localhost.
In profile mode, AI Search sends a json_schema field on the override payload. ST's chat-completions route translates this per-provider, so you don't have to think about it:
| Provider class | What ST does with json_schema
|
|---|---|
| OpenAI, OpenRouter, Groq, xAI, Fireworks, Custom, Azure | Strict json_schema on the request |
| Claude (Anthropic) | Forced tool_choice (translated) |
| Gemini | responseSchema |
| Mistral, DeepSeek, Moonshot, Z.AI | Soft json_object mode |
| Anything else | Field silently dropped |
DLE sends the schema unconditionally. Worst case is no-op; best case is strict parseable JSON without any prompt-engineering tricks.
The Claude exception: ST translates json_schema to forced tool_choice, which the Claude API rejects when extended thinking is enabled (Thinking may not be enabled when tool_choice forces tool use.). Thinking is on by default for Claude 4.x via profile presets. To avoid breaking other tooling that uses those presets, AI Search detects Claude profiles by model prefix and skips the schema for Claude. The JSON extractor in ai.js is permissive enough to handle Claude responses without the schema.
In proxy mode (Anthropic Messages API direct), DLE has full control of the payload. The proxy path uses cache_control breakpoints on the manifest for Anthropic prompt caching.
Connect multiple Obsidian vaults at once. Each vault has its own host, port, API key, HTTPS toggle, and enable flag. Entries from all enabled vaults merge into a single index.
Setup:
- In Settings → Connection → Obsidian, click Add Vault to add a new connection.
- Each vault has Name, Host, Port, HTTPS, API Key, and Enabled.
- Click Test All to verify all enabled connections.
- Use Scan Vaults to sweep a port range looking for responding Local REST API instances.
Notes:
- Entries from all enabled vaults merge and are treated identically by the pipeline.
- Each entry tracks its
vaultSourcefield for diagnostics,trackerKey(vaultSource:title) collisions, and dedup. - The Multi-Vault Conflict Resolution setting controls how entries with the same title across vaults are handled (
allkeeps both disambiguated;first/lastkeep one;mergecombines content). - The health check audits multi-vault configuration (overlapping titles, unreachable vaults, mismatched tag conventions).
The parsed vault index gets saved to IndexedDB (database DeepLoreEnhanced, store vaultCache) after every successful build.
On page load:
- DLE hydrates from IndexedDB instantly (no Obsidian call needed).
- A background validator hits Obsidian and reconciles changes.
- UI surfaces show the cached state immediately so the first generation works without waiting.
This lets DLE survive Obsidian being briefly unreachable on page load. No settings to configure.
When auto-sync triggers, DLE fetches all vault file contents but avoids redundant work:
- Fetches all file contents from Obsidian (local fetch is fast).
- Computes content hashes and compares against the existing index.
- Reuses already-parsed entries for unchanged files (skips parse and tokenize).
- Re-parses only new or modified files.
- Removes entries for deleted files.
- Falls back to a full rebuild if the reuse approach fails.
The savings come from skipping the expensive parse/tokenize step for unchanged entries, not from reducing network calls.
When the index rebuilds, DLE compares the new index against the previous one and reports the diff.
Detected changes:
- New entries added
- Entries removed
- Modified content
- Changed keywords
Auto-sync polling: set Auto-Sync Interval to re-check the vault every N seconds. When changes get detected, toast notifications summarize what changed (controlled by Show Sync Change Toasts).
Manual refresh: click Refresh Index in Settings → System, or run /dle-refresh.
DeepLore runs two independent circuit breakers.
Obsidian (per-vault):
- States: closed (normal), open (failing; skip calls during backoff), half-open (let one probe through).
- Exponential backoff from 2s to 15s.
- Keyed by
host:portso each vault has independent failure tracking. - Resets when a call succeeds.
- Stale circuit breakers (vaults removed from config) get pruned.
AI search:
- Threshold: 2 consecutive failures to trip.
- Cooldown: 30s before a half-open probe is allowed.
- Half-open probe gate ensures exactly one caller goes through after the cooldown.
- Throttled calls do not trip the breaker (the throttle is 500ms minimum between AI calls).
- User aborts, timeouts, rate-limit responses (HTTP 429), and auth errors (HTTP 401/403) also do not trip the breaker.
- 5xx responses, network errors, and persistent JSON-parse failures do trip it.
When the AI breaker is open, AI search falls back to keyword results for the cooldown window.
Two race-condition guards run during generation:
-
generationLockwithgenerationLockTimestampandgenerationLockEpoch. Prevents concurrent generations from clobbering each other. Stale-lock detector force-releases after 30s timeout. The Librarian agentic loop refreshes the timestamp before every API call and tool processing to prevent the stale-lock detector from firing mid-loop. -
chatEpochincrements onCHAT_CHANGED. Epoch-sensitive operations re-check the value after everyawaitto bail out if the user switched chats mid-flight.lastInjectionEpochis the corresponding guard forlastInjectionSourcesto prevent stale cross-chat writes.
buildEpoch increments on force-release of a stuck indexing flag. In-progress index builds capture the epoch at start and bail out if the value changes mid-build (zombie guard).
AI search caches results with a sliding window strategy. The manifest and chat context are hashed separately. When only new chat messages get appended (vault unchanged):
- If the new messages don't reference any vault entity names or keys, cached results get reused.
- If new messages mention vault entities, the cache invalidates and a fresh AI call runs.
- A prefix-content-hash check catches mid-context edits (sliding window only checks lines at the end; an edit to existing lines invalidates the cache).
- An entity-regex version stamp catches the case where
entityShortNameRegexesgot rebuilt since the cache was written.
Most regenerations, swipes, and non-lore-relevant messages reuse cached results automatically.
Off by default. Toggle via AI Search → Show Filtering → Category Pre-filter.
For large vaults (40+ selectable entries with 4+ distinct categories), AI search uses a two-call approach:
- Group entries by category (extracted from tags/type fields).
- First AI call: select relevant categories from the full list.
- Second AI call: select specific entries from within those categories.
Safety valve: if the category filter would remove more than the configured aggressiveness fraction of entries (default 0.8 → up to 80%), it falls back to the full manifest. Requires at least 4 distinct categories to activate at all.
In proxy mode, the AI search manifest is placed first in the message payload with cache_control breakpoints. This uses Anthropic prompt caching so the manifest (which rarely changes between calls) is cached server-side, reducing token costs on subsequent calls.
profile mode currently does not support cache_control breakpoints. Most providers other than Anthropic don't have an equivalent.
DeepLore stores state in three places:
ST extension settings (extension_settings.deeplore_enhanced, persisted to disk by saveSettingsDebounced):
- All UI settings from the Settings popup
- Vault connection list (
vaults[]) - API keys (plaintext, platform limitation; use a dedicated lorebook vault, not your personal one)
- Saved prompt presets (
promptPresets) - All-time analytics counters
- Saved graph node positions
- The wizard-completed flag
chat_metadata (per-chat, saved by ST's normal chat persistence):
-
deeplore_notebook: Author's Notebook content -
deeplore_ai_notepad: AI Notepad accumulated session notes -
deeplore_lastScribeSummary: prior Scribe note context -
deeplore_injection_log: injection dedup history -
deeplore_pins/deeplore_blocks: per-chat{title, vaultSource}arrays -
deeplore_context: contextual gating state (era, location, scene type, character present, custom fields) -
deeplore_chat_counts: per-chat injection counts keyed bytrackerKey -
deeplore_lore_gaps: Librarian gap records -
deeplore_lore_gaps_hidden: first-tier soft-removed gap IDs (re-flag resurfaces) -
deeplore_lore_gaps_dismissed: second-tier permanently dismissed gap IDs -
deeplore_librarian_session: persisted Librarian session draft -
deeplore_folder_filter: folder-path filter array -
deeplore_swipe_injected_keys: per-swipe injectedtrackerKeys for accurate rollback across reloads
IndexedDB (DeepLoreEnhanced database, vaultCache store):
- Parsed vault index (entries plus BM25 inverted index)
- Used for instant hydration on page load before background validation against Obsidian
Per-message tool call records are stored on message.extra.deeplore_tool_calls (not chat_metadata).
Profile mode works with any provider SillyTavern's Connection Manager supports:
- Cloud APIs: Anthropic (Claude Haiku / Sonnet / Opus), OpenAI (GPT-4o, GPT-4o-mini, GPT-5), Gemini, Cohere, Mistral, DeepSeek, OpenRouter, Groq, xAI, Fireworks, Z.AI, Moonshot, Azure OpenAI.
- Local backends: Oobabooga, KoboldCpp, llama.cpp, Custom (any OpenAI-compatible local endpoint).
Forced JSON output works on every provider listed above (see the matrix earlier on this page). For providers without strict schema support, DeepLore's JSON extractor handles typical responses without help.
Function calling (Librarian): requires a tool-calling provider. Claude 3+/4.x, GPT-4o, GPT-5, Gemini Pro, OpenRouter for any of those models. Local models that route through llama.cpp's tool-calling spec also work. The Librarian feature auto-enables function calling on the active connection when you turn it on.
Local-model latency: local backends typically need 60-120s for AI search on long chats. Cloud APIs respond in 5-15s. Set the per-channel timeout accordingly. The default AI Search timeout is 20s; increase to 60000-120000ms for local.
Settings versions are tracked in settingsVersion (current: 3). Migrations run on load when the stored version is behind:
- v0 → v1: initial versioned settings (no behavior change).
-
v1 → v2: Librarian connection consolidation.
librarianSessionModelgot renamed tolibrarianModel. Existing per-tool connection modes are preserved; only Librarian's model field migrates. -
v2 → v3: Librarian default connection mode changed from
profiletoinheritfor unconfigured users (those withlibrarianConnectionMode: 'profile'and an emptylibrarianProfileId). Users who explicitly chose a profile and set a profileId are left alone.
Migrations run idempotently and persist immediately. The settingsVersion value is what gates re-runs.
DeepLore Wiki
Getting started
Features
Reference
Help