-
Notifications
You must be signed in to change notification settings - Fork 0
Conditional Query Detection
Conditional detection recognizes a leading "if X, Y" structure — "if the back door is unlocked, let me know" — and answers it honestly: it searches the condition, and either gives a real yes/no verdict (when the underlying source genuinely supports one) or presents the raw result and says so plainly when it doesn't. Mnemolis has no reminder or trigger capability, so it can never actually act on the consequence — what it can do is make sure the response is framed around the condition's real, current answer instead of just restating the question back.
The detector only matches a leading "if " / "should " / "in case " followed eventually by a comma:
^(if|should|in case)\s+(.+?),\s*(.+)$This is deliberately restrictive, and the restriction is the actual design decision worth understanding. "If" is genuinely ambiguous in English — it has a conditional sense ("if it's raining, bring an umbrella") and a "whether" sense ("check if the lights are on" means "check whether the lights are on," not a real condition at all). The whether sense never shows up at the very start of a sentence followed by a comma; it's always embedded after a verb like "check," "see," or "tell me." Restricting to the leading-comma form sidesteps the ambiguity entirely, rather than trying to guess at it from surrounding verbs.
A few phrasings are deliberately, permanently out of scope as a result:
- Mid-sentence or trailing "if" — "remind me to bring an umbrella if it's raining" doesn't match. Out of scope for the same reason as above.
- "Let me know if X" — genuinely ambiguous even to a human reader. Could mean "tell me the current status" or "notify me if it changes." Not safely interpretable either way, so it's left alone.
- No comma at all — "if the front door is unlocked tell me" (missing the comma) doesn't match. This is a real, accepted limitation, not an oversight — distinguishing this reliably from "whether" usage would require actual grammatical parsing, not pattern matching, and that's a different kind of project.
Once a condition is extracted, the real question is whether Mnemolis can say anything meaningful about whether it's true. This is restricted to exactly three sources with a genuinely structured, binary signal:
_YES_NO_INTERPRETABLE_SOURCES = {"ha", "uptime", "forecast"}
-
ha— checks for "locked"/"unlocked" keywords in the condition, then in the result -
uptime— checks"down"/"not up"(the condition implying something's broken) or"up"/"running"/"working"(implying it should be fine) in the condition, against"down"or"all"+"up"together in the result -
forecast— checks for"rain"/"raining"in the condition specifically (deliberately narrow, never a broader "bad weather" guess); if found, checks the result for"rain","storm", or"shower"language to confirm, or"clear"to deny. The condition-side and result-side keyword sets are deliberately asymmetric —"storm"/"shower"are real, valid ways the result might describe rain happening, but a condition phrased as "if there's a storm" (no mention of rain) isn't matched at all, since there's nopositive_condition_keywordsor storm-specific condition check for this source. Not a bug — see the next paragraph for what's actually interpretable here and what isn't.
Every other source — Kiwix, web, news — is never interpreted, on purpose. There's no structured signal to check against in free text, and guessing wrong would actively mislead rather than just be unhelpful. Even within the three interpretable sources, a genuinely subjective condition like "if it's hot enough this week" correctly returns no verdict, because there's no universal threshold for "hot enough" to check against — _interpret_yes_no returns None here, not a guess.
Condition extracted, searched
│
▼
Was the answer source one of
ha / uptime / forecast?
│
┌───────────┴───────────┐
▼ no ▼ yes
Present the real result Does the condition's language
honestly, note it's map to a recognizable keyword
conditional, let the pattern for THIS source?
person judge for │
themselves ┌────────┴────────┐
▼ no ▼ yes
Same honest State an explicit verdict:
presentation "It IS / IS NOT the case
as the "no" that {condition} — so the
branch suggested action may or
may not apply"
Wrong is worse than uncertain. That's the entire design principle behind this feature, and it's the reason the interpretable-sources set is a short, explicit allowlist rather than something that tries to generalize.
A query can be conditional without starting with "if" — "what is the weather and if the back door is unlocked, let me know" doesn't match the leading pattern at all, but Query Decomposition will still split it into two sub-queries, and the second one ("if the back door is unlocked, let me know") absolutely is conditional. Conditional detection is re-applied to every decomposed sub-query for exactly this reason.
The first version of this re-check recursed on the original "if X, Y" string, with a manual depth counter meant to prevent runaway recursion. That counter introduced a real bug: it incremented before the conditional was actually consumed, which meant the recursive call's own necessary re-detection of the very same conditional got blocked by the counter that was supposed to be protecting against infinite recursion that was never actually possible in the first place. The fix — and the much simpler design that replaced the depth counter entirely — is told in full in The Recursion Design Bug.
"if any services are down, let me know, and also what's the weather" has a genuine second intent hiding after the conditional's consequence. An early version of the consequence-extraction regex was greedy and captured everything to the end of the string — "let me know, and also what's the weather" — silently swallowing the weather question into plain descriptive text that never got searched at all.
This is fixed by checking the extracted consequence for a trailing conjunction and, if found, splitting off the remainder and searching it independently, merging it back into the final response with its own source attribution. The same fix surfaced a second, smaller bug — the exact [FUSION — FUSION] double-header issue described in Fusion — at a new call site that hadn't existed when that bug was first found and fixed elsewhere.
The condition and the remainder used to each get routed with their own full, independent call — search the condition, get a verdict, then search the remainder, merge the two. This was sequential, not concurrent: if either half hit a slow LLM call or a slow fusion fan-out, the total wait was additive, not the longer of the two. A condition that took 2 seconds to resolve and a remainder that took 6 added up to roughly 8 seconds total, not 6.
This was found via real, live latency data, not a synthetic benchmark — several real conditional_with_remainder-shaped queries logged by Adversarial Self-Testing showed latency meaningfully above what either half would cost alone.
An earlier version of this page recorded this as deliberately not being fixed, reasoning that the same conditional-handling code's real bug history (the recursion depth-counter bug above, and the greedy-consequence-regex bug just above this section) made any change here too risky. Re-examined directly, that reasoning conflated two different things. Both real bugs live in detect_conditional()'s parsing logic and _interpret_binary_state()'s keyword matching — neither has anything to do with the order the two route_with_source() calls inside _resolve_conditional() execute in, and re-deriving the actual data dependencies confirmed those two calls don't depend on each other any more than Query Expansion's two SearXNG fetches did — a structurally similar case that had already been found feasible and fixed by the time this one was re-examined.
The real, original blocker turned out to be concrete and fixable, not a vague "this area is fragile" caution: a genuine, pre-existing file-write race in how both caches persisted to disk, found and fixed as part of researching the web case (see Caching for the full mechanism). That fix removed the actual reason this page's earlier caution had any teeth.
Fixed the same way, with the same verification discipline — and worth being explicit that the discipline mattered, not just the idea: the web case's own fix shipped first with a real regression (ThreadPoolExecutor not propagating contextvars.ContextVar state into worker threads by default, silently breaking suppress_cache_writes()), caught by a real test before anyone hit it in practice. Building this fix started from that lesson rather than relearning it: each task gets its own contextvars.copy_context() call before submission (a single shared context can't be entered by two threads at once — confirmed the hard way while building the web fix), and three real checks ran before trusting any of it — a genuine timing proof of concurrency, direct confirmation suppress_cache_writes() correctly reaches both worker threads, and confirmation normal caching is unaffected when suppression isn't active.
Only spun up at all when a remainder genuinely exists — a plain "if X, Y" query with no trailing conjunction (the more common real-world shape) has an empty remainder and never needed a second call in the first place; that path is completely unchanged. Verified against realistic timings matching the real flagged query's shape (a 2-second fusion fan-out condition, a 1.5-second single-source remainder): 2.0s concurrent versus what would have been 3.5s sequential.
Query: "if the back door is unlocked, let me know"
│
▼
condition = "the back door is unlocked"
consequence = "let me know"
│
▼
Search the condition → routes to `ha`
Real result: "Back Door: locked"
│
▼
ha is interpretable. Condition mentions
"unlocked". Result says "locked" — the
opposite. Verdict: NOT the case.
│
▼
"This was a conditional question: 'if the back door
is unlocked, let me know.'
It is NOT the case that the back door is unlocked —
so the suggested action (let me know) may not apply.
Back Door: locked"
The real underlying result is always preserved in the response, regardless of the verdict — framing adds context, it never replaces or hides the actual data.