Parameterize BRAINDB_API_URL with API_PORT#1
Merged
Conversation
To match the use of API_PORT in the api section
Owner
|
Thanks Mark 🙏 Merging — I'll open a new PR to also parameterise the uvicorn --port and the right side of ports: so all three references move together when API_PORT≠8000. |
WarGloom
pushed a commit
to WarGloom/braindb
that referenced
this pull request
May 24, 2026
…rrow-query strategy This is the second-leg of the recall overhaul (the first leg, d4b9288, fixed the silent embedding-zero bug and widened the scoring pool). Two new things land here, plus one prompt nudge. ## A.6 — fuzzy now goes through keywords too (symmetric retrieval) Before: the embedding pathway in assemble_context was keyword-mediated (after d4b9288), but the fuzzy pathway still ran pg_trgm + fulltext directly against entity content / title via fuzzy_search. The result was structurally unfair: a fact saved with keywords ["Petros", ...] got text_score ~0.06 against a multi-word query like "Petros person identity profile" because pg_trgm dilutes when a short query is compared against a long entity body. The keyword indexing was being bypassed by half the recall pipeline. After: a new helper find_fuzzy_keywords runs pg_trgm similarity(content, query) over entity_type='keyword' rows (short keyword content → no dilution), and assemble_context's text pathway fans out via the existing find_entities_for_keywords. Both pathways now produce a per-entity score equal to the best matched-keyword similarity over that entity's tagged_with neighbours. The geometric-mean merge and missing_signal_penalty are unchanged but become meaningful: they combine two signals about the SAME thing (how well the query matches this entity's keywords), one via trigrams and one via embeddings. fuzzy_search itself is intentionally left alone — it still serves the "arbitrary content matching" use-cases (quick_search agent tool, /memory/search). A discoverability backup in assemble_context still calls fuzzy_search and applies a heavy 0.2 discount as a pure fallback (only adds entities the keyword path didn't already cover; never overrides a keyword-path score). Design principle being restored (user-stated): keywords are the indexing hub. tagged_with relations are created automatically when an entity is saved, so the keyword graph alone is enough for retrieval connectivity. Explicit elaborates / refers_to edges are editorial nuance, not required for findability. ## A.7 — two-level diversity quota (per-search-term + per-keyword) When A.6 went live the top recall results for narrow-subject queries were dominated by a few popular hub keywords (CityFalcon ~42 entities, user-profile ~30, BrainDB ~12, ...). Each of those keywords was strongly matched by the broad multi-word queries the LLM was issuing, so their entities crowded top-N at near-identical scores; the narrow-subject fact (e.g. Petros, only 1 entity tagged) fell below the cut. Two complementary mechanisms, sharing ONE counter, fix this: L1 — per-search-term reservation: each query in queries[] gets ceil(max_results × per_query_share / num_queries) reserved slots filled from that query's OWN top-ranked entities. So a focused narrow query ALWAYS surfaces something in the result, no matter how broad the other queries are. L2 — per-keyword quota (geometric decay): walking the remaining (open) slots in final_rank-desc order, each new dominant matched keyword gets a halving allowance (50% / 25% / 12.5% ... of max_results, floor 1). Stops a popular keyword from monopolising the open portion. They share one bookkeeping dict (seen: kw_id -> remaining), so a keyword's allowance is decremented by BOTH L1 reservations and L2 walks — no double-spending, no conflict. The full coexistence rules are documented in the docstring of _apply_two_level_quota in braindb/services/context.py. Please read that block before touching the function; the no-conflict property depends on the shared counter. assemble_context now also tracks per-query scores (text_scores_by_q, embedding_scores_by_q) alongside the existing max-aggregated dicts, so L1 can rank entities by THAT query's own combined score (using the same geometric-mean / missing_signal_penalty merge per query). ## Prompt nudge — recall_memory docstring teaches narrow-query strategy A multi-word query like "Petros person identity profile" matches the short "Petros" keyword at only ~0.4 fuzzy (trigram dilution). The 1-word query "Petros" matches it at ~1.0 and surfaces the Petros fact at the top. To exploit this, the recall_memory tool's docstring (which the LLM reads as the tool description) now explicitly tells the model: - prefer 2-4 short focused queries over one long phrase - include bare subject names as standalone queries - example: ["Petros", "Selonda Saronikos fish farm", ...] - the per-search-term quota guarantees each angle gets representation, so adding the bare keyword is free The narrow strategy + L1 reservation together unlock the narrow-subject case: the LLM issues a single-keyword query for the subject, that query reserves slots in the result, the subject's fact tops those slots. Also bumped: agent recall_memory default max_results 15 → 30 (via new settings.recall_default_max_results). The /memory/context API schema default was already 30; this brings the agent tool in line. ## Verification (live, deepinfra/Gemma-4-31B) | Query | Petros position | final_rank | |--------------------------------------------------------|-----------------|------------| | ["Petros"] (narrow) | dimknaf#1 | 0.838 | | ["Petros", "Selonda Saronikos fish farm", "Dimitrios manager"] | dimknaf#1 | 0.839 | | ["Petros person identity profile", "Petros relation to Dimitris", "Petros CityFalcon"] (broad-only) | dimknaf#5 | (was: NOT in top-30) | Dimitrios Koutsoumpos /agent/query regression: 49.9s, 1362-char structured grounded answer. Tool sequence intact. ## Files braindb/agent/tools.py | 33 ++++- (docstring + default 30) braindb/config.py | 28 ++++ (3 new settings) braindb/services/context.py | 288 ++++++++++++ (the bulk: A.6 + A.7) braindb/services/keyword_service.py | 32 ++++ (find_fuzzy_keywords) 4 files changed, 342 insertions(+), 39 deletions(-) ## Knobs (all new settings, defaults are the shipping values) scoring_pool_keyword_neighbors: int = 500 Already shipped in d4b9288; unchanged here. scoring_pool_fuzzy: int = 500 Already shipped in d4b9288; unchanged here. The fuzzy scoring pool now applies to fuzzy_keyword matches (A.6). per_query_share: float = 0.5 L1 quota: fraction of max_results reserved across per-query slots. Set to 0 to disable L1. keyword_quota_halving: float = 0.5 L2 quota: each new dominant keyword's slot allowance shrinks geometrically. Set to 1.0 to disable L2. recall_default_max_results: int = 30 Default max_results the agent's recall_memory tool exposes to the LLM (and the /memory/context API). ## What is explicitly NOT touched - missing_signal_penalty (still 0.5) - effective_importance / temporal decay - graph_expand - the geometric-mean seed_score merge - fuzzy_search itself (still keyword-blind for quick_search / /memory/search consumers) - the agent loop, the typed final-answer contract, the wiki pipeline, the scheduler No IDF was added. The two-level quota plus the prompt nudge are sufficient for narrow-subject surfacing in our data; adding IDF on top would be bloat.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
...to match the use of API_PORT in the api section