Skip to content

feat(knowledge): memory graph semantic search — query processor and hybrid scoring (#3384)#3609

Closed
mrveiss wants to merge 2 commits intoDev_new_guifrom
issue-3384
Closed

feat(knowledge): memory graph semantic search — query processor and hybrid scoring (#3384)#3609
mrveiss wants to merge 2 commits intoDev_new_guifrom
issue-3384

Conversation

@mrveiss
Copy link
Copy Markdown
Owner

@mrveiss mrveiss commented Apr 6, 2026

Closes #3384

Implements Phase 1 (core infrastructure) and Phase 2 (hybrid scoring) from docs/design/MEMORY_GRAPH_SEMANTIC_SEARCH.md.

Summary

  • query_processor.pyMemoryGraphQueryProcessor: natural language query detection, intent extraction (entity type filters, time range, status), Redis FT.SEARCH query builder, result parsing and ranking
  • hybrid_scorer.pyHybridScorer: cosine similarity (embedding-based), BM25 text scoring, configurable weight blend (semantic_weight + keyword_weight), graceful degradation to keyword-only when embeddings unavailable
  • __init__.py — Package re-exports public API

Tests

58 unit tests, all passing (query_processor_test.py). Redis and embeddings fully mocked.

Design decisions

  • Hybrid score = semantic_weight × cosine_sim + keyword_weight × bm25 (weights configurable, default 0.7/0.3)
  • Natural language detection via heuristics (verb presence, question words, sentence length) — no LLM call required
  • Graceful degradation: missing embeddings fall back to keyword-only scoring without errors

Out of scope (Phases 3–5)

Context integration, caching, query suggestions, and search analytics will be addressed in follow-up issues.

@mrveiss
Copy link
Copy Markdown
Owner Author

mrveiss commented Apr 6, 2026

Review fixes applied ✓

  • CriticalHybridScorer now accepts optional redis_client in __init__; client cached on self._redis and reused across all candidates in score_and_rank — eliminates N connections per call
  • Critical_ENTITY_TYPE_PATTERNS updated to lowercase schema values (bug_fix, feature, decision, task, conversation) matching ENTITY_TYPES frozenset in schema.py
  • Criticalget_related_entities now resolves entity name → UUID via get_entity_by_name() then uses memory:relations:out:{uuid} key and reads via redis.json().get() (matching graph_store schema)
  • Important — hardcoded model_name="nomic-embed-text" replaced with config.get("knowledge.embedding_model", "nomic-embed-text")
  • Important — all datetime.now() calls replaced with datetime.now(tz=timezone.utc)
  • Critical__init__.py unified to export both query processor/scorer symbols and graph_store/schema symbols (with try/except ImportError guard)
  • Tests updated: 3 entity-type assertions updated to lowercase; get_related_entities tests rewritten to mock get_entity_by_name + redis.json().get()

All 58 tests passing.

mrveiss added a commit that referenced this pull request Apr 6, 2026
…3609)

- autobot-backend: redis.service → redis-stack-server.service (was silently
  ignored by systemd); add Wants= + network-online.target
- autobot-celery: same redis fix; keep After=autobot-backend.service
- autobot-slm-backend: add redis-stack-server + postgresql dependencies
  (had none — could race both on every boot)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@mrveiss mrveiss closed this Apr 6, 2026
@mrveiss
Copy link
Copy Markdown
Owner Author

mrveiss commented Apr 6, 2026

Superseded by #3616 which consolidates all memory graph code into autobot_memory_graph.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant