Skip to content

Feature: Add semantic search via Qdrant + Embedding #33

@shukebeta

Description

@shukebeta

Implementation Plan

Phase 1 – Infrastructure

  • Set up Qdrant via Docker on local machine
  • Set up Ollama with bge-m3 model (better CJK support)
  • Add QdrantUrl and OllamaUrl to appsettings.json
  • Add NuGet packages: Qdrant.Client, OllamaSharp

Phase 2 – Indexing

  • Create IEmbeddingService wrapping Ollama
  • Create IVectorSearchService wrapping Qdrant
  • Chunking strategy: notes < 500 chars as single chunk; longer notes split by paragraph
  • One-off backfill script for existing ~30k notes
  • Hook into note create/update/delete to keep index in sync

Phase 3 – Search API

  • Add semantic search path to existing search endpoint
  • Return top N results by similarity score, filtered by userId
  • Merge with Manticore results: keyword hits ranked first, semantic hits appended

Phase 4 – Resilience

  • Graceful fallback to Manticore-only if Qdrant/Ollama is unreachable

Notes

  • Qdrant payload must include userId — filter on every query, never expose
    other users' notes
  • isPrivate notes should also be filtered appropriately
  • Start with nomic-embed-text for quick validation; switch to bge-m3 if
    CJK quality needs improvement

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions