Skip to content

connect() hard-requires an embedder; implement the embedder=None bring-your-own-vectors mode #8

@thorwhalen

Description

@thorwhalen

vd.connect() always builds an embedding function: when embedding_model is None, util._get_embedding_function constructs an imbed.Embed(). There is no supported "I will always supply vectors myself" mode.

Consequence: a pure vector-store consumer must inject a fake embedder. efvd's main consumer — does exactly this, passing a poison-pill _unused_embedding_model that raises RuntimeError if ever called (ef/source_manager.py). That is a workaround for a facade gap.

vd's own design constitution (.claude/CLAUDE.md §6) already specifies the intended behaviour:

No embedding model bound into Collection. … vd accepts an injectable embedder (or pre-computed vectors / embedder=None "bring-your-own-vectors" mode).

The embedder=None mode is specified but not implemented — None currently means "default to imbed", not "no embedder".

Proposal:

  • connect(backend, *, embedding_model=None, ...): None means no embedder, not "default to imbed".
  • With no embedder, __setitem__ / search given a str raises a clear error ("no embedder configured — pass a vector or set embedding_model"); passing vectors works normally.
  • Keep auto-embedding as an opt-in convenience when an embedder is explicitly given.
  • imbed stays an optional dependency, off the default path.

This also lets ef drop its poison-pill workaround (see thorwhalen/ef companion issue).

This is squarely the question of whether ef over-influenced vd: here it is the opposite — vd's document-first embedding default forces ef to work around it. Fixing it makes vd a cleaner independent facade for the vector-first paradigm too.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions