Problem
Fresh `schema/schema.sql` fails to create two HNSW indexes on halfvec columns:
- line 96: `CREATE INDEX IF NOT EXISTS warm_tier_embedding_idx ON warm_tier USING hnsw (embedding halfvec_cosine_ops);`
- line 435: `CREATE INDEX IF NOT EXISTS shared_memories_embedding_idx ON shared_memories USING hnsw (embedding halfvec_cosine_ops);`
Postgres errors with `ERROR: column does not have dimensions` because the halfvec columns are declared without a dimension spec (e.g., `embedding halfvec,` rather than `embedding halfvec(384),`).
Since schema.sql is applied via `psql -f` without `ON_ERROR_STOP`, psql continues after these errors. The tables exist but the HNSW indexes do not. Semantic search falls back to sequential scan — slow but functional.
Evidence
First surfaced in the v3.0.0-beta.3 release workflow (run 24591387057) because `test-load` is PR-gated and had never been exercised until release. Integration tests have been passing throughout because they run with `EMBEDDING_PROVIDER=none`, so the missing indexes are never hit.
Latent since v2.7.0 when halfvec was introduced.
Constraints
The embedding dimension is provider-specific (`Xenova/bge-small-en-v1.5` = 384, OpenAI `text-embedding-3-small` = 1536, etc.) — not known at schema time. Hardcoding a default would be wrong for operators using different providers.
Options to consider
- Move HNSW index creation out of schema.sql. Provide a post-install script or a management endpoint that ALTERs the halfvec column to the operator's configured dimension and creates the index. Document in DEPLOYMENT-SECURITY.md.
- Provide a `schema/hnsw-indexes.sql` companion script that operators run after choosing a dimension. `schema.sql` stays dimensionless; index creation is opt-in.
- Require `EMBEDDING_DIMENSIONS` at install time and template it into schema.sql via envsubst or similar. Adds a build step to installation.
Option 2 is lowest-friction for users who don't use embeddings (the default) while still giving a clean path for those who do.
Acceptance
- HNSW indexes are created for operators who use embeddings.
- No errors in CI when applying fresh `schema.sql`.
- Documentation clearly explains the dimension requirement and how to add indexes.
Problem
Fresh `schema/schema.sql` fails to create two HNSW indexes on halfvec columns:
Postgres errors with `ERROR: column does not have dimensions` because the halfvec columns are declared without a dimension spec (e.g., `embedding halfvec,` rather than `embedding halfvec(384),`).
Since schema.sql is applied via `psql -f` without `ON_ERROR_STOP`, psql continues after these errors. The tables exist but the HNSW indexes do not. Semantic search falls back to sequential scan — slow but functional.
Evidence
First surfaced in the v3.0.0-beta.3 release workflow (run 24591387057) because `test-load` is PR-gated and had never been exercised until release. Integration tests have been passing throughout because they run with `EMBEDDING_PROVIDER=none`, so the missing indexes are never hit.
Latent since v2.7.0 when halfvec was introduced.
Constraints
The embedding dimension is provider-specific (`Xenova/bge-small-en-v1.5` = 384, OpenAI `text-embedding-3-small` = 1536, etc.) — not known at schema time. Hardcoding a default would be wrong for operators using different providers.
Options to consider
Option 2 is lowest-friction for users who don't use embeddings (the default) while still giving a clean path for those who do.
Acceptance