RecVector

AI-powered content-based recommendations for any existing database — without migrations, without infrastructure rewrites, without an LLM at query time.

RecVector is an open-source TypeScript SDK that adds personalised recommendation capabilities to applications by mapping your existing database schema to a vector similarity search layer. It reads your tables as-is, learns from user interactions, and serves sub-100ms recommendations via a pre-aggregated profile architecture.

Why RecVector

Most recommendation systems require either a fully managed service (with its costs and data privacy trade-offs), or a complete rethink of your database schema. RecVector takes a different approach: you describe your existing tables in a JSON schema file, and RecVector does the rest.

Zero migrations — your users, products, articles tables stay exactly as they are. RecVector adds only two small aggregate tables (rec_user_profiles, rec_entity_stats).
No LLM at query time — profiles are pre-computed at write time. recommend() reads one row then runs an HNSW nearest-neighbour search. Query cost is O(log N) regardless of how many users exist.
Pluggable everything — swap embedding models (OpenAI, Gemini, HuggingFace) and vector databases (Chroma today, Pinecone/Milvus tomorrow) without touching application code.
Visual schema editor — npx recvector studio opens a React Flow drag-and-drop UI that introspects your database and generates the schema JSON for you.

Repository Structure

packages/
├── sdk/          # Core TypeScript SDK + CLI binary (@recvector/sdk)
├── adapters/     # Database and vector DB adapters (@recvector/adapters)
└── gui/          # React + Vite + React Flow studio (pre-built into sdk package)

The gui package is never installed on a developer's machine. It is pre-built by Vite and bundled as static files inside @recvector/sdk, served by the CLI's local Express server.

Packages

`@recvector/sdk`

The core SDK. Provides createRecEngine(), the full RecEngineClient interface, CLI commands, schema validation, and the recommendation engine itself.

Full documentation →

`@recvector/adapters`

Concrete implementations of the storage, vector DB, and embedding model interfaces defined in the SDK.

KnexStorageAdapter — reads your existing DB via Knex.js (PostgreSQL, MySQL, SQLite)
ChromaVectorClient — connects to a Chroma vector database
OpenAIEmbeddingModel, GeminiEmbeddingModel, HuggingFaceEmbeddingModel — embedding providers

Full documentation →

Architecture

Four Logical Entities

Entity	Description	Example
User	Who receives recommendations	`users` table
Entity	Items being recommended	`products`, `videos`, `articles`
Feature	Entity attributes that drive similarity	`tags`, `category`, `description`
Interaction	User–entity signals with weights	views (0.3), likes (1.0), purchases (1.0)

Profile Update Strategies

Two strategies are available, set via aggregation.profile_update_strategy in your schema JSON.

`full_recompute` (default)

Best for accuracy. Re-derives the profile from the full interaction history on each update.

logInteraction() increments a counter in rec_user_profiles
When the counter hits the batch threshold (or time fallback expires) → updateUserProfile() fires asynchronously
Fetches the last 100 interactions, retrieves their stored entity vectors from the vector DB, applies a weighted average with time decay, and upserts the profile

`incremental`

Best for high-frequency interaction pipelines. O(D) hot-path — no history reads.

logInteraction() immediately nudges the existing profile vector toward the interacted entity's vector using a weighted moving average
No embedding API calls at interaction time
Schedule periodic updateUserProfile() calls (e.g. nightly cron) to re-apply time decay and correct drift

Recommendation Path (shared)

recommend(userId) →
  load profile row (1 DB read) →
  HNSW query in vector DB (O(log N)) →
  score: α × similarity + β × log(1 + popularity) →
  optional MMR re-ranking for diversity →
  return top K

Performance target: < 100ms p95 at 1M entities with pre-computed profiles.

Why `fetchByIds` instead of re-embedding

Entity vectors are embedded once at upsertEntity() time and stored in the vector DB. Profile computation reuses those exact stored vectors. Re-embedding from SQL features at profile-update time would produce subtly different values due to embedding API non-determinism, placing the profile in a shifted coordinate space relative to the entity vectors it is compared against.

Quick Start

pnpm add @recvector/sdk @recvector/adapters
# also install your DB driver: pg | mysql2 | better-sqlite3

Configure (recvector.config.ts at your project root):

import { defineConfig } from '@recvector/sdk'

export default defineConfig({
  db: {
    client: 'postgresql',
    connection: process.env.DATABASE_URL,
  },
  schemaPath: './rec_schema.json',
  vectorDb: {
    type: 'chroma',
    url: 'http://localhost:8000',
    collection: 'my-app',
  },
  embeddingModel: {
    provider: 'openai',
    model: 'text-embedding-3-small',
    dimensions: 1536,
    apiKey: process.env.OPENAI_API_KEY,
  },
})

Map your schema (visual editor or hand-write rec_schema.json):

npx recvector studio

Bootstrap (embed all existing entities + build profiles from history):

npx recvector bootstrap --concurrency 10

Use in your application:

import { createRecEngine } from '@recvector/sdk'

const rec = await createRecEngine()

// Log an interaction when a user engages
await rec.logInteraction({
  userId: 'user_123',
  entityId: 'product_456',
  type: 'purchase',
})

// Get personalised recommendations
const recs = await rec.recommend({
  userId: 'user_123',
  topK: 10,
  lambda: 0.7,       // MMR diversity (1.0 = pure relevance)
  exploration: 0.05, // slight randomness for feed variation
})
// → [{ entityId: 'product_789', score: 0.94 }, ...]

Performance Targets

Operation	Target
Recommendation (p95)	< 100ms at 1M entities with pre-computed profile
Entity embedding	< 500ms per entity
Batch sync throughput	> 1,000 entities/min
Profile update	< 500ms
Storage per profile	~6KB (6GB at 1M users)
Tested scale	1–5M entities on a single Chroma node

SDK-Managed Tables

RecVector creates two aggregate tables in your existing database. No other tables are touched.

Table	Columns	Purpose
`rec_user_profiles`	`user_id` PK, `embedding` JSON, `last_updated`, `version`, `interaction_count_since_update`, `accumulated_weight`	Pre-computed user profile vectors
`rec_entity_stats`	`entity_id` PK, `feedback_counts` JSON, `version`	Global popularity aggregates

Development

This is a pnpm workspace. Always use pnpm.

# Install dependencies
pnpm install

# Build all packages (GUI → SDK → CJS)
cd packages/sdk && pnpm build

# Run SDK tests
cd packages/sdk && pnpm test

# Start Studio for local development
npx recvector studio

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
packages		packages
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
rec_schema.json		rec_schema.json
recvector.config.ts		recvector.config.ts
tsconfig.base.json		tsconfig.base.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RecVector

Why RecVector

Repository Structure

Packages

`@recvector/sdk`

`@recvector/adapters`

Architecture

Four Logical Entities

Profile Update Strategies

`full_recompute` (default)

`incremental`

Recommendation Path (shared)

Why `fetchByIds` instead of re-embedding

Quick Start

Performance Targets

SDK-Managed Tables

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RecVector

Why RecVector

Repository Structure

Packages

@recvector/sdk

@recvector/adapters

Architecture

Four Logical Entities

Profile Update Strategies

full_recompute (default)

incremental

Recommendation Path (shared)

Why fetchByIds instead of re-embedding

Quick Start

Performance Targets

SDK-Managed Tables

Development

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`@recvector/sdk`

`@recvector/adapters`

`full_recompute` (default)

`incremental`

Why `fetchByIds` instead of re-embedding

Packages