Skip to content

Commit 17427fc

Browse files
committed
docs: per-backend guides for Postgres, Qdrant, and Pinecone vector stores
Three new guides covering setup, configuration, and troubleshooting: - POSTGRES_BACKEND.md: pgvector setup, hybrid search (HNSW + tsvector RRF), multi-tenancy, cloud providers - QDRANT_BACKEND.md: Docker/Cloud setup, BM25 hybrid search, scaling (sharding, quantization), sidecar SQLite - PINECONE_BACKEND.md: namespace isolation, metadata filtering, limitations, migration to self-hosted, cost comparison
1 parent a7191ae commit 17427fc

3 files changed

Lines changed: 512 additions & 0 deletions

File tree

docs/PINECONE_BACKEND.md

Lines changed: 153 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,153 @@
1+
# Pinecone Backend
2+
3+
The Pinecone backend stores embeddings in [Pinecone](https://www.pinecone.io/), a fully managed vector database. This is the simplest backend to set up — no infrastructure to manage — but has limitations compared to self-hosted options.
4+
5+
## Prerequisites
6+
7+
| Requirement | Notes |
8+
|---|---|
9+
| Pinecone account | Free tier available at [pinecone.io](https://www.pinecone.io/) |
10+
| API key | Found in the Pinecone console under "API Keys" |
11+
| Index created | Create an index in the console or via the Pinecone API |
12+
| Node.js 18+ | Uses native `fetch` (no SDK dependency) |
13+
14+
## Configuration
15+
16+
```typescript
17+
import { PineconeVectorStore } from '@framers/agentos/rag/implementations/vector_stores/PineconeVectorStore';
18+
19+
const store = new PineconeVectorStore({
20+
id: 'my-pinecone',
21+
type: 'pinecone',
22+
apiKey: process.env.PINECONE_API_KEY!,
23+
indexHost: 'https://my-index-abc123.svc.aped-1234.pinecone.io',
24+
namespace: 'agent-default',
25+
defaultDimension: 1536,
26+
});
27+
28+
await store.initialize();
29+
```
30+
31+
### Configuration options
32+
33+
| Option | Type | Default | Description |
34+
|---|---|---|---|
35+
| `apiKey` | `string` | **required** | Pinecone API key |
36+
| `indexHost` | `string` | **required** | Data Plane URL for your index (from Pinecone console) |
37+
| `namespace` | `string` | `''` | Default namespace; collections map to namespaces |
38+
| `defaultDimension` | `number` | `1536` | Embedding dimensions (must match the index) |
39+
40+
The `indexHost` is the **Data Plane** endpoint for a specific index — not the control plane URL. Find it in the Pinecone console under your index details. It looks like `https://my-index-abc123.svc.aped-1234.pinecone.io`.
41+
42+
## Namespace-based collection isolation
43+
44+
Pinecone namespaces are used as "collections". Each namespace is fully isolated within the same index:
45+
46+
```typescript
47+
// Agent A's memories
48+
await store.upsert('agent-alice', documents);
49+
50+
// Agent B's memories — completely separate namespace
51+
await store.upsert('agent-bob', documents);
52+
53+
// Query only Agent A's namespace
54+
await store.query('agent-alice', embedding, { topK: 10 });
55+
```
56+
57+
Namespaces are created implicitly on first upsert. `createCollection()` is a no-op.
58+
59+
## Metadata filtering
60+
61+
Pinecone supports MongoDB-style metadata filter operators. AgentOS translates its unified `MetadataFilter` format to Pinecone's native syntax:
62+
63+
```typescript
64+
const results = await store.query('my-namespace', embedding, {
65+
topK: 10,
66+
filter: {
67+
type: { $eq: 'semantic' }, // Equality
68+
importance: { $gte: 0.5 }, // Range
69+
tags: { $in: ['project', 'decision'] }, // Set membership
70+
},
71+
});
72+
```
73+
74+
### Supported operators
75+
76+
| Operator | Description | Example |
77+
|---|---|---|
78+
| `$eq` | Equal to | `{ status: { $eq: 'active' } }` |
79+
| `$ne` | Not equal to | `{ status: { $ne: 'deleted' } }` |
80+
| `$gt`, `$gte` | Greater than (or equal) | `{ score: { $gt: 0.8 } }` |
81+
| `$lt`, `$lte` | Less than (or equal) | `{ age: { $lt: 30 } }` |
82+
| `$in` | In set | `{ type: { $in: ['a', 'b'] } }` |
83+
| `$nin` | Not in set | `{ type: { $nin: ['x'] } }` |
84+
| `$exists` | Field exists | `{ tags: { $exists: true } }` |
85+
86+
Metadata values must be string, number, boolean, or string arrays. Complex objects are JSON-stringified before storage.
87+
88+
## Limitations
89+
90+
### No hybrid search
91+
92+
Pinecone requires a separate sparse encoder (e.g., SPLADE) for hybrid search. The AgentOS `hybridSearch()` method falls back to dense-only search on Pinecone. For true hybrid search, use the Postgres or Qdrant backends.
93+
94+
### No knowledge graph
95+
96+
There is no sidecar storage for knowledge graph data. If you enable `graph: true` in `MemoryConfig` with the Pinecone backend, graph data is not persisted.
97+
98+
### Not self-hostable
99+
100+
Pinecone is a managed service only. You cannot run it on your own infrastructure. If self-hosting is a requirement, use Qdrant or Postgres.
101+
102+
### Batch size limit
103+
104+
Pinecone limits upserts to 100 vectors per request. AgentOS handles this automatically by splitting batches, but large ingestion jobs will make many sequential API calls.
105+
106+
### No `deleteAll` count
107+
108+
`delete({ deleteAll: true })` returns `deletedCount: -1` because Pinecone's API does not report how many vectors were deleted in a bulk operation.
109+
110+
## Migration FROM Pinecone to self-hosted backends
111+
112+
Use the AgentOS migration engine to move data from Pinecone to Postgres or Qdrant:
113+
114+
```typescript
115+
import { MigrationEngine } from '@framers/agentos/rag/migration/MigrationEngine';
116+
117+
await MigrationEngine.migrate({
118+
from: {
119+
type: 'pinecone',
120+
// PineconeSourceAdapter uses indexHost + apiKey + namespace
121+
url: 'https://my-index-abc123.svc.aped-1234.pinecone.io',
122+
apiKey: process.env.PINECONE_API_KEY!,
123+
},
124+
to: {
125+
type: 'postgres',
126+
connectionString: 'postgresql://postgres:wunderland@localhost:5432/agent_memory',
127+
},
128+
batchSize: 100,
129+
onProgress: (done, total, table) => {
130+
console.log(`[${table}] ${done}/${total}`);
131+
},
132+
});
133+
```
134+
135+
The migration reads vectors via Pinecone's `list` + `fetch` APIs and writes them to the target backend. Non-vector data (knowledge graph, conversations) is not stored in Pinecone and will not be migrated.
136+
137+
## Cost comparison
138+
139+
| Tier | Vectors | Monthly cost | Notes |
140+
|---|---|---|---|
141+
| **Starter (free)** | 100K | $0 | 1 index, 1 project, community support |
142+
| **Standard** | 1M+ | ~$70+ | Multiple indexes, backup, 99.95% SLA |
143+
| **Enterprise** | 10M+ | Custom | Dedicated infra, HIPAA, SOC2 |
144+
145+
Self-hosted alternatives for comparison:
146+
147+
| Backend | Vectors | Monthly cost | Notes |
148+
|---|---|---|---|
149+
| **Postgres + pgvector** | 10M+ | ~$15 (Neon free) to ~$50 (RDS) | Full SQL, hybrid search included |
150+
| **Qdrant (Docker)** | 10M+ | Cost of your VM (~$5-20) | Built-in BM25, quantization |
151+
| **Qdrant Cloud** | 1M+ | ~$25+ | Managed Qdrant, auto-scaling |
152+
153+
Pinecone is the easiest to start with but becomes expensive at scale. For production agents processing large knowledge bases, self-hosted Postgres or Qdrant offer better cost efficiency and more features (hybrid search, knowledge graph).

docs/POSTGRES_BACKEND.md

Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
# Postgres + pgvector Backend
2+
3+
The Postgres backend stores embeddings, metadata, and full-text content in a single relational database using the [pgvector](https://github.com/pgvector/pgvector) extension. This gives you ACID transactions, hybrid search (dense vectors + BM25 in one query), and JSONB metadata filtering — all without a separate vector service.
4+
5+
## Prerequisites
6+
7+
| Requirement | Minimum version |
8+
|---|---|
9+
| PostgreSQL | 14+ (15+ recommended for `HNSW` index type) |
10+
| pgvector extension | 0.5.0+ (`CREATE EXTENSION vector`) |
11+
| Node.js | 18+ (uses the `pg` npm package) |
12+
13+
## Quick start — Docker
14+
15+
```bash
16+
docker run -d \
17+
--name agentos-pgvector \
18+
-e POSTGRES_PASSWORD=wunderland \
19+
-p 5432:5432 \
20+
pgvector/pgvector:pg16
21+
22+
# Verify
23+
psql postgresql://postgres:wunderland@localhost:5432/postgres \
24+
-c "CREATE EXTENSION IF NOT EXISTS vector; SELECT extversion FROM pg_extension WHERE extname='vector';"
25+
```
26+
27+
The `pgvector/pgvector` image ships with the extension pre-installed. No manual compilation needed.
28+
29+
## Manual setup
30+
31+
If you are using an existing Postgres instance (self-hosted or managed), install pgvector manually:
32+
33+
```sql
34+
-- Run as a superuser or a user with CREATE EXTENSION privilege.
35+
CREATE EXTENSION IF NOT EXISTS vector;
36+
```
37+
38+
AgentOS creates its own tables on first use. The schema looks like:
39+
40+
```sql
41+
CREATE TABLE IF NOT EXISTS "<prefix>my_collection" (
42+
id TEXT PRIMARY KEY,
43+
embedding vector(1536), -- pgvector column
44+
metadata_json JSONB, -- GIN-indexed for filtering
45+
text_content TEXT, -- raw text for hybrid search
46+
tsv tsvector GENERATED ALWAYS AS (to_tsvector('english', COALESCE(text_content, ''))) STORED,
47+
created_at BIGINT NOT NULL,
48+
updated_at BIGINT
49+
);
50+
51+
-- Indexes created automatically:
52+
-- 1. HNSW index for approximate nearest neighbor search
53+
-- 2. GIN index on metadata_json for JSONB filtering
54+
-- 3. GIN index on tsv for full-text search
55+
```
56+
57+
## Configuration
58+
59+
```typescript
60+
import { PostgresVectorStore } from '@framers/agentos/rag/implementations/vector_stores/PostgresVectorStore';
61+
62+
const store = new PostgresVectorStore({
63+
id: 'my-pg-store',
64+
type: 'postgres',
65+
connectionString: 'postgresql://postgres:wunderland@localhost:5432/agent_memory',
66+
poolSize: 10, // Connection pool size (default: 10)
67+
defaultDimension: 1536, // Default embedding dimensions (default: 1536)
68+
similarityMetric: 'cosine', // 'cosine' | 'euclidean' | 'dotproduct'
69+
tablePrefix: 'agent1_', // Optional prefix for multi-tenancy
70+
});
71+
72+
await store.initialize();
73+
```
74+
75+
### Configuration options
76+
77+
| Option | Type | Default | Description |
78+
|---|---|---|---|
79+
| `connectionString` | `string` | **required** | Standard Postgres connection URI |
80+
| `poolSize` | `number` | `10` | Max concurrent connections in the pool |
81+
| `defaultDimension` | `number` | `1536` | Embedding vector dimensions for new collections |
82+
| `similarityMetric` | `string` | `'cosine'` | Distance function: `cosine`, `euclidean`, or `dotproduct` |
83+
| `tablePrefix` | `string` | `''` | Table name prefix for multi-tenant deployments |
84+
85+
## Hybrid search
86+
87+
The Postgres backend is the only backend that supports true **single-query hybrid search**: pgvector HNSW for dense vectors and PostgreSQL tsvector for BM25 lexical matching, fused with Reciprocal Rank Fusion (RRF) in a single SQL statement.
88+
89+
```typescript
90+
const results = await store.hybridSearch(
91+
'my_collection',
92+
queryEmbedding,
93+
'natural language query text',
94+
{
95+
topK: 10,
96+
rrfK: 60, // RRF constant (default: 60)
97+
},
98+
);
99+
```
100+
101+
How it works internally:
102+
103+
1. **Dense CTE**: Finds top candidates by pgvector HNSW distance (`<=>` for cosine).
104+
2. **Lexical CTE**: Finds top candidates by `ts_rank()` against the `tsvector` column.
105+
3. **Fusion CTE**: Merges both result sets with `1/(k + rank_dense) + 1/(k + rank_lexical)`.
106+
4. **Final join**: Fetches full documents for the top fused results.
107+
108+
This avoids two separate queries and application-level fusion.
109+
110+
## Multi-tenancy via schema isolation
111+
112+
For SaaS deployments where each tenant needs isolated data:
113+
114+
```typescript
115+
// Tenant A
116+
const storeA = new PostgresVectorStore({
117+
// ...
118+
tablePrefix: 'tenant_a_',
119+
});
120+
121+
// Tenant B
122+
const storeB = new PostgresVectorStore({
123+
// ...
124+
tablePrefix: 'tenant_b_',
125+
});
126+
```
127+
128+
Each prefix creates a separate set of tables: `"tenant_a_my_collection"`, `"tenant_a__collections"`, etc. Alternatively, use Postgres schemas (`SET search_path`) for stronger isolation.
129+
130+
## Cloud providers
131+
132+
Any managed Postgres with pgvector works. Just set the connection string:
133+
134+
| Provider | Connection string example |
135+
|---|---|
136+
| **Neon** | `postgresql://user:pass@ep-cool-grass-123456.us-east-2.aws.neon.tech/neondb?sslmode=require` |
137+
| **Supabase** | `postgresql://postgres:pass@db.xyzabc.supabase.co:5432/postgres` |
138+
| **AWS RDS** | `postgresql://postgres:pass@mydb.cluster-xyz.us-east-1.rds.amazonaws.com:5432/mydb` |
139+
| **Google Cloud SQL** | `postgresql://postgres:pass@/mydb?host=/cloudsql/project:region:instance` |
140+
| **Azure Flexible Server** | `postgresql://postgres:pass@myserver.postgres.database.azure.com:5432/mydb?sslmode=require` |
141+
142+
All of these support pgvector. Neon and Supabase have it pre-installed. For RDS, enable the `pgvector` extension in the parameter group.
143+
144+
## Troubleshooting
145+
146+
### `ERROR: could not open extension control file "vector"`
147+
148+
pgvector is not installed. On managed services, check that the extension is enabled in your database configuration. For self-hosted:
149+
150+
```bash
151+
# Ubuntu/Debian
152+
sudo apt install postgresql-16-pgvector
153+
154+
# macOS (Homebrew)
155+
brew install pgvector
156+
```
157+
158+
Then run `CREATE EXTENSION vector;` as a superuser.
159+
160+
### `ERROR: different vector dimensions`
161+
162+
You changed `defaultDimension` after creating a collection. pgvector enforces dimension constraints at the column level. Drop and recreate the collection, or create a new collection with the correct dimension.
163+
164+
### Connection refused / timeout
165+
166+
- Verify the connection string host, port, and credentials.
167+
- Check that `pg_hba.conf` allows connections from your IP.
168+
- For Docker: ensure `-p 5432:5432` is set and the container is running.
169+
- For cloud: check firewall / security group rules.
170+
171+
### Pool exhaustion (`too many clients already`)
172+
173+
Increase `poolSize` in the config, or reduce concurrent usage. The default of 10 is usually sufficient for single-agent deployments. Multi-agent setups may need 20-50.

0 commit comments

Comments
 (0)