Skip to content

promarsal/agentic-postgres-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Agentic Postgres

The first PostgreSQL database purpose-built for AI agents.

Agentic Postgres - AI-Native Database

One database with vector search, hybrid search, time-series, and AI capabilities built in.

Replace Pinecone + Elasticsearch + InfluxDB with intelligent Postgres.

πŸ“– Why Agentic Postgres? (Click to expand)

When a user asks "Why did sales drop last month?" that simple question triggers:

  • β†’ SQL queries for structured data
  • β†’ Time-series analysis to identify trends
  • β†’ Keyword search for exact mentions
  • β†’ Semantic search to understand meaning
  • β†’ Hybrid search combining multiple techniques
  • β†’ Memory updates to store insights

Traditional approach: Stitch together PostgreSQL + InfluxDB + Elasticsearch + Pinecone + OpenAI API

Agentic Postgres: All of this in ONE database. No external services. No data movement.

What gets replaced:

  • ❌ Pinecone β†’ βœ… pgvectorscale (DiskANN)
  • ❌ Elasticsearch β†’ βœ… pg_textsearch (BM25)
  • ❌ InfluxDB β†’ βœ… TimescaleDB
  • ❌ Embedding service β†’ βœ… pgai
  • ❌ App logic β†’ βœ… plpgsql

Quick Start

Get Started in 3 Commands

# Install Tiger CLI
curl -fsSL https://cli.tigerdata.com | sh

# Authenticate
tiger auth login

# Install MCP
tiger mcp install

Run the Demo

# Clone and install
git clone https://github.com/promarsal/agentic-postgres-demo.git
cd agentic-postgres-demo
npm install

# Configure
echo "OPENAI_API_KEY=sk-..." > .env
echo "DATABASE_URL=postgresql://..." >> .env  # Get from: tiger mcp install output

# Setup database and populate data
npm run build && npm run setup
npm run populate-embeddings

πŸ‘‰ See QUICKSTART.md for detailed 5-minute setup guide.


Try the Investigation

Run these 6 questions to see all capabilities in action:

# Q1: SQL Analytics + TimescaleDB
npm run dev "Sales dropped yesterday compared to last week - why?"

# Q2: Hybrid Search (BM25 + Vector + RRF) 🌟
npm run dev "What are customers saying about Premium Wireless Headphones?"

# Q3: Semantic Search (pgvectorscale)
npm run dev "Are other products showing similar quality issues?"

# Q4: SQL Joins + Customer Analysis
npm run dev "Which customers bought Premium Wireless Headphones and left negative feedback?"

# Q5: Agent Memory (RAG)
npm run dev "Based on what we've learned, what should I do immediately?"

# Q6: Self-Observability
npm run dev "Show me how you figured this out - what was your investigation process?"

Each question takes 10-30 seconds and demonstrates different Agentic Postgres capabilities.

πŸ“Š See Detailed Examples (Click to expand)

1. "Why did sales drop?"

Capability: TimescaleDB + Multi-Query Analysis

SELECT product_name, 
       SUM(CASE WHEN order_date = CURRENT_DATE - 1 THEN amount ELSE 0 END) AS yesterday_sales,
       SUM(CASE WHEN order_date = CURRENT_DATE - 8 THEN amount ELSE 0 END) AS last_week_sales
FROM orders
GROUP BY product_name
ORDER BY change_amount;

Agent discovers: Premium Wireless Headphones dropped from $3,299.89 β†’ $0


2. "What are customers saying?"

Capability: Hybrid Search (BM25 + Vector + RRF)

WITH semantic_search AS (
  -- Vector similarity using pgvectorscale
  SELECT id, feedback_text, sentiment,
         1 - (embedding <=> query_vector) as similarity,
         ROW_NUMBER() OVER (ORDER BY embedding <=> query_vector) as rank
  FROM user_feedback WHERE embedding IS NOT NULL
),
fulltext_search AS (
  -- BM25 keyword search using PostgreSQL FTS
  SELECT id, ts_rank(...) as fts_rank,
         ROW_NUMBER() OVER (ORDER BY ts_rank(...) DESC) as rank
  FROM user_feedback
  WHERE to_tsvector(feedback_text) @@ websearch_to_tsquery(keywords)
),
combined AS (
  -- RRF: Combine rankings with 1/(rank+60) formula
  SELECT *, (1.0/(s.rank+60) + 1.0/(f.rank+60)) as rrf_score
  FROM semantic_search s FULL OUTER JOIN fulltext_search f USING (id)
)
SELECT * FROM combined ORDER BY rrf_score DESC LIMIT 15;

Agent finds: 15 customer complaints about defects (cushions falling apart, static noise, poor materials)


3. "Are other products affected?"

Capability: Semantic Search (pgvectorscale)

Agent analyzes: Semantically similar feedback across all products to identify patterns


4. "Which customers are at risk?"

Capability: SQL Joins + Customer Analysis

Agent identifies: 27 customers who bought the product and left negative feedback


5. "What should I do immediately?"

Capability: Agent Memory (RAG) + Cross-Query Synthesis

Agent recalls: All findings from previous investigations and synthesizes actionable recommendations


6. "How did you figure this out?"

Capability: Self-Observability

Agent reveals: Step-by-step investigation process, tools used, duration - querying its own history!


What You'll See

Beautiful, readable output showing the agent's investigation process:

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
πŸ“Š STEP 1: SQL Query
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

πŸ’‘ Reason: To identify which products experienced a sales drop...

πŸ“ SQL:
──────────────────────────────────────────────────────────────
  SELECT product_name, 
         SUM(CASE WHEN order_date = CURRENT_DATE - 1 THEN amount ELSE 0 END)...
──────────────────────────────────────────────────────────────

βœ… Result: 5 rows returned

πŸ“Š Results:
   1. Premium Wireless Headphones | 0 | 3299.89 | -3299.89
   2. Smart Fitness Watch | 0 | 749.97 | -749.97
   ...

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
πŸ” STEP 2: Hybrid Search (BM25 + Vector)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

πŸ’‘ Reason: To find customer feedback related to quality issues...

πŸ” Semantic Query: complaints about Premium Wireless Headphones
πŸ”‘ Keywords: broken|defective|damaged|poor quality|static noise

πŸ“ SQL: Hybrid Search with Reciprocal Rank Fusion (RRF)
──────────────────────────────────────────────────────────────
  WITH semantic_search AS (
    -- Vector similarity using pgvectorscale
    ...
  ),
  fulltext_search AS (
    -- BM25 keyword search using PostgreSQL FTS
    ...
  )
  SELECT * FROM combined ORDER BY rrf_score DESC LIMIT 15;
──────────────────────────────────────────────────────────────

βœ… Found: 15 results

πŸ“Š Match Breakdown:
   β€’ Both (BM25 + Vector): 15
   β€’ Semantic only: 0
   β€’ Keyword only: 0

πŸ’¬ Sample Feedback:
   1. "Defective Premium Wireless Headphones. Cushions fell apart..." [negative]
   ...

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🧠 Agent Thinking
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Premium Wireless Headphones: Sales dropped from $3,299.89 to $0.
Root cause: Quality issues reported by customers...

══════════════════════════════════════════════════════════════════════
βœ…  INVESTIGATION COMPLETE
══════════════════════════════════════════════════════════════════════

Premium Wireless Headphones: Sales dropped from $3,299.89 to $0. 
Root cause: Quality issues reported by customers...

Notice: The agent shows every SQL query, search technique, and reasoning step!


The Magic: It's Just Postgres

🎯 Everything you just saw happened in ONE Postgres database:

  • βœ… Time-series queries (TimescaleDB)
  • βœ… Semantic search (pgvectorscale + pgai)
  • βœ… Full-text search (pg_textsearch BM25)
  • βœ… Hybrid search (combining both)
  • βœ… Memory & synthesis (stored with embeddings)
  • βœ… Self-analysis (querying own history)

No external services. No data movement. Just intelligent Postgres.


How It Works

CLI β†’ AI SDK Agent β†’ Tools β†’ Agentic Postgres

Stack:

  • Vercel AI SDK - Agent reasoning & tool calling (gpt-4o)
  • Agentic Postgres - PostgreSQL with extensions:
    • TimescaleDB - Time-series for agent_events hypertable
    • pgvector - Vector storage for embeddings
    • pgvectorscale - DiskANN index for fast similarity search
    • pgai - Generate embeddings in-database
    • pg_textsearch - Built-in BM25 full-text search
    • plpgsql - Procedural logic in the database

Agent Tools:

  • query_database - Execute SQL queries
  • hybrid_search - Combine FTS + vector search (RRF)
  • semantic_search_feedback - Pure vector similarity
  • fulltext_search - PostgreSQL FTS
  • store_insight - Save learnings to agent_memory
  • search_insights - Recall past learnings (RAG)
  • analyze_agent_performance - Meta-analysis

What Gets Set Up

npm run setup creates everything you need:

  • βœ… Extensions: timescaledb, vector, vectorscale, ai
  • βœ… Tables: 6 tables (products, orders, feedback, agent tracking)
  • βœ… Indexes: DiskANN (vector), GIN/FTS (text), TimescaleDB (time-series)
  • βœ… Demo Data: 5 products, 71 orders, 88 feedback entries
πŸ“‹ Click for detailed breakdown

Extensions

  • timescaledb - Time-series optimization
  • vector - Vector storage (pgvector)
  • vectorscale - DiskANN indexes (pgvectorscale)
  • ai - Generate embeddings in-database (pgai)

Tables

  • agent_questions - Investigation tracking
  • agent_events - Time-series log (TimescaleDB hypertable)
  • agent_memory - Stored insights with embeddings
  • products, orders, user_feedback - Demo business data

Indexes

  • DiskANN: Fast vector similarity search (pgvectorscale)
  • GIN/FTS: PostgreSQL full-text search (BM25)
  • TimescaleDB: Automatic time-series partitioning
  • B-tree: Standard indexes on dates/IDs

Demo Data

  • 5 products with realistic pricing
  • 71 orders over 14-day period (with intentional sales drop)
  • 88 customer feedback entries (27 complaints about quality)

πŸ‘‰ See INDEXES_AND_DATA.md for complete technical details.


Key Concepts

🎯 Hybrid Search (RRF) - The hero capability

Combines PostgreSQL FTS (keyword) with pgvectorscale (semantic):

  • Full-Text Search: GIN index, BM25 ranking
  • Vector Search: DiskANN index, fast ANN search
  • RRF Algorithm: Reciprocal Rank Fusion merges both for comprehensive results
  • Why powerful: Catches BOTH exact keywords AND semantically similar concepts
  • Replaces Elasticsearch + Pinecone with one query!

Example: Query "quality issues" finds both:

  • Exact matches: "quality control problems"
  • Semantic matches: "disappointed with purchase", "broke after two days"
🧠 Agent Memory (RAG) - Learning across sessions

Agent stores insights in agent_memory with embeddings:

  • Semantic search recalls past learnings
  • DiskANN enables fast similarity search
  • No separate vector database needed
  • Builds knowledge over time
πŸ“Š Observability - Full audit trail

Every action tracked in agent_events (TimescaleDB hypertable):

  • Step-by-step investigation tracking
  • Performance metrics (duration, tool usage)
  • Question tracking with final answers
  • Self-analysis capability
⏱️ TimescaleDB - Time-series at scale

Automatic partitioning for time-series data:

  • agent_events is a hypertable (partitioned by time)
  • Efficient queries on time ranges
  • Built-in time-series functions
  • Compression and retention policies

πŸ‘‰ Technical details: See INDEXES_AND_DATA.md for SQL code, index specs, and performance metrics.


Documentation

Guide Purpose Time
QUICKSTART.md Get running in 5 minutes ⚑ 5 min
TEST_COMMANDS.md Copy-paste test commands πŸ§ͺ Copy-paste
GETTING_STARTED.md Complete beginner guide πŸ“– 10 min
DEMO_QUERIES.md Demo script with talking points 🎀 Reference
INDEXES_AND_DATA.md Technical deep-dive (SQL, indexes, data) πŸ”§ Technical

Learn More


License

MIT


Questions? Check QUICKSTART.md or GETTING_STARTED.md
Presenting? See DEMO_QUERIES.md for talking points
Technical? See INDEXES_AND_DATA.md for deep dive

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published