Skip to content

onelazyteam/pg_llm

Repository files navigation

pg_llm - PostgreSQL LLM Integration Extension

This PostgreSQL extension enables direct integration with various Large Language Models (LLMs). It supports multiple LLM providers and features like session management, parallel inference, and more.

Features

  • Support remote models
  • Support local LLM
  • Dynamic model management (add/remove models at runtime)
  • Large model metadata persistence
  • Importing the log library
  • Support for streaming responses
  • Session-based multi-turn conversation support
  • Parallel inference with multiple models
  • Automatically select the model with the highest confidence (select by score), and use the local model as a backup (fall back to the local model when confidence is low)
  • Sensitive information encryption
  • Audit logging
  • Session state management
  • Execute SQL through plug-in functions and let the LLM give optimization suggestions
  • Support natural language query, can enter human language through plug-ins to query database data
  • Generate reports, generate data analysis charts, intelligent analysis: built-in data statistical analysis and visualization suggestion generation
  • Full observability, complete record of decision-making process and intermediate results
  • Improve model accuracy, RAG knowledge base, feedback learning

Prerequisites

  • PostgreSQL 14.0 or later
  • C++20 compatible compiler
  • libcurl
  • OpenSSL
  • jsoncpp
  • CMake 3.15 or later (for CMake build)

Dependencies

The pg_llm extension depends on the following libraries:

  • OpenSSL: For secure connections to LLM APIs
  • cURL: For making HTTP requests to LLM APIs
  • JsonCpp: For parsing JSON responses from LLM APIs
  • PostgreSQL logging subsystem: For extension runtime logging

The build system will automatically check for these dependencies and provide instructions if they are missing.

Installation

  1. Configure PostgreSQL environment:
# Add PostgreSQL binaries to your PATH
# First, locate your PostgreSQL installation's bin directory:
# You can use the command:
which psql
# or
pg_config --bindir

# Then add the path to your shell configuration file
# For macOS (add to ~/.zshrc or ~/.bash_profile):
export PATH=/path/to/postgresql/bin:$PATH

# For Linux (add to ~/.bashrc):
export PATH=/path/to/postgresql/bin:$PATH

# Apply changes
source ~/.zshrc  # or ~/.bash_profile or ~/.bashrc
  1. Install dependencies:
# Ubuntu/Debian
sudo apt-get install postgresql-server-dev-all libcurl4-openssl-dev libjsoncpp-dev libssl-dev cmake pkg-config

# MacOS
brew install postgresql curl jsoncpp openssl cmake pkg-config
  1. Build and install pg_llm:
cd pg_llm
mkdir build && cd build

# Configure (choose one of the following build types)
# Debug build
cmake -DCMAKE_BUILD_TYPE=Debug ..

# Release build
cmake -DCMAKE_BUILD_TYPE=Release ..

# Address Sanitizer build
cmake -DCMAKE_BUILD_TYPE=ASan ..

# Build
make

# Install
sudo make install
  1. Configure PostgreSQL to load the extension:

Since pg_llm implements the _PG_init function for initialization, it must be loaded via shared_preload_libraries. Add the following to your postgresql.conf file:

# Add pg_llm to shared_preload_libraries
shared_preload_libraries = 'pg_llm'
  1. Restart PostgreSQL to load the extension:
# For systemd-based systems
sudo systemctl restart postgresql

# For macOS
brew services restart postgresql

# For other systems
pg_ctl restart -D /path/to/data/directory
  1. Create the extension in your database: After building, you need to enable the extension in PostgreSQL:
-- Enable the extension
CREATE EXTENSION vector;
CREATE EXTENSION pg_llm;

-- Verify installation
SELECT * FROM pg_available_extensions WHERE name = 'pg_llm';

Usage

Adding Models

  1. Alibaba Tongyi Qianwen:
SELECT pg_llm_add_model(
    'qianwen',
    'qianwen-chat',
    'your-api-key',
    '{
        "model_name": "qwen-turbo",
        "api_endpoint": "https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation",
        "access_key_id": "your-access-key-id",
        "access_key_secret": "your-access-key-secret"
    }'
);

Single-turn Chat

SELECT pg_llm_chat('gpt4-chat', 'What is PostgreSQL?');

Multi-turn Chat

SELECT pg_llm_create_session();
SELECT pg_llm_multi_turn_chat('qianwen-chat', '5PN2qmWqBlQ9wQj99nsQzldVI5ZuGXbE', 'WHO ARE YOU?');
SELECT pg_llm_multi_turn_chat('qianwen-chat', '5PN2qmWqBlQ9wQj99nsQzldVI5ZuGXbE', 'What was the previous question?');

Parallel Multi-model Chat

SELECT pg_llm_parallel_chat(
    'What are the advantages of PostgreSQL?',
    ARRAY['gpt4', 'deepseek-chat', 'hunyuan-chat', 'qianwen-chat']
);

Streaming Chat

SELECT *
FROM pg_llm_chat_stream(
  'qianwen-chat',
  'Explain MVCC in PostgreSQL',
  '{}'::jsonb
);

Structured JSON APIs

SELECT pg_llm_chat_json('qianwen-chat', 'Summarize PostgreSQL in one paragraph');

SELECT pg_llm_parallel_chat_json(
  'Which PostgreSQL features matter most for analytics workloads?',
  ARRAY['deepseek-r1-local', 'qianwen-chat'],
  '{"confidence_threshold": 0.65}'::jsonb
);

SELECT pg_llm_text2sql_json(
  'qianwen-chat',
  'Show the latest 10 orders',
  NULL,
  true,
  '{"enable_rag": true}'::jsonb
);

Session State

SELECT pg_llm_create_session(8);
SELECT pg_llm_update_session_state('session-id', '{"topic":"finance"}'::jsonb);
SELECT pg_llm_get_session('session-id');
SELECT * FROM pg_llm_get_session_messages('session-id');

SQL Analysis And Reporting

SELECT pg_llm_execute_sql_with_analysis(
  'qianwen-chat',
  'SELECT region, SUM(revenue) AS total_revenue FROM sales GROUP BY region',
  '{}'::jsonb
);

SELECT pg_llm_generate_report(
  'qianwen-chat',
  'SELECT region, SUM(revenue) AS total_revenue FROM sales GROUP BY region',
  '{}'::jsonb
);

Knowledge Base And Feedback

SELECT pg_llm_add_knowledge(
  'ops-runbook',
  'PostgreSQL VACUUM reclaims dead tuples and updates visibility information.',
  '{"domain":"operations"}'::jsonb,
  '{"chunk_size": 128}'::jsonb
);

SELECT * FROM pg_llm_search_knowledge('How does VACUUM help?', '{"limit": 3}'::jsonb);

SELECT pg_llm_record_feedback(
  '00000000-0000-0000-0000-000000000000'::uuid,
  5,
  'Helpful answer',
  '{"tag":"positive"}'::jsonb
);

Audit And Trace

SELECT * FROM pg_llm_get_audit_log('{"limit": 20}'::jsonb);
SELECT pg_llm_get_trace('00000000-0000-0000-0000-000000000000'::uuid);

Removing Models

SELECT pg_llm_remove_model('gpt4-chat');

Development Guide

Please refer to CONTRIBUTING.md for detailed development guidelines, including:

  • Code organization
  • Building for development
  • Development workflow
  • Adding new models
  • Coding standards
  • Commit conventions

Security Considerations

  • API keys are encrypted before storage
  • All sensitive information is handled securely
  • Access control is managed through PostgreSQL's permission system
  • Comprehensive audit logging

Contributing

  1. Fork the repository
  2. Create your feature branch
  3. Commit your changes
  4. Push to the branch
  5. Create a Pull Request

Coding Standards

  • Use C++20 features appropriately
  • Follow PostgreSQL coding conventions
  • Add comprehensive comments
  • Include unit tests for new features
  • Update documentation as needed

License

This project is licensed under the PostgreSQL License - see the LICENSE file for details.

Support

For issues and feature requests, please create an issue in the GitHub repository.

About

Postgres plug-in for accessing large language models, just for fun

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors