digoal
2024-01-08
PostgreSQL , PolarDB , DuckDB , db4ai , ai4db , pgvector , openAI
pg_vectorize
The simplest way to do vector search in Postgres. Vectorize is a Postgres extension that automates that the transformation and orchestration of text to embeddings, allowing you to do vector and semantic search on existing data with as little as two function calls.
通义千问也有类似的API 可以直接讲text转换为向量, 参考: 《沉浸式学习PostgreSQL|PolarDB 16: 植入通义千问大模型+文本向量化模型, 让数据库具备AI能力》
One function call to initialize your data. Another function call to search.
Postgres Extensions:
- pg_cron == 1.5
- pgmq >= 0.30.0
- pgvector >= 0.5.0
And you'll need an OpenAI key:
- openai API key
克隆项目
docker exec -ti pg bash
cd /tmp
git clone --depth 1 https://github.com/tembo-io/pg_vectorize.git
配置cargo源, 参考: https://mirrors.ustc.edu.cn/help/crates.io-index.html
# export CARGO_HOME=/root
# mkdir -vp ${CARGO_HOME:-$HOME/.cargo}
# vi ${CARGO_HOME:-$HOME/.cargo}/config
[source.crates-io]
replace-with = 'ustc'
[source.ustc]
registry = "sparse+https://mirrors.ustc.edu.cn/crates.io-index/"
安装插件
cd /tmp/pg_vectorize
grep pgrx Cargo.toml # 返回pgrx版本
cargo install --locked --version 0.11.0 cargo-pgrx
cargo pgrx init # create PGRX_HOME 后, 立即ctrl^c 退出
cargo pgrx init --pg14=`which pg_config` # 不用管报警
PGRX_IGNORE_RUST_VERSIONS=y cargo pgrx install --pg-config `which pg_config`
如果遇到错误 error: none of the selected packages contains these features: pg14, did you mean: pg15?
vi Cargo.toml
[features]
default = ["pg15"]
pg15 = ["pgrx/pg15", "pgrx-tests/pg15"]
# 添加一行14的
pg14 = ["pgrx/pg14", "pgrx-tests/pg14"]
pg_test = []
Finished dev [unoptimized + debuginfo] target(s) in 8m 36s
Installing extension
Copying control file to /usr/share/postgresql/14/extension/vectorize.control
Copying shared library to /usr/lib/postgresql/14/lib/vectorize.so
Discovering SQL entities
Discovered 10 SQL entities: 0 schemas (0 unique), 4 functions, 0 types, 4 enums, 2 sqls, 0 ords, 0 hashes, 0 aggregates, 0 triggers
Writing SQL entities to /usr/share/postgresql/14/extension/vectorize--0.7.0.sql
Copying extension schema upgrade file to /usr/share/postgresql/14/extension/vectorize--0.3.0--0.4.0.sql
Copying extension schema upgrade file to /usr/share/postgresql/14/extension/vectorize--0.4.0--0.5.0.sql
Copying extension schema upgrade file to /usr/share/postgresql/14/extension/vectorize--0.7.0--0.7.1.sql
Copying extension schema upgrade file to /usr/share/postgresql/14/extension/vectorize--0.6.0--0.6.1.sql
Copying extension schema upgrade file to /usr/share/postgresql/14/extension/vectorize--0.5.0--0.6.0.sql
Copying extension schema upgrade file to /usr/share/postgresql/14/extension/vectorize--0.2.0--0.3.0.sql
Finished installing vectorize
打包插件文件
docker cp pg:/usr/share/postgresql/14/extension/vectorize.control ~/pg14_amd64/
docker cp pg:/usr/lib/postgresql/14/lib/vectorize.so ~/pg14_amd64/
docker cp pg:/usr/share/postgresql/14/extension/vectorize--0.7.0.sql ~/pg14_amd64/