digoal
2023-07-10
PostgreSQL , PolarDB , docker , 插件
制作PolarDB | PostgreSQL 开源docker镜像, 集成大量插件方便学习, 并推送到阿里云镜像服务.
推荐使用Dockfile和docker build来构建镜像. 使用容器 + docker commit构建的镜像非常大.
插件如下:
postgres=# create extension
Display all 104 possibilities? (y or n)
address_standardizer hdfs_fdw pgcrypto pg_surgery powa
"address_standardizer-3" hll pg_dirtyread pg_trgm q3c
address_standardizer_data_us hstore pgfincore pg_visibility refint
"address_standardizer_data_us-3" hypopg pg_freespacemap pg_wait_sampling roaringbitmap
adminpack insert_username pg_jieba pg_walinspect rum
amcheck intagg pgpool_adm pldbgapi seg
autoinc intarray pgpool_recovery plprofiler smlar
bloom ip4r pgpool_regclass plr sslinfo
btree_gin isn pg_prewarm pointcloud tablefunc
btree_gist lo pg_qualstats pointcloud_postgis table_log
citext ltree pg_repack postgis tcn
cube moddatetime pgroonga "postgis-3" tdigest
dblink mysql_fdw pgroonga_database postgis_raster timescaledb
dict_int ogr_fdw pgrouting "postgis_raster-3" timescaledb_toolkit
dict_xsyn old_snapshot pgrowlocks postgis_sfcgal tsm_system_rows
duckdb_fdw oracle_fdw pg_show_plans "postgis_sfcgal-3" tsm_system_time
earthdistance orafce pg_similarity postgis_tiger_geocoder unaccent
extra_window_functions pageinspect pg_sphere "postgis_tiger_geocoder-3" "uuid-ossp"
file_fdw parray_gin pg_stat_kcache postgis_topology vector
first_last_agg pgaudit pg_stat_statements "postgis_topology-3" xml2
fuzzystrmatch pg_buffercache pgstattuple postgres_fdw
本文图方便使用容器 + docker commit构建的镜像. 非常大.
使用debian上一个稳定版作为基础镜像.
docker pull debian:oldstable
IT-C02YW2EFLVDL:~ digoal$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
debian oldstable b3b1ed8e0d05 6 days ago 124MB
启动并进入debian容器
docker run -it --name debian debian:oldstable /bin/bash
配置国内源
sed -i "s@http://\(deb\|security\).debian.org@http://mirrors.aliyun.com@g" /etc/apt/sources.list
apt update
apt-get reinstall -y apt-transport-https ca-certificates
sed -i "s@http://mirrors.aliyun.com@https://mirrors.aliyun.com@g" /etc/apt/sources.list
apt update
安装postgresql-15
apt install -y lsb-release wget vim man
sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
apt-get install -y gnupg2
apt-get update
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmour -o /etc/apt/trusted.gpg.d/postgresql.gpg
apt-get update
apt-get install -y locales
localedef -i en_US -f UTF-8 en_US.UTF-8
apt-get -y install postgresql-15 postgresql-client-15 libpq-dev postgresql-server-dev-15 postgresql-doc-15 postgresql-contrib
安装postgresql常见插件
apt search postgres|grep postgresql-15|grep -v dbgsym|grep -v "debug symbols"
apt-get -y install postgresql-15-dirtyread postgresql-15-extra-window-functions postgresql-15-first-last-agg postgresql-15-hll postgresql-15-hypopg postgresql-15-ip4r postgresql-15-mysql-fdw postgresql-15-ogr-fdw postgresql-15-oracle-fdw postgresql-15-orafce postgresql-15-pg-qualstats postgresql-15-pg-stat-kcache postgresql-15-pg-wait-sampling postgresql-15-pgfincore postgresql-15-pgaudit postgresql-15-pgpool2 postgresql-15-pgrouting postgresql-15-pgrouting-doc postgresql-15-pgrouting-scripts postgresql-15-pgsphere postgresql-15-pgvector postgresql-15-pldebugger postgresql-15-pointcloud postgresql-15-plr postgresql-15-postgis-3 postgresql-15-postgis-3-scripts postgresql-15-powa postgresql-15-q3c postgresql-15-repack postgresql-15-rum postgresql-15-show-plans postgresql-15-similarity postgresql-15-tablelog postgresql-15-tdigest postgresql-15-wal2json postgresql-15-plprofiler pgagroal pgpool2 pgbouncer pgxnclient
timescaledb
echo "deb https://packagecloud.io/timescale/timescaledb/debian/ $(lsb_release -c -s) main" | tee /etc/apt/sources.list.d/timescaledb.list
wget --quiet -O - https://packagecloud.io/timescale/timescaledb/gpgkey | gpg --dearmor -o /etc/apt/trusted.gpg.d/timescaledb.gpg
apt update
apt install -y timescaledb-2-postgresql-15
groonga, 任意语言, 任意字符个数的模糊查询插件
wget https://packages.groonga.org/debian/groonga-apt-source-latest-bullseye.deb
apt install -y -V ./groonga-apt-source-latest-bullseye.deb
apt update
apt install -y postgresql-15-pgdg-pgroonga
配置环境变量
su - root
vi ~/.bashrc
# add by digoal
alias rm='rm -i'
alias cp='cp -i'
alias ll='ls -larth'
alias mv='mv -i'
export PGHOME=/usr/lib/postgresql/15
export PATH=$PGHOME/bin:$PATH
export LD_LIBRARY_PATH=$PGHOME/lib:$LD_LIBRARY_PATH
export PGDATA=/var/lib/postgresql/15/pgdata
export PGUSER=postgres
export PGHOST=$PGDATA
export PGPORT=1921
export PGDATABASE=postgres
export LC_ALL=en_US.UTF-8
su - postgres
vi ~/.bash_profile
# add by digoal
alias rm='rm -i'
alias cp='cp -i'
alias ll='ls -larth'
alias mv='mv -i'
export PGHOME=/usr/lib/postgresql/15
export PATH=$PGHOME/bin:$PATH
export LD_LIBRARY_PATH=$PGHOME/lib:$LD_LIBRARY_PATH
export PGDATA=/var/lib/postgresql/15/pgdata
export PGUSER=postgres
export PGHOST=$PGDATA
export PGPORT=1921
export PGDATABASE=postgres
export LC_ALL=en_US.UTF-8
还有几个插件需要手工安装, 没有在apt里面.
结巴分词
apt install -y git cmake
cd ~
git clone --depth 1 https://github.com/jaiminpan/pg_jieba
cd pg_jieba
git submodule update --init --recursive
mkdir build
cd build
cmake -DPostgreSQL_TYPE_INCLUDE_DIR=/usr/include/postgresql/15/server ..
make
make install
rb
cd ~
git clone --depth 1 https://github.com/ChenHuajun/pg_roaringbitmap
cd pg_roaringbitmap
USE_PGXS=1 make
USE_PGXS=1 make install
parray_gin
cd ~
git clone --depth 1 https://github.com/theirix/parray_gin
cd parray_gin
USE_PGXS=1 make
USE_PGXS=1 make install
相似搜索
cd ~
git clone --depth 1 https://github.com/jirutka/smlar
cd smlar
USE_PGXS=1 make
USE_PGXS=1 make install
duckdb fdw
cd ~
git clone --depth 1 https://github.com/alitrack/duckdb_fdw
cd duckdb_fdw
wget https://github.com/duckdb/duckdb/releases/download/v0.8.1/libduckdb-linux-amd64.zip
unzip -d . libduckdb-linux-amd64.zip
cp libduckdb.so $(pg_config --libdir)
USE_PGXS=1 make
USE_PGXS=1 make install
hdfs_fdw
cd ~
apt-get install -y default-jre
apt install -y openjdk-17-jdk
git clone --depth 1 https://github.com/EnterpriseDB/hdfs_fdw
cd ~/hdfs_fdw/libhive
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64
export JDK_INCLUDE=$JAVA_HOME/include
export INSTALL_DIR=/usr/lib/postgresql/15/lib
export PATH=/usr/lib/postgresql/15/bin:$PATH
make
make install
cd ~/hdfs_fdw/libhive/jdbc
javac MsgBuf.java
javac HiveJdbcClient.java
jar cf HiveJdbcClient-1.0.jar *.class
cp HiveJdbcClient-1.0.jar /usr/lib/postgresql/15/lib
cd ~/hdfs_fdw
USE_PGXS=1 make
USE_PGXS=1 make install
madlib机器学习
cd ~
apt-get install -y m4
apt-get install -y python3-distutils
git clone --depth 1 https://github.com/apache/madlib
cd madlib
mkdir build
cd build
cmake ..
make -j 8
apt install -y python
未来使用如下命令安装madlib插件
# src/bin/madpack -s madlib -p postgres -c [user[/password]@][host][:port][/database] install
有几个插件编译有点问题, 暂时先不放进来:
- age(图), imgsmlr(图片相似), fixeddecimal(128 bit decimal), parquet_s3_fdw, pggraphblas(图), rdkit(化学分子)
配置OS资源限制
# vi /etc/security/limits.conf
# nofile超过1048576的话,一定要先将sysctl的fs.nr_open设置为更大的值,并生效后才能继续设置nofile.
* soft nofile 1024000
* hard nofile 1024000
* soft nproc unlimited
* hard nproc unlimited
* soft core unlimited
* hard core unlimited
* soft memlock unlimited
* hard memlock unlimited
初始化数据库实例
su - postgres
initdb -D $PGDATA -U postgres -E UTF8 --lc-collate=C --lc-ctype=en_US.UTF8
配置数据库参数文件
cd $PGDATA
pg_hba.conf
增加
host all all 0.0.0.0/0 scram-sha-256
postgresql.auto.conf
配置
listen_addresses = '0.0.0.0'
port = 1921
max_connections = 2000
unix_socket_directories = '., /var/run/postgresql'
shared_buffers = 128MB
dynamic_shared_memory_type = posix
vacuum_cost_delay = 0
bgwriter_delay = 20ms
bgwriter_lru_maxpages = 500
bgwriter_lru_multiplier = 5.0
max_parallel_workers_per_gather = 0
synchronous_commit = off
wal_compression = on
wal_writer_delay = 10ms
max_wal_size = 1GB
min_wal_size = 80MB
random_page_cost = 1.1
log_destination = 'csvlog'
logging_collector = on
log_truncate_on_rotation = on
log_timezone = 'Etc/UTC'
autovacuum = on
autovacuum_vacuum_cost_delay = 0ms
vacuum_freeze_table_age = 750000000
vacuum_multixact_freeze_table_age = 750000000
datestyle = 'iso, mdy'
timezone = 'Etc/UTC'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
default_text_search_config = 'pg_catalog.english'
创建阿里云个人镜像服务, 第一次需要设置密码.
https://cr.console.aliyun.com/cn-hangzhou/instances
设置名字空间 digoal , 创建仓库 opensource_database .
https://cr.console.aliyun.com/repository/cn-hangzhou/digoal/opensource_database/details
查询容器ID
IT-C02YW2EFLVDL:~ digoal$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
6cd83a7149c4 debian:oldstable "/bin/bash" 4 hours ago Up 4 hours debian
准备使用上述容器将打包镜像
docker help commit
Usage: docker commit [OPTIONS] CONTAINER [REPOSITORY[:TAG]]
Create a new image from a container's changes
Options:
-a, --author string Author (e.g., "John Hannibal Smith <hannibal@a-team.com>")
-c, --change list Apply Dockerfile instruction to the created image
-m, --message string Commit message
-p, --pause Pause container during commit (default true)
打包镜像
docker commit -a "digoal zhou <digoal@126.com>" -m "PostgreSQL with many valuable extensions." 6cd83a7149c4 digoal:pg15.3_with_exts_base
IT-C02YW2EFLVDL:~ digoal$ docker commit -a "digoal zhou <digoal@126.com>" -m "PostgreSQL with many valuable extensions." 6cd83a7149c4 digoal:pg15.3_with_exts_base
sha256:669d6ae270438d2de7f8b13cbe2a01d218af275b9c7f9532db0a4f3710eec115
IT-C02YW2EFLVDL:~ digoal$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
digoal pg15.3_with_exts_base 669d6ae27043 42 seconds ago 5.23GB
登陆阿里云docker registry
docker login --username=xxxxxxx registry.cn-hangzhou.aliyuncs.com
打标
docker tag 669d6ae27043 registry.cn-hangzhou.aliyuncs.com/digoal/opensource_database:pg15.3_with_exts_base
推送到阿里云镜像服务
docker push registry.cn-hangzhou.aliyuncs.com/digoal/opensource_database:pg15.3_with_exts_base
从Registry中拉取镜像
$ docker pull registry.cn-hangzhou.aliyuncs.com/digoal/opensource_database:pg15.3_with_exts_base
IT-C02YW2EFLVDL:~ digoal$ docker pull registry.cn-hangzhou.aliyuncs.com/digoal/opensource_database:pg15.3_with_exts_base
pg15.3_with_exts_base: Pulling from digoal/opensource_database
Digest: sha256:a42064fcc2e0b3c8bece12173b8b4139db3fa55e0fa82b19a7f0642c47e9c87a
Status: Image is up to date for registry.cn-hangzhou.aliyuncs.com/digoal/opensource_database:pg15.3_with_exts_base
registry.cn-hangzhou.aliyuncs.com/digoal/opensource_database:pg15.3_with_exts_base
IT-C02YW2EFLVDL:~ digoal$ docker images -a
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.cn-hangzhou.aliyuncs.com/digoal/opensource_database pg15.3_with_exts_base 669d6ae27043 17 minutes ago 5.23GB
digoal pg15.3_with_exts_base 669d6ae27043 17 minutes ago 5.23GB
postgres 15.3-bullseye 68ca5f232776 6 days ago 379MB
debian oldstable b3b1ed8e0d05 6 days ago 124MB
用同样的方法可以创建自己的开源PolarDB容器, 安装好一些常用的插件. 甚至可以把很多数据库都塞到一个容器里, 便于对比学习.
# debian 最新稳定版本包含的pg是上一个稳定版本.
例如debian11包含的是pg14, debian12包含的是pg15.
# 1 查询debian 12.x最新版本?
[绕过docker hub查询debian image最新版本](../202307/20230710_02.md)
# 2 pull 镜像
docker pull debian:12.5
# 3 启动容器
docker run -it --name debian debian:12.5 /bin/bash
# 4 配置apt 源
mv /etc/apt/sources.list.d/debian.sources /etc/apt/sources.list.d/debian.sources.bak
echo "
# 默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释
deb http://mirrors.aliyun.com/debian/ bookworm main contrib non-free non-free-firmware
# deb-src http://mirrors.aliyun.com/debian/ bookworm main contrib non-free non-free-firmware
deb http://mirrors.aliyun.com/debian/ bookworm-updates main contrib non-free non-free-firmware
# deb-src http://mirrors.aliyun.com/debian/ bookworm-updates main contrib non-free non-free-firmware
deb http://mirrors.aliyun.com/debian/ bookworm-backports main contrib non-free non-free-firmware
# deb-src http://mirrors.aliyun.com/debian/ bookworm-backports main contrib non-free non-free-firmware
deb http://mirrors.aliyun.com/debian-security bookworm-security main contrib non-free non-free-firmware
# deb-src http://mirrors.aliyun.com/debian-security bookworm-security main contrib non-free non-free-firmware
" > /etc/apt/sources.list
apt-get update
# 5 安装 postgresql apt 源
apt-get install -y locales
localedef -i en_US -f UTF-8 en_US.UTF-8
apt-get install -y lsb-release wget vim man gnupg2
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmour -o /etc/apt/trusted.gpg.d/postgresql.gpg
# wget --quiet -O - https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/apt/ACCC4CF8.asc | gpg --dearmour -o /etc/apt/trusted.gpg.d/postgresql.gpg
sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
apt-get update
# 6 安装postgresql 15
apt-get install -y postgresql-15 postgresql-client-15 libpq-dev postgresql-server-dev-15 postgresql-doc-15 postgresql-contrib