Skip to content

Latest commit

 

History

History
522 lines (373 loc) · 18.5 KB

20230710_02.md

File metadata and controls

522 lines (373 loc) · 18.5 KB

制作PolarDB | PostgreSQL 开源docker镜像, 集成大量插件方便学习, 并推送到阿里云镜像服务

作者

digoal

日期

2023-07-10

标签

PostgreSQL , PolarDB , docker , 插件


背景

制作PolarDB | PostgreSQL 开源docker镜像, 集成大量插件方便学习, 并推送到阿里云镜像服务.

推荐使用Dockfile和docker build来构建镜像. 使用容器 + docker commit构建的镜像非常大.

插件如下:

postgres=# create extension 
Display all 104 possibilities? (y or n)
address_standardizer              hdfs_fdw                          pgcrypto                          pg_surgery                        powa
"address_standardizer-3"          hll                               pg_dirtyread                      pg_trgm                           q3c
address_standardizer_data_us      hstore                            pgfincore                         pg_visibility                     refint
"address_standardizer_data_us-3"  hypopg                            pg_freespacemap                   pg_wait_sampling                  roaringbitmap
adminpack                         insert_username                   pg_jieba                          pg_walinspect                     rum
amcheck                           intagg                            pgpool_adm                        pldbgapi                          seg
autoinc                           intarray                          pgpool_recovery                   plprofiler                        smlar
bloom                             ip4r                              pgpool_regclass                   plr                               sslinfo
btree_gin                         isn                               pg_prewarm                        pointcloud                        tablefunc
btree_gist                        lo                                pg_qualstats                      pointcloud_postgis                table_log
citext                            ltree                             pg_repack                         postgis                           tcn
cube                              moddatetime                       pgroonga                          "postgis-3"                       tdigest
dblink                            mysql_fdw                         pgroonga_database                 postgis_raster                    timescaledb
dict_int                          ogr_fdw                           pgrouting                         "postgis_raster-3"                timescaledb_toolkit
dict_xsyn                         old_snapshot                      pgrowlocks                        postgis_sfcgal                    tsm_system_rows
duckdb_fdw                        oracle_fdw                        pg_show_plans                     "postgis_sfcgal-3"                tsm_system_time
earthdistance                     orafce                            pg_similarity                     postgis_tiger_geocoder            unaccent
extra_window_functions            pageinspect                       pg_sphere                         "postgis_tiger_geocoder-3"        "uuid-ossp"
file_fdw                          parray_gin                        pg_stat_kcache                    postgis_topology                  vector
first_last_agg                    pgaudit                           pg_stat_statements                "postgis_topology-3"              xml2
fuzzystrmatch                     pg_buffercache                    pgstattuple                       postgres_fdw      

例子

本文图方便使用容器 + docker commit构建的镜像. 非常大.

使用debian上一个稳定版作为基础镜像.

docker pull debian:oldstable  
  
IT-C02YW2EFLVDL:~ digoal$ docker images  
REPOSITORY   TAG             IMAGE ID       CREATED      SIZE  
debian       oldstable       b3b1ed8e0d05   6 days ago   124MB  

启动并进入debian容器

docker run -it --name debian debian:oldstable /bin/bash  

配置国内源

sed -i "s@http://\(deb\|security\).debian.org@http://mirrors.aliyun.com@g" /etc/apt/sources.list    
    
apt update    
    
apt-get reinstall -y apt-transport-https ca-certificates     
    
    
sed -i "s@http://mirrors.aliyun.com@https://mirrors.aliyun.com@g" /etc/apt/sources.list    
    
apt update    

安装postgresql-15

apt install -y lsb-release wget vim man   
  
sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'  
  
apt-get install -y gnupg2  
apt-get update  
  
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmour -o /etc/apt/trusted.gpg.d/postgresql.gpg
  
apt-get update  
  
apt-get install -y locales   
localedef -i en_US -f UTF-8 en_US.UTF-8  
  
  
apt-get -y install postgresql-15 postgresql-client-15 libpq-dev postgresql-server-dev-15 postgresql-doc-15 postgresql-contrib  

安装postgresql常见插件

apt search postgres|grep postgresql-15|grep -v dbgsym|grep -v "debug symbols"  
  
apt-get -y install postgresql-15-dirtyread postgresql-15-extra-window-functions postgresql-15-first-last-agg postgresql-15-hll postgresql-15-hypopg postgresql-15-ip4r postgresql-15-mysql-fdw postgresql-15-ogr-fdw postgresql-15-oracle-fdw postgresql-15-orafce postgresql-15-pg-qualstats postgresql-15-pg-stat-kcache postgresql-15-pg-wait-sampling postgresql-15-pgfincore postgresql-15-pgaudit postgresql-15-pgpool2 postgresql-15-pgrouting postgresql-15-pgrouting-doc postgresql-15-pgrouting-scripts postgresql-15-pgsphere postgresql-15-pgvector postgresql-15-pldebugger postgresql-15-pointcloud postgresql-15-plr postgresql-15-postgis-3 postgresql-15-postgis-3-scripts postgresql-15-powa postgresql-15-q3c postgresql-15-repack postgresql-15-rum postgresql-15-show-plans postgresql-15-similarity postgresql-15-tablelog postgresql-15-tdigest postgresql-15-wal2json postgresql-15-plprofiler pgagroal pgpool2 pgbouncer pgxnclient   

timescaledb

echo "deb https://packagecloud.io/timescale/timescaledb/debian/ $(lsb_release -c -s) main" | tee /etc/apt/sources.list.d/timescaledb.list  
wget --quiet -O - https://packagecloud.io/timescale/timescaledb/gpgkey | gpg --dearmor -o /etc/apt/trusted.gpg.d/timescaledb.gpg  
apt update  
apt install -y timescaledb-2-postgresql-15  

groonga, 任意语言, 任意字符个数的模糊查询插件

wget https://packages.groonga.org/debian/groonga-apt-source-latest-bullseye.deb  
apt install -y -V ./groonga-apt-source-latest-bullseye.deb  
apt update  
apt install -y postgresql-15-pgdg-pgroonga  

配置环境变量

su - root  
  
vi ~/.bashrc      
  
#  add by digoal      
alias rm='rm -i'      
alias cp='cp -i'      
alias ll='ls -larth'      
alias mv='mv -i'  
export PGHOME=/usr/lib/postgresql/15  
export PATH=$PGHOME/bin:$PATH      
export LD_LIBRARY_PATH=$PGHOME/lib:$LD_LIBRARY_PATH  
export PGDATA=/var/lib/postgresql/15/pgdata     
export PGUSER=postgres      
export PGHOST=$PGDATA      
export PGPORT=1921      
export PGDATABASE=postgres      
export LC_ALL=en_US.UTF-8  
  
su - postgres  
  
vi ~/.bash_profile    
  
#  add by digoal      
alias rm='rm -i'      
alias cp='cp -i'      
alias ll='ls -larth'      
alias mv='mv -i'  
export PGHOME=/usr/lib/postgresql/15  
export PATH=$PGHOME/bin:$PATH      
export LD_LIBRARY_PATH=$PGHOME/lib:$LD_LIBRARY_PATH  
export PGDATA=/var/lib/postgresql/15/pgdata     
export PGUSER=postgres      
export PGHOST=$PGDATA      
export PGPORT=1921      
export PGDATABASE=postgres      
export LC_ALL=en_US.UTF-8  

还有几个插件需要手工安装, 没有在apt里面.

结巴分词

apt install -y git cmake  
cd ~  
git clone --depth 1 https://github.com/jaiminpan/pg_jieba  
cd pg_jieba  
git submodule update --init --recursive  
mkdir build  
cd build  
cmake -DPostgreSQL_TYPE_INCLUDE_DIR=/usr/include/postgresql/15/server ..  
make  
make install  

rb

cd ~  
git clone --depth 1 https://github.com/ChenHuajun/pg_roaringbitmap  
cd pg_roaringbitmap  
USE_PGXS=1 make  
USE_PGXS=1 make install  

parray_gin

cd ~  
git clone --depth 1 https://github.com/theirix/parray_gin  
cd parray_gin  
USE_PGXS=1 make  
USE_PGXS=1 make install  

相似搜索

cd ~  
git clone --depth 1 https://github.com/jirutka/smlar  
cd smlar  
USE_PGXS=1 make  
USE_PGXS=1 make install  

duckdb fdw

cd ~  
git clone --depth 1 https://github.com/alitrack/duckdb_fdw  
cd duckdb_fdw  
wget https://github.com/duckdb/duckdb/releases/download/v0.8.1/libduckdb-linux-amd64.zip  
unzip -d . libduckdb-linux-amd64.zip   
cp libduckdb.so $(pg_config --libdir)   
USE_PGXS=1 make  
USE_PGXS=1 make install  

hdfs_fdw

cd ~  
apt-get install -y default-jre  
apt install -y openjdk-17-jdk  
git clone --depth 1 https://github.com/EnterpriseDB/hdfs_fdw  
  
cd ~/hdfs_fdw/libhive  
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64  
export JDK_INCLUDE=$JAVA_HOME/include  
export INSTALL_DIR=/usr/lib/postgresql/15/lib  
export PATH=/usr/lib/postgresql/15/bin:$PATH  
make  
make install  
  
cd ~/hdfs_fdw/libhive/jdbc  
javac MsgBuf.java  
javac HiveJdbcClient.java  
jar cf HiveJdbcClient-1.0.jar *.class  
cp HiveJdbcClient-1.0.jar /usr/lib/postgresql/15/lib  
  
cd ~/hdfs_fdw  
USE_PGXS=1 make  
USE_PGXS=1 make install  

madlib机器学习

cd ~  
apt-get install -y m4   
apt-get install -y python3-distutils  
git clone --depth 1 https://github.com/apache/madlib  
cd madlib  
mkdir build   
cd build   
cmake ..   
make -j 8  
apt install -y python  
  
未来使用如下命令安装madlib插件  
# src/bin/madpack -s madlib -p postgres -c [user[/password]@][host][:port][/database] install  

有几个插件编译有点问题, 暂时先不放进来:

  • age(图), imgsmlr(图片相似), fixeddecimal(128 bit decimal), parquet_s3_fdw, pggraphblas(图), rdkit(化学分子)

配置OS资源限制

# vi /etc/security/limits.conf  
    
# nofile超过1048576的话,一定要先将sysctl的fs.nr_open设置为更大的值,并生效后才能继续设置nofile.    
    
* soft    nofile  1024000    
* hard    nofile  1024000    
* soft    nproc   unlimited    
* hard    nproc   unlimited    
* soft    core    unlimited    
* hard    core    unlimited    
* soft    memlock unlimited    
* hard    memlock unlimited    

初始化数据库实例

su - postgres  
initdb -D $PGDATA -U postgres -E UTF8 --lc-collate=C --lc-ctype=en_US.UTF8   

配置数据库参数文件

cd $PGDATA  
  
pg_hba.conf  
增加  
host all all 0.0.0.0/0 scram-sha-256  
  
postgresql.auto.conf  
配置  
listen_addresses = '0.0.0.0'		  
port = 1921				  
max_connections = 2000			  
unix_socket_directories = '., /var/run/postgresql'	  
shared_buffers = 128MB			  
dynamic_shared_memory_type = posix	  
vacuum_cost_delay = 0			  
bgwriter_delay = 20ms			  
bgwriter_lru_maxpages = 500		  
bgwriter_lru_multiplier = 5.0		  
max_parallel_workers_per_gather = 0	  
synchronous_commit = off		  
wal_compression = on			  
wal_writer_delay = 10ms		  
max_wal_size = 1GB  
min_wal_size = 80MB  
random_page_cost = 1.1			  
log_destination = 'csvlog'		  
logging_collector = on		  
log_truncate_on_rotation = on		  
log_timezone = 'Etc/UTC'  
autovacuum = on			  
autovacuum_vacuum_cost_delay = 0ms	  
vacuum_freeze_table_age = 750000000  
vacuum_multixact_freeze_table_age = 750000000  
datestyle = 'iso, mdy'  
timezone = 'Etc/UTC'  
lc_messages = 'en_US.UTF-8'			  
lc_monetary = 'en_US.UTF-8'			  
lc_numeric = 'en_US.UTF-8'			  
lc_time = 'en_US.UTF-8'				  
default_text_search_config = 'pg_catalog.english'  

创建阿里云个人镜像服务, 第一次需要设置密码.

https://cr.console.aliyun.com/cn-hangzhou/instances

设置名字空间 digoal , 创建仓库 opensource_database .

https://cr.console.aliyun.com/repository/cn-hangzhou/digoal/opensource_database/details

查询容器ID

IT-C02YW2EFLVDL:~ digoal$ docker ps -a  
CONTAINER ID   IMAGE              COMMAND       CREATED       STATUS       PORTS     NAMES  
6cd83a7149c4   debian:oldstable   "/bin/bash"   4 hours ago   Up 4 hours             debian  

准备使用上述容器将打包镜像

docker help commit  
  
Usage:  docker commit [OPTIONS] CONTAINER [REPOSITORY[:TAG]]  
  
Create a new image from a container's changes  
  
Options:  
  -a, --author string    Author (e.g., "John Hannibal Smith <hannibal@a-team.com>")  
  -c, --change list      Apply Dockerfile instruction to the created image  
  -m, --message string   Commit message  
  -p, --pause            Pause container during commit (default true)  

打包镜像

docker commit -a "digoal zhou <digoal@126.com>" -m "PostgreSQL with many valuable extensions." 6cd83a7149c4 digoal:pg15.3_with_exts_base  
  
  
IT-C02YW2EFLVDL:~ digoal$ docker commit -a "digoal zhou <digoal@126.com>" -m "PostgreSQL with many valuable extensions." 6cd83a7149c4 digoal:pg15.3_with_exts_base  
sha256:669d6ae270438d2de7f8b13cbe2a01d218af275b9c7f9532db0a4f3710eec115  
  
IT-C02YW2EFLVDL:~ digoal$ docker images  
REPOSITORY   TAG                     IMAGE ID       CREATED          SIZE  
digoal       pg15.3_with_exts_base   669d6ae27043   42 seconds ago   5.23GB  

登陆阿里云docker registry

docker login --username=xxxxxxx registry.cn-hangzhou.aliyuncs.com  

打标

docker tag 669d6ae27043 registry.cn-hangzhou.aliyuncs.com/digoal/opensource_database:pg15.3_with_exts_base  

推送到阿里云镜像服务

docker push registry.cn-hangzhou.aliyuncs.com/digoal/opensource_database:pg15.3_with_exts_base   

从Registry中拉取镜像

$ docker pull registry.cn-hangzhou.aliyuncs.com/digoal/opensource_database:pg15.3_with_exts_base

IT-C02YW2EFLVDL:~ digoal$ docker pull registry.cn-hangzhou.aliyuncs.com/digoal/opensource_database:pg15.3_with_exts_base
pg15.3_with_exts_base: Pulling from digoal/opensource_database
Digest: sha256:a42064fcc2e0b3c8bece12173b8b4139db3fa55e0fa82b19a7f0642c47e9c87a
Status: Image is up to date for registry.cn-hangzhou.aliyuncs.com/digoal/opensource_database:pg15.3_with_exts_base
registry.cn-hangzhou.aliyuncs.com/digoal/opensource_database:pg15.3_with_exts_base


IT-C02YW2EFLVDL:~ digoal$ docker images -a
REPOSITORY                                                     TAG                     IMAGE ID       CREATED          SIZE
registry.cn-hangzhou.aliyuncs.com/digoal/opensource_database   pg15.3_with_exts_base   669d6ae27043   17 minutes ago   5.23GB
digoal                                                         pg15.3_with_exts_base   669d6ae27043   17 minutes ago   5.23GB
postgres                                                       15.3-bullseye           68ca5f232776   6 days ago       379MB
debian                                                         oldstable               b3b1ed8e0d05   6 days ago       124MB

用同样的方法可以创建自己的开源PolarDB容器, 安装好一些常用的插件. 甚至可以把很多数据库都塞到一个容器里, 便于对比学习.

20240227更新, 简化版 安装postgresql 15:

# debian 最新稳定版本包含的pg是上一个稳定版本. 
例如debian11包含的是pg14, debian12包含的是pg15. 


# 1 查询debian 12.x最新版本?
[绕过docker hub查询debian image最新版本](../202307/20230710_02.md)

# 2 pull 镜像
docker pull debian:12.5

# 3 启动容器
docker run -it --name debian debian:12.5 /bin/bash 

# 4 配置apt 源
mv /etc/apt/sources.list.d/debian.sources /etc/apt/sources.list.d/debian.sources.bak

echo "
# 默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释
deb http://mirrors.aliyun.com/debian/ bookworm main contrib non-free non-free-firmware
# deb-src http://mirrors.aliyun.com/debian/ bookworm main contrib non-free non-free-firmware

deb http://mirrors.aliyun.com/debian/ bookworm-updates main contrib non-free non-free-firmware
# deb-src http://mirrors.aliyun.com/debian/ bookworm-updates main contrib non-free non-free-firmware

deb http://mirrors.aliyun.com/debian/ bookworm-backports main contrib non-free non-free-firmware
# deb-src http://mirrors.aliyun.com/debian/ bookworm-backports main contrib non-free non-free-firmware

deb http://mirrors.aliyun.com/debian-security bookworm-security main contrib non-free non-free-firmware
# deb-src http://mirrors.aliyun.com/debian-security bookworm-security main contrib non-free non-free-firmware
" > /etc/apt/sources.list  

apt-get update  

# 5 安装 postgresql apt 源
apt-get install -y locales   
localedef -i en_US -f UTF-8 en_US.UTF-8  

apt-get install -y lsb-release wget vim man gnupg2  

wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmour -o /etc/apt/trusted.gpg.d/postgresql.gpg 
# wget --quiet -O - https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/apt/ACCC4CF8.asc | gpg --dearmour -o /etc/apt/trusted.gpg.d/postgresql.gpg  

sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'  

apt-get update 

# 6 安装postgresql 15
apt-get install -y postgresql-15 postgresql-client-15 libpq-dev postgresql-server-dev-15 postgresql-doc-15 postgresql-contrib

digoal's wechat