digoal
2023-09-12
PostgreSQL , PolarDB , 机器学习 , ai , 大模型 , 模型集市 , 向量数据库 , 向量 , 图像搜索 , 推荐系统 , 自然语言处理 , 情感分析 , 模型训练 , 谎言分析 , 消除重复问题 , 语义逻辑分析 , 词性分类 , 摘要提炼 ,
翻译 , 阅读理解 , 讲故事 , 上下文回答 , 填空
postgresml的dockerfile内容:
postgresml Dockerfile使用了nvidia的基础镜像, 目的是使用nvidia gpu 进行加速.
简单修改如下, 修改ubuntu源为国内的地址.
FROM nvidia/cuda:12.1.1-devel-ubuntu22.04
ENV PATH="/usr/local/cuda/bin:${PATH}"
RUN sed -i 's/archive.ubuntu.com/mirrors.ustc.edu.cn/g' /etc/apt/sources.list
RUN sed -i 's/cn.archive.ubuntu.com/mirrors.ustc.edu.cn/g' /etc/apt/sources.list
RUN sed -i 's/security.ubuntu.com/mirrors.ustc.edu.cn/g' /etc/apt/sources.list
RUN apt-get update && apt-get install -y lsb-release curl ca-certificates gnupg coreutils sudo openssl
RUN echo "deb [trusted=yes] https://apt.postgresml.org $(lsb_release -cs) main" > /etc/apt/sources.list.d/postgresml.list
RUN echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list
RUN curl https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor | tee /etc/apt/trusted.gpg.d/apt.postgresql.org.gpg >/dev/null
ENV TZ=UTC
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update -y && apt-get install git postgresml-15 postgresml-dashboard -y
RUN git clone --branch v0.5.0 https://github.com/pgvector/pgvector && cd pgvector && echo "trusted = true" >> vector.control && make && make install
COPY entrypoint.sh /app/entrypoint.sh
COPY dashboard.sh /app/dashboard.sh
COPY --chown=postgres:postgres local_dev.conf /etc/postgresql/15/main/conf.d/01-local_dev.conf
COPY --chown=postgres:postgres pg_hba.conf /etc/postgresql/15/main/pg_hba.conf
ENTRYPOINT ["bash", "/app/entrypoint.sh"]
如果你的环境有nvidia gpu, 如何打包PostgresML docker镜像?
git clone --depth 1 -b v2.7.9 https://github.com/postgresml/postgresml
cd postgresml/docker
docker build -t="postgresml/15:v2.7.9" . 2>&1 | tee /tmp/docker_build.log
运行postgresml镜像:
docker run --platform linux/amd64 -d -it --gpus all --cap-add=SYS_PTRACE --cap-add SYS_ADMIN --privileged=true --name pgml --shm-size=1g -p 5433:5432 -p 8000:8000 postgresml/15:v2.7.9 sudo -u postgresml psql -d postgresml
docker exec -ti pgml bash
如果你的环境没有nvidia gpu, 换成普通镜像打包:
cd postgresml/docker
vi Dockerfile
FROM --platform=$TARGETPLATFORM ubuntu:22.04
MAINTAINER digoal zhou "dege.zzz@alibaba-inc.com"
ARG TARGETPLATFORM
ARG BUILDPLATFORM
RUN echo "I am running on $BUILDPLATFORM, building for $TARGETPLATFORM"
ENV TZ=UTC
ENV DEBIAN_FRONTEND=noninteractive
RUN sed -i 's/archive.ubuntu.com/mirrors.ustc.edu.cn/g' /etc/apt/sources.list
RUN sed -i 's/cn.archive.ubuntu.com/mirrors.ustc.edu.cn/g' /etc/apt/sources.list
RUN sed -i 's/security.ubuntu.com/mirrors.ustc.edu.cn/g' /etc/apt/sources.list
# FROM nvidia/cuda:12.1.1-devel-ubuntu22.04
# ENV PATH="/usr/local/cuda/bin:${PATH}"
RUN apt-get update && apt-get install -y lsb-release curl ca-certificates gnupg coreutils sudo openssl
RUN echo "deb [trusted=yes] https://apt.postgresml.org $(lsb_release -cs) main" > /etc/apt/sources.list.d/postgresml.list
RUN echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list
RUN curl https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor | tee /etc/apt/trusted.gpg.d/apt.postgresql.org.gpg >/dev/null
RUN apt-get update -y && apt-get install git postgresml-15 postgresml-dashboard -y
RUN git clone --branch v0.5.0 https://github.com/pgvector/pgvector && cd pgvector && echo "trusted = true" >> vector.control && make && make install
COPY entrypoint.sh /app/entrypoint.sh
COPY dashboard.sh /app/dashboard.sh
COPY --chown=postgres:postgres local_dev.conf /etc/postgresql/15/main/conf.d/01-local_dev.conf
COPY --chown=postgres:postgres pg_hba.conf /etc/postgresql/15/main/pg_hba.conf
ENTRYPOINT ["bash", "/app/entrypoint.sh"]
build docker image:
cd postgresml/docker
docker build --platform=linux/amd64 -t="postgresml/15:v2.7.9" . 2>&1 | tee /tmp/docker_build.log
运行postgresml镜像:
docker run --platform linux/amd64 -d -it --cap-add=SYS_PTRACE --cap-add SYS_ADMIN --privileged=true --name pgml --shm-size=1g -p 5433:5432 -p 8000:8000 postgresml/15:v2.7.9 sudo -u postgresml psql -d postgresml
docker exec -ti pgml bash
su - postgresml
$ psql
\psql (15.4 (Ubuntu 15.4-1.pgdg22.04+1))
Type "help" for help.
postgresml=# \dx
List of installed extensions
Name | Version | Schema | Description
---------+---------+------------+---------------------------------------
pgml | 2.7.9 | pgml | pgml: Created by the PostgresML team
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
(2 rows)
打开web, 体验postgresml dashboard:
推送到阿里云镜像仓库:
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
postgresml/15 v2.7.9 064b5796d5f1 56 minutes ago 8.08GB
docker tag 064b5796d5f1 registry.cn-hangzhou.aliyuncs.com/digoal/opensource_database:pgml15_v2.7.9
docker push registry.cn-hangzhou.aliyuncs.com/digoal/opensource_database:pgml15_v2.7.9
你如果想体验postgresml, 可以我做好的这个镜像:
docker pull registry.cn-hangzhou.aliyuncs.com/digoal/opensource_database:pgml15_v2.7.9
OS USER: postgresml , postgres
DB SUPERUSER: root
DB USER: postgresml
DB PASSWORD: postgresml
DB NAME: postgresml
DB OWNER: postgresml
SCHEMA: pgml
EXTENSION: pgml
PORT: 5432
dashboard:
BIN: /usr/bin/pgml-dashboard
PORT: 8000
WEB: (在宿主机打开)
http://localhost:8000/
pgml需要在线连模型集市 huggingface 获取模型, 如果网络不通会报错.
postgresml=# SELECT pgml.transform(
'translation_en_to_fr',
inputs => ARRAY[
'Welcome to the future!',
'Where have you been all this time?'
]
) AS french;
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like t5-base is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.
https://postgresml.org/docs/resources/developer-docs/installation
配置cargo源, 参考: https://mirrors.ustc.edu.cn/help/crates.io-index.html
# export CARGO_HOME=/root
# mkdir -vp ${CARGO_HOME:-$HOME/.cargo}
# vi ${CARGO_HOME:-$HOME/.cargo}/config
[source.crates-io]
replace-with = 'ustc'
[source.ustc]
registry = "sparse+https://mirrors.ustc.edu.cn/crates.io-index/"
# 1 install rust
# su - root
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source "$HOME/.cargo/env"
# 下载pgml项目
cd /tmp
git clone --depth 1 https://github.com/postgresml/postgresml
# 下载依赖代码
cd /tmp/postgresml
git submodule update --init --recursive
# 安装pgml插件
cd /tmp/postgresml/pgml-extension
# pgrx 版本请参考 Cargo.toml 文件决定
cargo install --locked --version 0.11.2 cargo-pgrx
cargo pgrx init # create PGRX_HOME 后, 立即ctrl^c 退出
cargo pgrx init --pg15=`which pg_config` # 不用管报警
PGRX_IGNORE_RUST_VERSIONS=y cargo pgrx install --release --pg-config `which pg_config`
https://github.com/docker-library/official-images/blob/master/library/ubuntu
https://github.com/postgresml/postgresml
https://github.com/postgresml/postgresml/tree/master/docker
https://postgresml.org/docs/guides/setup/v2/installation
《PostgresML=模型集市+向量数据库+自定义模型 : 用postgresml体验AI应用(图像搜索、推荐系统和自然语言处理)与向量检索》