Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 87 additions & 0 deletions .github/workflows/build-docker.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
name: build-docker-images

on:
push:
branches: [ main ]
paths-ignore:
- "*.md"

pull_request:
branches: [ main ]
paths-ignore:
- "*.md"

# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

env:
REGISTRY_URL: "docker.io" # docker.io or other registry URL, DOCKER_REGISTRY_USERNAME/DOCKER_REGISTRY_PASSWORD to be set in CI env.
BUILDKIT_PROGRESS: "plain" # Full logs for CI build.

# DOCKER_REGISTRY_USERNAME and DOCKER_REGISTRY_PASSWORD is required for docker image push, they should be set in CI secrets.
DOCKER_REGISTRY_USERNAME: ${{ secrets.DOCKER_REGISTRY_USERNAME }}
DOCKER_REGISTRY_PASSWORD: ${{ secrets.DOCKER_REGISTRY_PASSWORD }}

# used to sync image to mirror registry
DOCKER_MIRROR_REGISTRY_USERNAME: ${{ secrets.DOCKER_MIRROR_REGISTRY_USERNAME }}
DOCKER_MIRROR_REGISTRY_PASSWORD: ${{ secrets.DOCKER_MIRROR_REGISTRY_PASSWORD }}

jobs:
qpod_bigdata:
name: "bigdata"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: |
source ./tool.sh
build_image bigdata latest docker_bigdata/Dockerfile && push_image

qpod_elasticsearch:
name: "elasticsearch"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: |
source ./tool.sh
build_image elasticsearch latest docker_elasticsearch/Dockerfile && push_image

qpod_kafka_confluent:
name: "kafka"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: |
source ./tool.sh
build_image kafka latest docker_kafka_confluent/Dockerfile && push_image

qpod_greenplum:
name: "greenplum"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: |
source ./tool.sh
build_image greenplum latest docker_greenplum/Dockerfile && push_image
qpod_postgres:
name: "postgres"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: |
source ./tool.sh
build_image postgres latest docker_postgres/postgres-ext.Dockerfile && push_image

## Sync all images in this build (listed by "names") to mirror registry.
sync_images:
needs: ["qpod_bigdata", "qpod_elasticsearch", "qpod_kafka_confluent", "qpod_postgres", "qpod_greenplum"]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: |
source ./tool.sh
printenv > /tmp/docker.env
docker run --rm \
--env-file /tmp/docker.env \
-v $(pwd):/tmp \
-w /tmp \
qpod/docker-kit python /opt/utils/image-syncer/run_jobs.py
56 changes: 0 additions & 56 deletions .github/workflows/docker.yml

This file was deleted.

13 changes: 6 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,17 @@
# QPod Data Lab - Docker Image Stack
# QPod Data Lab - Docker Image Stack for BigData

[![License](https://img.shields.io/badge/License-BSD%203--Clause-green.svg)](https://opensource.org/licenses/BSD-3-Clause)
[![GitHub Workflow Status](https://img.shields.io/github/actions/workflow/status/QPod/data-lab/docker.yml?branch=main)](https://github.com/QPod/data-lab/actions/workflows/docker.yml)
[![Join the Gitter Chat](https://img.shields.io/gitter/room/nwjs/nw.js.svg)](https://gitter.im/QPod/)
[![Docker Pulls](https://img.shields.io/docker/pulls/qpod/qpod.svg)](https://hub.docker.com/r/qpod/qpod)
[![Docker Starts](https://img.shields.io/docker/stars/qpod/qpod.svg)](https://hub.docker.com/r/qpod/qpod)
[![Recent Code Update](https://img.shields.io/github/last-commit/QPod/data-lab.svg)](https://github.com/QPod/data-lab/stargazers)
[![GitHub Workflow Status](https://img.shields.io/github/actions/workflow/status/QPod/lab-data/docker.yml?branch=main)](https://github.com/QPod/lab-data/actions/workflows/docker.yml)
[![Recent Code Update](https://img.shields.io/github/last-commit/QPod/lab-data.svg)](https://github.com/QPod/lab-data/stargazers)

Please generously STAR★ our project or donate to us! [![GitHub Starts](https://img.shields.io/github/stars/QPod/data-lab.svg?label=Stars&style=social)](https://github.com/QPod/data-lab/stargazers)
[![Donate-PayPal](https://img.shields.io/badge/Donate-PayPal-blue.svg)](https://paypal.me/haobibo)
[![Donate-AliPay](https://img.shields.io/badge/Donate-Alipay-blue.svg)](https://raw.githubusercontent.com/wiki/haobibo/resources/img/Donate-AliPay.png)
[![Donate-WeChat](https://img.shields.io/badge/Donate-WeChat-green.svg)](https://raw.githubusercontent.com/wiki/haobibo/resources/img/Donate-WeChat.png)

[Wiki & Document](https://github.com/QPod/docker-images/wiki) | [中文使用指引(含中国地区镜像)](https://github.com/QPod/docker-images/wiki/QPod%E4%B8%AD%E6%96%87%E6%8C%87%E5%BC%95)
Discussion and contributions are welcome:
[![Join the Discord Chat](https://img.shields.io/badge/Discuss_on-Discord-green)](https://discord.gg/kHUzgQxgbJ)
[![Open an Issue on GitHub](https://img.shields.io/github/issues/QPod/lab-data)](https://github.com/QPod/lab-data/issues)

## Building blocks for data lake and pipelines projects

Expand Down
3 changes: 2 additions & 1 deletion docker_bigdata/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ RUN source /opt/utils/script-setup.sh \
&& echo "Install mysql client:" && setup_mysql_client \
&& echo "Install mongosh:" && setup_mongosh_client \
&& echo "Install redis-cli:" && setup_redis_client \
&& echo "Install pyflink:" && install_pip /opt/utils/list_install_pip_pyflink.txt \
&& echo "Install pyspark:" && install_pip /opt/utils/list_install_pip_pyspark.txt \
&& echo "Install pyflink:" && install_pip /opt/utils/list_install_pip_pyflink.txt \
&& pip install --no-deps apache-flink \
&& echo "Clean up" && list_installed_packages && install__clean
13 changes: 5 additions & 8 deletions docker_bigdata/work/list_install_pip_pyflink.txt
Original file line number Diff line number Diff line change
@@ -1,12 +1,9 @@
% from: https://github.com/apache/flink/blob/master/flink-python/setup.py
apache-flink
pemja
pandas
pyarrow
apache-beam
cloudpickle
avro-python3
requests
%py4j==0.10.9.3
%apache-beam==2.38.0
%cloudpickle==2.1.0
%avro-python3>=1.8.1,!=1.9.2,<1.10.0
%fastavro>=1.1.0,<1.4.8
%protobuf<3.18
%pemja==0.2.4
% apache-flink % tempfix bu install without deps
55 changes: 55 additions & 0 deletions docker_elasticsearch/solr.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
FROM ubuntu:latest

LABEL maintainer="haobibo@gmail.com"

USER root

ENV APACHE_DIST="http://archive.apache.org/dist" \
MAVEN_VERSION="3.6.3" \
SOLR_VERSION="8.3.1" \
SOLR_HOME="/data/solr" \
SOLR_LIB_DIR="/data/solr/.lib" \
SOLR_SERVER_LIB="/opt/solr/server/solr-webapp/webapp/WEB-INF/lib" \
PATH="/opt/solr/bin:/opt/maven/bin:$PATH"

RUN mkdir -p $SOLR_HOME $SOLR_LIB_DIR \
&& apt-get -y update --fix-missing && apt-get -y upgrade \
&& apt-get -qq install -y --no-install-recommends wget unzip lsof openjdk-11-jdk-headless \
&& apt-get autoremove && apt-get clean && rm -rf /var/lib/apt/lists/* \
&& install_zip() { wget -nv $1 -O /tmp/TMP.zip && unzip -q /tmp/TMP.zip -d /opt/ && rm /tmp/TMP.zip ; } \
&& install_zip "${APACHE_DIST}/maven/maven-3/${MAVEN_VERSION}/binaries/apache-maven-${MAVEN_VERSION}-bin.zip" && mv /opt/apache-maven-${MAVEN_VERSION} /opt/maven \
&& install_zip "${APACHE_DIST}/lucene/solr/${SOLR_VERSION}/solr-${SOLR_VERSION}.zip" && mv /opt/solr-${SOLR_VERSION} /opt/solr \
&& sed -i -e '/-Dsolr.clustering.enabled=true/ a SOLR_OPTS="$SOLR_OPTS -Denable.runtime.lib=true -Dsun.net.inetaddr.ttl=60 -Dsun.net.inetaddr.negative.ttl=60"' /opt/solr/bin/solr.in.sh \
&& echo 'SOLR_HOME=${SOLR_HOME}' >> /opt/solr/bin/solr.in.sh \
&& echo 'SOLR_PID_DIR=${SOLR_HOME}' >> /opt/solr/bin/solr.in.sh \
&& echo 'SOLR_LOGS_DIR=${SOLR_HOME}/logs' >> /opt/solr/bin/solr.in.sh \
&& echo 'SOLR_LOG_LEVEL=WARN' >> /opt/solr/bin/solr.in.sh \
&& echo '#!/bin/bash' >> /opt/solr/bin/start-solr.sh \
&& echo '[ -f "${SOLR_HOME}/solr.xml" ] || cp -R /opt/solr/server/solr/* ${SOLR_HOME}/' >> /opt/solr/bin/start-solr.sh \
&& echo 'cp -R ${SOLR_LIB_DIR}/*.jar ${SOLR_SERVER_LIB}/' >> /opt/solr/bin/start-solr.sh \
&& echo '/opt/solr/bin/solr start -force -f -c' >> /opt/solr/bin/start-solr.sh \
&& chmod +x /opt/solr/bin/start-solr.sh

RUN mvn_get() { mvn dependency:copy -DlocalRepositoryDirectory="/tmp/m2repo" -DoutputDirectory="${SOLR_SERVER_LIB}" -Djavax.net.ssl.trustStorePassword=changeit -Dartifact="$1"; } \
&& mvn_get "com.janeluo:ikanalyzer:2012_u6" \
&& mvn_get "com.hankcs:hanlp:portable-1.6.3" \
&& mvn_get "com.huaban:jieba-analysis:1.0.2" \
&& rm -Rf /tmp/* /opt/solr/docs/ \
&& ls -alh ${SOLR_SERVER_LIB}

RUN source /opt/utils/script-utils.sh \
&& VERSION_GRADLE="6.5.1" \
&& URL_GRADLE="https://downloads.gradle-dn.com/distributions/gradle-${VERSION_GRADLE}-bin.zip" \
&& install_zip ${URL_GRADLE} && mv /opt/gradle-* /opt/gradle \
&& ln -s /opt/gradle/bin/gradle /usr/bin/ \
&& echo "@ Version of Gradle:" && gradle --version


EXPOSE 8983 9983

WORKDIR /opt/solr

VOLUME /data/solr

ENTRYPOINT ["start-solr.sh"]
CMD ["start-solr.sh]
13 changes: 8 additions & 5 deletions docker_greenplum/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,13 @@ FROM ${BASE_NAMESPACE:+$BASE_NAMESPACE/}${BASE_IMG} AS builder

COPY rootfs /

RUN VERSION_GPDB_RELEASE="7.0.0-beta.3" \
&& source /opt/utils/script-utils.sh \
RUN set -x && source /opt/utils/script-utils.sh \
&& install_apt /opt/utils/install_list_greenplum.apt \
&& apt-get -qq install -yq --no-install-recommends gcc g++ bison flex cmake pkg-config ccache ninja-build \
&& install_tar_gz https://github.com/greenplum-db/gpdb/releases/download/${VERSION_GPDB_RELEASE}/${VERSION_GPDB_RELEASE}-src-full.tar.gz \
&& VERSION_GPDB_RELEASE=$(curl -sL https://github.com/greenplum-db/gpdb/releases.atom | grep 'releases/tag' | grep "7." | head -1 | grep -Po '\d[\d.]+' ) \
&& URL_GBDP="https://github.com/greenplum-db/gpdb/releases/download/${VERSION_GPDB_RELEASE}/${VERSION_GPDB_RELEASE}-src-full.tar.gz" \
&& echo "Downloading GBDP src release ${VERSION_GPDB_RELEASE} from: ${URL_GBDP}" \
&& install_tar_gz $URL_GBDP \
&& cd /opt/gpdb_src \
&& PYTHON=/opt/conda/bin/python3 ./configure --prefix=/opt/gpdb --with-perl --with-python --with-libxml --with-gssapi --with-openssl \
&& sudo make -j16 && sudo make install -j16
Expand All @@ -25,7 +27,7 @@ ENV GPHOME="/opt/gpdb" \
GPDATA="/data/gpdb" \
GPUSER="gpadmin"

RUN source /opt/utils/script-utils.sh \
RUN set -x && source /opt/utils/script-utils.sh \
&& echo "source ${GPHOME}/greenplum_path.sh" >> /etc/profile \
&& useradd -u 1000 ${GPUSER} -s /bin/bash -d /home/${GPUSER} \
&& usermod -aG root ${GPUSER} \
Expand All @@ -48,7 +50,8 @@ RUN source /opt/utils/script-utils.sh \
&& echo "Clean up" && list_installed_packages && install__clean

USER ${GPUSER}
RUN [ -e ~/.ssh/id_rsa.pub ] || ssh-keygen -t rsa -b 4096 -N "" -C GreenplumDB -f ~/.ssh/id_rsa \
RUN set -x && whoami \
&& [ -e ~/.ssh/id_rsa.pub ] || ssh-keygen -t rsa -b 4096 -N "" -C GreenplumDB -f ~/.ssh/id_rsa \
&& cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys \
&& ssh-keygen -A -v \
&& chmod 600 ~/.ssh/authorized_keys \
Expand Down
7 changes: 4 additions & 3 deletions docker_greenplum/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# GreenplumDB
# GreenplumDB

This is the docker contianer for starting a GreenplumDB 7 cluster.
https://docs.vmware.com/en/VMware-Greenplum/7/greenplum-database/landing-index.html
Expand All @@ -25,12 +25,13 @@ psql -d postgres -c "ALTER ROLE gpadmin WITH PASSWORD 'gpadmin';"

Please refer to the file `example/gpdb-single-vm/docker-compose.yml`.
Note: it is neded to create folders `primary1` and `primary2` for segment nodes in `/data/database/greenplum`:
```

```bash
mkdir -pv /data/database/greenplum/primary1
mkdir -pv /data/database/greenplum/primary2
```

# Debug
## Debug

```bash
# to build the docker image
Expand Down
22 changes: 22 additions & 0 deletions docker_postgres/postgres-ext.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Distributed under the terms of the Modified BSD License.

ARG BASE_NAMESPACE
ARG BASE_IMG="base"
FROM ${BASE_NAMESPACE:+$BASE_NAMESPACE/}${BASE_IMG} as builder

ARG PG_MAJOR=15
FROM library/postgres:${PG_MAJOR:-latest}

LABEL maintainer="haobibo@gmail.com"

COPY work /opt/utils/
COPY --from=builder /opt /opt

RUN set -x \
&& apt-get update && apt-get install -y gettext \
apt-utils apt-transport-https ca-certificates gnupg2 dirmngr locales sudo lsb-release curl \
&& envsubst < /opt/utils/install_list_pgext.apt > /opt/utils/install_list_pgext.apt \
&& . /opt/utils/script-utils.sh \
&& install_apt /opt/utils/install_list_base.apt \
&& install_apt /opt/utils/install_list_pgext.apt \
&& echo "Clean up" && list_installed_packages && install__clean
11 changes: 11 additions & 0 deletions docker_postgres/work/install_list_pgext.apt
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
postgresql-contrib
postgresql-${PG_MAJOR}-postgis*
postgresql-${PG_MAJOR}-pgvector
postgresql-${PG_MAJOR}-cron
postgresql-${PG_MAJOR}-wal2json

% https://packagecloud.io/citusdata/community/${distro_name}/ ${distro_codename} main
% https://packagecloud.io/timescale/timescaledb/${distro_name}/ ${distro_codename} main
% https://apt.postgresml.org ${distro_codename} maintainer
% timescaledb-2-postgresql-${PG_MAJOR}
% postgresql-${PG_MAJOR}-citus-12.1
Loading