This image was created with the intention of adding extra configuration options to the deployment of Apache Pulsar component on Docker. We are not associated with Apache or Pulsar in anyway. You can find the official image here.
- Maintained by: General Software Inc Open Projects
- Where to file issues: GitHub Issues
Apache Pulsar is a highly scalable, low latency messaging platform running on commodity hardware. It provides simple pub-sub semantics over topics, guaranteed at-least-once delivery of messages, automatic cursor management for subscribers, and cross-datacenter replication.
docker run -itd --name pulsar -e "STANDALONE=true" -p 8080:8080 -p 6650:6650 -p 8443:8443 -p 6651:6651 --restart on-failure gsiopen/pulsar:2.5.2
This image is runned using a non root user
pulsar
who owns the/opt/pulsar
folder.
By default, pulsar's data is stored in /opt/pulsar/data
. You can bind a local volume as follows:
docker run -itd --name pulsar -v /path/to/store/data:/opt/pulsar/data -e "STANDALONE=true" -p 8080:8080 -p 6650:6650 -p 8443:8443 -p 6651:6651 --restart on-failure gsiopen/pulsar:2.5.2
All CLI
scripts are contained in PATH
, so you can invoke them using their respective commands and arguments as follows:
docker exec -it pulsar pulsar-admin brokers healthcheck
You can find out if something went wrong while initializing the container using the next command:
docker logs pulsar
The rest can be found in the logs
folder with format pulsar-[service]-[hostname].log
Although Pulsar can start an internal zookeeper server when launched standalone
, the recommendend deployment for Pulsar cluster is to use an external zookeeper cluster. Therefore, the environment variables below are mandatory to indicate where the zookeeper servers are.
A comma separated list of
hostname:port
where the zookeper servers that will coordinate the Pulsar cluster are.
A comma separated list of
hostname:port
where the zookeper servers that will coordinate the whole Pulsar instance are. If you intend to deploy a single cluster you can use the same value ofZOO_SERVERS
.
The next variable will be important as well since it coordinates you cluster when launched along other clusters.
How you want to name you Pulsar cluster.
Before runnig Pusar services, you must intialize some cluster metada in the external zookeeper server. So, one of your cluster's nodes must also set this variables.
Marks this container as the responsable of initializing you clusters metadata in the zookeeper servers.
Web
URL
of your cluster, by default we will use the container'shostname
and the default port8080
.
Web
URL
with TLS of your cluster, by default we will use the container'shostname
and the default port8443
.
Broker
URL
of your cluster, by default we will use the container'shostname
and the default port6650
.
Broker
URL
with TLS of your cluster, by default we will use the container'shostname
and the default port6651
.
Example using docker-compose
:
version: "3.7"
networks:
private-net:
name: private-net
driver: bridge
ipam:
driver: default
config:
- subnet: 192.168.1.0/24
services:
zoo-1:
image: gsiopen/zookeeper:3.6.1
container_name: zoo-1
hostname: zoo-1
environment:
- ZOO_MY_ID=1
- ZOO_SERVERS=server.1=0.0.0.0:2888:3888;2181 server.2=zoo-2:2888:3888;2181 server.3=zoo-3:2888:3888;2181
restart: on-failure
networks:
private-net:
ipv4_address: 192.168.1.2
zoo-2:
image: gsiopen/zookeeper:3.6.1
container_name: zoo-2
hostname: zoo-2
environment:
- ZOO_MY_ID=2
- ZOO_SERVERS=server.1=zoo-1:2888:3888;2181 server.2=0.0.0.0:2888:3888;2181 server.3=zoo-3:2888:3888;2181
restart: on-failure
networks:
private-net:
ipv4_address: 192.168.1.3
zoo-3:
image: gsiopen/zookeeper:3.6.1
container_name: zoo-3
hostname: zoo-3
environment:
- ZOO_MY_ID=3
- ZOO_SERVERS=server.1=zoo-1:2888:3888;2181 server.2=zoo-2:2888:3888;2181 server.3=0.0.0.0:2888:3888;2181
restart: on-failure
networks:
private-net:
ipv4_address: 192.168.1.4
pulsar-1:
image: gsiopen/pulsar:2.5.2
container_name: pulsar-1
hostname: pulsar-1
environment:
- CLUSTER_NAME=pulsar
- ZOO_SERVERS=zoo-1:2181,zoo-2:2181,zoo-3:2181
- CONFIG_STORE_SERVERS=zoo-1:2181,zoo-2:2181,zoo-3:2181
- INITIALIZE_METADATA=true
depends_on:
- zoo-1
- zoo-2
- zoo-3
restart: on-failure
networks:
private-net:
ipv4_address: 192.168.1.5
pulsar-2:
image: gsiopen/pulsar:2.5.2
container_name: pulsar-2
hostname: pulsar-2
environment:
- CLUSTER_NAME=pulsar
- ZOO_SERVERS=zoo-1:2181,zoo-2:2181,zoo-3:2181
- CONFIG_STORE_SERVERS=zoo-1:2181,zoo-2:2181,zoo-3:2181
depends_on:
- pulsar-1
restart: on-failure
networks:
private-net:
ipv4_address: 192.168.1.6
pulsar-3:
image: gsiopen/pulsar:2.5.2
container_name: pulsar-3
hostname: pulsar-3
environment:
- CLUSTER_NAME=pulsar
- ZOO_SERVERS=zoo-1:2181,zoo-2:2181,zoo-3:2181
- CONFIG_STORE_SERVERS=zoo-1:2181,zoo-2:2181,zoo-3:2181
depends_on:
- pulsar-1
restart: on-failure
networks:
private-net:
ipv4_address: 192.168.1.7
Pulsar uses default configuration files in the /opt/pulsar/conf
folder. You can bind an external folder with your configuration files as follows:
docker run -itd --name pulsar -v /path/to/conf:/opt/pulsar/conf -e "STANDALONE=true" -p 8080:8080 -p 6650:6650 -p 8443:8443 -p 6651:6651 --restart on-failure gsiopen/pulsar:2.5.2
The environment configuration is controlled via the following environment variable groups or PREFIX:
CONF_ZOO: affects zookeeper.conf
CONF_GLOBAL_ZOO: affects global_zookeeper.conf
CONF_BOOKKEEPER: affects bookkeeper.conf
CONF_BROKER: affects broker.conf
CONF_CLIENT: affects client.conf
CONF_DISCOVERY: affects discovery.conf
CONF_PROXY: affects proxy.conf
CONF_WEBSOCKET: affects websocket.conf
CONF_STANDALONE: affects standalone.conf
YAML_LOG4J2: affects log4j2.yml
YAML_FUNCTIONS_WORKER: affects functions_worker.yml
Set environment variables with the appropriated group in the form PREFIX_PROPERTY.
Due to restriction imposed by docker and docker-compose on environment variable names the following substitution are applied to PROPERTY names:
_ => .
__ => _
___ => -
Following are some illustratory examples:
CONF_BROKER_functionsWorkerEnabled=true: sets the functionsWorkerEnabled property in broker.conf
YAML_FUNCTIONS_WORKER_pulsarFunctionsCluster=pulsar: sets the pulsarFunctionsCluster property in functions_worker.yml
To use Pulsar's connectors you must first configure the next parameters on all the nodes you want to participate as function workers.
CONF_BROKER_functionsWorkerEnabled=true
YAML_FUNCTIONS_WORKER_pulsarFunctionsCluster=`cluster's name`
Each function worker assigned by Pulsar to run a specific connector will search the connector's file under the connectors
folder. You can either bind a local volume with the connectors you want or set the next environment variable:
A space separated list of
URL
s pointing to the connectors files you want to download into theconnectors
folder.
You will also need a configuration YAML
, with options and parameters as requiered by each connector, inside the container that will launch the connector. So, bind a local volume with the YAML
file inside the /home/pulsar/
folder.
Let's see a full example of how a docker-compose.yml
would look like if you want to run the Cassandra example:
# (Previous configuration remains untouched)
pulsar-1:
image: gsiopen/pulsar:2.5.2
container_name: pulsar-1
hostname: pulsar-1
environment:
- CLUSTER_NAME=pulsar
- ZOO_SERVERS=zoo-1:2181,zoo-2:2181,zoo-3:2181
- CONFIG_STORE_SERVERS=zoo-1:2181,zoo-2:2181,zoo-3:2181
- INITIALIZE_METADATA=true
- CONF_BROKER_functionsWorkerEnabled=true
- YAML_FUNCTIONS_WORKER_pulsarFunctionsCluster=pulsar
- CONNECTORS_URL=https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download&filename=pulsar/pulsar-2.5.2/connectors/pulsar-io-cassandra-2.5.2.nar
command: "/bin/bash -c 'pulsar-admin sinks create --tenant public --namespace default --name cassandra-test-sink --sink-type cassandra --sink-config-file /home/pulsar/cassandra-sink.yml --inputs test_cassandra && tail -f /dev/null'"
volumes:
- /path/to/cassandra-sink.yml:/home/pulsar/cassandra-sink.yml
depends_on:
- zoo-1
- zoo-2
- zoo-3
restart: on-failure
networks:
private-net:
ipv4_address: 192.168.1.5
pulsar-2:
image: gsiopen/pulsar:2.5.2
container_name: pulsar-2
hostname: pulsar-2
environment:
- CLUSTER_NAME=pulsar
- ZOO_SERVERS=zoo-1:2181,zoo-2:2181,zoo-3:2181
- CONFIG_STORE_SERVERS=zoo-1:2181,zoo-2:2181,zoo-3:2181
- CONF_BROKER_functionsWorkerEnabled=true
- YAML_FUNCTIONS_WORKER_pulsarFunctionsCluster=pulsar
- CONNECTORS_URL=https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download&filename=pulsar/pulsar-2.5.2/connectors/pulsar-io-cassandra-2.5.2.nar
depends_on:
- pulsar-1
restart: on-failure
networks:
private-net:
ipv4_address: 192.168.1.6
pulsar-3:
image: gsiopen/pulsar:2.5.2
container_name: pulsar-3
hostname: pulsar-3
environment:
- CLUSTER_NAME=pulsar
- ZOO_SERVERS=zoo-1:2181,zoo-2:2181,zoo-3:2181
- CONFIG_STORE_SERVERS=zoo-1:2181,zoo-2:2181,zoo-3:2181
- CONF_BROKER_functionsWorkerEnabled=true
- YAML_FUNCTIONS_WORKER_pulsarFunctionsCluster=pulsar
- CONNECTORS_URL=https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download&filename=pulsar/pulsar-2.5.2/connectors/pulsar-io-cassandra-2.5.2.nar
depends_on:
- pulsar-1
restart: on-failure
networks:
private-net:
ipv4_address: 192.168.1.7
Notice the use of
command
to submit the sink connector as soon as the broker is ready, addtail -f /dev/null
to maintain the container runnig.
View license information for the software contained in this image.
As with all Docker images, these likely also contain other software which may be under other licenses (such as Bash, etc from the base distribution, along with any direct or indirect dependencies of the primary software being contained).
As for any pre-built image usage, it is the image user's responsibility to ensure that any use of this image complies with any relevant licenses for all software contained within.