-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide out of the box clustering #64
Comments
Of course, if you are able to get it running, it would be great to merge, document, or at least publish it for anybody else interested in clustering. |
I managed to automatically cluster ejabberd using Kubernetes API. The work is based on VerneMQ Docker image start script. |
Well, you can create an ad-hoc repository (or fork) for hosting the required files and documentation. And I can link to your work in the ejabberd Docker documentation, so anybody interested can find it. |
@karimhm would you mind sharing your script? or at least reviewing mine? :) https://github.com/Robbilie/kubernetes-ejabberd if you have addional requirements, let me know, otherwise i would ask @badlop for a review too and create a PR for this repository here :) |
My FROM ejabberd/ecs:20.12
ENV EJABBERD_HOSTS=localhost \
EJABBERD_ERLANG_NODE="ejabberd@$(hostname -f)"
USER root
COPY docker-entrypoint.sh /docker-entrypoint.sh
RUN apk add --no-cache py3-jinja2 curl jq \
&& rm -rf /var/cache/apk/* \
&& chmod +x /ready-probe.sh
# Setup runtime environment
USER ejabberd
WORKDIR $HOME
ENTRYPOINT exec /docker-entrypoint.sh
#!/bin/sh
readonly EJABBERD_READY_FILE=$HOME/.ejabberd_ready
readonly EJABBERD_CLUSTER_READY_FILE=$HOME/.ejabberd_cluster_ready
# Mark ejabberd as not ready so the `ready-probe.sh` script would be able to know about it.
if [ -e $EJABBERD_READY_FILE ]; then
rm $EJABBERD_READY_FILE
fi
## Configuration files
# `ejabberd.yml`
readonly EJABBERD_CONFIG_TEMPLATE=$TEMPLATES_DIR/ejabberd.yml.tpl
readonly EJABBERD_CONFIG_FILE=$HOME/conf/ejabberd.yml
# `ejabberdctl.cfg`
readonly EJABBERD_CTL_CONFIG_TEMPLATE=$TEMPLATES_DIR/ejabberdctl.cfg.tpl
readonly EJABBERD_CTL_CONFIG_FILE=$HOME/conf/ejabberdctl.cfg
readonly JINJA_CMD="import os;
import sys;
import jinja2;
sys.stdout.write(
jinja2.Template
(sys.stdin.read()
).render(env=os.environ))
"
cat $EJABBERD_CONFIG_TEMPLATE | python3 -c "$JINJA_CMD" > $EJABBERD_CONFIG_FILE
cat $EJABBERD_CTL_CONFIG_TEMPLATE | python3 -c "$JINJA_CMD" > $EJABBERD_CTL_CONFIG_FILE
## Clustering
join_cluster() {
# No need to look for a cluster to join if joined before.
if [ -e $EJABBERD_CLUSTER_READY_FILE ]; then
echo "[entrypoint_script] Skip joining cluster, already joined."
# Mark ejabberd as ready
touch $EJABBERD_READY_FILE
return 0
fi
if [ $EJABBERD_CLUSTER_KUBERNETES_DISCOVERY == "TRUE" ]; then
local kubernetes_cluster_name=${EJABBERD_KUBERNETES_CLUSTER_NAME:-cluster.local}
local kubernetes_namespace=${EJABBERD_KUBERNETES_NAMESPACE:-`cat /var/run/secrets/kubernetes.io/serviceaccount/namespace`}
local kubernetes_label_selector=${EJABBERD_KUBERNETES_LABEL_SELECTOR:-cluster.local}
local kubernetes_subdomain=${EJABBERD_KUBERNETES_SUBDOMAIN:-$(curl --silent -X GET $INSECURE --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt https://kubernetes.default.svc.$kubernetes_cluster_name/api/v1/namespaces/$kubernetes_namespace/pods?labelSelector=$kubernetes_label_selector -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" | jq '.items[0].spec.subdomain' | sed 's/"//g' | tr '\n' '\0')}
if [ $kubernetes_subdomain == "null" ]; then
EJABBERD_KUBERNETES_HOSTNAME=$EJABBERD_KUBERNETES_POD_NAME.$kubernetes_namespace.svc.$kubernetes_cluster_name
else
EJABBERD_KUBERNETES_HOSTNAME=$EJABBERD_KUBERNETES_POD_NAME.$kubernetes_subdomain.$kubernetes_namespace.svc.$kubernetes_cluster_name
fi
local join_cluster_result=0
local pod_names=$(curl --silent -X GET $INSECURE --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt https://kubernetes.default.svc.$kubernetes_cluster_name/api/v1/namespaces/$kubernetes_namespace/pods?labelSelector=$kubernetes_label_selector -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" | jq '.items[].spec.hostname' | sed 's/"//g' | tr '\n' ' ')
for pod_name in $pod_names;
do
if [ $pod_name == "null" ]; then
echo "[entrypoint_script] No Kubernetes pods were found. This might happen because the current pod is the first pod."
echo "[entrypoint_script] Skip joining cluster."
touch $EJABBERD_CLUSTER_READY_FILE
# Mark ejabberd as ready
touch $EJABBERD_READY_FILE
break
fi
if [ $pod_name != $EJABBERD_KUBERNETES_POD_NAME ]; then
local node_to_join="ejabberd@$pod_name.$kubernetes_subdomain.$kubernetes_namespace.svc.$kubernetes_cluster_name"
echo "[entrypoint_script] Will join cluster node: '$node_to_join'"
local response=$($HOME/bin/ejabberdctl ping $node_to_join)
while [ $response != "pong" ]; do
echo "[entrypoint_script] Waiting for node: $node_to_join..."
sleep 5
response=$($HOME/bin/ejabberdctl ping $node_to_join)
done
$HOME/bin/ejabberdctl join_cluster $node_to_join
join_cluster_result=$?
break
else
echo "[entrypoint_script] Skip joining current node: $pod_name"
fi
done
if [ $join_cluster_result == 0 ]; then
echo "[entrypoint_script] ejabberd did join cluster successfully"
touch $EJABBERD_CLUSTER_READY_FILE
# Mark ejabberd as ready
touch $EJABBERD_READY_FILE
else
echo "[entrypoint_script] ejabberd did fail to join cluster"
exit 2
fi
else
echo "[entrypoint_script] Kubernetes clustering is not enabled"
# Mark ejabberd as ready
touch $EJABBERD_READY_FILE
fi
}
## Termination
EJABBERD_PID=0
terminate() {
local net_interface=$(route | grep '^default' | grep -o '[^ ]*$')
local ip_address=$(ip -4 addr show $net_interface | grep -oE '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' | sed -e "s/^[[:space:]]*//" | head -n 1)
if [ $EJABBERD_PID != 0 ]; then
# Leave the cluster before terminating
if [ -n "$EJABBERD_KUBERNETES_HOSTNAME" ]; then
NODE_NAME_TO_TERMINATE=ejabberd@$EJABBERD_KUBERNETES_HOSTNAME
else
NODE_NAME_TO_TERMINATE=ejabberd@$ip_address
fi
echo "[entrypoint_script] Leaving cluster '$NODE_NAME_TO_TERMINATE'"
NO_WARNINGS=true $HOME/bin/ejabberdctl leave_cluster $NODE_NAME_TO_TERMINATE
$HOME/bin/ejabberdctl stop > /dev/null
$HOME/bin/ejabberdctl stopped > /dev/null
kill -s TERM $EJABBERD_PID
exit 0
fi
}
trap "terminate" SIGTERM
## Start ejabberd
$HOME/bin/ejabberdctl "foreground" &
EJABBERD_PID=$!
$HOME/bin/ejabberdctl started
join_cluster
wait $EJABBERD_PID
#!/bin/sh
if $HOME/bin/ejabberdctl status>/dev/null 2>/dev/null && [ -e $HOME/.ejabberd_ready ]; then
return 0
else
return 3
fi
|
A subset of the Kubernetes deployment look as follow: env:
- name: EJABBERD_CLUSTER_KUBERNETES_DISCOVERY
value: "TRUE"
- name: EJABBERD_KUBERNETES_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: EJABBERD_KUBERNETES_LABEL_SELECTOR
value: "app=ejabberd"
readinessProbe:
exec:
command:
- /bin/sh
- -c
- /ready-probe.sh
initialDelaySeconds: 15
periodSeconds: 15 |
did you check out my solution? its a bit more lean, doesnt need kubernetes api access etc |
Using DNS for node discovery is not solid and often break (caused by the OS and networking equipment configuration). A distribution coordinator (used to find or register the first node) is needed, in my example Kubernetes API is used due to its simplicity and reliability. It is possible to use other distribution coordinator such as: |
Oh well, i am using the script right now and since headless service functionality is exactly intended for this usecase i figured it would be stable enough since its a core kubernetes feature 🤔 |
This is possible since ejabberd 21.7 As an example exercise, this docker-compose.yml setups two nodes: it registers an account in main node, and then instructs the replica node to join main. Only replica is accessible for XMPP clients and administration. version: '3.7'
services:
main:
image: ejabberd/ecs:latest
environment:
- ERLANG_NODE_ARG=ejabberd@main
- ERLANG_COOKIE=dummycookie123
- CTL_ON_CREATE=register admin localhost asd
- CTL_ON_START=stats registeredusers ;
status
command: ["foreground"]
healthcheck:
test: netstat -nl | grep -q 5222
start_period: 5s
interval: 5s
timeout: 5s
retries: 120
replica:
image: ejabberd/ecs:latest
depends_on:
main:
condition: service_healthy
healthcheck:
test: netstat -nl | grep -q 5222
start_period: 5s
interval: 5s
timeout: 5s
retries: 120
ports:
- "5222:5222"
- "5269:5269"
- "5280:5280"
- "5443:5443"
environment:
- ERLANG_NODE_ARG=ejabberd@replica
- ERLANG_COOKIE=dummycookie123
- CTL_ON_CREATE=join_cluster ejabberd@main
- CTL_ON_START=stats registeredusers ;
check_password admin localhost asd ;
status
command: ["foreground"]
|
@badlop I believe this should be documented on Docker hub/ejabberd |
this not really a solution for kubernetes though (and frankly its not a pretty solution for docker-compose either) in kubernetes terms you would run a single pod as master and a deployment with the slaves. this is really not pretty and other software like mongooseim for example take the approach that basically everyone takes which is a statefulset… |
Ok, documented in 8e9f665 However, the content of https://hub.docker.com/r/ejabberd/ecs/ is a copy of that file, and requires manual update, so it won't see an update until next version...
I know almost nothing about kubernetes, so I cannot review it, or decide over it, or offer an alternative kubernetes solution.
As Dan Greenburg explained:
Your improvements are welcomed |
i did provide a link with an example upfront which could be used as a basis, karim posted their script aswell :) |
Is this still the best, official docker-compose.yml file ? |
I gave updating this yaml file a try but failed
|
What exact file are you referring to?
Not necessary. This issue started with this question and complain:
I showed that it is possible. Nobody said it is necessary.
Your compose file uses |
thanks badlop |
You failed because you are mixing the official documentation and image, with other examples that show configuration for other images that include other features. If you follow strictly what the docker-ejabberd README says, and the exact example configuration that it links, it works. |
But this leads back here for the docker-compose file #64 (comment) I was trying to have a single file to "paste-edit-run and server's done" That's why I posted an attempt at another docker-compose file I'm making a shell script to do that on host instead (like I've done to streamline the docker mailserver installation docker-mailserver/docker-mailserver#2839 ) |
Ok, I've updated the docker-ejabberd README and the github container documentation to not link to this place. |
Great ! If you, or anyone else reading this, is interested in taking this a step further. I think the following could be added to the docker-compose file to streamline server installation.
Could you make the docker-compose.yml from the doc into a file that can be wget ? possibly https://raw.githubusercontent.com/processone/ejabberd/master/docker-compose.yml ? I've made a small script that does this in one copy and paste (script doesn't currently work, something's wrong with the file access [critical] <0.174.0>@ejabberd_app:start/2:72 Failed to start ejabberd application: Failed to read YAML file '/opt/ejabberd/conf/ejabberd.yml': Syntax error on line 24 at position 2: did not find expected key )
The above is based on another docker container I use, docker-mailserver, here is how they handle this the hostname is specified like this
and they assume the user has already taken care of running certbot on their host, so they just add the /etc/letsencrypt folder like this. I think this is a safe assumption, even the ejabber.yml.example file assumes this in their examples
|
Wow that took a long time to debug I was getting desperate, I spammed the .pem files into every path I could think of
Turns out there's an extra space in the script curse yaml !! Anyway, it's further along here is what I found out First, for letsencrypt you need to add the alternate domains to your ssl certificates for each sub domain that xmpp requires certbot certonly --standalone --expand -d ${ejabber_hostname} -d proxy.${ejabber_hostname} -d pubsub.${ejabber_hostname} -d upload.${ejabber_hostname} -d conference.${ejabber_hostname} And if you have existing certs, certbot will just create new folders like So here is the procedure for now 1st step
now paste yml text from container docs version: '3.7' services:
press CTRL+X, ENTER, y to save, ENTER 2nd step, change ejabber_hostname and ejabber_admin_password now repeat with below script
and execute the script
This will dump you immediately into the logs, press ctrl+c to quit Here is my log sample,
So, from this point, looks like STUN/TURN doesn't work Oh, wait a second
Well... looks like it's valid, why is ejabberd complaining ....
keys match....
public keys match
dates good
Oh, what's going on there, might not be a ejabberd problem in this case .... Oh well, server works, my sunday has disappeared, enough of this ... |
Just in case anybody would like to try this helm chart: It is under development and feedback, testing is very much welcomed. |
The official ejabberd community server Docker image does not offer any way to form a cluster. It is hard to find resources online regarding clustering ejabberd.
The text was updated successfully, but these errors were encountered: