Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replica set example that uses PetSet #184

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 75 additions & 0 deletions 3.2/root/usr/bin/run-mongod-pet
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
#!/bin/bash
#
# Run mongod in a PetSet-based replica set. See
# https://github.com/sclorg/mongodb-container/blob/master/examples/petset/README.md
# for a description of how this is intended to work.
#
# Note that this differs from the `run-mongodb-replication` script in at least
# these aspects:
# - It does not attempt to remove the host from the replica set configuration
# when it is terminating. That is by design, because, in a PetSet, when a
# pod/container terminates and is restarted by OpenShift, it will always have
# the same hostname. Removing hosts from the configuration affects replica set
# elections and can impact the replica set stability.
# - The original replication example uses MONGODB_INITIAL_REPLICA_COUNT to wait
# for a certain number of pods to come up and then initializes the replica set.
# This example, instead, initializes the replica set when the first pod starts.

set -o errexit
set -o nounset
set -o pipefail

source ${CONTAINER_SCRIPTS_PATH}/common.sh

function usage() {
echo "You must specify the following environment variables:"
echo " MONGODB_USER"
echo " MONGODB_PASSWORD"
echo " MONGODB_DATABASE"
echo " MONGODB_ADMIN_PASSWORD"
echo " MONGODB_KEYFILE_VALUE"
echo " MONGODB_REPLICA_NAME"
echo "Optional variables:"
echo " MONGODB_SERVICE_NAME (default: mongodb)"
echo "MongoDB settings:"
echo " MONGODB_NOPREALLOC (default: true)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mongodb 3.2 uses wiredTiger storage engine - these options are for old mmapv1 engine. @rhcarvalho @php-coder OpenShift will support to somehow use data created by old version in petset deployment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I keep seeing in the logs that the config file contains values that do not apply.
AFAIK we haven't played much with that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Config file can contain it (it takes effect only if different storage engine is specified or data from old engine are present).

My question was mainly about if to print it in 'help' (this code is not shared with anyone else)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, if they aren't relevant let's drop them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhcarvalho @php-coder OpenShift will support to somehow use data created by old version in petset deployment?

IMHO we shouldn't. Petset is a new thing and I assume that someone who will try it, use it from scratch. Also, if user wants to use old data he/she can provide custom config file.

@bparees Please, approve/disprove it.

echo " MONGODB_SMALLFILES (default: true)"
echo " MONGODB_QUIET (default: true)"
exit 1
}

function cleanup() {
echo "=> Shutting down MongoDB server ..."
pkill -INT mongod || :
wait_for_mongo_down
exit 0
}

trap 'cleanup' SIGINT SIGTERM

# If user provides own config file use it and do not generate new one
if [ ! -s "${MONGODB_CONFIG_PATH}" ]; then
# Generate config file for MongoDB
envsubst < "${CONTAINER_SCRIPTS_PATH}/mongodb.conf.template" > "${MONGODB_CONFIG_PATH}"
fi

if ! [[ -v MONGODB_USER && -v MONGODB_PASSWORD && -v MONGODB_DATABASE && -v MONGODB_ADMIN_PASSWORD && -v MONGODB_KEYFILE_VALUE && -v MONGODB_REPLICA_NAME ]]; then
usage
fi

mongo_common_args="-f ${MONGODB_CONFIG_PATH}"

# Attention: setup_keyfile may modify value of mongo_common_args!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May? Is is somehow possible that it won't?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if custom config contains keyFile parameter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed this. Thanks.

setup_keyfile

${CONTAINER_SCRIPTS_PATH}/init-petset-replset.sh &

# TODO: capture exit code of `init-petset-replset.sh` and exit with an error if
# the initialization failed, so that the container will be restarted and the
# user can gain more visibility that there is a problem in a way other than just
# inspecting log messages.

# Make sure env variables don't propagate to mongod process.
unset MONGODB_USER MONGODB_PASSWORD MONGODB_DATABASE MONGODB_ADMIN_PASSWORD
mongod ${mongo_common_args} --replSet "${MONGODB_REPLICA_NAME}" &
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In run-mongod-replication, this same code is run in a subshell so that the trapped code in cleanup still has access to those variables. Did you make sure we're not breaking things (in particular, wait_for_mongo_down) with this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far I can see wait_for_mongo_down don't use any of these variables, so it shouldn't break anything.

wait
5 changes: 5 additions & 0 deletions 3.2/root/usr/share/container-scripts/mongodb/common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -259,3 +259,8 @@ function setup_keyfile() {
chmod 0600 ${MONGODB_KEYFILE_PATH}
mongo_common_args+=" --keyFile ${MONGODB_KEYFILE_PATH}"
}

# info prints a message prefixed by date and time.
function info() {
printf "=> [%s] %s\n" "$(date +'%a %b %d %T')" "$*"
}
159 changes: 159 additions & 0 deletions 3.2/root/usr/share/container-scripts/mongodb/init-petset-replset.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
#!/bin/bash

set -o errexit
set -o nounset
set -o pipefail

source "${CONTAINER_SCRIPTS_PATH}/common.sh"

# This is a full hostname that will be added to replica set
# (for example, "replica-2.mongodb.myproject.svc.cluster.local")
readonly MEMBER_HOST="$(hostname -f)"

# Description of possible statuses: https://docs.mongodb.com/manual/reference/replica-states/
readonly WAIT_PRIMARY_STATUS='
while (rs.status().startupStatus || (rs.status().hasOwnProperty("myState") && rs.status().myState != 1)) {
printjson(rs.status());
sleep(1000);
};
printjson(rs.status());
'
readonly WAIT_PRIMARY_OR_SECONDARY_STATUS="
var mbrs;
while (!mbrs || mbrs.length == 0 || !(mbrs[0].state == 1 || mbrs[0].state == 2)) {
printjson(rs.status());
sleep(1000);
mbrs = rs.status().members.filter(function(el) {
return el.name.indexOf(\"${MEMBER_HOST}:\") > -1;
});
};
print(mbrs[0].stateStr);
"

# Outputs available endpoints (hostnames) to stdout.
# This also includes hostname of the current pod.
#
# Uses the following global variables:
# - MONGODB_SERVICE_NAME (optional, defaults to 'mongodb')
function find_endpoints() {
local service_name="${MONGODB_SERVICE_NAME:-mongodb}"

# Extract host names from lines like this: "10 33 0 mongodb-2.mongodb.myproject.svc.cluster.local."
dig "${service_name}" SRV +search +short | cut -d' ' -f4 | rev | cut -c2- | rev
}

# TODO: unify this and `mongo_initiate` from common.sh
# Initializes the replica set configuration. It is safe to call this function if
# a replica set is already configured.
#
# Arguments:
# - $1: host address[:port]
#
# Uses the following global variables:
# - MONGODB_REPLICA_NAME
# - MONGODB_ADMIN_PASSWORD
# - WAIT_PRIMARY_STATUS
function initiate() {
local host="$1"

if mongo --eval "quit(db.isMaster().setName == '${MONGODB_REPLICA_NAME}' ? 0 : 1)" --quiet; then
info "Replica set '${MONGODB_REPLICA_NAME}' already exists, skipping initialization"
return
fi

local config="{_id: '${MONGODB_REPLICA_NAME}', members: [{_id: 0, host: '${host}'}]}"

info "Initiating MongoDB replica using: ${config}"
mongo admin --eval "rs.initiate(${config});${WAIT_PRIMARY_STATUS}" --quiet

info "Creating MongoDB users ..."
mongo_create_admin
mongo_create_user "-u admin -p ${MONGODB_ADMIN_PASSWORD}"

info "Successfully initialized replica set"
}

# Adds a host to the replica set configuration. It is safe to call this function
# if the host is already in the configuration.
#
# Arguments:
# - $1: host address[:port]
#
# Global variables:
# - MAX_ATTEMPTS
# - SLEEP_TIME
# - MONGODB_REPLICA_NAME
# - MONGODB_ADMIN_PASSWORD
# - WAIT_PRIMARY_OR_SECONDARY_STATUS
function add_member() {
local host="$1"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhcarvalho Why this function has indentation in 2 spaces? All other functions uses 4 spaces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because all of our code base uses 2 spaces, and I don't know why you used 4 😛

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you were going to open my eyes on that?! :-|

info "Adding ${host} to replica set ..."

local script
script="
for (var i = 0; i < ${MAX_ATTEMPTS}; i++) {
var ret = rs.add('${host}');
if (ret.ok) {
quit(0);
}
// ignore error if host is already in the configuration
if (ret.code == 103) {
quit(0);
}
sleep(${SLEEP_TIME});
}
printjson(ret);
quit(1);
"

# TODO: replace this with a call to `replset_addr` from common.sh, once it returns host names.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhcarvalho Why hostnames are necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it allows to the replica set to survive a restart of its members.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it allows to the replica set to survive a restart of its members.

Yes, this true - in replica set config.

But IMPOV it does not matter while connecting replset.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it could be slower, because DNS resolving have to be done before connecting ;-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But IMPOV it does not matter while connecting replset.

That's a good point. 👍

It might work without changing replset_addr. I'll try.

Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@php-coder how about we use $(repl_addr) here? I have tried it before and it worked.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided to leave this to a follow up.

local endpoints
endpoints="$(find_endpoints | paste -s -d,)"

if [ -z "${endpoints}" ]; then
info "ERROR: couldn't add host to replica set!"
info "CAUSE: DNS lookup for '${MONGODB_SERVICE_NAME:-mongodb}' returned no results."
return 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning zero or non-zero here makes no difference, right? I believe users have to go read the logs to realize that something is wrong, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. it's correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhcarvalho When we decided to move retry logic into JavaScript we were wrong actually: there is non-zero possibility that potential member can't connect to replica set at all because primary isn't ready yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhcarvalho Never mind. My test has failed not because of this, but probably because of exceeding timeout.

Copy link
Contributor Author

@php-coder php-coder Oct 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timeout also wasn't a cause. In one of improvements I replaced grep -x by = and it was wrong. Indeed, best is the enemy of the good :-(

fi

local replset_addr
replset_addr="${MONGODB_REPLICA_NAME}/${endpoints}"

if ! mongo admin -u admin -p "${MONGODB_ADMIN_PASSWORD}" --host "${replset_addr}" --eval "${script}" --quiet; then
Copy link
Contributor

@omron93 omron93 Oct 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhcarvalho Do you how long(/how many attempts) replset addressing is trying to connect?

PRIMARY don't have to be elected and then I am afraid connection will fail.

Otherwise waiting for primary is necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add retries to add_member.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhcarvalho Or wait that ReplSet is accepting connections before add_member

info "ERROR: couldn't add host to replica set!"
return 1
fi

info "Successfully added to replica set"
info "Waiting for PRIMARY/SECONDARY status ..."

local rs_status_out
rs_status_out="$(mongo admin -u admin -p "${MONGODB_ADMIN_PASSWORD}" --host "${replset_addr}" --eval "${WAIT_PRIMARY_OR_SECONDARY_STATUS}" --quiet || :)"

if ! echo "${rs_status_out}" | grep -xqs '\(SECONDARY\|PRIMARY\)'; then
info "ERROR: failed waiting for PRIMARY/SECONDARY status. Command output was:"
echo "${rs_status_out}"
echo "==> End of the error output <=="
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is yet another format for log messages, is this something you're intending to have in the image or just debugging to be removed before merge?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to have it in the image because it will be very useful if it fails. Otherwise it's mostly impossible to understand where command's output is ending.

return 1
fi

info "Successfully joined replica set"
}

info "Waiting for local MongoDB to accept connections ..."
wait_for_mongo_up &>/dev/null

# PetSet pods are named with a predictable name, following the pattern:
# $(petset name)-$(zero-based index)
# MEMBER_ID is computed by removing the prefix matching "*-", i.e.:
# "mongodb-0" -> "0"
# "mongodb-1" -> "1"
# "mongodb-2" -> "2"
readonly MEMBER_ID="${HOSTNAME##*-}"

# Initialize replica set only if we're the first member
if [ "${MEMBER_ID}" = '0' ]; then
initiate "${MEMBER_HOST}"
else
add_member "${MEMBER_HOST}"
fi
Loading