Skip to content

Commit

Permalink
Added support for claiming nodes as part of installation. (netdata#10084
Browse files Browse the repository at this point in the history
)

* Added support for claiming nodes as part of installation.

This adds four new options to the `netdata-installer.sh` script:

* `--claim-token`
* `--claim-rooms`
* `--claim-uri`
* `--claim-proxy`

These directly correspond to the `-token`, `-rooms`, `-uri`, and `-proxy`
options for the `netdata-claim.sh` script. They have the following
associated logic:

* If any are specified and the `--disable-cloud` option is also
  specified, we bail and tell the user to either enable the cloud or
  remove the claiming options.
* If only some but not all of the token, rooms, and uri options  are
  specified, we bail and tell the user that they must pass all three.
* If all three of the token, rooms, and uri are specified, we invoke the
  `netdata-claim.sh` script for the install itself as one of the last
  steps in the installation process, using the values passed to these
  options.

This allows users to directly claim the agent as part of the install,
which is useful for automated installation scenarios.

* Add missing space as suggested by @knatsakis

* Properly handle installs in /.

* Properly handle unprefixed installs.

* Fix another spelling error in an option name.

* Properly fix option naming.

* Move claiming into kickstart script instead of netdata-installer.

This makes us more future-proof.

The required changes also fix some buggy behavior in the option parsing
code in the kickstart scripts.

* Fix checksums.

* Sanely handle the daemon not running during the claiming process.

* Silence incorrect shellcheck warning.

* Simplify condition as suggested by @vkalintiris.

* Clean up old changes that should not be here anymore.

These are leftovers from an earlier revision, they are not actually
needed.

* Add ID generation logic to the claiming script.

This lets it reliably claim nodes which have not yet had the daemon run.

Also fixes a consistency issue in the claiming logic in the Docker
entrypoint.
  • Loading branch information
Ferroin authored and stelfrag committed Mar 9, 2021
1 parent bfe7067 commit 8a95b89
Show file tree
Hide file tree
Showing 6 changed files with 230 additions and 102 deletions.
42 changes: 35 additions & 7 deletions claim/netdata-claim.sh.in
Expand Up @@ -85,15 +85,22 @@ ERROR_MESSAGES[17]="Service Unavailable"

# Exit code: 18 - Agent unique id is not generated yet.

NETDATA_RUNNING=1

get_config_value() {
conf_file="${1}"
section="${2}"
key_name="${3}"
config_result=$(@sbindir_POST@/netdatacli 2>/dev/null read-config "$conf_file|$section|$key_name"; exit $?)
# shellcheck disable=SC2181
if [ "$?" != "0" ]; then
echo >&2 "cli failed, assume netdata is not running and query the on-disk config"
config_result=$(@sbindir_POST@/netdata 2>/dev/null -W get2 "$conf_file" "$section" "$key_name" unknown_default)
if [ "${NETDATA_RUNNING}" -eq 1 ]; then
config_result=$(@sbindir_POST@/netdatacli 2>/dev/null read-config "$conf_file|$section|$key_name"; exit $?)
result="$?"
if [ "${result}" -ne 0 ]; then
echo >&2 "Unable to communicate with Netdata daemon, querying config from disk instead."
NETDATA_RUNNING=0
fi
fi
if [ "${NETDATA_RUNNING}" -eq 0 ]; then
config_result=$(@sbindir_POST@/netdata 2>/dev/null -W get2 "$conf_file" "$section" "$key_name" unknown_default)
fi
echo "$config_result"
}
Expand Down Expand Up @@ -141,13 +148,33 @@ NETDATA_USER=$(get_config_value netdata global "run as user")
[ -z "$EUID" ] && EUID="$(id -u)"


gen_id() {
local id

id="$(uuidgen)"

if [ "${id}" = "8a795b0c-2311-11e6-8563-000c295076a6" ] || [ "${id}" = "4aed1458-1c3e-11e6-a53f-000c290fc8f5" ]; then
gen_id
else
echo "${id}"
fi
}

# get the MACHINE_GUID by default
if [ -r "${MACHINE_GUID_FILE}" ]; then
ID="$(cat "${MACHINE_GUID_FILE}")"
MGUID=$ID
else
echo >&2 "netdata.public.unique.id is not generated yet or not readable. Please run agent at least once before attempting to claim. Agent generates this file on first startup. If the ID is generated already make sure you have rights to read it (Filename: ${MACHINE_GUID_FILE})."
elif [ -f "${MACHINE_GUID_FILE}" ]; then
echo >&2 "netdata.public.unique.id is not readable. Please make sure you have rights to read it (Filename: ${MACHINE_GUID_FILE})."
exit 18
else
if mkdir -p "${MACHINE_GUID_FILE%/*}" && /bin/echo -n "$(gen_id)" > "${MACHINE_GUID_FILE}"; then
ID="$(cat "${MACHINE_GUID_FILE}")"
MGUID=$ID
else
echo >&2 "Failed to write new machine GUID. Please make sure you have rights to write to ${MACHINE_GUID_FILE}."
exit 18
fi
fi

# get token from file
Expand All @@ -174,6 +201,7 @@ do
-noproxy) NOPROXY=yes ;;
-noreload) RELOAD=0 ;;
-user=*) NETDATA_USER=${arg:6} ;;
-daemon-not-running) NETDATA_RUNNING=0 ;;
*) echo >&2 "Unknown argument ${arg}"
exit 1 ;;
esac
Expand Down
1 change: 1 addition & 0 deletions packaging/docker/run.sh
Expand Up @@ -25,6 +25,7 @@ if [ -n "${NETDATA_CLAIM_URL}" ] && [ -n "${NETDATA_CLAIM_TOKEN}" ] && [ ! -f /v
-url "${NETDATA_CLAIM_URL}" \
${NETDATA_CLAIM_ROOMS:+-rooms "${NETDATA_CLAIM_ROOMS}"} \
${NETDATA_CLAIM_PROXY:+-proxy "${NETDATA_CLAIM_PROXY}"}
-daemon-not-running
fi

exec /usr/sbin/netdata -u "${DOCKER_USR}" -D -s /host -p "${NETDATA_LISTENER_PORT}" -W set web "web files group" root -W set web "web files owner" root "$@"
127 changes: 85 additions & 42 deletions packaging/installer/kickstart-static64.sh
Expand Up @@ -12,6 +12,10 @@
# --local-files Use a manually provided tarball for the installation
# --allow-duplicate-install do not bail if we detect a duplicate install
# --reinstall if an existing install would be updated, reinstall instead
# --claim-token specify a token to use for claiming the newly installed instance
# --claim-url specify a URL to use for claiming the newly installed isntance
# --claim-rooms specify a list of rooms to claim the newly installed instance to
# --claim-proxy specify a proxy to use while claiming the newly installed instance
#
# Environment options:
#
Expand Down Expand Up @@ -224,56 +228,81 @@ NETDATA_INSTALLER_OPTIONS=""
NETDATA_UPDATES="--auto-update"
RELEASE_CHANNEL="nightly"
while [ -n "${1}" ]; do
if [ "${1}" = "--dont-wait" ] || [ "${1}" = "--non-interactive" ] || [ "${1}" = "--accept" ]; then
opts="${opts} --accept"
shift 1
elif [ "${1}" = "--dont-start-it" ]; then
NETDATA_INSTALLER_OPTIONS="${NETDATA_INSTALLER_OPTIONS:+${NETDATA_INSTALLER_OPTIONS} }${1}"
shift 1
elif [ "${1}" = "--no-updates" ]; then
NETDATA_UPDATES=""
shift 1
elif [ "${1}" = "--auto-update" ]; then
true # This is the default behaviour, so ignore it.
shift 1
elif [ "${1}" = "--stable-channel" ]; then
RELEASE_CHANNEL="stable"
NETDATA_INSTALLER_OPTIONS="${NETDATA_INSTALLER_OPTIONS:+${NETDATA_INSTALLER_OPTIONS} }${1}"
shift 1
elif [ "${1}" = "--disable-telemetry" ]; then
NETDATA_INSTALLER_OPTIONS="${NETDATA_INSTALLER_OPTIONS:+${NETDATA_INSTALLER_OPTIONS} }${1}"
shift 1
elif [ "${1}" = "--local-files" ]; then
NETDATA_UPDATES="" # Disable autoupdates if using pre-downloaded files.
shift 1
if [ -z "${1}" ]; then
fatal "Option --local-files requires extra information. The desired tarball full filename is needed"
fi
case "${1}" in
"--dont-wait") opts="${opts} --accept" ;;
"--non-interactive") opts="${opts} --accept" ;;
"--accept") opts="${opts} --accept" ;;
"--dont-start-it")
NETDATA_INSTALLER_OPTIONS="${NETDATA_INSTALLER_OPTIONS:+${NETDATA_INSTALLER_OPTIONS} }${1}"
NETDATA_CLAIM_EXTRA="${NETDATA_CLAIM_EXTRA} -daemon-not-running"
;;
"--no-updates") NETDATA_UPDATES="" ;;
"--stable-channel")
RELEASE_CHANNEL="stable"
NETDATA_INSTALLER_OPTIONS="${NETDATA_INSTALLER_OPTIONS:+${NETDATA_INSTALLER_OPTIONS} }${1}"
;;
"--disable-telemetry") NETDATA_INSTALLER_OPTIONS="${NETDATA_INSTALLER_OPTIONS:+${NETDATA_INSTALLER_OPTIONS} }${1}";;
"--local-files")
NETDATA_UPDATES="" # Disable autoupdates if using pre-downloaded files.
if [ -z "${2}" ]; then
fatal "Option --local-files requires extra information. The desired tarball full filename is needed"
fi

NETDATA_LOCAL_TARBALL_OVERRIDE="${1}"
shift 1
if [ -z "${1}" ]; then
fatal "Option --local-files requires a pair of the tarball source and the checksum file"
fi
NETDATA_LOCAL_TARBALL_OVERRIDE="${2}"

NETDATA_LOCAL_TARBALL_OVERRIDE_CHECKSUM="${1}"
shift 1
elif [ "${1}" = "--allow-duplicate-install" ]; then
NETDATA_ALLOW_DUPLICATE_INSTALL=1
shift 1
elif [ "${1}" = "--reinstall" ]; then
NETDATA_REINSTALL=1
shift 1
else
echo >&2 "Unknown option '${1}' or invalid number of arguments. Please check the README for the available arguments of ${0} and try again"
exit 1
fi
if [ -z "${3}" ]; then
fatal "Option --local-files requires a pair of the tarball source and the checksum file"
fi

NETDATA_LOCAL_TARBALL_OVERRIDE_CHECKSUM="${3}"
shift 2
;;
"--allow-duplicate-install") NETDATA_ALLOW_DUPLICATE_INSTALL=1 ;;
"--reinstall") NETDATA_REINSTALL=1 ;;
"--claim-token")
NETDATA_CLAIM_TOKEN="${2}"
shift 1
;;
"--claim-rooms")
NETDATA_CLAIM_ROOMS="${2}"
shift 1
;;
"--claim-url")
NETDATA_CLAIM_URL="${2}"
shift 1
;;
"--claim-proxy")
NETDATA_CLAIM_EXTRA="${NETDATA_CLAIM_EXTRA} -proxy ${2}"
shift 1
;;
*)
echo >&2 "Unknown option '${1}' or invalid number of arguments. Please check the README for the available arguments of ${0} and try again"
exit 1
esac
shift 1
done

if [ ! "${DO_NOT_TRACK:-0}" -eq 0 ] || [ -n "$DO_NOT_TRACK" ]; then
NETDATA_INSTALLER_OPTIONS="${NETDATA_INSTALLER_OPTIONS:+${NETDATA_INSTALLER_OPTIONS} }--disable-telemtry"
fi

if [ -n "${NETDATA_DISABLE_CLOUD}" ]; then
if [ -n "${NETDATA_CLAIM_TOKEN}" ] || [ -n "${NETDATA_CLAIM_ROOMS}" ] || [ -n "${NETDATA_CLAIM_URL}" ]; then
run_failed "Cloud explicitly disabled but automatic claiming requested."
run_failed "Either enable Netdata Cloud, or remove the --claim-* options."
exit 1
fi
fi

# shellcheck disable=SC2235,SC2030
if ( [ -z "${NETDATA_CLAIM_TOKEN}" ] && [ -n "${NETDATA_CLAIM_URL}" ] ) || ( [ -n "${NETDATA_CLAIM_TOKEN}" ] && [ -z "${NETDATA_CLAIM_URL}" ] ); then
run_failed "Invalid claiming options, both a claiming token and URL must be specified."
exit 1
elif [ -z "${NETDATA_CLAIM_TOKEN}" ] && [ -n "${NETDATA_CLAIM_ROOMS}" ]; then
run_failed "Invalid claiming options, claim rooms may only be specified when a token and URL are specified."
exit 1
fi

# Netdata Tarball Base URL (defaults to our Google Storage Bucket)
[ -z "$NETDATA_TARBALL_BASEURL" ] && NETDATA_TARBALL_BASEURL=https://storage.googleapis.com/netdata-nightlies

Expand Down Expand Up @@ -365,4 +394,18 @@ if [ $? -eq 0 ]; then
fi
else
echo >&2 "NOTE: did not remove: ${TMPDIR}/netdata-latest.gz.run"
exit 1
fi

# --------------------------------------------------------------------------------------------------------------------

if [ -n "${NETDATA_CLAIM_TOKEN}" ]; then
progress "Attempting to claim agent to ${NETDATA_CLAIM_URL}"
NETDATA_CLAIM_PATH=/opt/netdata/bin/netdata-claim.sh

if "${NETDATA_CLAIM_PATH}" -token=${NETDATA_CLAIM_TOKEN} -rooms=${NETDATA_CLAIM_ROOMS} -url=${NETDATA_CLAIM_URL} ${NETDATA_CLAIM_EXTRA}; then
progress "Successfully claimed node"
else
run_failed "Unable to claim node, you must do so manually."
fi
fi

0 comments on commit 8a95b89

Please sign in to comment.