Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for claiming nodes as part of installation. #10084

Merged
merged 13 commits into from Mar 8, 2021

Conversation

Ferroin
Copy link
Member

@Ferroin Ferroin commented Oct 16, 2020

Summary

This adds four new options to the kickstart.sh and kickstart-static64.sh scripts:

  • --claim-token
  • --claim-rooms
  • --claim-url
  • --claim-proxy

These directly correspond to the -token, -rooms, -url, and -proxy options for the netdata-claim.sh script. They have the following associated logic:

  • If any are specified and the --disable-cloud option is also specified, we bail and tell the user to either enable the cloud or
    remove the claiming options.
  • If only one of the token and url options is specified, or the room or proxy options are specified without both of the token or url options, bail early and tell the user to pass a valid set of options.
  • Otherwise, pass the required options to the netdata-claim.sh script after the installation is otherwise complete.

This allows users to directly claim the agent as part of the install, which is useful for automated installation scenarios.

Component Name

area/packaging

Test Plan

This can be verified by using the above-mentioned options with a fresh install to claim nodes as they are installed. Only the modified kickstart scripts are needed for this, they will work with the existing stable and nightly builds.

Additional Information

Fixes: #8819

This also fixes some buggy behavior in the kickstart scripts’ option parsing code that would cause them to behave differently depending on option order.

@github-actions github-actions bot added the area/packaging Packaging and operating systems support label Oct 16, 2020
knatsakis
knatsakis previously approved these changes Oct 19, 2020
netdata-installer.sh Outdated Show resolved Hide resolved
@Ferroin Ferroin changed the base branch from master to develop October 19, 2020 11:51
@Ferroin Ferroin changed the base branch from develop to master October 20, 2020 11:28
@Ferroin Ferroin marked this pull request as draft October 26, 2020 17:27
@Ferroin Ferroin force-pushed the installer-claiming branch 2 times, most recently from 1295d4b to 9112ce3 Compare October 28, 2020 11:39
@Ferroin Ferroin marked this pull request as ready for review October 28, 2020 13:11
manos-saratsis
manos-saratsis previously approved these changes Oct 28, 2020
manos-saratsis
manos-saratsis previously approved these changes Nov 2, 2020
Copy link

@manos-saratsis manos-saratsis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ferroin I tested the happy path both for installation through kickstart and through building the agent. It worked fine. I did not test corner cases plus have not tested it without setting the War Room parameter.

I have not reviewed the code.

knatsakis
knatsakis previously approved these changes Nov 6, 2020
@Ferroin
Copy link
Member Author

Ferroin commented Nov 13, 2020

Rebased to fix the conflict in netdata-installer.sh, no functional changes to the code.

knatsakis
knatsakis previously approved these changes Nov 15, 2020
@Ferroin
Copy link
Member Author

Ferroin commented Feb 26, 2021

@Ferroin last time I test this i have problem mentioned above and also was in direct contact with you over Slack (then put it here as well for the record). Was this solved?

As I understood at the time, the issue was how you were using the local-files option, not the code here, but you were’t getting an error message because of how the option parsing is handled (we can’t easily differentiate between parameters for an option and subsequent options).

The full --local-files option syntax is --local-files <netdata-source-tarball> <tarball-checksum-file> <go.d-plugin> <go.d.plugin-config-tarball> <install-required-packages.sh>.

We really need to just split this out as a set of options instead of one option with five parameters, and probably make everything optional (and add support for all the other stuff we would need local copies of...).

@underhood
Copy link
Contributor

@Ferroin I see so this is a problem but not directly related to this PR.... Do you have a test plan here then? If I remember correctly kickstart here downloaded claim script from the master but this PR modifies the claim script...

@Ferroin
Copy link
Member Author

Ferroin commented Feb 26, 2021

@Ferroin I see so this is a problem but not directly related to this PR.... Do you have a test plan here then? If I remember correctly kickstart here downloaded claim script from the master but this PR modifies the claim script...

If you provide the full set of local files with that option, then it will use the claiming script in the provided source tarball.

So the general test plan is to generate a source tarball from this PR using make dist, prepare the correct checksum for that (just dump the output of sha256sum run on the tarball into a text file), and then use those with the --local-files option (with the regular go.d files pulled from the latest release and the install-required-packages.sh file from this PR or the master branch) with the kickstart script from this PR.

@underhood
Copy link
Contributor

underhood commented Feb 26, 2021

thx, I will retest that today then. Also, we should make a note to fix that option parsing (in future PR)...

@Ferroin
Copy link
Member Author

Ferroin commented Feb 26, 2021

thx, I will retest that today then. Also, we should make a note to fix that option parsing (in future PR)...

I’ve opened an issue to just redo the whole handling of local files. It’s a major pain point for some users, and the current implementation is horribly broken in a number of other ways.

underhood
underhood previously approved these changes Feb 26, 2021
@@ -25,6 +25,7 @@ if [ -n "${NETDATA_CLAIM_URL}" ] && [ -n "${NETDATA_CLAIM_TOKEN}" ] && [ ! -f /v
-url "${NETDATA_CLAIM_URL}" \
${NETDATA_CLAIM_ROOMS:+-rooms "${NETDATA_CLAIM_ROOMS}"} \
${NETDATA_CLAIM_PROXY:+-proxy "${NETDATA_CLAIM_PROXY}"}
-daemon-not-running
Copy link
Member

@ilyam8 ilyam8 Feb 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. That condition is not correct

https://github.com/Ferroin/netdata/blob/b8ff8fbed3a2c1e670af64a668552069b3071289/packaging/docker/run.sh#L23-L24

i dont know what is /var/lib/netdata/claim.d/claimed_id. Perhaps some old dir?

  1. There is no NETDATA_CLAIM_ROOMS check. Is it ok to have no rooms. Does it work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. No, just me using the wrong name.
  2. AIUI from discussions about how claiming works that I0ve had with a couple of people, we do not need to specify rooms. Not doing so claims the node, but does not add it to any of the rooms in the space, but it can be manually added later through Netdata Cloud.

@knatsakis knatsakis self-requested a review March 8, 2021 09:20
knatsakis
knatsakis previously approved these changes Mar 8, 2021
This adds four new options to the `netdata-installer.sh` script:

* `--claim-token`
* `--claim-rooms`
* `--claim-uri`
* `--claim-proxy`

These directly correspond to the `-token`, `-rooms`, `-uri`, and `-proxy`
options for the `netdata-claim.sh` script. They have the following
associated logic:

* If any are specified and the `--disable-cloud` option is also
  specified, we bail and tell the user to either enable the cloud or
  remove the claiming options.
* If only some but not all of the token, rooms, and uri options  are
  specified, we bail and tell the user that they must pass all three.
* If all three of the token, rooms, and uri are specified, we invoke the
  `netdata-claim.sh` script for the install itself as one of the last
  steps in the installation process, using the values passed to these
  options.

This allows users to directly claim the agent as part of the install,
which is useful for automated installation scenarios.
This makes us more future-proof.

The required changes also fix some buggy behavior in the option parsing
code in the kickstart scripts.
These are leftovers from an earlier revision, they are not actually
needed.
This lets it reliably claim nodes which have not yet had the daemon run.

Also fixes a consistency issue in the claiming logic in the Docker
entrypoint.
@Ferroin Ferroin dismissed stale reviews from knatsakis and underhood via 932b4cc March 8, 2021 12:21
@Ferroin Ferroin merged commit 0a47bbd into netdata:master Mar 8, 2021
@Ferroin Ferroin deleted the installer-claiming branch March 8, 2021 13:12
stelfrag pushed a commit to stelfrag/netdata that referenced this pull request Mar 9, 2021
)

* Added support for claiming nodes as part of installation.

This adds four new options to the `netdata-installer.sh` script:

* `--claim-token`
* `--claim-rooms`
* `--claim-uri`
* `--claim-proxy`

These directly correspond to the `-token`, `-rooms`, `-uri`, and `-proxy`
options for the `netdata-claim.sh` script. They have the following
associated logic:

* If any are specified and the `--disable-cloud` option is also
  specified, we bail and tell the user to either enable the cloud or
  remove the claiming options.
* If only some but not all of the token, rooms, and uri options  are
  specified, we bail and tell the user that they must pass all three.
* If all three of the token, rooms, and uri are specified, we invoke the
  `netdata-claim.sh` script for the install itself as one of the last
  steps in the installation process, using the values passed to these
  options.

This allows users to directly claim the agent as part of the install,
which is useful for automated installation scenarios.

* Add missing space as suggested by @knatsakis

* Properly handle installs in /.

* Properly handle unprefixed installs.

* Fix another spelling error in an option name.

* Properly fix option naming.

* Move claiming into kickstart script instead of netdata-installer.

This makes us more future-proof.

The required changes also fix some buggy behavior in the option parsing
code in the kickstart scripts.

* Fix checksums.

* Sanely handle the daemon not running during the claiming process.

* Silence incorrect shellcheck warning.

* Simplify condition as suggested by @vkalintiris.

* Clean up old changes that should not be here anymore.

These are leftovers from an earlier revision, they are not actually
needed.

* Add ID generation logic to the claiming script.

This lets it reliably claim nodes which have not yet had the daemon run.

Also fixes a consistency issue in the claiming logic in the Docker
entrypoint.
Saruspete pushed a commit to Saruspete/netdata that referenced this pull request Mar 17, 2021
)

* Added support for claiming nodes as part of installation.

This adds four new options to the `netdata-installer.sh` script:

* `--claim-token`
* `--claim-rooms`
* `--claim-uri`
* `--claim-proxy`

These directly correspond to the `-token`, `-rooms`, `-uri`, and `-proxy`
options for the `netdata-claim.sh` script. They have the following
associated logic:

* If any are specified and the `--disable-cloud` option is also
  specified, we bail and tell the user to either enable the cloud or
  remove the claiming options.
* If only some but not all of the token, rooms, and uri options  are
  specified, we bail and tell the user that they must pass all three.
* If all three of the token, rooms, and uri are specified, we invoke the
  `netdata-claim.sh` script for the install itself as one of the last
  steps in the installation process, using the values passed to these
  options.

This allows users to directly claim the agent as part of the install,
which is useful for automated installation scenarios.

* Add missing space as suggested by @knatsakis

* Properly handle installs in /.

* Properly handle unprefixed installs.

* Fix another spelling error in an option name.

* Properly fix option naming.

* Move claiming into kickstart script instead of netdata-installer.

This makes us more future-proof.

The required changes also fix some buggy behavior in the option parsing
code in the kickstart scripts.

* Fix checksums.

* Sanely handle the daemon not running during the claiming process.

* Silence incorrect shellcheck warning.

* Simplify condition as suggested by @vkalintiris.

* Clean up old changes that should not be here anymore.

These are leftovers from an earlier revision, they are not actually
needed.

* Add ID generation logic to the claiming script.

This lets it reliably claim nodes which have not yet had the daemon run.

Also fixes a consistency issue in the claiming logic in the Docker
entrypoint.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docs area/packaging Packaging and operating systems support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for auto claiming a freshly installed NetAgent agent via kickstart
8 participants