Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for removal of agent configs #11

Merged
merged 4 commits into from
Nov 12, 2018
Merged

Conversation

itzg
Copy link
Contributor

@itzg itzg commented Nov 2, 2018

What

Implements https://jira.rax.io/browse/SALUS-61 by supporting the "remove" operation within a configuration instruction initiated by the Ambassador.

How

Refactored the code in the filebeat and telegraf agent code into the new handleContentConfigurationOp function in agents.go. The handling of the remove operation was also added into the combined method, which in turn removes the files named for the ID of the agent config.

The existing EnsureRunning function of SpecificAgentRunner was renamed to EnsureRunningState and generalized to support both starting and stopping of agents. Agents are stopped when no configs remain for the agent to run. FYI some agents, like telegraf, actually fail with an error when run without any input plugins configured.

I also refactored some handling of the agent process out from the specific agent handling code and into CommandHandler. That was mainly to allow for further mocking of that interface during unit tests of ProcessConfig and EnsureRunningState rather than relying on executing real processes, etc.

How to test

make test

Why

n/a

TODO

none

Copy link
Contributor

@jjbuchan jjbuchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go is still kinda foreign to me. Won't be able to a good review until I actually start contributing to this, but nothing obvious stood out.

Just left some random comments that I'll look at when online again or will chat to you.

EnsureRunning(ctx context.Context)
// EnsureRunningState is called after installation of an agent and after each call to ProcessConfig
// It must ensure the agent process is running if configs and executable are available
// It must also ensure that that the process is stopped if no configuration remains
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's interesting. By agent, I assume you mean telegraf or something?

It seems fair enough, the only thing I wonder is mostly for Windows cases, SCOM often monitors for processes stopping and will alert automatically. I believe this is unexpected service terminations rather than a clean stop, but maybe worth keeping in mind.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. One job of Envoy is to be an "agent manager" where telegraf and filebeat are the two agents supported so far.

This comment is referring to how the Envoy watches the agent's child process. The SCOM scenario is not quite related, but still an interesting one that we should implement with a general "process monitor".

return false
}
_, ok := err.(*noAppliedConfigsError)
return ok
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to ask about this. Does err.(*noAppliedConfigsError) only return true of false? seems odd if it does, and seems odd that ok is the variable name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a type assertion which can result in either one value (and panic, if the wrong type) or two values where the second is a boolean indicating if the type assertion was "ok" (hence the naming convention).

Go is at least consistent in that the map indexing operator also supports an optional, second boolean result value and the convention is "ok" there too.

@@ -113,7 +128,7 @@ func downloadExtractTarGz(outputPath, url string, exePath string) error {

_, err = io.Copy(file, tarReader)
if err != nil {
file.Close()
_ = file.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you have to assign a variable anytime the function returns something?

is this just a best practice thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IntelliJ was flagging this as a warning because the returned error was not being considered. Since errors on closing files tend to be uninteresting, I took the cheesy solution of explicitly ignoring the returned value with a blank identifier.

// Returns true if the configuration was applied
func handleContentConfigurationOp(op *telemetry_edge.ConfigurationOp, configInstancePath string) bool {
switch op.GetType() {
case telemetry_edge.ConfigurationOp_CREATE, telemetry_edge.ConfigurationOp_MODIFY:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this how indentation is supposed to work with go switch statements? do we actually use a linter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than a linter I'd like for us to enable the "gofmt" option on commit. With that said, yeah, apparently this is the official indentation style for switch statements, but not what I personally prefer :).

case telemetry_edge.ConfigurationOp_REMOVE:
err := os.Remove(configInstancePath)
if err != nil {
if os.IsNotExist(err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to check what this does. is it seeing if the error returned is a does not exist error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Since error in Go is an opaque interface type the convention seems to be providing help functions like that one to test for the type of error. It's a little awkward but a consistent idiom.

@itzg
Copy link
Contributor Author

itzg commented Nov 12, 2018

@jjbuchan if you get a chance, a quick re-review of this would be good

@itzg
Copy link
Contributor Author

itzg commented Nov 12, 2018

@jjbuchan sorry for the re-review request earlier...now it's really ready for a re-review

@itzg itzg merged commit 97649be into master Nov 12, 2018
@itzg itzg deleted the itzg/support-config-remove branch November 12, 2018 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants