Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move health checks to kubernetes package and extract logging #1677

Merged
merged 8 commits into from
Sep 10, 2020

Conversation

ANeumann82
Copy link
Member

The CLI now shares code with the manager which used a simple log - this is a hotfix to not spam stdout in the CLI.

We need a better solution to handle logging in code that is used by CLI and the manager though.

What this PR does / why we need it:

Fixes #1672

Signed-off-by: Andreas Neumann aneumann@mesosphere.com

…is is a hotfix to not spam stdout in the CLI

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Copy link
Member

@kensipe kensipe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pkg/engine where health resides should NOT be accessible to the CLI. it is for the manager. clog is a client util for controlling verbosity which should not be used or introduced at the server.
and we shouldn't init-ing what was designed to be client libs in the manager.

The output of the server logs include date and time output and are directed to stderr... those are lost when using clog because the output experience of clog is for end users not for logs.

@kensipe
Copy link
Member

kensipe commented Sep 4, 2020

further analysis...

This file (being sub package to engine) as engine errors... what is the cli suppose to do with that and why?

It appears that all "log" messages are the result of a return... perhaps the calling code should be responsible for what to log... or a wrapping function (1 for client and 1 for server).

I would suggest

  1. moving health to a kubernetes packaging (not engine)
  2. remove engine specific errors
  3. if IsDeleted is to remain... perhaps call the package status. resulting in status.IsDeleted() and status.IsHealthy()
  4. provide a wrapper for the server and client with different logging
  • resource package looks like a kubernetes util package as well.

It makes sense that the client and the server will need kubernetes lib
It doesn't make sense that the client will make call invocations against engine code

@kensipe
Copy link
Member

kensipe commented Sep 4, 2020

the engine.ErrFatalExecution for the terminal job... that is a huge assumption that this code is always call from the engine... it is a mistake for it to be in the file IMO.

@zen-dog
Copy link
Contributor

zen-dog commented Sep 7, 2020

that is a huge assumption that this code is always call from the engine... it is a mistake for it to be in the file IMO.

It's rather the other way around - health.go was always internal to the execution engine. And then at some point kudo init started using this package more and more. This makes a lot of sense from the code-reusability perspective but requires some more work.

In the hindsight, it is lamentable that we use two different loggers on the client and server-side. I'd rather we would consolidate to ease code reuse.

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

# Conflicts:
#	pkg/engine/health/health.go
@ANeumann82
Copy link
Member Author

In the hindsight, it is lamentable that we use two different loggers on the client and server-side. I'd rather we would consolidate to ease code reuse.

Agreed - We have a single code base for both manager and cli, I think we should use a common logging solution.

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

# Conflicts:
#	pkg/engine/health/health.go
@zen-dog
Copy link
Contributor

zen-dog commented Sep 7, 2020

Fun fact: I ran into just now:

❯ ./bin/kubectl-kudo get operatorversions
2020/09/07 18:13:31 CRD operators.kudo.dev is now healthy
2020/09/07 18:13:31 CRD operatorversions.kudo.dev is now healthy
2020/09/07 18:13:31 CRD instances.kudo.dev is now healthy
List of current installed operatorversions in namespace "default":
.
├── child-0.0.1
├── dummy-0.1.0
├── first-operator-0.2.0
├── flink-0.2.1
├── flink-demo-0.1.5
├── kafka-1.2.0
├── parent-0.1.0
└── zookeeper-0.3.0

works great except for "CRD * is now healthy" part which looks weird 🤷

@ANeumann82
Copy link
Member Author

Some additional findings - maybe it'd be better to make a new issue out of this, but we have a discussion here already.

I've looked at klog again today, and why we're not using it, and I found no reason not to use it:

  • It doesn't have any big dependencies (the only thing I see in its go.mod file is the logr interface
  • It provides V level control
  • It provides logging to files (or stdout, or stderr)
  • We have to custom define the flags to expose to the user anyway, so it won't clutter or obfuscate the CLI
  • We can disable the "Log-Line prefix" per default for the CLI and enable it for the manager
  • We can use it in the manager and the CLI
  • We can (for the manager) use klog.CopyStandardLogTo to output go "log.Printf" calls to use klog
  • It has LoggingStats (which is nice for unit tests, as we can simply assert that nothing was logged)

So, we could either

  • Replace clog with klog everywhere - Most clean option, but quite a few changes
  • Use clog as a wrapper for klog - Not that many changes, but looks like a band-aid, we don't really need a wrapper

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
@ANeumann82 ANeumann82 changed the title Use clog in health.go Move health checks to kubernetes package and extract logging Sep 8, 2020
@ANeumann82
Copy link
Member Author

further analysis...

This file (being sub package to engine) as engine errors... what is the cli suppose to do with that and why?

It appears that all "log" messages are the result of a return... perhaps the calling code should be responsible for what to log... or a wrapping function (1 for client and 1 for server).

I would suggest

  1. moving health to a kubernetes packaging (not engine)
  2. remove engine specific errors
  3. if IsDeleted is to remain... perhaps call the package status. resulting in status.IsDeleted() and status.IsHealthy()
  4. provide a wrapper for the server and client with different logging
  • resource package looks like a kubernetes util package as well.

It makes sense that the client and the server will need kubernetes lib
It doesn't make sense that the client will make call invocations against engine code

I have refactored the code to look like this, the resource package is still in the engine, although it would make sense to move it in a different PR.

Copy link
Member

@kensipe kensipe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a lot of work here that is moving in the right direction...
It is great that a shared resource "status"/"health" is moved outside of the engine code path...
The shared resource needs to be agnostic to the calling code. It currently id not. it specifically has clog code in the health.go.

If we were to break this 1 project into 3 projects.... one might expect:

  1. cli project
  2. manager project
  3. shared project

one would also expect that

  1. cli has clog
  2. manager has a server logger
  3. shared code would either NOT log (allowing logging by calling code) or a shared logger (which we should avoid for now... it comes with it's own set of additional challenges)

@kensipe
Copy link
Member

kensipe commented Sep 8, 2020

after further review... it looks like perhaps that was 1 leftover mistake... all references except the 1 are removed... I will look into refactoring (see if the condition is captured) etc.

Signed-off-by: Ken Sipe <kensipe@gmail.com>
Copy link
Member

@kensipe kensipe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few non-breaking nits...
first kudos for:

  1. nice work on reducing the scope of toUnstructured!!! 👏
  2. nice teasing apart cli and manager code

nits:

  1. the v1beta1 -> kudoapi can we not add it here... it is a separate concern
  2. adding return var names would make it much easier to read...
  3. godocs for isHealthy would be much appreciated... especially around calling code expectations

nice work

}
}
return false, ""
return false, "", nil
}

func IsDeleted(client client.Client, discovery discovery.CachedDiscoveryInterface, objs []runtime.Object) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all these status APIs have changed to bool, msg, error except this one. It is odd to have an IsXYZ without a bool... or no error means yes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I wasn't fully sure if I should move this as well - It has a different signature, uses a client and fetches resources, works on a slice of objects, etc...

But I think it makes sense. I'll refactor it to work on a single object and adjust the signature to match the others.

objUnstructured := &unstructured.Unstructured{Object: unstructMap}
// IsHealthy returns whether an object is healthy and a corresponding message
// Must be implemented for each type.
func IsHealthy(obj runtime.Object) (bool, string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we add more godoc here... developer less familiar needs to read all code to understand you signature. rules would also be helpful... there is a lot of complexity here... rules like: is it possible for bool to be true with an error? I would think not. Is there a case of a string being empty other than error?

IMO it would also be helpful to provide name variables here which would help with context. (healthy bool, msg string, err error) is more readable to me... otherwise it is a hunt for what is the string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also mention that only certain common objects are checked for health. Unknown objects will be considered healthy by default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@nfnt nfnt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Some small suggestions plus the nits @kensipe mentioned.

objUnstructured := &unstructured.Unstructured{Object: unstructMap}
// IsHealthy returns whether an object is healthy and a corresponding message
// Must be implemented for each type.
func IsHealthy(obj runtime.Object) (bool, string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also mention that only certain common objects are checked for health. Unknown objects will be considered healthy by default.

pkg/kudoctl/kudoinit/manager/manager.go Outdated Show resolved Hide resolved
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
@ANeumann82 ANeumann82 merged commit 8dccafc into main Sep 10, 2020
@ANeumann82 ANeumann82 deleted the an/use-clog-in-health branch September 10, 2020 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unexpected "Log" messages at the CLI
4 participants