Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: loglevel=2 and loglevel=1 messages not getting logged (WAS: cluster up: builds do not get loglevel=2 by default) #9247

Closed
csrwng opened this issue Jun 9, 2016 · 42 comments · Fixed by #9332

Comments

@csrwng
Copy link
Contributor

csrwng commented Jun 9, 2016

Only glog.V(0).Info messages are printed out when running a build in a cluster that has been started by 'oc cluster up'. The default should be loglevel 2.

Version

v1.3.0-alpha.1-190

Steps To Reproduce

Start a new cluster with 'oc cluster up'
Start a new build 'oc new-app https://github.com/openshift/ruby-hello-world.git'
Examine build logs

Current Result

glog.V(1) messages from builder are not printed out

Expected Result

glog.V(1) and glog.V(2) messages are printed out

@csrwng
Copy link
Contributor Author

csrwng commented Jun 10, 2016

@bparees - so I found the difference is not 'cluster up', rather this pull: #8924
Right now 'cluster up' will use the latest images, while 'openshift start' will use the last released image as you showed me today.

Bottom line is that glog.V(>0) messages will no longer show up in the log by default. Which I think is the more correct behavior, since if you look at the build pod, it's getting passed a --loglevel=0 and prior to #8924 we were ignoring that.

@bparees
Copy link
Contributor

bparees commented Jun 10, 2016

hm, then we may need to revisit what's being logged at level "1" because a lot of those messages should always be seen.

@csrwng
Copy link
Contributor Author

csrwng commented Jun 14, 2016

For now we will change anything logged at level 1 or 2 to "0". Does that sound acceptable?

@csrwng csrwng changed the title cluster up: builds do not get loglevel=2 by default build: loglevel=2 and loglevel=1 messages not getting logged (WAS: cluster up: builds do not get loglevel=2 by default) Jun 14, 2016
@csrwng csrwng assigned jupierce and unassigned csrwng Jun 14, 2016
@bparees
Copy link
Contributor

bparees commented Jun 14, 2016

@csrwng it sounds like the right starting point for the PR. we can review what messages that ends up including in the PR.

@smarterclayton
Copy link
Contributor

Reopening this, see my comments in the PR but this undid the very careful work to hide a bunch of info that is not useful for most users.

Regular users of the platform don't need this info unless they ask for it.

@smarterclayton
Copy link
Contributor

smarterclayton commented Jun 18, 2016

Here's some rules of thumb for how to be a good logging agent that should define the foundation of what the builders should be logging

  1. Build log output is a UI. Every single word you print should be communicating useful info to the user.
  2. Don't tell users things they can't do anything about - you waste their attention on meaningless info, which hides critical info
  3. Never, ever, EVER print JSON output to a user who didn't ask for it
  4. Don't tell me the same thing twice in two different ways
  5. If you're about to do something long, tell me what you're doing before you start, and if it's very long give me a bit of progress.
  6. Keep in mind relative importance - if you output a line with 80 characters, and a line with 10 characters, most people are going to focus on the 80 character line... but if that 10 character line says "we failed", then the focus isn't in the right place.

For builds specifically, if you took a brand new programmer off the street (someone using node.js for less than a month) and you made them look at the log output, they shouldn't see anything they couldn't figure out as being related to what their goal is

Also, we have rules about what log goes at what levels (see the upstream kubernetes docs/devel/logging.md file).

  1. V(2) is the default and is what most users should see most of the time (this is your UI)
  2. V(1) is for an abbreviated version of V(2), omit anything that doesn't directly correspond to the action being taken (probably user output, success, or failed)
  3. V(0) shouldn't be used unless it's useful to every single user in every single scenario. Basically, errors.

Some stuff that got added back in and some reasons why it's bad:

The value of ALLOWED_UIDS is [1-]
The value of DROP_CAPS is [KILL,MKNOD,SETGID,SETUID,SYS_CHROOT]

Not useful for end users. Should be v(4).

Image "centos/php-56-centos7@sha256:51c3e2b08bd9fadefccd6ec42288680d6d7f861bdbfbd2d8d24960621e4e27f5" not available locally, pulling ...

It's great to say we're going to be pulling.... but most of the info there is meaningless to a normal user. The digest in particular is very problematic - it takes up 80 characters and will cause wrapping in most cases, yet conveys no real info. We might make an exception to show it in this case because then a user could go back and use it... but a user isn't actually going to be able to recreate the build locally, so the full digest only belongs at v(3) or above. At V(1), what info do we actually need to convey? Pulling image "<short summary>"....

Creating a new S2I builder with build config: "Builder Name:\t\t\tApache 2.4 with PHP 5.6\nBuilder Image:\t\t\tcentos/php-56-centos7@sha256:51c3e2b08bd9fadefccd6ec42288680d6d7f861bdbfbd2d8d24960621e4e27f5\nBuilder Image Version:\t\t186a9c8caca0ef9eec4125247979876bf2c937b4\nBuilder Base Version:\t\taac5e76\nSource:\t\t\t\tfile:///tmp/s2i-build607401544/upload/src\nOutput Image Tag:\t\texample/cakephp-mysql-example-1:0d661744\nEnvironment:\t\t\tOPENSHIFT_BUILD_NAME=cakephp-mysql-example-1,OPENSHIFT_BUILD_NAMESPACE=example,OPENSHIFT_BUILD_SOURCE=https://github.com/openshift/cakephp-ex.git\nIncremental Build:\t\tdisabled\nRemove Old Build:\t\tdisabled\nBuilder Pull Policy:\t\tif-not-present\nPrevious Image Pull Policy:\talways\nQuiet:\t\t\t\tdisabled\nLayered Build:\t\t\tdisabled\nWorkdir:\t\t\t/tmp/s2i-build607401544\nDocker NetworkMode:\t\tcontainer:4be62ce657ac063cd020206e83adfe4253d2f8aa12077168e466554d8befd46c\nDocker Endpoint:\t\tunix:///var/run/docker.sock\n"

No.

Downloading "https://github.com/openshift/cakephp-ex.git" ...
Cloning source from https://github.com/openshift/cakephp-ex.git

Saying the same thing two different ways. The first one is better. The second one is redundant and should not be printed at v2.

I0618 21:07:53.282588       1 install.go:123] DEPRECATED: Use .s2i/bin instead of .sti/bin
I0618 21:07:53.282883       1 install.go:251] Using "assemble" installed from "<source-dir>/.sti/bin/assemble"
I0618 21:07:53.282953       1 install.go:251] Using "run" installed from "image:///usr/libexec/s2i/run"
I0618 21:07:53.282973       1 install.go:251] Using "save-artifacts" installed from "image:///usr/libexec/s2i/save-artifacts"

This is just something that snuck in, but shouldn't be logged here. V(4), maybe V(3).

Using provided push secret for pushing 172.30.209.18:5000/example/cakephp-mysql-example:latest image

This is kind of useful, but unfortunately is not actionable by a user. How does knowing this help me to verify my build succeeded or failed?

@smarterclayton
Copy link
Contributor

Bumping the priority of this up to p1 - I'm trying to record a screencast for summit that is demonstrating how awesome and succinct we are from cluster up all the way to the UI, and so I need this fixed by Monday afternoon, because we aren't.

@smarterclayton
Copy link
Contributor

Also, the warning message we print about mismatched versions is not actionable by any user of the platform, and is just a debugging message for OpenShift developers. It should be at v(3). No one will ever be able to do anything about it who isn't a cluster admin.

@bparees
Copy link
Contributor

bparees commented Jun 18, 2016

Also, the warning message we print about mismatched versions is not actionable by any user of the platform, and is just a debugging message for OpenShift developers. It should be at v(3). No one will ever be able to do anything about it who isn't a cluster admin.

they can't do anything about it, but when they send us their "my build isn't working, here's the output" log, it's nice that we can immediately see that warning without having to go back to them and get more verbose logging. Warnings aren't necessarily for users to fix, they are for users to be aware that "something isn't right" and perhaps to consult their admin or support.

Afterall, that is not a message any normal users should be seeing if they've got a proper install.

@smarterclayton
Copy link
Contributor

they can't do anything about it, but when they send us their "my build isn't working, here's the output" log

Who is that? OpenShift Online users? When is this ever going to be actioned? Why does the exact commit matter? What if we release a new version of the master, but don't rev the sti-builder (which is what happens all the time)?

It might be useful rarely. It's not useful enough that it's worth clogging up user output with. It's also spurious the majority of the time it's shown. If you can reduce the false-positive rate significantly, I might agree it's worth keeping.

@bparees
Copy link
Contributor

bparees commented Jun 19, 2016

Who is that? OpenShift Online users? When is this ever going to be actioned? Why does the exact commit matter? What if we release a new version of the master, but don't rev the sti-builder (which is what happens all the time)?

we don't do releases w/o doing that, do we? So any OSE or origin release user should have matching levels. And online should also have matching levels.

I would only expect this to be showing up for people doing active openshift development, if it's showing up for users who did a proper install of a packaged release then i agree, something has gone wrong and it needs to be revisited.

@smarterclayton
Copy link
Contributor

OSE does not rev sit-builder in all cases. Origin does not rev sti-builder in all cases.

@bparees
Copy link
Contributor

bparees commented Jun 19, 2016

loglevel 3 it is, then.

@smarterclayton
Copy link
Contributor

Another alternative on a build error is to always print that if it's true - i.e. at the very end.

@smarterclayton
Copy link
Contributor

I kind of regret --build-loglevel now - users having to select a level is very unintuitive. I think we probably want to have a --verbose flag instead and set the level ourselves (to 3 or 4 or similar).

@jupierce
Copy link
Contributor

@smarterclayton @bparees Hopefully these two PRs cover what is required for Monday's screencast.

@smarterclayton
Copy link
Contributor

Can you paste the output in one of the PRs so we can look?

@jupierce
Copy link
Contributor

Sorry about the delay on the logs. I'm not able to get a clone of the current source-to-image repo running in my development environment. Will work with Ben.

@jupierce
Copy link
Contributor

s2i test was blocked briefly by openshift/source-to-image#520 . @bparees was able to get me going again. PR openshift/source-to-image#519 has now been accepted. Now working on integrating this into my openshift/origin dev environment.

@jupierce
Copy link
Contributor

Output at all log levels with new s2i and logging tweaks: https://gist.github.com/jupierce/303fa169792c2a646976d3ff097f4be8

Noticing a few printf %!q(MISSING) entries.

@bparees
Copy link
Contributor

bparees commented Jun 22, 2016

the MISSING items appear to be a bug in @smarterclayton's glog wrapping logic, because i don't see anything else wrong with:
glog.V(4).Infof("Starting S2I build from %s/%s BuildConfig ...", s.build.Namespace, s.build.Name)

and

glog.V(4).Infof("The value of %s is [%s]", api.DropCapabilities, dropCaps)

@smarterclayton
Copy link
Contributor

There should be no glog prefixes at v1 or v2...

@smarterclayton
Copy link
Contributor

Means some code is not using the right package and is logging its own errors (instead of returning errors).

@smarterclayton
Copy link
Contributor

Log level: 1
Executing: oc start-build test --follow --wait --build-loglevel=1
test-4
Using locally available image "centos/ruby-22-centos7:latest"
Using locally available image "centos/ruby-22-centos7:latest"
Using locally available image "centos/ruby-22-centos7:latest"

What is this?

I0622 03:10:14.696765       1 sti.go:142] Preparing to build myproject/test-4:c4a8059a
Downloading "https://github.com/openshift/origin" ...
E0622 03:10:24.792138       1 install.go:245] script "save-artifacts" found by the script URL handler, but failed to install: failed to retrieve https://raw.githubusercontent.com/openshift/origin/master/test/extended/testdata/build-secrets/s2i/save-artifacts, response code 404
I0622 03:10:24.792258       1 sti.go:151] Clean build will be performed
I0622 03:10:24.792272       1 sti.go:165] Running "assemble" in "myproject/test-4:c4a8059a"
I0622 03:10:24.792286       1 sti.go:319] No user environment provided (no environment file found in application sources)

Should have no glog prefix.

relative-secret1=secret1
relative-secret2=secret2
relative-secret3=secret3
secret1=secret1
secret2=secret2
secret3=secret3
I0622 03:10:26.137404       1 sti.go:319] No user environment provided (no environment file found in application sources)

This is not a message a user cares about.

I0622 03:10:27.564004       1 sti.go:296] Successfully built myproject/test-4:c4a8059a

What is this? Not useful at 2.

Pushing image 172.30.179.236:5000/myproject/test:latest ...
error: The build myproject/test-4 status is "Failed"

@bparees
Copy link
Contributor

bparees commented Jun 22, 2016

@smarterclayton most of s2i is still using glog, you only changed the docker package w/ your wrapper pr.

@smarterclayton
Copy link
Contributor

Ok. So something else that needs to get fixed.

@bparees
Copy link
Contributor

bparees commented Jun 22, 2016

@jupierce
Copy link
Contributor

Sure thing.

@jupierce
Copy link
Contributor

jupierce commented Jun 23, 2016

s2i to use utilglog in openshift/source-to-image#524

@jupierce
Copy link
Contributor

Integrated fix for Missing Infof parameters into PR #9428 . The utilglog package is duplicated in source-to-image and origin (I fixed the problem in both places). Would be make sense to make this a Godep?

@bparees
Copy link
Contributor

bparees commented Jun 23, 2016

Would be make sense to make this a Godep?

probably not, we'd either have to either:

  1. create a separate repo for just the utilglog package (ugh)
    or
  2. make the origin code just use the utilglog package directly out of s2i (ugh-ish)

so while duplicating it also kinda sucks, for now i think we're going to stick with that.

@smarterclayton
Copy link
Contributor

SGTM, least bad of all options.

On Thu, Jun 23, 2016 at 5:30 PM, Ben Parees notifications@github.com
wrote:

Would be make sense to make this a Godep?

probably not, we'd either have to either:

  1. create a separate repo for just the utilglog package (ugh)
    or
  2. make the origin code just use the utilglog package directly out of s2i
    (ugh-ish)

so while duplicating it also kinda sucks, for now i think we're going to
stick with that.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#9247 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ABG_p7RLkM7myimtrt3IMRrCWsEn2gcpks5qOvsTgaJpZM4Ix7uM
.

@csrwng
Copy link
Contributor Author

csrwng commented Jun 24, 2016

why not pass in a logger in the s2i config? If no logger is passed then s2i can use a default one that's just a glog passthrough? We also need s2i to display the same image pull progress that we have in origin. I don't think the solution is to copy the image progress class to s2i as well. I'd rather pass a writer in the config to use for the pull call.

@csrwng csrwng closed this as completed Jun 24, 2016
@csrwng csrwng reopened this Jun 24, 2016
@bparees
Copy link
Contributor

bparees commented Jun 24, 2016

why not pass in a logger in the s2i config? If no logger is passed then s2i can use a default one that's just a glog passthrough?

not sure what problem you're trying to solve. we want passthrough behavior in both openshift and s2i. s2i has to be able to run w/o openshift present, so passing something from openshift to s2i doesn't help. Same for the pull progress implementation.

@csrwng
Copy link
Contributor Author

csrwng commented Jun 24, 2016

Repeat for image progress.

@csrwng
Copy link
Contributor Author

csrwng commented Jun 24, 2016

Why do we want the logging behavior to be the same in s2i standalone as when it's called from origin? It's a different use case.

@csrwng
Copy link
Contributor Author

csrwng commented Jun 24, 2016

For example, image progress for standalone s2i can be the default docker progress because most likely it's going to be output to a terminal, and it's not necessarily a bad experience.

@bparees
Copy link
Contributor

bparees commented Jun 24, 2016

we can discuss that for a followup. right now this p1 has been open for a week and it's easily solved by updating the existing s2i code to use the existing s2i log wrapper implementation with the correct log levels, so i'd like to get that landed before we start re-architecting things. @smarterclayton is talking about ripping glog out entirely anyway, so this isn't the right place to try to invent the future of our logging structure.

@jupierce
Copy link
Contributor

@csrwng @bparees
Should this be closed now?

@bparees
Copy link
Contributor

bparees commented Jul 11, 2016

@jupierce i've lost track. all the s2i and origin changes to make us log basic useful stuff at level 0 are now merged/bumped/etc?

@jupierce
Copy link
Contributor

jupierce commented Jul 11, 2016

@bparees Yes. Just wasn't sure if this has satisfied all parties concerned.

@bparees
Copy link
Contributor

bparees commented Jul 11, 2016

i'm satisfied :)

@bparees bparees closed this as completed Jul 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants