build: loglevel=2 and loglevel=1 messages not getting logged (WAS: cluster up: builds do not get loglevel=2 by default) #9247

csrwng · 2016-06-09T12:46:04Z

Only glog.V(0).Info messages are printed out when running a build in a cluster that has been started by 'oc cluster up'. The default should be loglevel 2.

Version

v1.3.0-alpha.1-190

Steps To Reproduce

Start a new cluster with 'oc cluster up'
Start a new build 'oc new-app https://github.com/openshift/ruby-hello-world.git'
Examine build logs

Current Result

glog.V(1) messages from builder are not printed out

Expected Result

glog.V(1) and glog.V(2) messages are printed out

csrwng · 2016-06-10T03:34:26Z

@bparees - so I found the difference is not 'cluster up', rather this pull: #8924
Right now 'cluster up' will use the latest images, while 'openshift start' will use the last released image as you showed me today.

Bottom line is that glog.V(>0) messages will no longer show up in the log by default. Which I think is the more correct behavior, since if you look at the build pod, it's getting passed a --loglevel=0 and prior to #8924 we were ignoring that.

bparees · 2016-06-10T03:38:15Z

hm, then we may need to revisit what's being logged at level "1" because a lot of those messages should always be seen.

csrwng · 2016-06-14T13:37:10Z

For now we will change anything logged at level 1 or 2 to "0". Does that sound acceptable?

bparees · 2016-06-14T14:39:38Z

@csrwng it sounds like the right starting point for the PR. we can review what messages that ends up including in the PR.

smarterclayton · 2016-06-18T21:13:58Z

Reopening this, see my comments in the PR but this undid the very careful work to hide a bunch of info that is not useful for most users.

Regular users of the platform don't need this info unless they ask for it.

smarterclayton · 2016-06-18T21:29:50Z

Here's some rules of thumb for how to be a good logging agent that should define the foundation of what the builders should be logging

Build log output is a UI. Every single word you print should be communicating useful info to the user.
Don't tell users things they can't do anything about - you waste their attention on meaningless info, which hides critical info
Never, ever, EVER print JSON output to a user who didn't ask for it
Don't tell me the same thing twice in two different ways
If you're about to do something long, tell me what you're doing before you start, and if it's very long give me a bit of progress.
Keep in mind relative importance - if you output a line with 80 characters, and a line with 10 characters, most people are going to focus on the 80 character line... but if that 10 character line says "we failed", then the focus isn't in the right place.

For builds specifically, if you took a brand new programmer off the street (someone using node.js for less than a month) and you made them look at the log output, they shouldn't see anything they couldn't figure out as being related to what their goal is

Also, we have rules about what log goes at what levels (see the upstream kubernetes docs/devel/logging.md file).

V(2) is the default and is what most users should see most of the time (this is your UI)
V(1) is for an abbreviated version of V(2), omit anything that doesn't directly correspond to the action being taken (probably user output, success, or failed)
V(0) shouldn't be used unless it's useful to every single user in every single scenario. Basically, errors.

Some stuff that got added back in and some reasons why it's bad:

The value of ALLOWED_UIDS is [1-]
The value of DROP_CAPS is [KILL,MKNOD,SETGID,SETUID,SYS_CHROOT]

Not useful for end users. Should be v(4).

Image "centos/php-56-centos7@sha256:51c3e2b08bd9fadefccd6ec42288680d6d7f861bdbfbd2d8d24960621e4e27f5" not available locally, pulling ...

It's great to say we're going to be pulling.... but most of the info there is meaningless to a normal user. The digest in particular is very problematic - it takes up 80 characters and will cause wrapping in most cases, yet conveys no real info. We might make an exception to show it in this case because then a user could go back and use it... but a user isn't actually going to be able to recreate the build locally, so the full digest only belongs at v(3) or above. At V(1), what info do we actually need to convey? Pulling image "<short summary>"....

Creating a new S2I builder with build config: "Builder Name:\t\t\tApache 2.4 with PHP 5.6\nBuilder Image:\t\t\tcentos/php-56-centos7@sha256:51c3e2b08bd9fadefccd6ec42288680d6d7f861bdbfbd2d8d24960621e4e27f5\nBuilder Image Version:\t\t186a9c8caca0ef9eec4125247979876bf2c937b4\nBuilder Base Version:\t\taac5e76\nSource:\t\t\t\tfile:///tmp/s2i-build607401544/upload/src\nOutput Image Tag:\t\texample/cakephp-mysql-example-1:0d661744\nEnvironment:\t\t\tOPENSHIFT_BUILD_NAME=cakephp-mysql-example-1,OPENSHIFT_BUILD_NAMESPACE=example,OPENSHIFT_BUILD_SOURCE=https://github.com/openshift/cakephp-ex.git\nIncremental Build:\t\tdisabled\nRemove Old Build:\t\tdisabled\nBuilder Pull Policy:\t\tif-not-present\nPrevious Image Pull Policy:\talways\nQuiet:\t\t\t\tdisabled\nLayered Build:\t\t\tdisabled\nWorkdir:\t\t\t/tmp/s2i-build607401544\nDocker NetworkMode:\t\tcontainer:4be62ce657ac063cd020206e83adfe4253d2f8aa12077168e466554d8befd46c\nDocker Endpoint:\t\tunix:///var/run/docker.sock\n"

No.

Downloading "https://github.com/openshift/cakephp-ex.git" ...
Cloning source from https://github.com/openshift/cakephp-ex.git

Saying the same thing two different ways. The first one is better. The second one is redundant and should not be printed at v2.

I0618 21:07:53.282588       1 install.go:123] DEPRECATED: Use .s2i/bin instead of .sti/bin
I0618 21:07:53.282883       1 install.go:251] Using "assemble" installed from "<source-dir>/.sti/bin/assemble"
I0618 21:07:53.282953       1 install.go:251] Using "run" installed from "image:///usr/libexec/s2i/run"
I0618 21:07:53.282973       1 install.go:251] Using "save-artifacts" installed from "image:///usr/libexec/s2i/save-artifacts"

This is just something that snuck in, but shouldn't be logged here. V(4), maybe V(3).

Using provided push secret for pushing 172.30.209.18:5000/example/cakephp-mysql-example:latest image

This is kind of useful, but unfortunately is not actionable by a user. How does knowing this help me to verify my build succeeded or failed?

smarterclayton · 2016-06-18T21:30:31Z

Bumping the priority of this up to p1 - I'm trying to record a screencast for summit that is demonstrating how awesome and succinct we are from cluster up all the way to the UI, and so I need this fixed by Monday afternoon, because we aren't.

smarterclayton · 2016-06-18T22:15:06Z

Also, the warning message we print about mismatched versions is not actionable by any user of the platform, and is just a debugging message for OpenShift developers. It should be at v(3). No one will ever be able to do anything about it who isn't a cluster admin.

bparees · 2016-06-18T23:39:01Z

Also, the warning message we print about mismatched versions is not actionable by any user of the platform, and is just a debugging message for OpenShift developers. It should be at v(3). No one will ever be able to do anything about it who isn't a cluster admin.

they can't do anything about it, but when they send us their "my build isn't working, here's the output" log, it's nice that we can immediately see that warning without having to go back to them and get more verbose logging. Warnings aren't necessarily for users to fix, they are for users to be aware that "something isn't right" and perhaps to consult their admin or support.

Afterall, that is not a message any normal users should be seeing if they've got a proper install.

smarterclayton · 2016-06-19T00:13:24Z

they can't do anything about it, but when they send us their "my build isn't working, here's the output" log

Who is that? OpenShift Online users? When is this ever going to be actioned? Why does the exact commit matter? What if we release a new version of the master, but don't rev the sti-builder (which is what happens all the time)?

It might be useful rarely. It's not useful enough that it's worth clogging up user output with. It's also spurious the majority of the time it's shown. If you can reduce the false-positive rate significantly, I might agree it's worth keeping.

bparees · 2016-06-19T00:31:56Z

Who is that? OpenShift Online users? When is this ever going to be actioned? Why does the exact commit matter? What if we release a new version of the master, but don't rev the sti-builder (which is what happens all the time)?

we don't do releases w/o doing that, do we? So any OSE or origin release user should have matching levels. And online should also have matching levels.

I would only expect this to be showing up for people doing active openshift development, if it's showing up for users who did a proper install of a packaged release then i agree, something has gone wrong and it needs to be revisited.

smarterclayton · 2016-06-19T00:37:29Z

OSE does not rev sit-builder in all cases. Origin does not rev sti-builder in all cases.

bparees · 2016-06-19T00:50:19Z

loglevel 3 it is, then.

smarterclayton · 2016-06-19T00:52:02Z

Another alternative on a build error is to always print that if it's true - i.e. at the very end.

smarterclayton · 2016-06-19T00:52:57Z

I kind of regret --build-loglevel now - users having to select a level is very unintuitive. I think we probably want to have a --verbose flag instead and set the level ourselves (to 3 or 4 or similar).

jupierce · 2016-06-19T21:19:10Z

@smarterclayton @bparees Hopefully these two PRs cover what is required for Monday's screencast.

smarterclayton · 2016-06-21T16:05:22Z

Can you paste the output in one of the PRs so we can look?

jupierce · 2016-06-21T16:25:45Z

Sorry about the delay on the logs. I'm not able to get a clone of the current source-to-image repo running in my development environment. Will work with Ben.

jupierce · 2016-06-21T19:35:02Z

s2i test was blocked briefly by openshift/source-to-image#520 . @bparees was able to get me going again. PR openshift/source-to-image#519 has now been accepted. Now working on integrating this into my openshift/origin dev environment.

jupierce · 2016-06-22T03:15:32Z

Output at all log levels with new s2i and logging tweaks: https://gist.github.com/jupierce/303fa169792c2a646976d3ff097f4be8

Noticing a few printf %!q(MISSING) entries.

bparees · 2016-06-22T03:58:50Z

the MISSING items appear to be a bug in @smarterclayton's glog wrapping logic, because i don't see anything else wrong with:
glog.V(4).Infof("Starting S2I build from %s/%s BuildConfig ...", s.build.Namespace, s.build.Name)

and

glog.V(4).Infof("The value of %s is [%s]", api.DropCapabilities, dropCaps)

smarterclayton · 2016-06-22T04:16:33Z

There should be no glog prefixes at v1 or v2...

smarterclayton · 2016-06-22T04:17:00Z

Means some code is not using the right package and is logging its own errors (instead of returning errors).

smarterclayton · 2016-06-22T04:18:22Z

Log level: 1
Executing: oc start-build test --follow --wait --build-loglevel=1
test-4
Using locally available image "centos/ruby-22-centos7:latest"
Using locally available image "centos/ruby-22-centos7:latest"
Using locally available image "centos/ruby-22-centos7:latest"

What is this?

I0622 03:10:14.696765       1 sti.go:142] Preparing to build myproject/test-4:c4a8059a
Downloading "https://github.com/openshift/origin" ...
E0622 03:10:24.792138       1 install.go:245] script "save-artifacts" found by the script URL handler, but failed to install: failed to retrieve https://raw.githubusercontent.com/openshift/origin/master/test/extended/testdata/build-secrets/s2i/save-artifacts, response code 404
I0622 03:10:24.792258       1 sti.go:151] Clean build will be performed
I0622 03:10:24.792272       1 sti.go:165] Running "assemble" in "myproject/test-4:c4a8059a"
I0622 03:10:24.792286       1 sti.go:319] No user environment provided (no environment file found in application sources)

Should have no glog prefix.

relative-secret1=secret1
relative-secret2=secret2
relative-secret3=secret3
secret1=secret1
secret2=secret2
secret3=secret3
I0622 03:10:26.137404       1 sti.go:319] No user environment provided (no environment file found in application sources)

This is not a message a user cares about.

I0622 03:10:27.564004       1 sti.go:296] Successfully built myproject/test-4:c4a8059a

What is this? Not useful at 2.

Pushing image 172.30.179.236:5000/myproject/test:latest ...
error: The build myproject/test-4 status is "Failed"

bparees · 2016-06-22T04:45:18Z

@smarterclayton most of s2i is still using glog, you only changed the docker package w/ your wrapper pr.

smarterclayton · 2016-06-22T13:19:27Z

Ok. So something else that needs to get fixed.

bparees · 2016-06-22T15:03:15Z

@jupierce can you please update the rest of s2i to use the utilglog wrapper as seen here:
https://github.com/openshift/source-to-image/blob/master/pkg/docker/util.go#L16
https://github.com/openshift/source-to-image/blob/master/pkg/docker/util.go#L22

jupierce · 2016-06-22T16:18:00Z

Sure thing.

jupierce · 2016-06-23T19:38:55Z

s2i to use utilglog in openshift/source-to-image#524

jupierce · 2016-06-23T21:07:45Z

Integrated fix for Missing Infof parameters into PR #9428 . The utilglog package is duplicated in source-to-image and origin (I fixed the problem in both places). Would be make sense to make this a Godep?

bparees · 2016-06-23T21:30:57Z

Would be make sense to make this a Godep?

probably not, we'd either have to either:

create a separate repo for just the utilglog package (ugh)
or
make the origin code just use the utilglog package directly out of s2i (ugh-ish)

so while duplicating it also kinda sucks, for now i think we're going to stick with that.

smarterclayton · 2016-06-23T23:38:18Z

SGTM, least bad of all options.

On Thu, Jun 23, 2016 at 5:30 PM, Ben Parees notifications@github.com
wrote:

Would be make sense to make this a Godep?

probably not, we'd either have to either:

create a separate repo for just the utilglog package (ugh)
or

make the origin code just use the utilglog package directly out of s2i
(ugh-ish)

so while duplicating it also kinda sucks, for now i think we're going to
stick with that.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#9247 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ABG_p7RLkM7myimtrt3IMRrCWsEn2gcpks5qOvsTgaJpZM4Ix7uM
.

csrwng · 2016-06-24T02:21:56Z

why not pass in a logger in the s2i config? If no logger is passed then s2i can use a default one that's just a glog passthrough? We also need s2i to display the same image pull progress that we have in origin. I don't think the solution is to copy the image progress class to s2i as well. I'd rather pass a writer in the config to use for the pull call.

bparees · 2016-06-24T03:18:15Z

why not pass in a logger in the s2i config? If no logger is passed then s2i can use a default one that's just a glog passthrough?

not sure what problem you're trying to solve. we want passthrough behavior in both openshift and s2i. s2i has to be able to run w/o openshift present, so passing something from openshift to s2i doesn't help. Same for the pull progress implementation.

csrwng · 2016-06-24T04:08:14Z

Add a field of a new interface, type Logger to s2i.Config (https://github.com/openshift/source-to-image/blob/master/pkg/api/types.go#L44).
In s2i's Strategy method (https://github.com/openshift/source-to-image/blob/master/pkg/build/strategies/strategies.go#L19), check for a nil Logger (case when s2i is run standalone), and provide a default implementation that knows nothing about origin's glog util. It could simply use the default glog Logger methods.
Modify all s2i code to use the Logger in the config for its logging.
In the builder, when creating the s2i.Config, provide a Logger implementation that is based on origin's glog utility (https://github.com/openshift/origin/blob/master/pkg/build/builder/sti.go#L169-L191).

Repeat for image progress.

csrwng · 2016-06-24T04:12:13Z

Why do we want the logging behavior to be the same in s2i standalone as when it's called from origin? It's a different use case.

csrwng · 2016-06-24T04:17:49Z

For example, image progress for standalone s2i can be the default docker progress because most likely it's going to be output to a terminal, and it's not necessarily a bad experience.

bparees · 2016-06-24T04:32:15Z

we can discuss that for a followup. right now this p1 has been open for a week and it's easily solved by updating the existing s2i code to use the existing s2i log wrapper implementation with the correct log levels, so i'd like to get that landed before we start re-architecting things. @smarterclayton is talking about ripping glog out entirely anyway, so this isn't the right place to try to invent the future of our logging structure.

jupierce · 2016-07-11T14:39:56Z

@csrwng @bparees
Should this be closed now?

bparees · 2016-07-11T15:11:48Z

@jupierce i've lost track. all the s2i and origin changes to make us log basic useful stuff at level 0 are now merged/bumped/etc?

jupierce · 2016-07-11T17:42:11Z

@bparees Yes. Just wasn't sure if this has satisfied all parties concerned.

bparees · 2016-07-11T18:07:16Z

i'm satisfied :)

csrwng mentioned this issue Jun 9, 2016

builders: simplified image progress reporting #9212

Merged

csrwng self-assigned this Jun 9, 2016

danmcp added priority/P2 component/build labels Jun 9, 2016

csrwng changed the title ~~cluster up: builds do not get loglevel=2 by default~~ build: loglevel=2 and loglevel=1 messages not getting logged (WAS: cluster up: builds do not get loglevel=2 by default) Jun 14, 2016

csrwng assigned jupierce and unassigned csrwng Jun 14, 2016

jupierce mentioned this issue Jun 14, 2016

need to dump build logs on any extended test failure #9235

Closed

jupierce mentioned this issue Jun 14, 2016

Changed glog V(1) and V(2) to V(0) #9332

Merged

openshift-bot closed this as completed in #9332 Jun 15, 2016

smarterclayton reopened this Jun 18, 2016

smarterclayton added priority/P1 and removed priority/P2 labels Jun 18, 2016

This was referenced Jun 19, 2016

Changing logging statments to Clayton's spec #9428

Merged

Changing messaging to Clayton's spec openshift/source-to-image#519

Merged

csrwng closed this as completed Jun 24, 2016

csrwng reopened this Jun 24, 2016

bparees closed this as completed Jul 11, 2016

build: loglevel=2 and loglevel=1 messages not getting logged (WAS: cluster up: builds do not get loglevel=2 by default) #9247

build: loglevel=2 and loglevel=1 messages not getting logged (WAS: cluster up: builds do not get loglevel=2 by default) #9247

Comments

csrwng commented Jun 9, 2016

Version

Steps To Reproduce

Current Result

Expected Result

csrwng commented Jun 10, 2016

bparees commented Jun 10, 2016

csrwng commented Jun 14, 2016

bparees commented Jun 14, 2016

smarterclayton commented Jun 18, 2016

smarterclayton commented Jun 18, 2016 • edited

smarterclayton commented Jun 18, 2016

smarterclayton commented Jun 18, 2016

bparees commented Jun 18, 2016

smarterclayton commented Jun 19, 2016

bparees commented Jun 19, 2016

smarterclayton commented Jun 19, 2016

bparees commented Jun 19, 2016

smarterclayton commented Jun 19, 2016

smarterclayton commented Jun 19, 2016

jupierce commented Jun 19, 2016

smarterclayton commented Jun 21, 2016

jupierce commented Jun 21, 2016

jupierce commented Jun 21, 2016

jupierce commented Jun 22, 2016

bparees commented Jun 22, 2016

smarterclayton commented Jun 22, 2016

smarterclayton commented Jun 22, 2016

smarterclayton commented Jun 22, 2016

bparees commented Jun 22, 2016

smarterclayton commented Jun 22, 2016

bparees commented Jun 22, 2016

jupierce commented Jun 22, 2016

jupierce commented Jun 23, 2016 • edited

jupierce commented Jun 23, 2016

bparees commented Jun 23, 2016

smarterclayton commented Jun 23, 2016

csrwng commented Jun 24, 2016

bparees commented Jun 24, 2016

csrwng commented Jun 24, 2016

csrwng commented Jun 24, 2016

csrwng commented Jun 24, 2016

bparees commented Jun 24, 2016

jupierce commented Jul 11, 2016

bparees commented Jul 11, 2016

jupierce commented Jul 11, 2016 • edited

bparees commented Jul 11, 2016

smarterclayton commented Jun 18, 2016 •

edited

jupierce commented Jun 23, 2016 •

edited

jupierce commented Jul 11, 2016 •

edited