Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Journald support #39

Merged
merged 4 commits into from
Feb 1, 2017
Merged

Conversation

Random-Liu
Copy link
Member

@Random-Liu Random-Liu commented Nov 10, 2016

Fixes #3.
Fixes #14.
This PR depends on google/cadvisor#1531. I'll remove the 3rd commit and update Godeps after the cadvisor patch is merged.

This PR finished the journald support.

Building sdjournal needs libsystemd-dev or libsystemd-journal-dev which may not be easy to get on some os distros. So I added a knob to disable it:

make # wil build journald support by default
ENABLE_JOURNALD=0 make # will not build journald support, thus no libsystemd needed

@adohe @dchen1107
/cc @kubernetes/sig-node


This change is Reviewable

@Random-Liu
Copy link
Member Author

@adohe Could you help me review this? :)

@Random-Liu
Copy link
Member Author

Random-Liu commented Nov 11, 2016

@adohe Wait, there is still something wrong. It seems that there is still something wrong with the statically linked binary.

@adohe-zz
Copy link
Contributor

@Random-Liu this cadvisor#1531 has merged, please remove the third commit and update Godep.

func getLogReader(cfg WatcherConfig) (io.ReadCloser, error) {
var errs []error
for name, getter := range logReaderGetters {
r, err := getter(cfg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the meaning of name here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do!

@adohe-zz
Copy link
Contributor

LGTM. @Random-Liu 👍

@Random-Liu
Copy link
Member Author

Random-Liu commented Nov 12, 2016

@adohe In fact there is a problem with current solution. go-systemd/sdjournal replies on the shared library libsystemd.so (See here), which could not be found inside the container. It is on the node with systemd installed. Even I've made the node-problem-detector binary statically linked, libsystemd.so can't be linked, because it's loaded at runtime with dlopen in the code.

There are several ways to solve this:

  • Option 1: Mount corresponding host path to make sure we can find the shared library with dlopen inside the container. => Problem: The shared library may be in different paths on different os distros.
  • Option 2: Add the shared library inside the container when building the image. => Problem: We are not sure whether the library will be compatible with systemd on the host, at least for systemd < 209 and systemd >= 209, they need different shared libraries.
  • Option 3: Mount and use journalctl binary directly, discard current solution.
    • Mount all necessary path and binary into the container. => Problem: I've tried this, and couldn't make it work till now.
    • Just mount the whole root fs as readonly inside the container, and run kernel monitor in a chroot. => I've tried this, and this at least works.

@adohe Do you have any thoughts? :)

@euank
Copy link
Contributor

euank commented Nov 14, 2016

@Random-Liu I have another option that solves that problem: Don't use journald! See #41 for that solution 😀

@adohe-zz
Copy link
Contributor

@euank as kernel monitor just cares about the kernel log message, access kmsg directly sounds like a good option.

@jfilak
Copy link

jfilak commented Nov 18, 2016

@Random-Liu What are the problems with Option 2? ABRT watches for kernel oopses in journald too. The ABRT container ships with own libsystemd and we haven't experienced any problem with backward compatibility.

Option 3 works for me on Fedora 24 (Atomic Host) with systemd-229-16.fc24.x86_64 and docker-1.10.3-54.gite03ddb8.fc24.x86_64:

docker run -it --privileged --rm -v /:/host fedora sh
journalctl -f -D /host/var/log/journal

@Random-Liu
Copy link
Member Author

Random-Liu commented Nov 19, 2016

Option 3 works for me on Fedora 24 (Atomic Host) with systemd-229-16.fc24.x86_64 and docker-1.10.3-54.gite03ddb8.fc24.x86_64

@jfilak Thanks for your information! I'm also using fedora now, because it has newer systemd (229) and supports +LZ which is at least needed in GCI.

My PR is almost ready. I'll send an update soon.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 20, 2016
@Random-Liu
Copy link
Member Author

Random-Liu commented Nov 20, 2016

@adohe @euank @jfilak Updated the PR.

  • The 1st commit added Plugin interface, and implemented syslog and journald plugins.
  • The 2nd commit updated Makefile to compile and build container image with the new code. Note that:
    • We added a build flag journald and a env ENALBE_JOURNALD to enable/disable building journald support. We do this because:
      • Compile go-systemd/sdjournal needs libsystemd-dev or libsystemd-journal-dev.
      • Container image become much bigger to support journald, because of the base image and extra dependencies.
    • We are using fedora as the base image now, because it has newer systemd (229) and supports +LZ which is at least required by GCI.
  • The 3rd commit updated the Godeps to include Fix a bug in tail. google/cadvisor#1531.
    • Note that cadvisor is using a different go-systemd version. This confused godeps. This PR has already included all dependencies, but in the future we may need to run godep save ./... twice when updating godeps. (This needs to be resolved, same with Unable to build code, missing godep #7)
  • The 4th commit updated the Readme.md to reflect the new change in kernelmonitor.

I've manually tested journald support of this PR, and tested file based log support with the NPD k8s e2e test.
We need a k8s e2e test testing the journald support.

@dchen1107
/cc @kubernetes/sig-node


# NOTE that enable journald will increase the image size.
ifeq ($(ENABLE_JOURNALD), 1)
# Enable journald bulid tag.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicking: a typo in the comment - s/bulid/build/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. Thanks!


container: node-problem-detector
cp Dockerfile.in Dockerfile
sed -i 's|BASEIMAGE|$(BASEIMAGE)|g' Dockerfile
sed -i 's|SETUP|$(SETUP)|g' Dockerfile
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicking: I would rather use @baseimage@ and @setup@ (Autotools dialect) in the Dockerfile file, so it would be much clearer that these strings will be replaced. And I usually create a solo rule for the file transformations as follows:

Dockerfile: Dockerfile.in
    sed -e 's|BASEIMAGE|$(BASEIMAGE)|g' \
        -e 's|SETUP|$(SETUP)|g' \
        $^ >$@

What is supposed to be the value of the SETUP variable?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: the Dockerfile target must be .PHONY target in your case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. Will do!

In fact SETUP can be removed. Initially I planed to use ubuntu, and needs a SETUP to download the libsystemd-dev. However, later I decided to use fedora because systemd on ubuntu doesn't support +LZ. Because systemd is already in fedora image, an extra SETUP is not needed.

test:
go test -v -race ./pkg/... $(BUILD_TAGS)

.PHONY: all
Copy link

@jfilak jfilak Nov 20, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question: Why do you make only the all target PHONY target? I believe that the other targets that should run unconditionally should be made PHONY targets too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because pushing to gcr.io/google_containers needs some special authentication. I don't think we want to do that every time we do make. :)

Copy link
Contributor

@euank euank Nov 21, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PHONY isn't about whether it happens every time, just about whether it treats the target name as a file to check the existence of (e.g. if you do touch clean; make clean it won't run the commands because it treats "clean" as a filename and sees that the target already exists... unless you make clean PHONY).

node-problem-detector isn't a PHONY necessarily, but everything else should be.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@euank Thanks for the clarification. I'm really a Makefile fresher. :)

Will review your PR and rebase this one.

@Random-Liu
Copy link
Member Author

Addressed comments. :)

Copy link
Contributor

@euank euank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it's intended that this plugin interface can work for kmsg.. but it couldn't.

I think it would be a cleaner isolation for both reading and parsing of the messages to be abstracted behind one interface, not have reading straddle the interface boundary.


PKG_SOURCES := $(shell find pkg -name '*.go')

node-problem-detector: ./bin/node-problem-detector

./bin/node-problem-detector: $(PKG_SOURCES) node_problem_detector.go
CGO_ENABLED=0 GOOS=linux godep go build -a -installsuffix cgo -ldflags '-w' -o node-problem-detector
CGO_ENABLED=$(CGO_ENABLED) GOOS=linux go build -o ./bin/node-problem-detector -a -ldflags '-w' $(BUILD_TAGS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we ever want CGO_ENABLED=0? Only distribution we support is the container, and it'll work fine in the container either way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the binary is dynamically linked, and there is no corresponding dynamic library in the container, it won't work, right?

test:
go test -timeout=1m -v -race ./pkg/... $(BUILD_TAGS)

.PHONY: all
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: shoulda been rebased out

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. :(


test:
go test -timeout=1m -v -race ./pkg/...
Dockerfile: Dockerfile.in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a .gitignore for Dockerfile?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't add Dockerfile to PHONY targets, users will be confused. Consider this scenario:

make
# The Dockerfile file didn't exist and make created it.
# @BASEIMAGE@ is Fedora and systemd-journal is enabled.
make ENABLE_JOURNALD=0
# The Dockerfile file already exists and make skipped the Dockerfile target. 
# However the user believes that @BASEIMAGE@ is set to "alpine:3.4" and
# systemd-journal is disabled.

Copy link
Member Author

@Random-Liu Random-Liu Nov 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfilak I see. You are right. Thanks for explaining. :)
Will do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@andyxning andyxning Feb 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Random-Liu @jfilak

According to #39(discussion), we should also make ./bin/node-problem-detector target phony or otherwise make will think that ./bin/node-problem-detector is already exists and will not recompile it with journald disabled. WDYT.

continue
}
if line == "" {
time.Sleep(100 * time.Millisecond)
// Trim tailing `\n`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant comment imo.

Also, strings.TrimSpace would be fine I expect

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

)

// Plugin is the interface of a log parsing plugin.
type Plugin interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an interface that is just: StreamLogMessages() (<-chan *types.KernelMessage, error) would be better.

As is, this interface won't work for a kmsg plugin due to the reading abstraction. A read of the kmsg device has special semantics which are lost by the attempts to buffer and so on above.

Copy link
Member Author

@Random-Liu Random-Liu Nov 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The advantage of current interface is:

  • Less code change, because current code is acting as.
  • Plugin can share some code, such as the read loop.

However, delegating the whole logic to the plugin is actually more flexible:

  • We can support kmsg, which has different read logic.
  • We can use sdjournal api directly instead of the reader interface which has an extra buffer inside.

Comparing these 2 options, I prefer the later. :)

/cc @adohe

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the later too. kmsg has different read logical, which could be regarded as a new kernel log watcher, but not suitable for current buffer approach plugin.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://github.com/kubernetes/kubernetes/wiki/CLA-FAQ to sign the CLA.

Once you've signed, please reply here (e.g. "I signed it!") and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. and removed cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 18, 2017
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Jan 18, 2017
Copy link

@ankon ankon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried a build of this, but could not get it to work.

One thing that didn't work at all was the magic with /etc/localtime, and I ended up using an init container to create a symlink instead of trying to mount the file directly.

# ENABLE_JOURNALD enables build journald support or not. Building journald support needs libsystemd-dev
# or libsystemd-journal-dev.
# TODO(random-liu): Build NPD inside container.
ENABLE_JOURNALD?=1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: spaces around the assignment operators missing to keep this in-sync with the other variables here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK. Will fix. Thanks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -30,14 +79,17 @@ version:
@echo $(VERSION)

./bin/node-problem-detector: $(PKG_SOURCES)
GOOS=linux go build -o bin/node-problem-detector \
CGO_ENABLED=$(CGO_ENABLED) GOOS=linux go build -o bin/node-problem-detector \
-ldflags '-w -extldflags "-static" -X $(PKG)/pkg/version.version=$(VERSION)' \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the comment above the "-static" here looks fishy, and probably should only be here when not building with journald support?

Copy link
Member

@andyxning andyxning Jan 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ankon
Yes, you're right. -static should only be used when journald is disabled. For more info please consult the discussions about this review

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ankon Good catch. This is introduced during last rebase. As is mentioned in #54 (review), we should not try to static link when journald is enabled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@Random-Liu
Copy link
Member Author

One thing that didn't work at all was the magic with /etc/localtime, and I ended up using an init container to create a symlink instead of trying to mount the file directly.

What problem did you hit?

@ankon
Copy link

ankon commented Jan 19, 2017

What problem did you hit?

It (not sure who exactly - docker, kubelet, kubernetes?) refused to start the pod, because it couldn't mkdir the directory /etc/localtime (ending in the usual CrashLoopBackOff behavior).

FWIW:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"2016-12-14T00:57:05Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"1970-01-01T00:00:00Z", GoVersion:"go1.7.1", Compiler:"gc", Platform:"linux/amd64"}
$ minikube version
minikube version: v0.15.0
$ uname -r
4.9.3-200.fc25.x86_64

@Random-Liu
Copy link
Member Author

It (not sure who exactly - docker, kubelet, kubernetes?) refused to start the pod, because it couldn't mkdir the directory /etc/localtime (ending in the usual CrashLoopBackOff behavior).

Could you post the log here? In fact, I don't know whether npd works for minikube. :)

@dchen1107
Copy link
Member

LGTM

@dchen1107 dchen1107 added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 1, 2017
@dchen1107 dchen1107 merged commit b66c4df into kubernetes:master Feb 1, 2017
@dchen1107 dchen1107 mentioned this pull request Feb 1, 2017
@Random-Liu Random-Liu deleted the journald-support branch February 1, 2017 21:00
@Random-Liu
Copy link
Member Author

@dchen1107 Thanks a lot for reviewing! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants