[WIP] stage1/logging: introduce a new journal logging mode #3727
Conversation
Can one of the admins verify this patch? |
Thoughts on this? I'm trying to find a good way to allow app sidecars (containerized inside the pod) to act as log forwarders for the pod. There's the existing |
WIP, However, I'm still interested on design feedback and/or alternatives. |
It is similar to the existing `log` mode that forwards all logs to systemd, with the difference that output will not show up on the console as well (StanddardOutput|Error=journal instead of journal+console). In addition to that, a read-only view of the pod journal will be bind mounted inside each app, so one more apps can decide to forward output from the pod reading that journal. Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
a2016bf
to
92504f1
Compare
... in their respective stage2/rootfs/etc/machine-id paths. This allows apps to read the Pod journal when it is available. systemd has a similar mechanism with systemd-firstboot.service and systemd-machine-id-setup(1). The stage1 image could use that service, since nspawn is already passing the container_uuid[1] ENV var, but writing the machine-id files directly was simpler than bringing in a new systemd service/dependency. [1]: https://www.freedesktop.org/wiki/Software/systemd/ContainerInterface/ Signed-off-by: Fabio Kung <fabio.kung@gmail.com>
It is unfinished, yes. The usecase is to have a custom stdin/stdout/stderr handler for scenario where we don't want to go through journald (e.g. custom log forwarders) or we need to attach/detach to apps (e.g. detachable interactive sessions). Those forwarder/attachers are in the host environment though, not in the pod itself.
Except for the systemd version incompatibility, I'm not much of a fan of the bindmount as it generically tries to sidestep and bend app's FS separation for a very specific purpose. Also, you'll soon realize you can't distinguish stdout vs stderr lines. Some other options worth considering:
/cc @alban as they do similar stage1-dependency tricks and may also need some iottymux progress to get rid of journal2cri. |
machineIDBytes := append([]byte(machineID), '\n') | ||
if err := ioutil.WriteFile(mPath, machineIDBytes, 0644); err != nil { | ||
return nil, nil, errwrap.Wrap(errors.New("error writing /etc/machine-id"), err) | ||
} | ||
if err := user.ShiftFiles([]string{mPath}, &p.UidRange); err != nil { | ||
return nil, nil, errwrap.Wrap(errors.New("error shifting /etc/machine-id"), err) | ||
} | ||
// also create /etc/machine-id inside each app, so they can read the pod journal when available |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems useful standalone too, I would suggest you to split it out on its own PR. Better to do a stat()
before and skip it if present though, to avoid messing with app rootfs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can split that into a different PR.
Doing the stat
before would break some very common cases though. Unfortunately, many common base images have a /etc/machine-id
baked in as artifacts of how they were built. E.g.:
rkt --insecure-options=image run docker://debian:jessie -- -c "cat /etc/machine-id"
[106023.438284] debian[5]: 5d12f049cbc39eb705e94d4b4fe4e580
Thanks for the feedback! I'm aware of the stdout/err separation issue with systemd's I'm also not as worried with the bind mounts because we do similar things for pod volumes shared across all apps. This would be a special (implicit?) case of that. As for the options you suggest, I considered some of them. Some more thoughts inline:
We run arbitrary code, and it would be pretty hard to enforce and/or prescribe that for all apps. It's a different avenue that we may explore at some point, but for now it is not very realistic.
I started this that way, but soon realized that customizing units generated by stage1-coreos is not trivial. It is currently done by the This may be a good area to focus on a different PR: extensibility of systemd units generated by stage1-coreos. In which case I'd be happy to maintain this as an alternative stage1 image, instead of baking it onto vanilla rkt if you are not interested in having apps reading from the pod journal.
Ordering on app start is something I will need to work on anyway, because even with journald, graceful shutdown of a Pod means we need to stop the log uploader sidecar last, and guarantee that all shutdown logs get shipped as well. My reservation about using the |
@fabiokung the following code adds some mounts via systemd units for each app without patching |
@alban ah, drop-ins on |
Closing this, I'll keep it as a separate stage1 image, and break up the |
It is similar to the existing
log
mode that forwards all logs to systemd, with the difference that output will not show up on the console as well (StanddardOutput|Error=journal instead of journal+console).In addition to that, a read-only view of the pod journal will be bind mounted inside each app, so one more apps can decide to forward output from the pod reading that journal.
Signed-off-by: Fabio Kung fabio.kung@gmail.com