-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: add support for journal-gatewayd #967
Conversation
@@ -234,23 +234,68 @@ func (a *Bootstrap) addSystemdUnits(uri string, templateData *bootstrapTemplateD | |||
} | |||
|
|||
for _, childInfo := range children { | |||
name := childInfo.Name() | |||
file, err := data.Assets.Open(path.Join(uri, name)) | |||
dir := path.Join(uri, childInfo.Name()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to add a unit test to test this part of the code?
[Service] | ||
ExecStart= | ||
ExecStart=/usr/lib/systemd/systemd-journal-gatewayd \ | ||
--key=/opt/openshift/tls/journal-gatewayd-ca.key \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we are using the same set of cert/key pair for server,client-authn and client-authz can we drop the -ca
@@ -357,6 +359,10 @@ func (a *Bootstrap) addParentFiles(dependencies asset.Parents) { | |||
dependencies.Get(asset) | |||
a.Config.Storage.Files = append(a.Config.Storage.Files, ignition.FilesFromAsset(rootDir, 0600, asset)...) | |||
} | |||
|
|||
journalCertKey := &tls.JournalCA{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this feels out of place in addSystemdUnits
it seems like it belongs here 907a22d#diff-2d9b655c8052ca2e3bdd9004778c75bcR357
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can't go there because the key and cert need to be readable by journal-gatewayd (which doesn't run as root). I should probably also change the owner of these files... I hope that works on RHCOS...
Can you pull 0bc0117 out into it's own PRs (and maybe split the auto-generated SVG rebuild into a separate commit?)? Those look like orthogonal changes that we can land without further discussion (once CI gets working again). |
unit := igntypes.Unit{Name: name, Contents: string(data)} | ||
if _, ok := enabled[name]; ok { | ||
unit.Enabled = util.BoolToPtr(true) | ||
if info.IsDir() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a check here that says "if the directory name doesn't end with .d
we ignore it" (possibly with a debug-level log)? From the systemd docs, it looks like they require all drop-in directories to have a .d
suffix. But I'm not all that familiar with systemd, maybe I'm just misreading the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to add a trace-level log line for that case. Trace should really only be used by developers which seems appropriate for this kind of message. Should I mention the "trace" level in the help output, or just keep that goodie for ourselves?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I mention the "trace" level in the help output, or just keep that goodie for ourselves?
I'd mention it. Users should recognize it as "the level after debug" even if they're non-devs, so I don't expect it will be too distracting.
docs/user/troubleshooting.md
Outdated
journalctl --unit=bootkube.service | ||
``` | ||
1. If SSH is available, the following command can be run on the bootstrap node: `journalctl --unit=bootkube.service` | ||
2. Regardless of whether or not SSH is available, the following command can be run: `curl --insecure --cert ${INSTALL_DIR}/tls/journal-gatewayd-client.crt --key ${INSTALL_DIR}/tls/journal-gatewayd-client.key 'https://${BOOTSTRAP_IP}:19531/entries?follow&_SYSTEMD_UNIT=bootkube.service'` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the installer connect to this automatically and stream it into the installer logs (like we do for Terraform) instead of polling for the API? I'm fine leaving that to follow-up work, but I don't think we need to document this manual approach if we're planning on adding a built-in approach in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. I'd prefer to always stream these logs, but I'm not sure how to handle the case where a network connection is not available. Do we just optimistically stream them and ignore it otherwise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... but I'm not sure how to handle the case where a network connection is not available.
We currently rely on network connections to the load balancers for our Kubernetes API and event waits. When those work, I expect this access to the bootstrap journal will usually work as well. When it doesn't (access from inside a restrictive firewall?), I'm fine logging a warning and falling back to the Kubernetes API wait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I once wrote a simple package to talk to gatewayd, in case you ever do want to do that: coreos/go-systemd#86
Also: Having an installer sub-command to that does the equivalent curl
would be pretty nice.
Let's see if the registry gateway issue has been resolved... |
@@ -205,7 +208,7 @@ func (a *Bootstrap) addStorageFiles(base string, uri string, templateData *boots | |||
} else { | |||
mode = 0600 | |||
} | |||
ign := ignition.FileFromBytes(strings.TrimSuffix(base, ".template"), mode, data) | |||
ign := ignition.FileFromBytes(strings.TrimSuffix(base, ".template"), "root", mode, data) | |||
if filename == ".bash_history" { | |||
ign.User = &igntypes.NodeUser{Name: ignitionUser} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could set a variable for the user name and override for .bash_history
before feeding into FileFromBytes
. But certainly not a blocker; we can always do something like that later if we want ;).
/retest |
That's not one I've seen before, but it sounds related to the recent registry failures we've seen. /retest |
This looks awesome! |
/retest |
This allows systemd drop-ins to be included in the bootstrap data directory. The drop-ins must be located under an appropriately-named directory, the same way that systemd requires on the host.
This gives the ability to specify the owner of the files. So far, only the root user is used, but support for other users will be needed in order to support journal-gatewayd. This service doesn't run as root, so in order to read its TLS assets, they either need to be world-readable or they need to be owned by the correct user.
The key for the root CA should not end up in the cluster. This is most easily prevented by ensuring that it doesn't end up on the bootstrap node at all. This change also makes the root CA certificate world-readable.
This allows the user to stream logs from the bootstrap node without the need for SSH. This read-only view into the bootstrap process is convenient for those who wish not to use SSH in their cluster. Right now, this requires the user to manually invoke curl to fetch the logs, but this could be done automatically by the installer in the future.
Rebased onto master. @wking can I get another one of those |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abhinavdahiya, crawford The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@sferich888, there were some troubleshooting-doc changes here. |
@@ -0,0 +1,6 @@ | |||
[Service] | |||
ExecStart= | |||
ExecStart=/usr/lib/systemd/systemd-journal-gatewayd \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this service defined? Where is it's source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this service defined? Where is it's source.
The service this is a drop-in for? I'd guess in a package RHCOS pulls in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this was offered by the kubelet? And it's debug port.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is journal-gatewayd, which is included (at my request) in RHCOS.
@@ -0,0 +1,6 @@ | |||
[Service] | |||
ExecStart= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this line be dropped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All systemd directives act as lists and each entry adds another item to the list. Simple services (like journal-gatewayd) can only have a single item for ExecStart, so we have to clear the list with an empty entry.
This allows the user to stream logs from the bootstrap node without the
need for SSH. This read-only view into the bootstrap process is
convenient for those who wish not to use SSH in their cluster. Right
now, this requires the user to manually invoke curl to fetch the logs,
but this could be done automatically by the installer in the future.