New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignition - Remote Logging for Console-less cloud servers #2513

Open
pctj101 opened this Issue Oct 19, 2018 · 7 comments

Comments

Projects
None yet
4 participants
@pctj101

pctj101 commented Oct 19, 2018

Issue Report

Feature Request

Enhance Ignition to remotely log failures to aid in debugging failures encountered during boot.

Environment

AWS EC2 Servers which do NOT have a console and often do not have content in system logs available. Console scrollback is also not available.

Desired Feature

Add configuration option in ignition to remotely log / offer logs - For example:
https://www.freedesktop.org/software/systemd/man/systemd-journal-gatewayd.html

Other Information

Because it is important that ignition doesn't mask failures, we must lockup CoreOS and not allow SSH in for debugging (as that may be a non-obvious failure).

However, in AWS, we can't drop down to console to execute journalctl commands and thus failure logs are also masked.

Instead we could make journals remotely available by streaming them to an external journal collector or opening up a port to request journals for debugging.

An example of a failure could be where the ignition configuration is syntactically correct, but requests a non-existing drive/network interface and thus fails. Perhaps trying to format a drive label that doesn't exist, or a device path that is different on a new type of cloud server. These would all pass validation but fail to boot leaving the sys-admin completely in the dark.

@dustymabe

This comment has been minimized.

dustymabe commented Oct 19, 2018

only thing I can think of here would be to have ignition try to mount a boot partition and store it's logs there on a failure event. Then you could mount the disk on another instance and see it.

But really, I feel like viewing the serial console log should show the failure (assuming there exists a console=ttyS0 kernel cmdline arg).

@crawford

This comment has been minimized.

Member

crawford commented Oct 19, 2018

I had opened a similar issue long ago (I cannot find it now). My intended use case was to support bare metal machines that don't have lights-out management. Writing the logs to a disk isn't always possible though (Container Linux supports disk-less boot), so the idea was to stream the logs elsewhere.

I'm curious about your environment, @pctj101. I didn't know AWS let you create instances without a console log. Which instance type are you using?

@pctj101

This comment has been minimized.

pctj101 commented Oct 21, 2018

I'll take another look tomorrow.

Perhaps I wasn't waiting long enough for the system logs to show up. Typically so far, after boot my console log is completely blank. If if takes several minutes for AWS console logs to catch up, perhaps we should log to something a bit more... 'real-time'.

@pctj101

This comment has been minimized.

pctj101 commented Oct 22, 2018

I did find that AWS takes 4 minutes to propagate console logs to the web ui. I'm not sure if there's a better way without making ignition too overweight and bulky. What do you think?

Now that I've verified logs appear, this kind of falls solidly into the "nice to have" bucket (rather than "need to have")

@ajeddeloh

This comment has been minimized.

ajeddeloh commented Oct 22, 2018

Ignition would need to learn to save it's own logs (i.e. keep its own internal copy in addition to what is logged to the journal). It could then POST its logs and something with success or failure to a provided URL (in the ignition section) in the case of either. We do something similar on packet which has a timeline you can post custom events to. I don't think that AWS has something similar. Thoughts?

@pctj101

This comment has been minimized.

pctj101 commented Oct 23, 2018

That's a cool thought. On AWS "cloudwatch" is a reasonable place to POST data to since that's where AWS seems to send all logs to by default. Serverless lambda functions for example log data to cloudwatch and it's "almost real-time" (much less lag than console logs at least).

@pctj101

This comment has been minimized.

pctj101 commented Oct 23, 2018

I don't know the details, but perhaps it's something along this idea:
https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_PutLogEvents.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment