Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log hostname/actual IP for instances request ignition files on first boot #397

Closed
michaelgugino opened this issue Jul 7, 2020 · 5 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@michaelgugino
Copy link
Contributor

A recurring problem the MAO has is a machine is provisioned, but never joins the cluster. Invariably, this is blamed on the machine-api due to lack of understanding how ignition and MCS/MCO works.

We need to track the hostname and/or private-ip address of machines that request the first boot ignition file so we can identify where a failure has occurred. The MCS currently logs IP addresses it sees from the http connection level, but this information is usually a VIP or a NAT and not useful. The actual hostname and/or private-ip should be captured in either the http headers or the request payload, and we should track these requests in a way they can be easily associated with a particular machine object. Machine objects know via the cloud providers what the hostname and/or ip addresses of an instance will be, so having the MCS report this information somewhere useful will allow us to build tooling to help diagnose cluster-joining problems easier.

@runcom
Copy link
Member

runcom commented Jul 8, 2020

The actual hostname and/or private-ip should be captured in either the http headers or the request payload, and we should track these requests in a way they can be easily associated with a particular machine object. Machine objects know via the cloud providers what the hostname and/or ip addresses of an instance will be, so having the MCS report this information somewhere useful will allow us to build tooling to help diagnose cluster-joining problems easier.

sounds good to me - we can work together on this as it seems the above is missing from the MachineAPI implementation right? or can we go ahead and check the headers we get in the MCS already?

@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 25, 2020
@openshift-bot
Copy link

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 24, 2020
@openshift-bot
Copy link

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-ci-robot
Copy link

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

4 participants