Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read properly information from localhost API when it's needed #98

Closed
wants to merge 1 commit into from

Conversation

yboaron
Copy link
Contributor

@yboaron yboaron commented Sep 15, 2020

For Keepalived unicast configuration, we should be able to read nodes details from the localhost:kube-apiserver

With this PR we should be able to read nodes information also from localhost.

How to verify the fix?

  1. ssh into one of the masters
  2. Change server value in kubeconfig file
    a. sudo vi /var/lib/kubelet/kubeconfig
    b. change server: https://api-int.ostest.test.metalkube.org:6443 to some dummy server for example
    server: https://192.168.111.199:6443
  3. Wait for ~60 seconds and check the logs of haproxy-monitor container, we shouldn't see this error

@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 15, 2020
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 15, 2020
@yboaron
Copy link
Contributor Author

yboaron commented Sep 15, 2020

/test e2e-metal-ipi

@yboaron
Copy link
Contributor Author

yboaron commented Sep 15, 2020

/retitle Read properly information from localhost API when it's needed

@openshift-ci-robot openshift-ci-robot changed the title [WIP] Fix bug when reading information from localhost API Read properly information from localhost API when it's needed Sep 15, 2020
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 15, 2020
@cybertron
Copy link
Member

So if I'm reading the code correctly, the problem was that not passing kubeconfig in the localhost case was not valid? The change lgtm (I don't like overloading variable meanings the way we were before anyway), but I just want to make sure I understand. Particularly since I'm having trouble standing up a local cluster to check out the behavior of this code. :-/

@yboaron
Copy link
Contributor Author

yboaron commented Sep 16, 2020

So if I'm reading the code correctly, the problem was that not passing kubeconfig in the localhost case was not valid? The change lgtm (I don't like overloading variable meanings the way we were before anyway), but I just want to make sure I understand. Particularly since I'm having trouble standing up a local cluster to check out the behavior of this code. :-/

That's correct.
I saw some certificate authentication errors in the monitor logs when we try to read from the localhost, I'm almost sure that I verified reading from localhost when I tested that back in days, could be that I missed something :-( .

So, to hit the bug you can ssh to one of the masters and change the server value in /var/lib/kubelet/kubeconfig to some dummy server (e.g: server: https://192.168.111.199:6443) and you should see this error on haproxy-monitor logs.

To verify the bug fix repeat on the same scenario and you shouldn't see this error message on haproxy-monitor logs.

@cybertron
Copy link
Member

Yeah, I finally got a cluster up yesterday and reproduced the issue and the fix. I think we need this for #96 because that's predicated on being able to use the local api to update the keepalived config. Do you want to open a bug for this so we can get it merged in 4.6? It pretty much breaks our keepalived logic right now. I guess the other option would be to just include this in one of our other changes, but on its own this is a smaller change that is probably easier to get merged.

@yboaron
Copy link
Contributor Author

yboaron commented Sep 16, 2020

Yeah, I finally got a cluster up yesterday and reproduced the issue and the fix. I think we need this for #96 because that's predicated on being able to use the local api to update the keepalived config. Do you want to open a bug for this so we can get it merged in 4.6? It pretty much breaks our keepalived logic right now. I guess the other option would be to just include this in one of our other changes, but on its own this is a smaller change that is probably easier to get merged.

I already added this commit yesterday to #92 (metal-ipi CI test was red without this fix) , which Brad already LGTM , so I think we can merge #92 , what do you think?

@bcrochet
Copy link
Member

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 16, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bcrochet, yboaron

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@celebdor
Copy link
Contributor

/close

@openshift-ci-robot
Copy link
Contributor

@celebdor: Closed this PR.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants