New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add connectivity checker controller #856
Add connectivity checker controller #856
Conversation
@sanchezl can you please comment how EventRecorder gets initialized on other operators ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please comment how EventRecorder gets initialized on other operators ?
We use a framework for most of our operator commands that creates the event.Recorder for us. I don't see a proper event.Recorder in use in any of the existing controllers, but the pki controller does have a loggingRecorder that simply writes to Stdout. You can start with that, or use events.NewRecorder()
to create a new recorder that actually creates events.
pkg/controller/connectivitycheck/connectivity_check_controller.go
Outdated
Show resolved
Hide resolved
pkg/controller/connectivitycheck/connectivity_check_controller.go
Outdated
Show resolved
Hide resolved
6470dac
to
8eea9d1
Compare
@rcarrillocruz See rcarrillocruz#1 for some thoughts on the context. |
8eea9d1
to
99c45e6
Compare
Running this patch with cluster network operator running locally it gets stuck with: I1030 10:16:01.494974 252285 base_controller.go:66] Waiting for caches to sync for ConnectivityCheckController |
/assign @sttts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like one more informer factory to start.
pkg/controller/connectivitycheck/connectivity_check_controller.go
Outdated
Show resolved
Hide resolved
7895a89
to
44762c1
Compare
We will need to tweak the period each check is run , right now is every second (in library-go) |
I need to templatize the manifests, so the RBAC and check-endpoints is only added in case the flag $check_endpoints is enabled |
I also need to put this on SDN |
dfd8206
to
dc25805
Compare
Yeah, it would be good to think about how this could be made more generic - do we even need to care about the underlying network provider? |
/retest |
This leverages PodNetworkConnectivityChecks to perform network checks periodically. The checks are: From regular pod to kubeapiserver/openshiftapiserver service and endpoints and LBs. From regular pod to pods on every node. From regular pod to a test hello-openshift service.
bd6599c
to
123b165
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danielmellado, danwinship, rcarrillocruz The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
4 similar comments
/retest |
/retest |
/retest |
/retest |
/test e2e-gcp |
/test e2e-aws-ovn-windows-custom-vxlan |
1 similar comment
/test e2e-aws-ovn-windows-custom-vxlan |
/test e2e-gcp |
/retest |
1 similar comment
/retest |
/override ci/prow/e2e-openstack-ovn |
@abhat: Overrode contexts on behalf of abhat: ci/prow/e2e-openstack-ovn, ci/prow/e2e-ovn-ipsec-step-registry In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/hold |
@rcarrillocruz noticed that the network operator reconcile loop is showing a bunch of errors with unregistered GVK
Holding off for now till you can ack these are benign or we have a fix. |
/hold cancel |
Looks like the openstack-ovn job has been failing all over the place. Nothing to do with the PR itself. Cancelling the hold. |
/test e2e-agnostic-upgrade |
/test e2e-gcp |
Add network diagnostics controller to CNO
This leverages PodNetworkConnectivityChecks (https://github.com/openshift/enhancements/blob/241bfa6388a2bde7d78c4be7f0479fe99784daed/enhancements/kube-apiserver/stability-point-to-point-network-check.md) to perform network checks periodically.
The checks are:
From regular pod to kubeapiserver/openshiftapiserver service and endpoints and LBs.
From regular pod to pods on every node running a hello-openshift service.