New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge 2021-07-18 #609
Merge 2021-07-18 #609
Conversation
…ide Pod Signed-off-by: Robert Bost <rbost@redhat.com>
Signed-off-by: Christoph Stäbler <cstabler@redhat.com>
Since we do not have a config parser that checks if HybridOverlay.ClusterSubnets is set if HybridOverlay.Enabled is true, before adding the lr-in-policy we should check if there are any clustersubnets in the config. Since we don't have a default value for this field it ends up creating policies that have nil dst fields. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
- Support in Makefile (e.g. OCI_BIN=podman make fedora) - Support in kind.sh with flag "-ep podman" Signed-off-by: Joel Takvorian <jtakvori@redhat.com>
ovnkube-trace will now look for ovnkube pods in all namespaces and it will pick the namespace of the first matching pod that it finds. A manual override with -ovn-config-namespace still takes precedence. Signed-off-by: Andreas Karis <ak.karis@gmail.com>
For now, the document only features a short introduction about multicast. Follow-up commits will add information on how to: - configure the cluster to use multicast - description of the OVN north entities used to implement multicast Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
…atching-pods-2278
Check /sys/class/net for ifindex when ip command is not available inside Pod
When a gw pod gets the external gateway annotation, it adds the specific routes to the external gateway for existing pods, but it does not remove the SNAT that was added when the pod was created. Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>
Improve logging message when address move fails Signed-off-by: Andreas Karis <ak.karis@gmail.com>
Right now we add the vipProtocol to the list of vipProtocols to remove only if the loadbalancer is not the idling one. The problem is, we update the loadbalancer just before this check, with the effect of not removing the vip from the existing balancers. This causes the vip to stay both on the idling and non idling loadbalancers, with non predictable effects. Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
…ging nicstobridge.go: Improve logging message when address move fails
…more If ovnkube-master is a bit slow and the service is already gone from the apiserver and informer cache, we still want to clean it up. But we don't want to pass a nil service to Eventf() which will cause the following panic: I0708 15:46:33.636318 1 services_controller.go:518] Deleting service cluster-density-503fcc1a-1066-44d9-bef7-43292b046b06-151/deployment-2pod-151-1 E0708 15:46:35.468114 1 utils.go:93] Error deleting VIP [172.30.55.216:80] on OVN LoadBalancer [01801284-6384-498c-8ef8-a08a6e63c77f 072075fd-eb15-47cc-bb17-05c8b4b49634 0a11b3f8-b1ba-4649-815b-8117442236d 5 10c04472-7cc0-4247-aa7b-2f166e8d4ab1 13a0737b-9859-4654-9a45-df034a2d1098 1532a693-b1e7-438e-b125-9e79953baf73 1b860b67-d31d-4633-98ba-5b62d2db1f2e 240b1487-b592-4c5d-a30c-1910c17512a0 2a25228b-7860-4342-bb88- b75d0068cfa1 3417cdde-16a7-4db0-b717-319ad7459a32 41ebdb00-9cce-47bf-9849-e1a203f915b1 45a7e1bb-62d0-4ec9-871b-53320f031d69 45e40b3c-4cde-4c69-8980-6b067e1c6f45 4bebffa0-a635-4880-81e7-6da162f65ae0 4ce91155-325f -4a85-95cc-3b8504a24a19 4eb3e323-8278-48be-a2d7-3288285cfe0e 4f576145-ef6e-4408-8781-a229e8c73506 587912ee-e688-4ef0-95cc-38c1e44a0108 5d5bfbf6-da93-4044-9263-c716aa699b0d 66625dce-c24c-4479-a576-8eef0fee4055 67 28cfbd-54ad-493d-9de5-f29189c6509b 706ade56-cdc8-4a20-8836-637837ad4d84 71927cfe-ac03-440c-ac9e-227aa6c330c6 71a14d97-0ee3-42b1-a47a-63589504c0a1 7b7777e0-a9d8-4417-a8ea-2aa66644d02b 81b048c3-4a70-4eaa-b17f-aa9a f440d9fb 8432d426-a68f-4757-9aef-7501f70c46f7 8b4e2ab3-82e8-4baa-b7c0-abaf6a83d0fa 8f81fc2c-e2c6-4d64-ae5f-c86d25f5687f 93542f94-9dec-4cbf-9200-3ce926d3e560 97b89ca2-2335-46e6-88eb-ed74d084789f a46434bf-26b5-416 8-bb6c-69cc328e98ad a4648426-4d7f-4bf7-b40e-4df1fbc6cf10 a68ed82d-f2de-4cd9-aa1b-264e78a5ed26 a8f2055d-406f-47f7-b8c6-4abc7de7ffac aaa40b67-8c2b-4c29-bdf7-8d0802b89c49 ae11be0f-8053-419e-ad2e-0ba960870dc6 af7c56 6e-ed23-4e2a-8696-668a598e7a7b b5e67f77-2a3e-4366-aa0b-6705eb254282 bcd4c9eb-13c9-436b-a355-37f237969de5 c1d9ce94-7b19-49ef-8159-39450968e50e c7192d48-3db5-4a49-a849-ec8a814728e9 cefa01d7-5120-4b70-9cf0-dedb52b2 9753 d02c33af-6d30-40e4-a678-85b3db567ae1 d5b2f49c-1cdb-4086-83f5-9a6993d8b0ef d6ddb601-30ba-4b3f-aa63-62afff9b361c dbfddd53-5a09-41eb-bbf2-33916b240fc1 e8746562-a990-40a3-b142-466e0d77c6f4 ec44d001-484b-4fe2-b7 e3-94c060c22b0f ee2734c5-21a0-42ec-813c-be174c7b751f f21577cc-78b0-47dd-b342-0de071c77743 f2d00d0a-0f4b-4863-81c0-8172084bcf38 f2fb2f8b-33e3-40f5-8859-d2f0a99b376e f327ce83-d94f-4b56-9e34-d0fa3ae5de9f f4b120db-e 21f-4d74-87cc-abd2a88779c3] I0708 15:46:35.468207 1 services_controller.go:223] Finished syncing service deployment-2pod-123-1 on namespace cluster-density-503fcc1a-1066-44d9-bef7-43292b046b06-123 : 15.673423702s E0708 15:46:35.468297 1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference) goroutine 8824 [running]: k8s.io/apimachinery/pkg/util/runtime.logPanic(0x18f2520, 0x28a4910) /go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x95 k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0) /go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x86 panic(0x18f2520, 0x28a4910) /usr/lib/golang/src/runtime/panic.go:965 +0x1b9 k8s.io/api/core/v1.(*Service).GetObjectKind(0x0, 0x7fa626221c00, 0x0) <autogenerated>:1 +0x5 k8s.io/client-go/tools/reference.GetReference(0xc0002bb490, 0x1d3f9a0, 0x0, 0x7fa5dc16fb38, 0x0, 0x0) /go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/client-go/tools/reference/ref.go:59 +0x14d k8s.io/client-go/tools/record.(*recorderImpl).generateEvent(0xc000f673c0, 0x1d3f9a0, 0x0, 0x0, 0xc031e556dbe7c809, 0x65dd7aa6698, 0x28e28c0, 0x1afbbca, 0x7, 0x1b1bd64, ...) /go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/client-go/tools/record/event.go:327 +0x5d k8s.io/client-go/tools/record.(*recorderImpl).Event(0xc000f673c0, 0x1d3f9a0, 0x0, 0x1afbbca, 0x7, 0x1b1bd64, 0x1d, 0xc0105b8000, 0x1ec3) /go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/client-go/tools/record/event.go:349 +0xc5 k8s.io/client-go/tools/record.(*recorderImpl).Eventf(0xc000f673c0, 0x1d3f9a0, 0x0, 0x1afbbca, 0x7, 0x1b1bd64, 0x1d, 0x1b62102, 0x41, 0xc00df6c720, ...) /go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/client-go/tools/record/event.go:353 +0xca github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn/controller/services.(*Controller).syncServices(0xc00176b4d0, 0xc00a217d60, 0x4e, 0x0, 0x0) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/controller/services/services_controller.go:246 +0x682 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn/controller/services.(*Controller).processNextWorkItem(0xc00176b4d0, 0x203000) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/controller/services/services_controller.go:184 +0xcd github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn/controller/services.(*Controller).worker(...) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/controller/services/services_controller.go:173 k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc002f319b0) /go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x5f k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc002f319b0, 0x1d339c0, 0xc0024048a0, 0xc001987701, 0xc000210780) /go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0x9b k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc002f319b0, 0x3b9aca00, 0x0, 0xc001987701, 0xc000210780) /go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x98 k8s.io/apimachinery/pkg/util/wait.Until(0xc002f319b0, 0x3b9aca00, 0xc000210780) /go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90 +0x4d created by github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn/controller/services.(*Controller).Run /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/controller/services/services_controller.go:161 +0x3b1 panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xd3aa05] Signed-off-by: Dan Williams <dcbw@redhat.com>
Due to lock contention, and because our namespace handler is single-threaded, namespace additions may pile up and take a while. Log how long they take, and also when namespaces are added and deleted so we can better debug these issues. Signed-off-by: Dan Williams <dcbw@redhat.com>
…-patching-pods-2278 Add unit test for Kube.SetAnnotationsOnPod
Add information to the multicast doc explaining on to enable it on a given namespace. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
Show ACL examples of: - allow multicast traffic per namespace - block all multicast traffic (lesser priority than ACL above) Furthermore, also show the port groups and address sets per namespace. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
Show the multicast related changes to each node's logical switch: - set snooping - set querier - specify src MAC address - specify src IP address Also show the changes introduced to the cluster router, by activating the multicast relay option. Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
Adds libovsdb clients to OVN, egressip, services and unidle controllers and the relevant harness for each of its tests. Signed-off-by: Jaime Caamaño Ruiz <jcaamano@redhat.com>
Add libovsdb clients to controllers & test harness
services: log errors and don't panic if the service doesn't exist any more
namespace: track how long namespace addition takes
Add hbo-lr-policy only if config.HybridOverlay.ClusterSubnets is set
When the master is restarted, all the nodes are fed using both a fetch of existing nodes and by the add node event. In case the node already had subnet annotations, we were updating the subnets allocated metric again. This commit checks if new subnets were allocated and only then calls the code to update metrics associated with allocated subnets. Signed-off-by: Aniket Bhat <anbhat@redhat.com>
Signed-off-by: Dan Williams <dcbw@redhat.com>
In some scale scenarios it takes 20s to repair services on startup: 2021-07-08T15:46:57.601Z|01774|ovn_dbctl|INFO|Running command run --if-exists -- remove load_balancer 13a0737b-9859-4654-9a45-df034a2d1098 vips "\"172.30.9.247:80\"" I0708 15:46:57.658232 1 repair.go:132] Deleting non-existing Kubernetes vip 172.30.135.31:80 from OVN TCP load balancer 13a0737b-9859-4654-9a45-df034a2d1098 ... I0708 15:47:08.113637 1 repair.go:132] Deleting non-existing Kubernetes vip 10.0.141.143:30666 from OVN TCP load balancer aaa40b67-8c2b-4c29-bdf7-8d0802b89c49 2021-07-08T15:47:08.117Z|02031|ovn_dbctl|INFO|Running command run --if-exists -- remove load_balancer aaa40b67-8c2b-4c29-bdf7-8d0802b89c49 vips "\"10.0.141.143:30666\"" I0708 15:47:08.477075 1 repair.go:47] Finished repairing loop for services: 21.111423982s A good chunk of that time is sequential calls to delete stale VIPs in the repair loop. Batch them instead. Signed-off-by: Dan Williams <dcbw@redhat.com>
services: batch LoadBalancer VIP deletions when possible
Fix duplicate incrementing of subnet allocation metric
Kind script: add support for podman
Fix lgw external ip
/retest |
1 similar comment
/retest |
This happens when the pod was already created but a new event of the pod is generated. I managed to see it after a ovnkube-master manual restart. Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
addGWRoutesForPod: don't fail if the routes are already added
ovnkube-trace: Autodetermine ovnNamespace
We need to pass the CA data itself between ovnkube-node and the cnishim since the node is containerized and the shim is not, and the path could be different between the two since they have different filesystem namespaces. So we might as well just read the CA file and pass data around internally, rather than using a file path. Signed-off-by: Dan Williams <dcbw@redhat.com>
Passing the Kube API authentication data via the CNI config file has two problems: 1) the CA file path might be different to the cniserver (because it's containerized) than it is to the cnishim running outside a container 2) it's better not to leak authentication info into the host filesystem, even though the CNI config file should have restricted permissions To solve these two issues, pass the Kube API authentication data back from the cniserver (running in ovnkube-node) to the cnishim in the JSON response instead of writing it to a file on-disk. This commit reverts parts of: d397166 cni: cancel pod sandbox add requests if the pod's UID or MAC changes Signed-off-by: Dan Williams <dcbw@redhat.com>
cni: pass Kube API auth via cnishim response, not CNI config file
No synthetic failures /retest |
@dcbw aws-ovn showed one instance of:
|
/retest |
1 similar comment
/retest |
another instance on vsphere-ovn:
They are definitely less frequent now @dcbw |
/retest |
@trozet: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/override ci/prow/e2e-gcp-ovn |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dcbw, trozet The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@dcbw: Overrode contexts on behalf of dcbw: ci/prow/e2e-gcp-ovn In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@fedepaol @jcaamano @astoycos @abhat @alexanderConstantinescu @dcbw FYI