Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GW: Ignore headless services in syncServices #3943

Merged
merged 1 commit into from Oct 3, 2023

Conversation

tssurya
Copy link
Member

@tssurya tssurya commented Sep 29, 2023

- What this PR does and why is it needed
SyncServices in gateway code would try and add iptable rules for
headless services resulting in ovnkube-node crashing

I0929 16:34:08.992237   17788 iptables.go:35] Chain: "OVN-KUBE-ITP" in table: "mangle" already exists, skipping creation: running [/usr/sbin/iptables -t mangle -N OVN-KUBE-ITP --wait]: exit status 1: iptables: Chain already exists.
I0929 16:34:08.997235   17788 iptables.go:27] Adding rule in table: mangle, chain: OVN-KUBE-ITP with args: "-p TCP -d <nil> --dport 80 -j MARK --set-xmark 0x1745ec" for protocol: 0 
I0929 16:34:08.998672   17788 iptables.go:35] Chain: "OVN-KUBE-ITP" in table: "mangle" already exists, skipping creation: running [/usr/sbin/iptables -t mangle -N OVN-KUBE-ITP --wait]: exit status 1: iptables: Chain already exists.
E0929 16:34:09.000870   17788 factory.go:845] Failed (will retry) while processing existing *v1.Service items: gateway sync services failed: failed to add iptables mangle/OVN-KUBE-ITP rule "-p TCP -d <nil> --dport 80 -j MARK --set-xmark 0x1745ec": running [/usr/sbin/iptables -t mangle -C OVN-KUBE-ITP -p TCP -d <nil> --dport 80 -j MARK --set-xmark 0x1745ec --wait]: exit status 2: iptables v1.8.7 (legacy): host/network `<nil>' not found
 State:       Running                                                                                                                                          [239/47128]
      Started:   Fri, 29 Sep 2023 18:32:18 +0200                                                                                                                             
    Last State:  Terminated                                                                                                                                                  
      Reason:    Error                                                                                                                                                       
      Message:      9868 reflector.go:293] Stopping reflector *v1.EgressFirewall (0s) from github.com/ovn-org/ovn-kubernetes/go-controller/pkg/crd/egressfirewall/v1/apis/inf
ormers/externalversions/factory.go:116                                                                                                                                       
I0929 16:32:04.355513    9868 reflector.go:293] Stopping reflector *v1.EgressIP (0s) from github.com/ovn-org/ovn-kubernetes/go-controller/pkg/crd/egressip/v1/apis/informers/
externalversions/factory.go:131                                                                                                                                              
I0929 16:32:04.356755    9868 reflector.go:293] Stopping reflector *v1.EgressQoS (0s) from github.com/ovn-org/ovn-kubernetes/go-controller/pkg/crd/egressqos/v1/apis/informer
s/externalversions/factory.go:131                                                                                                                                            
I0929 16:32:04.355627    9868 reflector.go:293] Stopping reflector *v1alpha1.BaselineAdminNetworkPolicy (0s) from sigs.k8s.io/network-policy-api/pkg/client/informers/externa
lversions/factory.go:132                                                                                                                                                     
I0929 16:32:04.355308    9868 handler.go:215] Removed *v1.EgressFirewall event handler 9                                                                                     
I0929 16:32:04.356191    9868 reflector.go:293] Stopping reflector *v1.EgressService (0s) from github.com/ovn-org/ovn-kubernetes/go-controller/pkg/crd/egressservice/v1/apis/
informers/externalversions/factory.go:131                                                                                                                                    
I0929 16:32:04.356993    9868 reflector.go:293] Stopping reflector *v1.NetworkPolicy (0s) from k8s.io/client-go/informers/factory.go:150                                     
I0929 16:32:04.356233    9868 metrics.go:506] Stopping metrics server 172.19.0.6:9410                                                                                        
I0929 16:32:04.357225    9868 metrics.go:502] Metrics server has stopped serving at address "172.19.0.6:9410"                                                                
I0929 16:32:04.356331    9868 reflector.go:293] Stopping reflector *v1.AdminPolicyBasedExternalRoute (0s) from github.com/ovn-org/ovn-kubernetes/go-controller/pkg/crd/adminp
olicybasedroute/v1/apis/informers/externalversions/factory.go:116                                                                                                            
F0929 16:32:04.357564    9868 ovnkube.go:136] failed to start node network manager: failed to start default node network controller: error waiting for node readiness: gatewa
y init failed to start watching services: watchResource for resource *factory.serviceForGateway. Failed addHandlerFunc: context deadline exceeded                            
=============== pid 9868 terminated ==========                                                                                                                               
                                                                                                                                                                             
      Exit Code:    6                                                                                                                                                        
      Started:      Fri, 29 Sep 2023 18:31:03 +0200                                                                                                                          
      Finished:     Fri, 29 Sep 2023 18:32:04 +0200                                                                                                                          
    Ready:          False                                                                                                                                                    
    Restart Count:  2                                                                                                                                                        
    Requests:                                                                                                                                                                
      cpu:      100m                                                                                                                                                         
      memory:   300Mi
    Readiness:  exec [/usr/bin/ovn-kube-util readiness-probe -t ovnkube-node] delay=30s timeout=30s period=60s #success=1 #failure=3

This PR ignores headless services on restarts.

- Special notes for reviewers

- How to verify it
Tested on KIND, without this fix ovnkube-node pod crashes and doesn't recover.
Note that this is already done for add/delete services, it got missed out in sync services code

- Description for the changelog
Don't process headless services

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
@tssurya tssurya changed the title OVNK/GW: Ignore headless services in syncServices GW: Ignore headless services in syncServices Sep 29, 2023
@coveralls
Copy link

Coverage Status

coverage: 49.909% (+0.05%) from 49.861% when pulling 44a7018 on tssurya:fix-headless-svc-restarts into ac329f8 on ovn-org:master.

@trozet trozet merged commit 6a546f4 into ovn-org:master Oct 3, 2023
26 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants