New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to connect to service from local network after livenessProbe has been started #56887

Closed
rogovst opened this Issue Dec 6, 2017 · 9 comments

Comments

Projects
None yet
5 participants
@rogovst

rogovst commented Dec 6, 2017

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

/sig network

What happened:

kubectl get nodes
NAME STATUS AGE VERSION
dsdocker01 Ready 17h v1.7.6
dsdocker02 Ready 43d v1.7.6
dsdocker03 Ready 43d v1.7.6
dsdocker04 Ready 19d v1.7.6

Custom sysctl:

kernel.printk_ratelimit = 10
kernel.printk_ratelimit_burst = 20
net.bridge.bridge-nf-call-iptables = 1
net.core.message_burst = 10
net.core.message_cost = 20
net.core.netdev_max_backlog = 50000
net.core.optmem_max = 40960
net.core.rmem_default = 16777216
net.core.rmem_max = 16777216
net.core.somaxconn = 20000
net.core.wmem_default = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_fin_timeout = 10
net.ipv4.tcp_max_orphans = 30260
net.ipv4.tcp_max_syn_backlog = 30000
net.ipv4.tcp_max_tw_buckets = 2000000
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1

I've the test pod with service

---
apiVersion: v1
kind: ReplicationController
metadata:
  name: nginx-controller
spec:
  replicas: 1
  selector:
    name: nginx-dev
  template:
    metadata:
      labels:
        name: nginx-dev
    spec:
      nodeSelector:
        stage: development
      containers:
        - name: nginx-dev
          image: nginx
          ports:
            - containerPort: 80
            - containerPort: 443
          livenessProbe:
            initialDelaySeconds: 60
            timeoutSeconds: 5
            periodSeconds: 30
            httpGet:
              scheme: HTTP
              path: /
              port: 80
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-dev
spec:
  ports:
    - name: http
      port: 80
      protocol: TCP
    - name: https
      port: 443
      targetPort: 443
      protocol: TCP
  selector:
  type: NodePort

After launch, the pod started on dsdocker01 node and I can connect to corresponding port number by any host from my local network

# telnet dsdocker01 32393
Trying 10.183.217.17...
Connected to dsdocker01.
Escape character is '^]'.
^]

 # telnet dsdocker02 32393
Trying 10.183.217.31...
Connected to dsdocker02.
Escape character is '^]'.
^]

After initialDelaySeconds livenessProbe has been started and I can not connect from local network to corresponding port of node where pod is working. But I still can connect to this port on other kubernetes nodes:

# telnet dsdocker01 32393
Trying 10.183.217.17...
telnet: connect to address 10.183.217.17: Connection timed out

# telnet dsdocker02 32393
Trying 10.183.217.31...
Connected to dsdocker02.
Escape character is '^]'.
^]

tcpdump output from container:

root@nginx-controller-16v6j:/# tcpdump -Nnn -vv -i eth0
13:48:18.007075 IP (tos 0x0, ttl 63, id 19177, offset 0, flags [DF], proto TCP (6), length 60)
    172.30.65.1.48706 > 172.30.65.41.80: Flags [S], cksum 0xbd1b (correct), seq 865411857, win 27200, options [mss 1360,sackOK,TS val 3345967608 ecr 0,nop,wscale 7], length 0
13:48:19.008246 IP (tos 0x0, ttl 63, id 19178, offset 0, flags [DF], proto TCP (6), length 60)
    172.30.65.1.48706 > 172.30.65.41.80: Flags [S], cksum 0xb931 (correct), seq 865411857, win 27200, options [mss 1360,sackOK,TS val 3345968610 ecr 0,nop,wscale 7], length 0

It seems that container just do not answer on this packets

Also i can reach this port locally from a corresponding kubernters node:

[root@DSDOCKER01 ~]# telnet dsdocker01 32393
Trying 127.0.1.1...
Connected to dsdocker01.
Escape character is '^]'.
^]

What you expected to happen:
No idea for now

How to reproduce it (as minimally and precisely as possible):
Create 1 master and 2minions
Launch pod with LivenessProbe and NodePort service
Tyr to acces nodeport(on minion where pod is running) from local net after LivenessProbe is started

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.6", GitCommit:"4bc5e7f9a6c25dc4c03d4d656f2cefd21540e28c", GitTreeState:"clean", BuildDate:"2017-09-18T12:25:36Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration: VMWare v Cloud
  • OS (e.g. from /etc/os-release):CentOS Linux release 7.4.1708 (Core)
  • Kernel (e.g. uname -a): 3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 20 20:32:50 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools: by hand(ansible)
  • Others:
@rogovst

This comment has been minimized.

Show comment
Hide comment
@rogovst

rogovst Dec 6, 2017

/sig network

rogovst commented Dec 6, 2017

/sig network

@dims

This comment has been minimized.

Show comment
Hide comment
@dims

dims Dec 6, 2017

Member

/sig network

Member

dims commented Dec 6, 2017

/sig network

@thockin

This comment has been minimized.

Show comment
Hide comment
@thockin

thockin Jan 6, 2018

Member

I can't reproduce this.

Member

thockin commented Jan 6, 2018

I can't reproduce this.

@thockin

This comment has been minimized.

Show comment
Hide comment
@thockin

thockin Jan 6, 2018

Member

When you run tcpdump, can you use -i any ?

Member

thockin commented Jan 6, 2018

When you run tcpdump, can you use -i any ?

@thockin

This comment has been minimized.

Show comment
Hide comment
@thockin

thockin Jan 6, 2018

Member

You're not using externalTrafficPolicy, are you? #57922

Member

thockin commented Jan 6, 2018

You're not using externalTrafficPolicy, are you? #57922

@rogovst

This comment has been minimized.

Show comment
Hide comment
@rogovst

rogovst Jan 10, 2018

When you run tcpdump, can you use -i any ?

root@nginx-controller-2t3gt:/#  tcpdump -Nnn -vv -i any
10:28:55.918413 IP (tos 0x10, ttl 61, id 17064, offset 0, flags [DF], proto TCP (6), length 60)
    172.30.86.1.41906 > 172.30.86.2.80: Flags [S], cksum 0x8f28 (correct), seq 772432295, win 27200, options [mss 1355,sackOK,TS val 47715380 ecr 0,nop,wscale 7], length 0
10:28:56.926832 IP (tos 0x10, ttl 61, id 17065, offset 0, flags [DF], proto TCP (6), length 60)
    172.30.86.1.41906 > 172.30.86.2.80: Flags [S], cksum 0x8b37 (correct), seq 772432295, win 27200, options [mss 1355,sackOK,TS val 47716389 ecr 0,nop,wscale 7], length 0
10:28:58.974737 IP (tos 0x10, ttl 61, id 17066, offset 0, flags [DF], proto TCP (6), length 60)
    172.30.86.1.41906 > 172.30.86.2.80: Flags [S], cksum 0x8337 (correct), seq 772432295, win 27200, options [mss 1355,sackOK,TS val 47718437 ecr 0,nop,wscale 7], length 0
10:29:03.006792 IP (tos 0x10, ttl 61, id 17067, offset 0, flags [DF], proto TCP (6), length 60)
    172.30.86.1.41906 > 172.30.86.2.80: Flags [S], cksum 0x7377 (correct), seq 772432295, win 27200, options [mss 1355,sackOK,TS val 47722469 ecr 0,nop,wscale 7], length 0

You're not using externalTrafficPolicy, are you? #57922
Not using

# kubectl describe svc/nginx-dev
Name:			nginx-dev
Namespace:		uat
Labels:			<none>
Annotations:		<none>
Selector:		name=nginx-dev
Type:			NodePort
IP:			172.40.76.176
Port:			http	80/TCP
NodePort:		http	30808/TCP
Endpoints:		172.30.86.2:80
Port:			https	443/TCP
NodePort:		https	31317/TCP
Endpoints:		172.30.86.2:443
Session Affinity:	None
Events:			<none>

rogovst commented Jan 10, 2018

When you run tcpdump, can you use -i any ?

root@nginx-controller-2t3gt:/#  tcpdump -Nnn -vv -i any
10:28:55.918413 IP (tos 0x10, ttl 61, id 17064, offset 0, flags [DF], proto TCP (6), length 60)
    172.30.86.1.41906 > 172.30.86.2.80: Flags [S], cksum 0x8f28 (correct), seq 772432295, win 27200, options [mss 1355,sackOK,TS val 47715380 ecr 0,nop,wscale 7], length 0
10:28:56.926832 IP (tos 0x10, ttl 61, id 17065, offset 0, flags [DF], proto TCP (6), length 60)
    172.30.86.1.41906 > 172.30.86.2.80: Flags [S], cksum 0x8b37 (correct), seq 772432295, win 27200, options [mss 1355,sackOK,TS val 47716389 ecr 0,nop,wscale 7], length 0
10:28:58.974737 IP (tos 0x10, ttl 61, id 17066, offset 0, flags [DF], proto TCP (6), length 60)
    172.30.86.1.41906 > 172.30.86.2.80: Flags [S], cksum 0x8337 (correct), seq 772432295, win 27200, options [mss 1355,sackOK,TS val 47718437 ecr 0,nop,wscale 7], length 0
10:29:03.006792 IP (tos 0x10, ttl 61, id 17067, offset 0, flags [DF], proto TCP (6), length 60)
    172.30.86.1.41906 > 172.30.86.2.80: Flags [S], cksum 0x7377 (correct), seq 772432295, win 27200, options [mss 1355,sackOK,TS val 47722469 ecr 0,nop,wscale 7], length 0

You're not using externalTrafficPolicy, are you? #57922
Not using

# kubectl describe svc/nginx-dev
Name:			nginx-dev
Namespace:		uat
Labels:			<none>
Annotations:		<none>
Selector:		name=nginx-dev
Type:			NodePort
IP:			172.40.76.176
Port:			http	80/TCP
NodePort:		http	30808/TCP
Endpoints:		172.30.86.2:80
Port:			https	443/TCP
NodePort:		https	31317/TCP
Endpoints:		172.30.86.2:443
Session Affinity:	None
Events:			<none>
@thockin

This comment has been minimized.

Show comment
Hide comment
@thockin

thockin Feb 24, 2018

Member

It's not clear to me what that tcpdump is showing - traffic between 2 containers on a single node? Via nodeport or direct? Dumping from which namespace?

https://kubernetes.io/docs/tasks/debug-application-cluster/debug-service/

Member

thockin commented Feb 24, 2018

It's not clear to me what that tcpdump is showing - traffic between 2 containers on a single node? Via nodeport or direct? Dumping from which namespace?

https://kubernetes.io/docs/tasks/debug-application-cluster/debug-service/

@rogovst

This comment has been minimized.

Show comment
Hide comment
@rogovst

rogovst Mar 22, 2018

net.ipv4.tcp_tw_recycle = 1 - is my problem.
After remove that setting everything works as expected
Thank you.
Please close the issue.

rogovst commented Mar 22, 2018

net.ipv4.tcp_tw_recycle = 1 - is my problem.
After remove that setting everything works as expected
Thank you.
Please close the issue.

@thockin

This comment has been minimized.

Show comment
Hide comment
@thockin

thockin Mar 23, 2018

Member
Member

thockin commented Mar 23, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment