Canary ingress having the same version backend service with main ingress causes 503 service unavailable (when in the same namespace) #3952

kidlj · 2019-04-01T10:30:14Z

Is this a BUG REPORT or FEATURE REQUEST? (choose one):

BUG REPORT

NGINX Ingress controller version:

image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.23.0

Kubernetes version (use kubectl version):

$ kubectl version

Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.3", GitCommit:"721bfa751924da8d1680787490c54b9179b1fed0", GitTreeState:"clean", BuildDate:"2019-02-04T04:48:03Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"darwin/amd64"}

Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.6", GitCommit:"ab91afd7062d4240e95e51ac00a18bd58fddd365", GitTreeState:"clean", BuildDate:"2019-02-26T12:49:28Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64"}

Environment:

Cloud provider or hardware configuration:
OS (e.g. from /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Others:

$ cat /etc/os-release 
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

What happened:

When configuring a canary ingress(weight, header or cookie) in the same namespace with the main ingress, if the canary ingress has the same backend service with the main ingress, the controller responds with 100% 503 service unavailable for the configured ingress path.

What you expected to happen:

Though normally we should have different version of service for canary ingresses, but having the same service with main ingress should not cause 503 responses.

How to reproduce it (as minimally and precisely as possible):

The two ingresses are in the same namespace:

$ kubectl get ingress -n=echo-production
NAME              HOSTS      ADDRESS   PORTS   AGE
http-svc          echo.com             80      6h45m
http-svc-canary   echo.com             80      163m

The main ingress:

$ kubectl get ingress -n=echo-production http-svc -o yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"extensions/v1beta1","kind":"Ingress","metadata":{"annotations":{"kubernetes.io/ingress.class":"nginx"},"name":"http-svc","namespace":"echo-production"},"spec":{"rules":[{"host":"echo.com","http":{"paths":[{"backend":{"serviceName":"http-svc","servicePort":80}}]}}]}}
    kubernetes.io/ingress.class: nginx
  creationTimestamp: "2019-04-01T03:31:54Z"
  generation: 1
  name: http-svc
  namespace: echo-production
  resourceVersion: "2909034"
  selfLink: /apis/extensions/v1beta1/namespaces/echo-production/ingresses/http-svc
  uid: b1e085cb-542e-11e9-a0a9-fa163e7b2db1
spec:
  rules:
  - host: echo.com
    http:
      paths:
      - backend:
          serviceName: http-svc
          servicePort: 80
status:
  loadBalancer: {}

The http-svc-canary ingress is in the same namespace as http-svc ingress, and it has the same backend service with main ingress:

$ kubectl get ingress -n=echo-production http-svc-canary -o yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/canary: "true"
    nginx.ingress.kubernetes.io/canary-weight: "30"
  creationTimestamp: "2019-04-01T07:34:05Z"
  generation: 1
  name: http-svc-canary
  namespace: echo-production
  resourceVersion: "2929466"
  selfLink: /apis/extensions/v1beta1/namespaces/echo-production/ingresses/http-svc-canary
  uid: 872f3ff7-5450-11e9-a0a9-fa163e7b2db1
spec:
  rules:
  - host: echo.com
    http:
      paths:
      - backend:
          serviceName: http-svc
          servicePort: 80
status:
  loadBalancer: {}

After echo.com has been pointed to the controller ip in /etc/hosts:

$ curl -s http://echo.com
<html>
<head><title>503 Service Temporarily Unavailable</title></head>
<body>
<center><h1>503 Service Temporarily Unavailable</h1></center>
<hr><center>nginx/1.15.9</center>
</body>
</html>

The http-svc yaml:

cat demo-echo-service.yaml 
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: http-svc
spec:
  replicas: 1
  selector:
    matchLabels:
      app: http-svc
  template:
    metadata:
      labels:
        app: http-svc
    spec:
      containers:
      - name: http-svc
        image: gcr.io/kubernetes-e2e-test-images/echoserver:2.1
        ports:
        - containerPort: 8080
        env:
          - name: NODE_NAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
          - name: POD_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: POD_NAMESPACE
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace
          - name: POD_IP
            valueFrom:
              fieldRef:
                fieldPath: status.podIP

---

apiVersion: v1
kind: Service
metadata:
  name: http-svc
  labels:
    app: http-svc
spec:
  ports:
  - port: 80
    targetPort: 8080
    protocol: TCP
    name: http
  selector:
    app: http-svc

Anything else we need to know:

Main ingress and canary ingress in the same namespace;
Main ingress and canary ingress have the same backend service;
503 100%;
Ingress controller pod is ok running.

The text was updated successfully, but these errors were encountered:

ElvinEfendi · 2019-04-01T22:26:51Z

Yeah this is not good - patches are welcome :) Otherwise we will address it sometime later.

joshsouza · 2019-06-27T23:20:25Z

Just to add some confirmation/information here that may be helpful (I'm still fully troubleshooting a slightly different problem, but I believe it's related):

It appears to me that the canary settings are tied to the upstream/backend that they use, and the first ingress rule that generates the upstream's code will "win" in defining the upstream. This prevents you from utilizing the same backend for multiple ingresses with any one of them using canary (since the upstream will be created without the canary trafficshaping settings) but conversely, I imagine if the canary created the upstream first, you might inadvertently propagate trafficshaping rules to non-canary things. (I haven't confirmed that yet, just seems logical).

I don't know if there's a fix unless there was a separate "canary" upstream created for canary things, but I also haven't looked far enough into the code to know if that's appropriate.

Hopefully this information is useful for others, or in investigating this further.

fejta-bot · 2019-09-25T23:20:54Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2019-10-26T00:05:37Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2019-11-25T00:47:41Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2019-11-25T00:47:49Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 25, 2019

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 26, 2019

k8s-ci-robot closed this as completed Nov 25, 2019

XionZhao mentioned this issue Jan 26, 2021

关于ingress nginx issue #3952的问题咨询 kidlj/intj#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Canary ingress having the same version backend service with main ingress causes 503 service unavailable (when in the same namespace) #3952

Canary ingress having the same version backend service with main ingress causes 503 service unavailable (when in the same namespace) #3952

kidlj commented Apr 1, 2019

ElvinEfendi commented Apr 1, 2019

joshsouza commented Jun 27, 2019

fejta-bot commented Sep 25, 2019

fejta-bot commented Oct 26, 2019

fejta-bot commented Nov 25, 2019

k8s-ci-robot commented Nov 25, 2019

Canary ingress having the same version backend service with main ingress causes 503 service unavailable (when in the same namespace) #3952

Canary ingress having the same version backend service with main ingress causes 503 service unavailable (when in the same namespace) #3952

Comments

kidlj commented Apr 1, 2019

ElvinEfendi commented Apr 1, 2019

joshsouza commented Jun 27, 2019

fejta-bot commented Sep 25, 2019

fejta-bot commented Oct 26, 2019

fejta-bot commented Nov 25, 2019

k8s-ci-robot commented Nov 25, 2019