New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingress controller not starting #2994

Closed
Foxsa opened this Issue Aug 27, 2018 · 20 comments

Comments

Projects
None yet
@Foxsa

Foxsa commented Aug 27, 2018

NGINX Ingress controller version: 0.18.0

Kubernetes version (use kubectl version): v1.11.1

Environment:

  • Cloud provider or hardware configuration: HW
  • OS (e.g. from /etc/os-release): Ubuntu 18.04.1
  • Kernel (e.g. uname -a): 4.15.0-29

What happened:
After upgrade controller from 0.17.1 to 0.18.0 it`s not started and wrote in logs:

-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:    0.18.0
  Build:      git-7b20058
  Repository: https://github.com/kubernetes/ingress-nginx.git
-------------------------------------------------------------------------------

nginx version: nginx/1.15.2
W0827 09:10:13.940665       9 client_config.go:552] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0827 09:10:13.940907       9 main.go:191] Creating API client for https://10.96.0.1:443
I0827 09:10:13.982783       9 main.go:235] Running in Kubernetes cluster version v1.11 (v1.11.1) - git (clean) commit b1b29978270dc22fecc592ac55d903350454310a - platform linux/amd64
I0827 09:10:14.002798       9 main.go:100] Validated ingress-nginx/default-http-backend as the default backend.
I0827 09:10:14.153962       9 nginx.go:255] Starting NGINX Ingress controller
 ResourceVersion:"2636042", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress kube-system/dashboard
I0827 09:10:15.263295       9 backend_ssl.go:68] Adding Secret "kube-system/wild" to the local store
I0827 09:10:15.263483       9 event.go:221] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"default", Name:"consent", UID:"250ae80d-9aee-11e8-8514-a4bf012da507", APIVersion:"extensions/v1beta1", ResourceVersion:"2636043", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress default/consent
I0827 09:10:15.264786       9 event.go:221] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"default", Name:"static-content", UID:"fe34ea2d-a52d-11e8-8514-a4bf012da507", APIVersion:"extensions/v1beta1", ResourceVersion:"3058967", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress default/static-content
I0827 09:10:15.354651       9 nginx.go:684] Starting TLS proxy for SSL Passthrough
F0827 09:10:15.354735       9 nginx.go:696] listen tcp :443: bind: permission denied

How to reproduce it (as minimally and precisely as possible):
Config:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: nginx-ingress-controller
  namespace: ingress-nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ingress-nginx
  template:
    metadata:
      labels:
        app: ingress-nginx
      annotations:
        prometheus.io/port: '10254'
        prometheus.io/scrape: 'true'
    spec:
      serviceAccountName: nginx-ingress-serviceaccount
      containers:
        - name: nginx-ingress-controller
          image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.18.0
          args:
            - /nginx-ingress-controller
            - --default-backend-service=$(POD_NAMESPACE)/default-http-backend
            - --configmap=$(POD_NAMESPACE)/nginx-configuration
            - --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
            - --udp-services-configmap=$(POD_NAMESPACE)/udp-services
            - --publish-service=$(POD_NAMESPACE)/ingress-nginx
            - --annotations-prefix=nginx.ingress.kubernetes.io
            - --enable-ssl-passthrough
          securityContext:
            capabilities:
              drop:
                - ALL
              add:
                - NET_BIND_SERVICE
            # www-data -> 33
            runAsUser: 33
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          ports:
            - name: http
              containerPort: 80
            - name: https
              containerPort: 443
          livenessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: 10254
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 1
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: 10254
              scheme: HTTP
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 1
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: nginx-configuration
  namespace: ingress-nginx
  labels:
    app: ingress-nginx
data:
  client-body-buffer-size: 32M
  hsts: "true"
  proxy-body-size: 1G
  proxy-buffering: "off"
  proxy-read-timeout: "600"
  proxy-send-timeout: "600"
  server-tokens: "false"
  ssl-redirect: "false"
  upstream-keepalive-connections: "50"
  use-proxy-protocol: "false"

Anything else we need to know:

@aledbf aledbf added the kind/bug label Aug 27, 2018

@JessieAMorris

This comment has been minimized.

Show comment
Hide comment
@JessieAMorris

JessieAMorris Aug 28, 2018

I've been getting a similar issue though it seems to only be with TCP services.

[nginx-ingress-public-controller-6d876f78f7-8vkkt] Error: exit status 1 
[nginx-ingress-public-controller-6d876f78f7-8vkkt] nginx: the configuration file /tmp/nginx-cfg188813845 syntax is ok 
[nginx-ingress-public-controller-6d876f78f7-8vkkt] 2018/08/28 17:44:40 [emerg] 124#124: bind() to 0.0.0.0:23 failed (2: No such file or directory) 
[nginx-ingress-public-controller-6d876f78f7-8vkkt] nginx: [emerg] bind() to 0.0.0.0:23 failed (2: No such file or directory) 
[nginx-ingress-public-controller-6d876f78f7-8vkkt] nginx: configuration file /tmp/nginx-cfg188813845 test failed ```

I rolled back to 0.17.1 and it's working correctly again. I'm not sure what changed between 0.17.1 and 0.18.0.

JessieAMorris commented Aug 28, 2018

I've been getting a similar issue though it seems to only be with TCP services.

[nginx-ingress-public-controller-6d876f78f7-8vkkt] Error: exit status 1 
[nginx-ingress-public-controller-6d876f78f7-8vkkt] nginx: the configuration file /tmp/nginx-cfg188813845 syntax is ok 
[nginx-ingress-public-controller-6d876f78f7-8vkkt] 2018/08/28 17:44:40 [emerg] 124#124: bind() to 0.0.0.0:23 failed (2: No such file or directory) 
[nginx-ingress-public-controller-6d876f78f7-8vkkt] nginx: [emerg] bind() to 0.0.0.0:23 failed (2: No such file or directory) 
[nginx-ingress-public-controller-6d876f78f7-8vkkt] nginx: configuration file /tmp/nginx-cfg188813845 test failed ```

I rolled back to 0.17.1 and it's working correctly again. I'm not sure what changed between 0.17.1 and 0.18.0.
@mkjpryor-stfc

This comment has been minimized.

Show comment
Hide comment
@mkjpryor-stfc

mkjpryor-stfc Aug 28, 2018

I am also seeing the bind error with 0.18.0 on GKE. Rolling back to 0.17.1 fixes it.

mkjpryor-stfc commented Aug 28, 2018

I am also seeing the bind error with 0.18.0 on GKE. Rolling back to 0.17.1 fixes it.

@Foxsa Foxsa changed the title from Ingress controller no starting to Ingress controller not starting Aug 30, 2018

@Foxsa

This comment has been minimized.

Show comment
Hide comment
@Foxsa

Foxsa Aug 31, 2018

I've tried to update 0.17.1 to 0.19.0. Same result.
But on another cluster I have successfully updated 0.13.0 to 0.19.0

Foxsa commented Aug 31, 2018

I've tried to update 0.17.1 to 0.19.0. Same result.
But on another cluster I have successfully updated 0.13.0 to 0.19.0

@rochacon

This comment has been minimized.

Show comment
Hide comment
@rochacon

rochacon Aug 31, 2018

@Foxsa From your logs[1] its seems like it could be something related to the SSL passthrough. This other cluster you just mentioned also have it enabled?

[1]

I0827 09:10:15.354651       9 nginx.go:684] Starting TLS proxy for SSL Passthrough
F0827 09:10:15.354735       9 nginx.go:696] listen tcp :443: bind: permission denied

rochacon commented Aug 31, 2018

@Foxsa From your logs[1] its seems like it could be something related to the SSL passthrough. This other cluster you just mentioned also have it enabled?

[1]

I0827 09:10:15.354651       9 nginx.go:684] Starting TLS proxy for SSL Passthrough
F0827 09:10:15.354735       9 nginx.go:696] listen tcp :443: bind: permission denied
@mkjpryor-stfc

This comment has been minimized.

Show comment
Hide comment
@mkjpryor-stfc

mkjpryor-stfc Sep 1, 2018

@Foxsa @rochacon

I also have SSL pass-through enabled, so maybe that is the common factor...

mkjpryor-stfc commented Sep 1, 2018

@Foxsa @rochacon

I also have SSL pass-through enabled, so maybe that is the common factor...

@timm088

This comment has been minimized.

Show comment
Hide comment
@timm088

timm088 Sep 2, 2018

Confirmed for me also, 0.18.0 and 0.19.0 when enabling ssl-passthrough (was my only option for TLS termination at the pod level when using GRPC)

timm088 commented Sep 2, 2018

Confirmed for me also, 0.18.0 and 0.19.0 when enabling ssl-passthrough (was my only option for TLS termination at the pod level when using GRPC)

@VampireDaniel

This comment has been minimized.

Show comment
Hide comment
@VampireDaniel

VampireDaniel Sep 3, 2018

same issue,

NGINX Ingress controller
  Release:    0.19.0
  Build:      git-05025d6
I0903 03:08:42.221794       9 backend_ssl.go:68] Adding Secret "kube-system/ingress-nginx-tls-certs" to the local store
I0903 03:08:42.317294       9 nginx.go:686] Starting TLS proxy for SSL Passthrough
F0903 03:08:42.317441       9 nginx.go:698] listen tcp :443: bind: permission denied

VampireDaniel commented Sep 3, 2018

same issue,

NGINX Ingress controller
  Release:    0.19.0
  Build:      git-05025d6
I0903 03:08:42.221794       9 backend_ssl.go:68] Adding Secret "kube-system/ingress-nginx-tls-certs" to the local store
I0903 03:08:42.317294       9 nginx.go:686] Starting TLS proxy for SSL Passthrough
F0903 03:08:42.317441       9 nginx.go:698] listen tcp :443: bind: permission denied
@Foxsa

This comment has been minimized.

Show comment
Hide comment
@Foxsa

Foxsa Sep 3, 2018

@rochacon yes, on that other cluster SSL pass-through disabled. It could be the case

Foxsa commented Sep 3, 2018

@rochacon yes, on that other cluster SSL pass-through disabled. It could be the case

@aledbf

This comment has been minimized.

Show comment
Hide comment
@aledbf

aledbf Sep 3, 2018

Member

I can confirm this is an issue when SSL pass-through is enabled. The reason for the error is related to the securityContext settings where we use runAsUser: 33. To run as user, we use authbind which works just fine for binaries not generated with Go. To fix this issue I am planning to remove the go proxy that enables SSL pass-through and use the sni preread feature from NGINX

Edit: as a workaround, using runAsUser: 0 and adding the flag --enable-dynamic-certificates should fix this error (using 0.19.0)

Member

aledbf commented Sep 3, 2018

I can confirm this is an issue when SSL pass-through is enabled. The reason for the error is related to the securityContext settings where we use runAsUser: 33. To run as user, we use authbind which works just fine for binaries not generated with Go. To fix this issue I am planning to remove the go proxy that enables SSL pass-through and use the sni preread feature from NGINX

Edit: as a workaround, using runAsUser: 0 and adding the flag --enable-dynamic-certificates should fix this error (using 0.19.0)

@nrobert13

This comment has been minimized.

Show comment
Hide comment
@nrobert13

nrobert13 Sep 3, 2018

I tried this workaround, but getting:
Error generating self-signed certificate: could not create temp pem file /etc/ingress-controller/ssl/default-fake-certificate.pem: open /etc/ingress-controller/ssl/default-fake-certificate.pem993917147: permission denied

running with:

        args:
          - /nginx-ingress-controller
          - --default-backend-service=$(POD_NAMESPACE)/default-http-backend
          - --configmap=$(POD_NAMESPACE)/nginx-configuration
          - --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
          - --udp-services-configmap=$(POD_NAMESPACE)/udp-services
          - --publish-service=$(POD_NAMESPACE)/ingress-nginx
          - --annotations-prefix=nginx.ingress.kubernetes.io
          - --enable-ssl-passthrough
          - --enable-dynamic-certificates
          - --enable-dynamic-configuration
          - --enable-ssl-chain-completion=false
          - --update-status

nrobert13 commented Sep 3, 2018

I tried this workaround, but getting:
Error generating self-signed certificate: could not create temp pem file /etc/ingress-controller/ssl/default-fake-certificate.pem: open /etc/ingress-controller/ssl/default-fake-certificate.pem993917147: permission denied

running with:

        args:
          - /nginx-ingress-controller
          - --default-backend-service=$(POD_NAMESPACE)/default-http-backend
          - --configmap=$(POD_NAMESPACE)/nginx-configuration
          - --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
          - --udp-services-configmap=$(POD_NAMESPACE)/udp-services
          - --publish-service=$(POD_NAMESPACE)/ingress-nginx
          - --annotations-prefix=nginx.ingress.kubernetes.io
          - --enable-ssl-passthrough
          - --enable-dynamic-certificates
          - --enable-dynamic-configuration
          - --enable-ssl-chain-completion=false
          - --update-status
@stephen-dahl

This comment has been minimized.

Show comment
Hide comment
@stephen-dahl

stephen-dahl Sep 3, 2018

@aledbf I am using

helm install stable/nginx-ingress --name ingress-nginx --namespace ingress-nginx --set controller.extraArgs.enable-ssl-passthrough="" --set controller.extraArgs.enable-dynamic-certificates=""

how do I set runAsUser?

stephen-dahl commented Sep 3, 2018

@aledbf I am using

helm install stable/nginx-ingress --name ingress-nginx --namespace ingress-nginx --set controller.extraArgs.enable-ssl-passthrough="" --set controller.extraArgs.enable-dynamic-certificates=""

how do I set runAsUser?

@aledbf

This comment has been minimized.

Show comment
Hide comment
@aledbf

aledbf Sep 4, 2018

Member

To all the affected users by this issue, please use quay.io/kubernetes-ingress-controller/nginx-ingress-controller-amd64:fix-tcp-udp

Member

aledbf commented Sep 4, 2018

To all the affected users by this issue, please use quay.io/kubernetes-ingress-controller/nginx-ingress-controller-amd64:fix-tcp-udp

@stephen-dahl

This comment has been minimized.

Show comment
Hide comment
@stephen-dahl

stephen-dahl Sep 4, 2018

I cant use that tag

helm install stable/nginx-ingress --name ingress-nginx --namespace ingress-nginx --set controller.extraArgs.enable-ssl-passthrough="" --set controller.image.repository="quay.io/kubernetes-ingress-controller/nginx-ingress-controller-amd64" --set controller.image.tag="fix-tcp-udp"
Error: render error in "nginx-ingress/templates/controller-deployment.yaml": template: nginx-ingress/templates/controller-deployment.yaml:51:22: executing "nginx-ingress/templates/controller-deployment.yaml" at <semverCompare ">=0.9...>: error calling semverCompare: Invalid Semantic Version

stephen-dahl commented Sep 4, 2018

I cant use that tag

helm install stable/nginx-ingress --name ingress-nginx --namespace ingress-nginx --set controller.extraArgs.enable-ssl-passthrough="" --set controller.image.repository="quay.io/kubernetes-ingress-controller/nginx-ingress-controller-amd64" --set controller.image.tag="fix-tcp-udp"
Error: render error in "nginx-ingress/templates/controller-deployment.yaml": template: nginx-ingress/templates/controller-deployment.yaml:51:22: executing "nginx-ingress/templates/controller-deployment.yaml" at <semverCompare ">=0.9...>: error calling semverCompare: Invalid Semantic Version
@GeertJohan

This comment has been minimized.

Show comment
Hide comment
@GeertJohan

GeertJohan Sep 6, 2018

The tag fix-tcp-udp is not usable from helm. Better workarround seems to be to use helm install ..etc. with --version 0.17.1 to select the previous version of the nginx-ingress chart.

GeertJohan commented Sep 6, 2018

The tag fix-tcp-udp is not usable from helm. Better workarround seems to be to use helm install ..etc. with --version 0.17.1 to select the previous version of the nginx-ingress chart.

@aledbf

This comment has been minimized.

Show comment
Hide comment
@aledbf

aledbf Sep 6, 2018

Member

@stephen-dahl @GeertJohan that's on purpose. This image is just a temporal workaround until the next release. Install the helm chart and then use kubectl set image to use the fix-tcp-udp image.

Member

aledbf commented Sep 6, 2018

@stephen-dahl @GeertJohan that's on purpose. This image is just a temporal workaround until the next release. Install the helm chart and then use kubectl set image to use the fix-tcp-udp image.

@OrlinVasilev

This comment has been minimized.

Show comment
Hide comment
@OrlinVasilev

OrlinVasilev Sep 13, 2018

what that fix will be ported to the main stream in like 0.19.1?

OrlinVasilev commented Sep 13, 2018

what that fix will be ported to the main stream in like 0.19.1?

@aledbf

This comment has been minimized.

Show comment
Hide comment
@aledbf

aledbf Sep 13, 2018

Member

@OrlinVasilev already done here #3038 (part of 0.20)

Member

aledbf commented Sep 13, 2018

@OrlinVasilev already done here #3038 (part of 0.20)

@OrlinVasilev

This comment has been minimized.

Show comment
Hide comment
@OrlinVasilev

OrlinVasilev Sep 13, 2018

@aledbf - where I can find the release date of 0.20?

OrlinVasilev commented Sep 13, 2018

@aledbf - where I can find the release date of 0.20?

@aledbf

This comment has been minimized.

Show comment
Hide comment
@aledbf

aledbf Sep 13, 2018

Member

@OrlinVasilev we release every three or four weeks. Next release should be in a week or two
Please check https://github.com/kubernetes/ingress-nginx/projects/27

Member

aledbf commented Sep 13, 2018

@OrlinVasilev we release every three or four weeks. Next release should be in a week or two
Please check https://github.com/kubernetes/ingress-nginx/projects/27

@ReSearchITEng

This comment has been minimized.

Show comment
Hide comment
@ReSearchITEng

ReSearchITEng Oct 3, 2018

@stephen-dahl @GeertJohan that's on purpose. This image is just a temporal workaround until the next release. Install the helm chart and then use kubectl set image to use the fix-tcp-udp image.

Should anyone need the command:

kubectl -n kube-system set image deployments/nginx-ingress-controller *=quay.io/kubernetes-ingress-controller/nginx-ingress-controller-amd64:fix-tcp-udp

ReSearchITEng commented Oct 3, 2018

@stephen-dahl @GeertJohan that's on purpose. This image is just a temporal workaround until the next release. Install the helm chart and then use kubectl set image to use the fix-tcp-udp image.

Should anyone need the command:

kubectl -n kube-system set image deployments/nginx-ingress-controller *=quay.io/kubernetes-ingress-controller/nginx-ingress-controller-amd64:fix-tcp-udp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment