Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tcp-services-configmap and udp-services-configmap are not applied upon upgrade to k8s version using nginx-0.49.3-rancher1 #35943

Closed
snasovich opened this issue Dec 21, 2021 · 13 comments
Assignees
Labels
area/ingress area/provisioning-rke1 Provisioning issues with RKE1 internal team/hostbusters The team that is responsible for provisioning/managing downstream clusters + K8s version support

Comments

@snasovich
Copy link
Collaborator

Issue description:
Upon upgrade of a Rancher-provisioned Kubernetes cluster to a version that uses rancher/nginx-ingress-controller:nginx-0.49.3-rancher1, e.g. Kubernetes v1.20.12 or v1.19.16, the following args are absent from the nginx-ingress-controller DaemonSet and any configured tcp-services and udp-services will stop working

  • --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
  • --udp-services-configmap=$(POD_NAMESPACE)/udp-services

Business impact:
Any configured tcp-services and udp-services break

Troubleshooting steps:
N/A

Repro steps:

  • Provision a Rancher v2.6.2 instance and a single node custom cluster with v1.20.11-rancher1-1 (or v1.19.13-rancher1-1)
  • Add an nginx Deployment to the default Namespace with a ClusterIP service to port 80
  • Add 8080: default/nginx:80 to the tcp-services ConfigMap in the ingress-nginx Namespace
  • Confirm access to nginx works via :8080
  • Upgrade to v1.20.12-rancher1-1 (or v1.19.16-rancher1-1)
  • Confirm access to nginx no longer works via :8080
  • Observe missing args on nginx-ingress-controller DaemonSet 

Actual behavior:
tcp-services and udp-services stop working

Expected behavior:
tcp-services and udp-services continue to work

Files, logs, traces:
N/A

Additional notes:
Workaround is to set tcp-services-configmap and udp-services-configmap arguments via extra_args:

ingress:
  extra_args:
    tcp-services-configmap: $(POD_NAMESPACE)/tcp-services
    udp-services-configmap: $(POD_NAMESPACE)/udp-services

JIRA ID: SURE-3636, SURE-3696

@snasovich snasovich added internal area/ingress [zube]: To Triage area/provisioning-rke1 Provisioning issues with RKE1 team/hostbusters The team that is responsible for provisioning/managing downstream clusters + K8s version support labels Dec 21, 2021
@snasovich snasovich added this to the v2.6.4 - Triaged milestone Dec 21, 2021
@snasovich
Copy link
Collaborator Author

Could be a regression in nginx-0.49.3-rancher1 compared to nginx-v0.48.1.go (https://github.com/rancher/kontainer-driver-metadata/blob/dev-v2.6/rke/templates/nginx-v0.49.3.go) in which case it should be a higher priority.
Scheduling for 2.6.4 to test this.

@kinarashah
Copy link
Member

kinarashah commented Jan 20, 2022

Issue:
This is a regression between upstream template nginx-ingress-v0.30.0 and controller-v0.31.0.
Best guess is the flags got removed when upstream refactored the code to maintain charts and static deploy files.

Current default behavior for charts is the flags get passed by default,
https://github.com/kubernetes/ingress-nginx/blob/main/charts/ingress-nginx/templates/_params.tpl#L17.

But RKE1 uses static deploy yaml files where the flags and config maps were removed.

Solution

  • Fix in KDM would be to add back the config maps and flags to nginx-ingress template. Since we do not update existing templates after they're released, this fix will be available in new templates introduced with the following Kubernetes versions:

     1.19.16-rancher1-3
     1.20.15-rancher1-1
     1.21.9-rancher1-1
     1.22.6-rancher1-1
    
  • Release note workaround for existing versions. Workaround mentioned in the issue works as expected -

    Workaround is to set tcp-services-configmap and udp-services-configmap arguments via extra_args:
    
    ingress:
       extra_args:
          tcp-services-configmap: $(POD_NAMESPACE)/tcp-services
          udp-services-configmap: $(POD_NAMESPACE)/udp-services
    

    Upgrade to any of the following Kubernetes versions will result into tcp-services-configmap and udp-services-configmap not being passed:

    v1.19.x:  >=1.19.16-rancher1-1 <1.19.16-rancher1-3
    v1.20.x:  >=1.20.12-rancher1-1  <1.20.15-rancher1-1
    v1.21.x:  <1.21.9-rancher1-1 
    v1.22.x:  <1.22.6-rancher1-1
    

@vivek-shilimkar
Copy link
Member

vivek-shilimkar commented Feb 2, 2022

Validated on rancher version v2.6.3

Observed following results:

Validation scenario 1: k8s cluster use nginx version 0.35-rancher2 and 0.49.3-rancher1

  • Provision a single node custom cluster with v1.18.20-rancher1-3.
  • Added an nginx Deployment to the default Namespace with a ClusterIP service to port 80
  • Added 8080: default/nginx:80 to the tcp-services ConfigMap in the ingress-nginx Namespace
  • Confirm access to nginx works via :8080
  • Upgrade to v1.19.16-rancher1-2 (nginx version 0.49.3-rancher1)
  • Access to nginx no longer works via :8080.
  • Following arguments are missing on nginx-ingress-controller Daemonset

-- tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
-- udp-services-configmap=$(POD_NAMESPACE)/udp-services

  • Needs work around to fix the missing access to nginx via :8080.

Validation scenario 2: k8s cluster use nginx version 0.49.3-rancher1

  • Provision a single node custom cluster with v1.19.16-rancher1-2.
  • Added an nginx Deployment to the default Namespace with a ClusterIP service to port 80
  • Added 8080: default/nginx:80 to the tcp-services ConfigMap in the ingress-nginx Namespace
  • Access to nginx failed via :8080
  • Added following 2 arguments in the nginx-ingress-controller DaemonSet.

-- tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
-- udp-services-configmap=$(POD_NAMESPACE)/udp-services

  • Access to nginx works via :8080
  • Upgrade to v1.20.14-rancher2-2 (nginx version = 0.49.3-rancher1)
  • Access to nginx still works via :8080.

Validation scenario 3: k8s cluster use nginx version 1.1.0-rancher1

  • Provision a single node custom cluster with v1.19.16-rancher1-3.
  • Added an nginx Deployment to the default Namespace with a ClusterIP service to port 80
  • Added 8080: default/nginx:80 to the tcp-services ConfigMap in the ingress-nginx Namespace
  • Confirm access to nginx works via :8080
  • Upgrade to v1.20.14-rancher2-2 (nginx version = 1.1.0-rancher1)
  • Access to nginx still works via :8080.
  • No need to work around required.

@sowmyav27
Copy link
Contributor

Waiting on @kinarashah to add this issue to Release note to close it out.

@sowmyav27
Copy link
Contributor

@vivek-infracloud Can you validate this usecase

  • Deploy 1.21 and 1.22 versions on. dev-v2.6 KDM on 2.6.3 and validate nginx version = 1.1.0-rancher1 and the ConfigMap in the ingress-nginx Namespace is available.

@vivek-shilimkar
Copy link
Member

Validated on rancher version v2.6.3 and KDM pointing to dev-v2.6

Observed following results:

  • Provision a single node custom cluster with v1.21.9-rancher1-1 and v1.22.6-rancher1-1.
  • Validated the nginx version is 1.1.0-rancher1 and ConfigMap are present in the ingress-nginx namespace.
  • Added an nginx Deployment to the default Namespace with a ClusterIP service to port 80
  • Added 8080: default/nginx:80 to the tcp-services ConfigMap in the ingress-nginx Namespace
  • Access to nginx doesn't work via :8080

Modified the nginx-ingress-controller template in order for nginx access to work via :8080
Added the hostNetwork: true and wait for cluster to update.

  • Access to nginx works via :8080

@samkulkarni20
Copy link
Contributor

@kinarashah On inspection of nginx template files in KDM, I noticed that it sets hostNetwork: true on the nginx controller, if the NetworkMode is set to hostNetwork. @vivek-infracloud mentioned he followed the exact same process for the 1.22 & 1.21 clusters as he did to provision 1.19, 120 cluster, but still there's a difference in the final result.

@abhi1693
Copy link

abhi1693 commented Feb 4, 2022

This issue also impacted Rancher v2.5.12

@kinarashah
Copy link
Member

@samkulkarni20 Thanks for looking into it, we changed the network mode to hostPort intentionally for k8s >=1.21. For reference, here's the docs issue https://github.com/rancher/docs/issues/3420 and the issue for which we fixed it which has more context: #33792. So this behavior is expected and can be closed as is.

Could we test this for dev-2.5 as well? Since the same new template was also introduced for 2.5.x as well. I created a backport issue for the validation. #36433

@vivek-shilimkar
Copy link
Member

Based on Kinara's comment the behaviour mentioned in the comment is expected.

Hence, closing the issue.

@yingbo-wu
Copy link

RKE [v1.3.7] v1.22.6-rancher1-1 The problem remains.

@snasovich snasovich changed the title tcp-services-configmap and udp-services-configmap upon upgrade to k8s version using nginx-0.49.3-rancher1 tcp-services-configmap and udp-services-configmap are not applied upon upgrade to k8s version using nginx-0.49.3-rancher1 Feb 27, 2022
@kinarashah
Copy link
Member

@samkulkarni20 In reference to the above comment, could we test this on 1.22.6-rancher1-1 with RKE v1.3.7 and make sure it works? cc @slickwarren

@rishabhmsra
Copy link
Contributor

Validated this on k8s 1.22.6-rancher1-1 using RKE v1.3.7
observed following results :

  • Provisioned a single node v1.22.6-rancher1-1 k8s cluster using RKE 1.3.7
  • Imported it in rancher v2.6.3
  • Validated the nginx version, it's nginx-1.1.0-rancher1
  • Created nginx deployment (clusterip port 80) in default ns.
  • Added 8080: default/nginx:80 to the tcp-services ConfigMap in the ingress-nginx Namespace
  • Access to nginx via <worker-node-public-ip>:8080 -> Did NOT work

Created another v1.22.6-rancher1-1 cluster using RKE 1.3.7 by following below steps

  • After rke config added network_mode: hostNetwork(as mentioned here) under ingress in cluster.yml -> rke up
  • Once the cluster provisioned successfully, imported it in rancher v2.6.3
  • Validated the nginx version, it's nginx-1.1.0-rancher1
  • hostNetwork: true was set on the nginx-ingress-controller daemon set
  • Created nginx deployment (clusterip port 80) in default ns.
  • Added 8080: default/nginx:80 to the tcp-services ConfigMap in the ingress-nginx Namespace
  • Access to nginx via <worker-node-public-ip>:8080 -> Working

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ingress area/provisioning-rke1 Provisioning issues with RKE1 internal team/hostbusters The team that is responsible for provisioning/managing downstream clusters + K8s version support
Projects
None yet
Development

No branches or pull requests

8 participants