Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] cannot drop the username and password of a private registry in the secret cattle-system/cattle-private-registry in the downstream cluster once it is set on RKE1 downstream cluster #45605

Closed
jiaqiluo opened this issue May 25, 2024 · 6 comments
Assignees
Labels
internal kind/bug Issues that are defects reported by users or that we know have reached a real release team/hostbusters The team that is responsible for provisioning/managing downstream clusters + K8s version support
Milestone

Comments

@jiaqiluo
Copy link
Member

Rancher Server Setup

  • Rancher version: SUSE Rancher 2.7.9, SUSE Rancher v2.8-Next2
  • Installation option (Docker install/Helm Chart): any
    • If Helm Chart, Kubernetes Cluster and version (RKE1, RKE2, k3s, EKS, etc):
  • Proxy/Cert Details:

Information about the Cluster

  • Kubernetes version: any
  • Cluster Type (Local/Downstream): Downstream
    • If downstream, what type of cluster? (Custom/Imported or specify provider for Hosted/Infrastructure Provider): RKE1 node-driver cluster

User Information

  • What is the role of the user logged in? Admin/Cluster Owner

Describe the bug

When the private_registries parameter is set for a DS cluster, it creates a "cattle-private-registry" secret in the "cattle-system" namespace in that DS cluster that contains the docker config file.

If you ever set the user and password and now want to unset both of them, the secret in the DS will always contain the user/password data.

To Reproduce

  • Create a K8s cluster and install Rancher v2.7 or v2.8.
  • Create a private registry and set the repository to public.
  • Create a DS cluster with Terraform (tested on v3.2.0 and v.4.1.0) and private_registries set to your private registry without credentials
  • Update the DS cluster manually from the UI and set user and password for the Private Registry
  • Let the cluster reconcile and update the cattle-private-registry secret.
  • Run the Terraform code again to update the configuration and unset the user and password for the private registry.

Result

Observe that the secret is still in the DS cluster with the credentials set.

Expected Result

The secret does not contain the credentials after the cluster configuration changes.

SURE-8429

@jiaqiluo jiaqiluo added the kind/bug Issues that are defects reported by users or that we know have reached a real release label May 25, 2024
@jiaqiluo jiaqiluo self-assigned this May 25, 2024
@jiaqiluo jiaqiluo added team/hostbusters The team that is responsible for provisioning/managing downstream clusters + K8s version support internal labels May 25, 2024
@jiaqiluo jiaqiluo changed the title [BUG] cannot drop the username and password from a private registry once it is set on RKE1 downstream cluster [BUG] cannot drop the username and password of a private registry in the secret cattle-system/cattle-private-registry in the downstream cluster once it is set on RKE1 downstream cluster May 25, 2024
@jiaqiluo jiaqiluo added this to the v2.9-Next1 milestone Jun 12, 2024
@jiaqiluo
Copy link
Member Author

Root cause

When updating the private registries on an RKE1 downstream cluster Rancher always skips the entry whose password is empty. Rancher thinks the reason for an empty password is it has been migrated to a Secret, so skipping the entry can avoid wiping out the password from the Secret. The logic works well in most cases except the following one: once the username and password are set for a private registry on the RKE1 downstream cluster, we will not be able to unset those two values at the same time in the cases where the private registry does not require login anymore or the username and password are set by mistake at the first place.

What was fixed, or what changes have occurred

The logic is updated such that now when updating the private registries on an RKE1 downstream cluster Rancher skips the private registry only if it meets all the following conditions:

  • its password is empty
  • it can be found in the list of existing private registries
  • its username is unchanged

Areas or cases that should be tested

A matrix of cases can be derived from creating/updating a DS RKE1 cluster with/without a private registry that does/doesn't have a username and/or password. In all cases, the cattle-private-registry Secret, whose name is recorded at .State.privateRegistrySecret on the mgmt cluster, should be updated properly.

What areas could experience regressions?

The same as the above.

Are the repro steps accurate/minimal?

Yes.
Note that it is not necessary to use Terraform as the bug is on the Rancher side, not the tf-provider-rancher2.

@jiaqiluo
Copy link
Member Author

The issue can validated on the latest v2.9-head tag

@markusewalker
Copy link
Contributor

This issue is waiting for an alpha/RC to properly test.

@markusewalker markusewalker added the Waiting for RC Waiting for an RC before this ticket can move. label Jun 21, 2024
@markusewalker markusewalker removed the Waiting for RC Waiting for an RC before this ticket can move. label Jul 3, 2024
@markusewalker
Copy link
Contributor

markusewalker commented Jul 3, 2024

QA TEST PLAN

# Scenario
1 Unset private registry username from downstream RKE1 cluster
2 Unset private registry password from downstream RKE1 cluster
3 Unset both the private registry username and password from downstream RKE1 cluster

@markusewalker
Copy link
Contributor

markusewalker commented Jul 3, 2024

Reproduced the issue on v2.8.4 for scenario 1. See below:

# Scenario Result
1 Unset private registry username from downstream RKE1 cluster

REPRODUCTION STEPS

  1. Setup Rancher v2.8.3.
  2. Provisioned a downstream RKE1 node driver cluster with an authenticated registry.
  3. Took note of the cattle-private-registry secret and it's values.
  4. Updated the private registry to unset the username.
  5. Validated that the cattle-private-registry is still there and has data/.dockerconfigjson still present and same value.

Now that this has been reproduced, will attempt to now validate with a tag that has the fix.

@markusewalker
Copy link
Contributor

Validated that this is addressed in v2.9.0-alpha7. See details below:

ENVIRONMENT DETAILS

  • Rancher install: Docker
  • Rancher version: v2.9.0-alpha7

TEST RESULT

# Scenario Result
1 Unset private registry username from downstream RKE1 cluster
2 Unset private registry password from downstream RKE1 cluster
3 Unset both the private registry username and password from downstream RKE1 cluster

VALIDATION STEPS

Scenario 1

  1. Setup Rancher via the rancher2 Terraform provider. Sample main.tf below:
terraform {
  required_providers {
    rancher2 = {
      source  = "rancher/rancher2"
      version = "4.1.0"
    }
  }
}
provider "rancher2" {
  api_url   = var.rancher_api_url
  token_key = var.rancher_admin_bearer_token
  insecure  = true
}

########################
# CREATE RKE1 CLUSTER
########################
resource "rancher2_cluster" "cluster" {
  name                                                       = var.cluster_name
  default_pod_security_admission_configuration_template_name = var.default_pod_security_admission_configuration_template_name
  rke_config {
    kubernetes_version = var.kubernetes_version
    network {
      plugin = var.network_plugin
    }
    private_registries {
      url      = var.private_registry_url
      user     = var.private_registry_username
      password = var.private_registry_password
    }
  }
}

########################
# CREATE NODE TEMPLATE
########################
resource "rancher2_node_template" "node_template" {
  name = var.node_template_name
  engine_insecure_registry = [var.insecure_registry]
  amazonec2_config {
    access_key     = var.aws_access_key
    secret_key     = var.aws_secret_key
    ami            = var.aws_ami
    region         = var.aws_region
    security_group = [var.aws_security_group_name]
    subnet_id      = var.aws_subnet_id
    vpc_id         = var.aws_vpc_id
    zone           = var.aws_zone
    root_size      = var.aws_root_size
    instance_type  = var.aws_instance_type
  }
}

########################
# CREATE ETCD NODE POOL
########################
resource "rancher2_node_pool" "etcd_node_pool" {
  cluster_id       = rancher2_cluster.cluster.id
  name             = var.etcd_node_pool_name
  hostname_prefix  = var.node_hostname_prefix
  node_template_id = rancher2_node_template.node_template.id
  quantity         = var.etcd_node_pool_quantity
  control_plane    = false
  etcd             = true
  worker           = false
}

########################
# CREATE CP NODE POOL
########################
resource "rancher2_node_pool" "control_plane_node_pool" {
  cluster_id       = rancher2_cluster.cluster.id
  name             = var.control_plane_node_pool_name
  hostname_prefix  = var.node_hostname_prefix
  node_template_id = rancher2_node_template.node_template.id
  quantity         = var.control_plane_node_pool_quantity
  control_plane    = true
  etcd             = false
  worker           = false
}

########################
# CREATE WORKER NODE POOL
########################
resource "rancher2_node_pool" "worker_node_pool" {
  cluster_id       = rancher2_cluster.cluster.id
  name             = var.worker_node_pool_name
  hostname_prefix  = var.node_hostname_prefix
  node_template_id = rancher2_node_template.node_template.id
  quantity         = var.worker_node_pool_quantity
  control_plane    = false
  etcd             = false
  worker           = true
}
  1. Provisioned a downstream RKE1 node driver cluster with an authenticated registry.
  2. Took note of the cattle-private-registry secret and its values.
  3. Updated the registry to unset the username.
  4. Validated that the cattle-private-registry secret does not have same value found in data/.dockerconfigjson.
    • It is updated to have the taken out username.
  5. Reverted the change and ensured the value in data/.dockerconfigjson reverted as well.

Scenario 2

  1. Repeated scenario 1, but unset the password instead of the username.
    • The cattle-private-registry secret is updated so that data/.dockerconfigjson reflects the change.

Scenario 3

  1. Repeated scenario 1, but unset both the username and password.
    • The cattle-private-registry secret is updated so that data/.dockerconfigjson reflects the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal kind/bug Issues that are defects reported by users or that we know have reached a real release team/hostbusters The team that is responsible for provisioning/managing downstream clusters + K8s version support
Projects
None yet
Development

No branches or pull requests

3 participants