Skip to content
This repository has been archived by the owner on Jun 13, 2023. It is now read-only.

Module manager cannot fetch module OCI images from local registry in k3d setup as registry aliases of k3s are not respected #136

Closed
kwiatekus opened this issue Oct 14, 2022 · 4 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@kwiatekus
Copy link

kwiatekus commented Oct 14, 2022

Description

When I try to setup local setup in single cluster mode (k3d) I end up getting errors when trying to enable keda module:
Module manager cannot connect to the local registry containing the module image.

Please see the detailed description how I execute it.

Steps to reproduce

  1. checkout https://github.com/kyma-project/keda-manager
  2. Setup local k3d cluster and local docker registry
k3d cluster create kyma --registry-create registry.localhost:0.0.0.0:5001
  1. Add etc/hosts entry to register the local docker registry under a registry.localhost name
127.0.0.1 registry.localhost
  1. Export ENVs pointing to module and module image registries
export IMG_REGISTRY=registry.localhost:5001/unsigned/operator-images
export MODULE_REGISTRY=registry.localhost:5001/unsigned
  1. Build Keda module
make module-build
  1. Build Keda manager image
make module-image
  1. Verify if the module and the manager's image are pushed to the local registry
curl registry.localhost:5001/v2/_catalog
{"repositories":["unsigned/component-descriptors/kyma.project.io/module/keda","unsigned/operator-images/keda-operator"]}
  1. Edit the template.yaml file and change the target to control-plane
spec:
  target: control-plane
  1. Install modular kyma on the k3d cluster
kyma alpha deploy  --template=./template.yaml

- Kustomize ready
- Lifecycle Manager deployed
- Module Manager deployed
- Modules deployed
- Kyma CR deployed
- Kyma deployed successfully!

Kyma is installed in version:
Kyma installation took:		18 seconds

Happy Kyma-ing! :)

Kyma installation is ready, but no module is activated yet

kubectl get kymas.operator.kyma-project.io -A
NAMESPACE    NAME           STATE   AGE
kcp-system   default-kyma   Ready   71s

Keda Module is a known module, but not activated

kubectl get moduletemplates.operator.kyma-project.io -A 
NAMESPACE    NAME                  AGE
kcp-system   moduletemplate-keda   2m24s
  1. Enable Keda in Kyma

Edit Kyma CR ...

kubectl edit kymas.operator.kyma-project.io -n kcp-system default-kyma

..to add Keda module

spec:
  modules:
  - name: keda
  1. Inspect logs from module manager:
k logs -n kcp-system module-manager-controller-manager-68559c85cf-jssgw manager -f

1.665754880968694e+09	ERROR	Reconciler error	{"controller": "manifest", "controllerGroup": "operator.kyma-project.io", "controllerKind": "Manifest", "manifest": {"name":"default-kyma-keda","namespace":"kcp-system"}, "namespace": "kcp-system", "name": "default-kyma-keda", "reconcileID": "48060106-abe8-4e8e-be16-57565aab77f3", "error": "Get \"https://registry.localhost:5001/v2/\": dial tcp 172.22.0.3:5001: connect: connection refused; Get \"http://registry.localhost:5001/v2/\": dial tcp 172.22.0.3:5001: connect: connection refused"}

Troubleshoot

When I inspect the registries at my k3d cluster i see

cat /etc/rancher/k3s/registries.yaml 
mirrors:
  registry.localhost:5000:
    endpoint:
    - http://registry.localhost:5000
  registry.localhost:5001:
    endpoint:
    - http://registry.localhost:5000
@jakobmoellerdev
Copy link

jakobmoellerdev commented Oct 14, 2022

Thanks for your detailed issue description:

I went through this step-by-step, following things came up:

  1. When trying to download kyma and there is no bin directory, the Make command make module-build errors
  2. When building the image and running the tests, they run in over 1 minute, I suggest you look into your test setup again (this is a long time for controller-runtime tests)
    (github.com/kyma-project/keda-manager/operator/controllers 75.286s)
  3. (probably not happening to most) When invoking make module-image, you might receive the following error:
failed to do request: Head "https://registry.localhost:5001/v2/unsigned/operator-images/keda-operator/blobs/sha256:f7cbc5ea5f8675be2357a6b6915b69d73f9ab94941d02f90ead5ae9421eeaaa3": http: server gave HTTP response to HTTPS client

=> this is because after adding the entry to etc/hosts, docker might be in containerd mode, which is still experimental. Make sure to disable this setting:
Under Experimental Features in docker desktop, make sure you have Use containerd for pulling and storing images disabled.

  1. This is your actual problem:
    Because pods inside the k3d cluster use the docker-internal port of the registry, it will try to resolve the registry against port 5000 instead of 5001. This causes the mapping to use 5000 instead of 5001. You can verify this by running
kubectl run -i --tty clustercurl --image=alpine --restart=Never -- sh -c 'apk --update add curl && curl registry.localhost:5001/v2/_catalog'; k delete pod clustercurl
kubectl run -i --tty clustercurl --image=alpine --restart=Never -- sh -c 'apk --update add curl && curl registry.localhost:5000/v2/_catalog'; k delete pod clustercurl

and observing that 5000 connects while 5001 does not.

The reason for this is that while k3s DOES respect /etc/rancher/k3s/registries.yaml and will inject this logic for every image pull inside the cluster, module-manager is not part of k3s and thus also does not know how to properly alias registry.localhost:5001. There is not really anything we can do here except adjust the documentation better or introduce aliasing for the module-manager (I think local registries are the only use case though).

There is an easy fix for you right now:

After creating the ModuleTemplate, edit the ModuleTemplate with kubectl edit moduletemplate moduletemplate-keda -n kcp-system
and change the existing repository context in spec.descriptor.component:

repositoryContexts:                                                                           
- baseUrl: registry.localhost:5000/unsigned                                                   
  componentNameMapping: urlPath                                                               
  type: ociRegistry

This will result in the ModuleManager picking up the internal port instead of the external port, and it will be able to resolve the registry correctly.

Note that while I tested your walkthrough (kyma-project/keda-manager@a399b60), module-manager still does not have the permissions in local mode for creating CRDs (since in local-mode it uses service account, while in remote mode, an administrative kubeconfig is expected):

chart installation failure for kcp-system/default-kyma-keda!!! : could not render existing resources from manifest: could not get information about the resource kedas.operator.kyma-project.io / : customresourcedefinitions.apiextensions.k8s.io \"kedas.operator.kyma-project.io\" is forbidden: User \"system:serviceaccount:kcp-system:module-manager-manager\" cannot get resource \"customresourcedefinitions\" in API group \"apiextensions.k8s.io\" at the cluster scope

This can be circumvented by giving the ClusterRole module-manager-manager-role additional permissions (workaround for now) with kubectl edit clusterrole module-manager-manager-role:

- apiGroups:                                                                                                                  
  - "*"                                                                                                                       
  resources:                                                                                                                  
  - "*"                                                                                                                       
  verbs:                                                                                                                      
  - "*"

This will then lead to a successful installation of keda-manager.

I want to note that currently there is a misconfiguration of the default configuration of keda-manager, as it tries to install RBACs for external.metrics.k8s.io/v1beta1 which does not exist in a default kubernetes cluster. This leads to module-manager erroring out on the reconciliation watch for this configuration. It would be great if you could double check what the requirement is here and when you really need this CRD. It could very well be that module-manager is incorrectly watching these resources (FYI @adityabhatia It might be worth offering an automation/configuration guide to give module manager more privileges. Alternatively we could think of a reconciliation loop in lifecycle-manager that adjusts the permissions of module-manager if it parses a module in control-plane mode as it knows module-manager needs more permissions.)

Please Note that this RBAC change will not be suitable for production and can only be used for testing purposes and to circumvent a least-privilege module-manager in local mode. In the real world, module-manager in local mode should be installed with the correct privilege set by the runtime administrator (knowing which modules will be installed on the cluster). We are still figuring out what the best solution is here for automation, so stay tuned until we have a better solution for the RBAC escalation here. For remote management, we will work with kubeconfigs assigned to a service account which will be auto-provisioned based on the modules. (@janmedrek FYI this is a big - if not the biggest - problem scenario for RBAC Provisioning with an Operator for remote clusters)

@jakobmoellerdev jakobmoellerdev self-assigned this Oct 14, 2022
@jakobmoellerdev jakobmoellerdev added the kind/bug Categorizes issue or PR as related to a bug. label Oct 14, 2022
@jakobmoellerdev jakobmoellerdev changed the title Module manager cannot fetch module OCI images from loca registry in k3d setup Module manager cannot fetch module OCI images from local registry in k3d setup as registry aliases of k3s are not respected Oct 14, 2022
@kwiatekus
Copy link
Author

Thanks @jakobmoellersap. The adjustments in template and adding extra permissions to module manager made it work.
And thanks for the additional feedback.

@m00g3n
Copy link

m00g3n commented Oct 18, 2022

Hey @jakobmoellersap, many thx for your input.

  1. When trying to download kyma and there is no bin directory, the Make command make module-build errors

This was inherited (together with fixed os) from sample project, we will address it here.

  1. When building the image and running the tests, they run in over 1 minute, I suggest you look into your test setup again (this is a long time for controller-runtime tests)

We did not have yet time to fine tune the project (however we've prepared the PR to migrate to ginkgo v2 which will be a first step)

  1. This is your actual problem ...

It sounds like it could be documented or/and added to Makefile in sample project.
This may be a common issue when more teams will start to create modules.

Note that while I tested your walkthrough (kyma-project/keda-manager@a399b60), module-manager still does not have the permissions in local mode for creating CRDs (since in local-mode it uses service account, while in remote mode

We already have this issue open - we should discuss RBACs there.
Anyway, we've plan to create an overlay in kustomize to have RBACs for testing (if it will be needed); the current solution is temporary.

I want to note that currently there is a misconfiguration of the default configuration of keda-manager

We are aware of this keda issue however it's beyond the scope of this issue.

@jakobmoellerdev
Copy link

A quick update on this ticket:

With this said, I would consider this issue closed. Please feel free to reopen if you face further issues with the ports.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants