Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application in any namespace | Synced with NO resources deployed #11638

Open
ironoa opened this issue Dec 9, 2022 · 43 comments
Open

Application in any namespace | Synced with NO resources deployed #11638

ironoa opened this issue Dec 9, 2022 · 43 comments
Labels
apps-in-any-namespace Issues related to the "Apps in any namespace" feature introduced in 2.5 bug/status:cannot-reproduce Cannot reproduce issue yet bug Something isn't working

Comments

@ironoa
Copy link

ironoa commented Dec 9, 2022

Describe the bug

Maybe someone can help me debug an issue I'm facing:

I have an ArgoCD instance deployed with helm, chart version 5.16.2, app version v2.5.4

I'm trying to enable the application-in-any-namespace feature... basically I added the application.namespaces: '*' config as a configs.params value.

When I deploy an Application in a namespace which is not the one where argocd is installed, the application now gets recognized (thanks to the above mentioned config), but it becomes immediately synced without deploying anything else... I'm also not getting any errors anywhere apparently...

FYI

  • I defined a dedicated AppProject with the spec.sourceNamespaces: '*' config.
  • The very same Application deployed in the argocd (my std) namespace produces the expected result instead

Any hint ? thanks

To Reproduce

  • deploy Argocd helm chart, configure the value configs.params with application.namespaces: '*'
  • deploy an AppProject with the spec.sourceNamespaces: '*' config
  • deploy an Application (which uses the Project defined in the previous step) in a namespace which is not the Argo one
  • the Application will get recognized, but no further resources will get deployed

Expected behavior

When I deploy an Application in a namespace which is not the one where argocd is installed, the application now gets recognized (thanks to the above mentioned config), but it becomes immediately synced without deploying anything else (pods, svc, ...)... I'm also not getting any errors anywhere apparently...

The very same Application deployed in the argocd (my std) namespace produces the expected result instead

Screenshots

image

Additional context

I opened an issue also here

@ironoa ironoa added the bug Something isn't working label Dec 9, 2022
@sommerit
Copy link

sommerit commented Dec 19, 2022

@ironoa
Have you try to deploy with

server:
  extraArgs:
    - --application-namespaces="*"

and

controller:
  extraArgs:
    - --application-namespaces="*"

in the helm Chart ?

@crenshaw-dev
Copy link
Collaborator

application.namespaces should be picked up by by both the server and the controller. Did you restart the components after applying that config?

Aside: try to very, very quickly narrow the *s, especially in the AppProject, to something more specific once things are up and running. 🙂

@ironoa
Copy link
Author

ironoa commented Dec 21, 2022

application.namespaces should be picked up by by both the server and the controller. Did you restart the components after applying that config?

indeed, application.namespaces parameter is taken by both the controller and the server

In particular, so far:

  • I verified that both the server and the controller pods have an environment variable set for ARGOCD_APPLICATION_NAMESPACES (accessed the pods and echo)
  • I set application.namespaces to *
  • I set spec.sourceNamespaces to * among other fields in the AppProject:
    • image
  • I tried to set application.namespaces and spec.sourceNamespaces to a specific namespace (another test) rather than *
  • I restarted everything multiple times
  • I tried also to edit the argocd-server cluster role to make it similar to the namespaced argocd-server role (maybe I have to do something smarter here?)

The outcome is always:

  • the App is detected, either I create it via App of App (root in argocd, app in test-namespace) or manually deploy the app in test-namespace

  • the manifest is detected

    • image
  • everything looks healthy, nothing gets deployed

    • image
  • same app, deployed in argocd namespace works as expected

    • image
  • PS if ARGOCD_APPLICATION_NAMESPACES envs are not properly set the app doesn't even get detected, tested, as expected

The biggest problem is the fact I haven't been able to spot an error message anywhere so far
=> Any ideas on how to debug this ?

@Alegrowin
Copy link

Was able to make v2.5.5 work by adding the application.namespaces to the argocd-cmd-params-cm.

I had no other choices then to kill the pods to make them reload the config.
Might look into rolling in the https://github.com/stakater/Reloader soon.

Hope that helps!

@ironoa
Copy link
Author

ironoa commented Dec 21, 2022

@ironoa Have you try to deploy with

server:
  extraArgs:
    - --application-namespaces="*"

and

controller:
  extraArgs:
    - --application-namespaces="*"

in the helm Chart ?

just to be meticulous, tried also that: not working

@Alegrowin
Copy link

Using kustomize would look somehow like this:

---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

bases:
  - github.com/argoproj/argo-cd//manifests/ha/cluster-install?ref=v2.5.5

configMapGenerator:
  - behavior: merge
    literals:
      - server.insecure="true"
      - application.namespaces="*"
    name: argocd-cmd-params-cm
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
  name: test
  namespace: argocd
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  clusterResourceWhitelist:
    - group: '*'
      kind: '*'
  destinations:
    - namespace: 'test-dev'
      server: 'https://kubernetes.default.svc'
    - namespace: 'test-uat'
      server: 'https://kubernetes.default.svc'
    - namespace: 'test-prd'
      server: 'https://kubernetes.default.svc'
  sourceRepos:
    - '*'
  sourceNamespaces:
  - test-*

@Alegrowin
Copy link

@ironoa did you remove your pods after changing the configmap?

If not, try this

kubectl scale statefulset/argocd-application-controller --replicas=0  -n argocd 
kubectl scale deployment/argocd-server --replicas=0  -n argocd 

kubectl scale statefulset/argocd-application-controller --replicas=1  -n argocd 
kubectl scale deployment/argocd-server --replicas=1  -n argocd 

@ironoa
Copy link
Author

ironoa commented Dec 21, 2022

Was able to make v2.5.5 work by adding the application.namespaces to the argocd-cmd-params-cm.

I had no other choices then to kill the pods to make them reload the config. Might look into rolling in the https://github.com/stakater/Reloader soon.

Hope that helps!

not sure what you mean, I just restarted every pod with kubectl delete --all pods in the argocd namespace, no luck

image

btw thanks for the support, I really don't know how to debug this (yet)

@Alegrowin
Copy link

Can you run
kubectl get configmap argocd-cmd-params-cm -n argocd -o yaml

and validate it contains

data:
  application.namespaces: '*'

@ironoa
Copy link
Author

ironoa commented Dec 21, 2022

@ironoa did you remove your pods after changing the configmap?

If not, try this

kubectl scale statefulset/argocd-application-controller --replicas=0  -n argocd 
kubectl scale deployment/argocd-server --replicas=0  -n argocd 

kubectl scale statefulset/argocd-application-controller --replicas=1  -n argocd 
kubectl scale deployment/argocd-server --replicas=1  -n argocd 

Tried just to be meticulous, but as I already said here, one of the first things I tried to do was accessing the pods (exec) and asses by echoing that the ARGOCD_APPLICATION_NAMESPACES env variable is set for both the server and the controller

@Alegrowin
Copy link

Maybe upgrading helm chart 5.16.9, which uses v2.5.5 might help?

@ironoa
Copy link
Author

ironoa commented Dec 21, 2022

Can you run kubectl get configmap argocd-cmd-params-cm -n argocd -o yaml

and validate it contains

data:
  application.namespaces: '*'

kubectl get configmap argocd-cmd-params-cm -n argocd -o yaml | grep application.namespaces

image

@ironoa
Copy link
Author

ironoa commented Dec 21, 2022

Maybe upgrading helm chart 5.16.9, which uses v2.5.5 might help?

I'm there already, both with the chart and the argo version

image

@ironoa
Copy link
Author

ironoa commented Dec 21, 2022

@Alegrowin out of this scenario, what are possible causes for an app being present, considered healthy, but nothing deployed underneath ?

So far it happened to me when:

  • the custom values of a plugin were misconfigured ( wrong tabulation, etc... ) => but that lead to the manifest being empty and the repo server sidecars issuing errors... FYI weirdly enough from the UI the app looked healthy (with nothing deployed, or warning you that syncing would have deleted everything)

In this case the manifest is there and looks as it should be, and indeed the very same app deployed in argocd namespace works

@ironoa
Copy link
Author

ironoa commented Dec 22, 2022

@Alegrowin I've just spot this 2 errors:

"time=\"2022-12-21T22:29:47Z\" level=error msg=\"Unable to create audit event: events is forbidden: User \\\"system:serviceaccount:argocd:argocd-server\\\" cannot create resource \\\"events\\\" in API group \\\"\\\" in the namespace \\\"test-argo-namespace\\\"\" application=test-argo-namespace dest-namespace=test-argo-namespace dest-server=\"https://kubernetes.default.svc\" reason=ResourceDeleted type=Normal\n"
"time=\"2022-12-21T22:29:47Z\" level=error msg=\"finished streaming call with code Unknown\" error=\"error getting app resource tree: cache: key is missing\" grpc.code=Unknown grpc.method=WatchResourceTree grpc.service=application.ApplicationService grpc.start_time=\"2022-12-21T22:29:35Z\" grpc.time_ms=12039.218 span.kind=server system=grpc\n"

both coming from the server component... nothing else other than that matches level=error
PS test-argo-namespace is exactly the namespace I'm using to test/debug this issue (referred in the first error log)

@Alegrowin
Copy link

@ironoa I was able to reproduce, here something that might help

Using kustomize, I basically patched the argocd-server ClusterRole resource to look alike the argocd-server Role since it is now managing resources in other namespaces as well.

Since you are using helm, the template will need to change (feature flag maybe?) so the sa/argocd-server can update resource in other namespace.

This ClusterRole needs to look like this [Role}(https://artifacthub.io/packages/helm/argo/argo-cd?modal=template&template=argocd-server/role.yaml)

@Alegrowin
Copy link

And also this in case you are using kustomize.

@crenshaw-dev

@ironoa
Copy link
Author

ironoa commented Dec 23, 2022

@ironoa I was able to reproduce, here something that might help

Using kustomize, I basically patched the argocd-server ClusterRole resource to look alike the argocd-server Role since it is now managing resources in other namespaces as well.

FYI If you are right then this commit/pr has never been enough and needs to be extended...

Still something is missing... I also tried to restart all the pods and recreate the application but it doesn't work... the Application looks healthy but it doesn't deploy anything...

LAST SEEN   TYPE     REASON            OBJECT                             MESSAGE
59m         Normal   ResourceUpdated   application/test-argo-namespace    Updated sync status:  -> Synced
59m         Normal   ResourceUpdated   application/test-argo-namespace    Updated health status:  -> Healthy

Here my role and (modified) cluster role description:

#role
PolicyRule:
  Resources                    Non-Resource URLs  Resource Names  Verbs
  ---------                    -----------------  --------------  -----
  applications.argoproj.io     []                 []              [create get list watch update delete patch]
  applicationsets.argoproj.io  []                 []              [create get list watch update delete patch]
  appprojects.argoproj.io      []                 []              [create get list watch update delete patch]
  configmaps                   []                 []              [create get list watch update patch delete]
  secrets                      []                 []              [create get list watch update patch delete]
  events                       []                 []              [create list]
#cluster role
PolicyRule:
  Resources                    Non-Resource URLs  Resource Names  Verbs
  ---------                    -----------------  --------------  -----
  applications.argoproj.io     []                 []              [create get list watch update delete patch]
  applicationsets.argoproj.io  []                 []              [create get list watch update delete patch]
  appprojects.argoproj.io      []                 []              [create get list watch update delete patch]
  configmaps                   []                 []              [create get list watch update patch delete]
  secrets                      []                 []              [create get list watch update patch delete]
  events                       []                 []              [create list]

am I missing something ?

@ironoa
Copy link
Author

ironoa commented Dec 23, 2022

"time=\"2022-12-21T22:29:47Z\" level=error msg=\"finished streaming call with code Unknown\" error=\"error getting app resource tree: cache: key is missing\" grpc.code=Unknown grpc.method=WatchResourceTree grpc.service=application.ApplicationService grpc.start_time=\"2022-12-21T22:29:35Z\" grpc.time_ms=12039.218 span.kind=server system=grpc\n"

=> https://github.com/argoproj/argo-cd/blob/master/server/application/application.go#L1383

@lusu007
Copy link
Contributor

lusu007 commented Dec 28, 2022

Something similar happens in my Argo-CD setup. I'm using AVP to replace placeholder in Secrets with values from Vault.
AVP has a discover command to find Helm charts in which it could replace placeholders. When a Helm Chart is detected in ArgoCD Application, ArgoCD syncs the application with no resources.
image

@jannfis
Copy link
Member

jannfis commented Dec 28, 2022

Hi folks, sorry for chiming in so late.

Have you set the resource tracking method to annotation or annotation+label? I believe what you see is the following:

  • Two applications with a similar name, but deployed to different namespaces
  • One of them (in argocd namespace) is synced and has resources
  • The other (in different namespace) sees the resources from the other application (by label tracking), but those are not permitted in the app's AppProject and thus not displayed

This has not been properly documented in 2.5, but will be in 2.6:

https://argo-cd.readthedocs.io/en/latest/operator-manual/app-any-namespace/#switch-resource-tracking-method

@lusu007
Copy link
Contributor

lusu007 commented Dec 28, 2022

Hey @jannfis,
thank you for your response. I had not changed the setting until now. But after changing it to annotation+label I, unfortunately, see no difference in ArgoCD's behavior.

I set CreateNamespace=true in my setup and the namespaces are also not created.

@ironoa
Copy link
Author

ironoa commented Dec 30, 2022

Hi @jannfis, thanks for your answer.

To clarify:

* Two applications with a similar name, but deployed to different namespaces

This assumption is not matching my case: I have just one Application, with an unique name, deployed in an unique namespace (tested also with a namespace named "t", very short). Anyway, so far carried multiple tests with resource tracking method set to: annotation, annotation+label, label(default) after your advice. Unfortunately that was not the solution.

FYI I've been trying to be very meticulous here. What I've done as a complementary test (only one app at a time) was taking the very same Application and deploying it in the argocd namespace, to asses that the problem is not the Application definition but just the fact it is deployed in a custom namespace. Full test description here

Works
image
Doesn't work: detected, synced and healthy with no resources deployed
image

@ironoa
Copy link
Author

ironoa commented Dec 30, 2022

Something similar happens in my Argo-CD setup. I'm using AVP to replace placeholder in Secrets with values from Vault. AVP has a discover command to find Helm charts in which it could replace placeholders. When a Helm Chart is detected in ArgoCD Application, ArgoCD syncs the application with no resources. image

Hey @lusu007 not sure this is related although the outcome is very similar. Are you actually deploying the Application in a namespace which is not the default one (i.e. argocd?)

I'm using AVP as well and I faced the same issue you mentioned (again, probably your issue is not correlated to the one of this issue if I'm right). To DEBUG: If you are using the sidecar approach try to log one of the sidecar pods, you might find an error there. If you are using the configmap approach, try to log the repo-server pod.
FYI I had errors in there (wrong identation in the plugin inline values, wrong vault setup, ...), and the outcome was indeed an healthy and synced status (error handling with plugins to be improved imho). I actually mentioned that here above

Just to clarify, for this Application I'm obviously not using any plugin, I'm focused only on the application-in-any-namespace feature. The main problem here is that I don't see any errors anywhere so I cannot really debug the issue

@lusu007
Copy link
Contributor

lusu007 commented Jan 3, 2023

Hey @ironoa, thank you for your detailed answers.

I tried to debug AVP further with a colleague.
We found out that Helm has read-only permissions to the cache folder. After moving the cache folders with the Helm env variables (HELM_CACHE_HOME, HELM_CONFIG_HOME, HELM_DATA_HOME) to /tmp/helm/... the permissions issue was fixed.
After that, all that was left to do was to either add the repositories in the init script or run helm dep update instead of helm dep build.

Now everything works. Thanks a lot for the detailed help!

A working AVP config:

apiVersion: argoproj.io/v1alpha1
kind: ConfigManagementPlugin
metadata:
  name: argocd-vault-plugin-helm
spec:
  # https://argocd-vault-plugin.readthedocs.io/en/stable/usage/#with-helm
  allowConcurrency: true
  discover:
    find:
      command:
        - sh
        - "-c"
        - "find . -name 'Chart.yaml' && find . -name 'values.yaml'"
  init:
    command:
      - sh
      - "-c"
      - "helm dependency update"
  generate:
    command:
      - sh
      - "-c"
      - "helm template $ARGOCD_APP_NAME --include-crds . | argocd-vault-plugin generate -"
  lockRepo: false

@ironoa
Copy link
Author

ironoa commented Jan 30, 2023

version v2.5.9, still facing the issue...

@jannfis
Copy link
Member

jannfis commented Jan 31, 2023

Sorry for chiming in again a little late.

Just a question from the comments I've read so far: Are the affected apps all have manifests generated by a plugin?

@jannfis
Copy link
Member

jannfis commented Jan 31, 2023

Just to clarify, for this Application I'm obviously not using any plugin, I'm focused only on the #9755 feature. The main problem here is that I don't see any errors anywhere so I cannot really debug the issue

OK, so probably not :)

@jannfis
Copy link
Member

jannfis commented Jan 31, 2023

@ironoa Referring to #11638 (comment), are there really no resources deployed to the cluster, or are they just not visible from the UI?

@jannfis
Copy link
Member

jannfis commented Jan 31, 2023

Also, what's weird about the two screenshots, that the one that is not working lacks the sync status field, i.e:

image

vs

image

@ironoa
Copy link
Author

ironoa commented Jan 31, 2023

@ironoa Referring to #11638 (comment), are there really no resources deployed to the cluster, or are they just not visible from the UI?

No resources deployed to the cluster, just the Application resource in the target namespace

@rshiva777
Copy link

Hello All,

I too faced similar kind of issue, after creating of an application (helm chart) in argocd, none of the resources are displaying. The issue was due to the wrong format of the values.yaml file which was causing the issue.

Proper syntax of the values.yaml has fixed my issue.

@entanglesoftware
Copy link

entanglesoftware commented Feb 8, 2023

image

Hi @crenshaw-dev,

I am facing the same issue. Please help us out or guide us so that we can debug and raise a PR to improve the product. Let me know if you need more details. Unfortunately, We are stuck here. Many thanks in advance!!

@jannfis
Copy link
Member

jannfis commented Feb 9, 2023

I'm having a hard time reproducing this issue. For me, it works as expected.

Do all of you facing this problem have installed Argo CD through the Helm chart?

@entanglesoftware
Copy link

Hi @jannfis , I am not sure of others, but i am using it. We are using helm chart to deploy the resources. We are doing following steps:

argo-install:
	helm repo add argocd https://argoproj.github.io/argo-helm
	helm repo update
	kubectl apply -f ${ARGO_REPO}/namespace.yaml
	helm upgrade --install argocd argocd/argo-cd --namespace argocd \
	--values ${ARGO_REPO}/install.yaml

@entanglesoftware
Copy link

Is istio can be reason for it ? I am using service mesh for it as i need to connect with ingress.

@entanglesoftware
Copy link

Hi @jannfis ,

I tried with the kubectl installation too. But it's still not deploying.

@entanglesoftware
Copy link

Thanks @jannfis , I got it working, the issue was in helm chart.

@armantoko
Copy link

what was the issue on the helm chart?
I am facing the same issue trying to deploy lightdash
would help a lot

@entanglesoftware
Copy link

Well. First try to deploy some popular helm chart like bitmani one for apache server and if it does, you confirmed that issue is your helm chart. I don't had issue in helm chart too, it's just that because i didn't include all the files needed for deployment.

@jannfis jannfis added apps-in-any-namespace Issues related to the "Apps in any namespace" feature introduced in 2.5 bug/status:cannot-reproduce Cannot reproduce issue yet labels Apr 4, 2023
@jdoylei
Copy link

jdoylei commented Apr 14, 2023

@ironoa - Hi! I just recently spent time testing apps-in-any-namespace in our Argo CD instance, specifically to enable secure app-of-apps. It's working for us with Argo CD v.2.5.16 (the multi-tenant HA namespace install). I had a couple suggestions. At one point, we had a similar situation in the GUI (synced Application appears with an empty tree, no child resources), but I think not for the same reason. You mentioned that your child resources are not actually being deployed to the cluster, but in our case the child resource (another Application) was deployed to the cluster, it's just that Argo CD was filtering it out when syncing the cluster state.

  • Do you have any resource.inclusions or resource.exclusions set up in argocd-cm? We were whitelisting resources and needed to make sure Application was included for the local cluster, and we had a tricky issue where "https://kubernetes.default.svc" did not work for this, it would only match if we used the API server IP address.
  • It sounds like you revisited the argocd-server role a couple times. We had to insure that Application and Event access was granted, to both argocd-server and argocd-application-controller, including view access at the cluster level. If it's possible in your environment, maybe you could try temporarily binding argocd-server and argocd-application-controller to "cluster-admin", to eliminate any possibility of RBAC errors.
  • What does the cluster setting for in-cluster/https://kubernetes.default.svc look like? Is the namespace field "All namespaces" or specific namespaces? (E.g. if you look at argocd cluster get in-cluster.)
  • Similar to above (argocd cluster get in-cluster), what status does it give you for the cluster? Is the connection Successful? What are the counts of applications, APIs, and resources? When our resource.inclusions/exclusions settings were messing up the cluster sync, we wouldn't get the correct counts here.
  • We had trouble at times getting things to appear or be included by virtue of the cluster name or URL matching in different places. Are you currently using a wildcard * in your Project's destination and "server: https://kubernetes.default.svc" in your Application's destination? Is there any change if you make them exactly the same?
  • What does the CLI report for your Application? Can you see any child resources running argocd app get test-argo-namespace? Or running argocd app resources test-argo-namespace? When we were having trouble with cluster syncing, we would get a list of child resources from "app get" but an empty list from "app resources".
  • Can you see any sync warnings or errors from kubectl? E.g. kubectl get Application -A -o jsonpath='{range .items[*]}{.metadata.namespace} {.metadata.name}{"\n"}{.status.conditions}{"\n"}{"\n"}{end}'

Hope this helps!

@bertrandmartel
Copy link

@Alegrowin Thanks a lot, I had a similar issue with the following error when attempting to create an app on another namespace:

FATA[0001] rpc error: code = PermissionDenied desc = error creating application: applications.argoproj.io is forbidden: User "system:serviceaccount:argocd:argocd-server" cannot create resource "applications" in API group "argoproj.io" in the namespace "****" 

I've edited ClusterRole using:

kubectl edit ClusterRole argocd-server -n argocd

and replaced the array item where apiGroups contain argoproj.io with:

- apiGroups:
  - argoproj.io
  resources:
  - applications
  - applicationsets
  - appprojects
  verbs:
  - create
  - get
  - list
  - watch
  - update
  - delete
  - patch

@ervinb
Copy link

ervinb commented Jun 16, 2023

Same thing happened to me when using a CMP (running helm template) with a chart which is using Helm's lookup function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
apps-in-any-namespace Issues related to the "Apps in any namespace" feature introduced in 2.5 bug/status:cannot-reproduce Cannot reproduce issue yet bug Something isn't working
Projects
None yet
Development

No branches or pull requests