Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot sync applications - clusterrole insufficient permissions #385

Closed
castleadmin opened this issue Jul 31, 2021 · 6 comments
Closed

Cannot sync applications - clusterrole insufficient permissions #385

castleadmin opened this issue Jul 31, 2021 · 6 comments
Labels
documentation Improvements or additions to documentation fixed-in-latest Fix is already made and waiting for new release

Comments

@castleadmin
Copy link

castleadmin commented Jul 31, 2021

Describe the bug
Syncing applications don't work because of insufficient permissions.
The argocd operator and the argocd resource have been created inside of the argocd namespace.

The argocd operator provides the following cluster roles:

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: argocd-argocd-argocd-application-controller
  ...
rules:
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - '*'
    resources:
      - '*'
  - verbs:
      - get
      - list
    nonResourceURLs:
      - '*'
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: argocd-argocd-argocd-server
  ...
rules:
  - verbs:
      - get
    apiGroups:
      - '*'
    resources:
      - '*'
  - verbs:
      - list
    apiGroups:
      - ''
    resources:
      - events
  - verbs:
      - get
    apiGroups:
      - ''
    resources:
      - pods
      - pods/log

The argocd-argocd-argocd-application-controller clusterrole can only use the verbs (get, list, watch) on all resources. Therefore, it isn't able to create resources for the argocd applications during the synchronization.

The clusterroles created by the argocd operator differs from the clusterroles defined in the install.yaml of argocd.

argocd install.yaml clusterroles

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app.kubernetes.io/component: application-controller
    app.kubernetes.io/name: argocd-application-controller
    app.kubernetes.io/part-of: argocd
  name: argocd-application-controller
rules:
- apiGroups:
  - '*'
  resources:
  - '*'
  verbs:
  - '*'
- nonResourceURLs:
  - '*'
  verbs:
  - '*'
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app.kubernetes.io/component: server
    app.kubernetes.io/name: argocd-server
    app.kubernetes.io/part-of: argocd
  name: argocd-server
rules:
- apiGroups:
  - '*'
  resources:
  - '*'
  verbs:
  - delete
  - get
  - patch
- apiGroups:
  - ""
  resources:
  - events
  verbs:
  - list
- apiGroups:
  - ""
  resources:
  - pods
  - pods/log
  verbs:
  - get

To Reproduce
Steps to reproduce the behavior:

  1. install the argocd-operator in the argocd namespace
  2. create the argocd resource inside of the argocd namespace
  3. create an application with the help of the argocd UI that is deployed to another namespace
  4. try to sync the application

Expected behavior
argocd should be able to create resources in other namespaces

@shubhamagarwal19
Copy link
Contributor

Hey @seed-248, thanks for reporting this!
The current implementation allows users to deploy ArgoCD instances with minimum permissions. This only allows the ArgoCD instance to manage resources in its own namespace.
If a user wishes to provide the ArgoCD instance permissions to manage resources in all namespaces then it needs to set an ENV variable in the argocd-operator deployment, i.e ARGOCD_CLUSTER_CONFIG_NAMESPACES and the value being the list of namespaces where argocd instance is installed.
example - if ArgoCD instance is installed in namespace argocd, then -

  env:
  -  name: ARGOCD_CLUSTER_CONFIG_NAMESPACES
     value: argocd

This would provide the argocd instance deployed in namespace argocd the permissions to manage resources in all namespaces.

@shubhamagarwal19 shubhamagarwal19 added the documentation Improvements or additions to documentation label Aug 2, 2021
@castleadmin
Copy link
Author

Hey @shubhamagarwal19,
thank you for pointing out the possibility to influence the permissions via the ARGOCD_CLUSTER_CONFIG_NAMESPACES environment variable.

I tried to get it run by setting the env variable in the OLM subscription, but wasn't successful so far:

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: argocd
  namespace: argocd
spec:
  channel: alpha
  name: argocd-operator
  source: operatorhubio-catalog
  sourceNamespace: olm
  config:
    env:
    - name: ARGOCD_CLUSTER_CONFIG_NAMESPACES
      value: argocd

The environment variable is available inside the argocd-operator pod.

[argocd-operator@argocd-operator-84fd6bd794-jlrk7 /]$ echo $ARGOCD_CLUSTER_CONFIG_NAMESPACES
argocd

However, the cluster role for the argocd-application-controller still looks the same:

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: argocd-argocd-argocd-application-controller
  ...
rules:
  - verbs:
      - get
      - list
      - watch
    apiGroups:
      - '*'
    resources:
      - '*'
  - verbs:
      - get
      - list
    nonResourceURLs:
      - '*'

@shubhamagarwal19
Copy link
Contributor

@seed-248, ahhh my bad!! The issue has been fixed but didn't make the cut for v0.0.15 release.
We are planning to release a new version for argocd-operator soon, so hopefully, that would resolve this!

@shubhamagarwal19 shubhamagarwal19 added the fixed-in-latest Fix is already made and waiting for new release label Aug 3, 2021
@RoboticHuman
Copy link

RoboticHuman commented Nov 2, 2021

@shubhamagarwal19 , is there a temporary workaround at the moment to allow for this use case, even if it requires a manual intervention?

Edit: never mind, i was able to solve the problem temporarily by assigning an extra clusterrolebinding with the required permissions

mbaldessari added a commit to mbaldessari/common that referenced this issue Dec 14, 2021
…nt namespaces

Otherwise we can error out with:

  Cluster level ClusterRoleBinding vault-server-binding cannot be managed
  when in namespaced mode

By setting ARGOCD_CLUSTER_CONFIG_NAMESPACES to the namespaces where
argocd instance is installed we allow it to create cluster wide
resources for all namespaces.

Tested this and now I am correctly able to invoke the vault helm chart
from argo without the above error.

References:
- argoproj/argo-cd#5886
- argoproj-labs/argocd-operator#385
@iam-veeramalla
Copy link
Collaborator

Hi @seed-248 , you can now make use of the ARGOCD_CLUSTER_CONFIG_NAMESPACES env var. Please upgrade to the latest version.

claudiol added a commit to hybrid-cloud-patterns/example that referenced this issue Mar 1, 2022
* Provide more subscription defaults

* Review and simplify argo sync options

* Ignore objects derived from our ACM policies

* See if we can set SkipDryRunOnMissingResource att the application level

* Revert "See if we can set SkipDryRunOnMissingResource att the application level"

It was not possible

This reverts commit 2ca42ed.

* Add more subscription defaults

* Support to kustomize targets

* Support applications using plugins

* Cannot assume the subscription name and csv prefix match

* Allow custom ignoreDifferences values

* Helper for deploying the repo accurately

* Include the site name in the initial app

* Simple site to test argo plugins

* Pass site value files when using the helm-with-kustomize plugin

* Use the app name when rendering the helm template

* Add missing file

* kustomize debugging

* The helm-with-kustomize example is useful to keep around

* Additional help targets

* Support loading just the site application on install

* Make it easy to drive the bootstrap option from the helper

* Make it easy to kustomize the secrets location from the helper

* Limit helm's template name to under 53 chars

* Create the application name ordered by significance

* Site name is included in the site's .Release.Name

* Try to stay under helm's name template max of 53 chars

* Fix the application role binding to refer to the actual policy

* Use the managed site name for the application and values file

* Enforce quoting of helmOverrides for acm policy applications

* Provide a default path to the site chart

* Drop references to manuela in common

* Allow namespaces without operatorgroups for charts that already include them

* Add argocd secret handling

* Add argosecret target

* Remove unneeded util dir

* Explain how to use common makefile

* Allow namespaces without operatorgroups for charts that already include them

* Add template toplevel makefile

* Allow templates to know which namespace they're being deployed into

* Refresh the kustomize-renderer example

* Refresh the sample datacenter from manuela

* Don't assume the install symlink is present

* Attempt new template format

* Ensure password has length before exiting wait loop

* Replace makefile template and make embedded shell wait for password to have length as well as 0 exit

* Add then

* Make script explain what it's doing

* make output clearer

* Revert "Attempt new template format"

This reverts commit c463cbc.

* Try putting applications into a dedicated argo instance

* All namespaces are created by the site chart

* Fix default applicaiton namespace handling

* Make the pattern name availble to argo applications

* Give preference to whatever was specified in the secrets file

* Strip off any auth tokens when calculating the target repo

* Ensure there are no spaces in the namespace

* Fix namespace for applications

* Fix namespace for application acm policies

* Include the pattern name to ensure uniqueness

* Update repo name

* Try putting applications into a dedicated argo instance

* All namespaces are created by the site chart

* Fix default applicaiton namespace handling

* Make the pattern name availble to argo applications

* Give preference to whatever was specified in the secrets file

* Strip off any auth tokens when calculating the target repo

* Ensure there are no spaces in the namespace

* Fix namespace for applications

* Fix namespace for application acm policies

* Include the pattern name to ensure uniqueness

* Fix the destination namespace for datacenter manifests

* Try a simpler argo name

* Use a shorter namespace

* acm: Fix the target namespace for the site application policy

* Fix application namespace

* The application name is already unique to the namespace

* Restore gitops to the name of the argocd CR

* Match the service accounts to the argocd name

* Document what argocd values need to be kept in sync

* Updated note regarding argo name

* Change the default 'name' label for namespaces

* Update common Makefile to have more parameters for secret making

* Re-factor install to not require .ONESHELL features as Mac doesn't support them out of the box

* Update Makefile doc and SECRET_NAME parameter

* Don't hardcode SECRET_NAME here

* Move script to scripts dir and call it from there

* New s3 secrets file for central-s3 support

* Took care of merge conflicts with s3-secret.yaml

* Adding functionalist to have a list of namespaces for a particular subscription

* Enhance compatibility with older versions of git

* Trim the example datacenter site

* Support real helm charts too

* Adds the if control to force booleans to be string type for argo on helm/vault overrides

* Improved secrets handling in pipelines (#10)

* Update common with new secret management

* Add values-global to Makefile

Co-authored-by: Wolfgang Kulhanek <WolfgangKulhanek@gmail.com>

* Add note regarding tekton annotation

* Add note regarding tekon annotation

* Ensure updated secret template variables are defined

* Missing template variable

* Update values.yaml

* Avoid assumptions about the site being called datacenter leaking into patterns

* Add missing template variables

* Standardize on Values.secrets for usernames as well as passwords

* Sync the example global values file with the i-e pattern

* Sync the plugin example application with the i-e pattern

* Ensure helm options stay in sync and add a simple test

* Make the test more stable and add missing values

* Extend the unit tests to the acm and site charts

* Ensure the global.repoURL variable is set consistently in tests

* Add some elements to .gitignore

* Fix whitespace in repoURL value in a POSIX-friendly way

* Remove manuela-specific elements and secrets

* Modify tests to match removal of secrets and namespace

* Remove cruft from makefile

* Loosen regex to extract target repo

* Add structure for vault

* Remove vault subdir to prep for alternate inclusion

* Squashed 'vault/' content from commit 9fa25e9

git-subtree-dir: vault
git-subtree-split: 9fa25e97c806073c7dd3274a851181cbb3d67868

* Change site to clustername to allow for multiple clusters in a config group

* Remove staging and adjust tests to reflect that

* Update examples for recent cleanups

* Support ocp authentication for namespaced argos

* Update examples for recent cleanups

* Support ocp authentication for namespaced argos

* Make sure that argo can manage cluster-wide resources from the relevant namespaces

Otherwise we can error out with:

  Cluster level ClusterRoleBinding vault-server-binding cannot be managed
  when in namespaced mode

By setting ARGOCD_CLUSTER_CONFIG_NAMESPACES to the namespaces where
argocd instance is installed we allow it to create cluster wide
resources for all namespaces.

Tested this and now I am correctly able to invoke the vault helm chart
from argo without the above error.

References:
- argoproj/argo-cd#5886
- argoproj-labs/argocd-operator#385

* Add some vault utilities and add a gitignore entry

* Add Makefile target to init and unseal vault

* Add an unseal target and provide it a default in the script

* Initial import of chart from 9348383 of https://github.com/external-secrets/external-secrets

* Remove vault and external secrets - we can install from helm directly

* Protect ourselves from calling vault-init twice

Without this we risk to easily overwrite the seal + token file and hence
lose future access to the vault.

* Add script for copying vault-token

* Add Hub cluster domain as a global variable based on ingress config

* Add code to extract root token

* Add function to wrap an exec including the token

* Add pki init to vault init makefile target

* Expand the PKI domain (knock off the first two domains for the hub cluster, e.g. apps and the next one up to allow the PKI to be bigger

* Correct pki role and domain creation

* Add more functions for secrets management

* pki init is done in vault_init, no need to have a separate makefile task

* Fix the name of the function to initialize the kubernetes backend

Otherwise we'll error out with:
common/scripts/vault-utils.sh: line 85: vault_k8s_init: command not found

* Add --interactive to the oc vault exec calls

This allows us to read stdin and push a file via stdin.
This is particularly useful when configuring the vault policy

* Add a policy init function to setup initial policy for the vault

* Add variable qualification to prevent helm template errors

* Add vault-route to workaround hard-coding of passthrough mode in vault helm chart 0.18.0

* Correct route resource, remove namespace and spell variable correctly

* Fix TTL lease typo in vault-init

Current wrong command:
bash-4.4$ vault secrets tune -max-lease=8760h pki
flag provided but not defined: -max-lease

Add -ttl at the end to fix it:
bash-4.4$ vault secrets tune --max-lease-ttl=8760h pki
Success! Tuned the secrets engine at: pki/

* Remove extra duplicate subcription YAML and force quoting in application install for consistency

* Add local domain to ACM policy

* Propogate localdomain setting to non-privileged argo

* Fix some tests

* Fix remaining tests

* Remove manuela tag from clustergroup chart

* Add extra framework options to level with clustergroup implementation

* Remove vault-route application

Now that vault-helm v0.19.0 has been released we can use it directly to
create the vault route with tls.termination set to 'edge', hence this is
not needed anymore.

* Remove vault-route application

Now that vault-helm v0.19.0 has been released we can use it directly to
create the vault route with tls.termination set to 'edge', hence this is
not needed anymore.

* Supply hubClusterDomain for localHubCluster where we don't have ACM to populate the lookup

* Don't conditionalize lookups when we know we need them

* Remove bashism in vault_exec

We currently uase "bash -c" inside vault_exec. This only works when
using UBI-based images. Let's move to 'sh -c' to be a bit more robust
in case we're in the presence of the upstream vault images which do not
have bash installed.

Tested a 'vault-init' on UBI images and it correctly worked with no
errors whatsoever in the log.

* Add namespace support to the regional gitops installations

This allows argo on regional clusters to have more rights.
Specifically this is needed to create the clusterbindingrole needed
for the k8s-external-secret operator.

As a first pass we'll use the '*' namespace. In a second iteration
we'll need to look at restricting that to openshift-gitops and the
namespace of the regional gitops instance. At this point of time
such a change is too invasive and is at risk of breaking existing
patterns.

Tested this on multicloud gitops and I correctly get argo to create
the clusterrolebinding in the k8s-external-secret:

  $ oc get clusterrolebinding | grep k8s-extern
  k8s-external-secrets-kubernetes-external-secrets          ClusterRole/k8s-external-secrets-kubernetes-external-secrets
  k8s-external-secrets-kubernetes-external-secrets-auth     ClusterRole/system:auth-delegator

Previously this would fail with:
  Cluster level CustomResourceDefinition "externalsecrets.kubernetes-client.io" can not be managed when in namespaced mode

* Add code to validate origin for install/upgrade

* Add better domain alternation logic and Makefile validation

* Stop using echo when returning a string in a function

This is not portable [1] and in some shells and also on bash depending
on the version [2] it may or may not automatically interpret switches (like -n).
Let's switch to printf which is the POSIX-blessed way of doing things
[3].

Tested this on my environment and was able to still do a vault-init
without any errors.

[1] https://wiki.bash-hackers.org/scripting/nonportable#echo_command
[2] https://stackoverflow.com/questions/11193466/echo-n-prints-n
[3] https://wiki.bash-hackers.org/commands/builtin/printf

* Add support for pushing the kube-root-ca.crt from the HUB to the managed clusters

By default this ACM templates is inactive and will only be activated if
asked explicitely via the .pushHubCA parameter.
It will pull the ca.crt field from the the kube-root-ca.crt ConfigMap on
the hub into a secret on the managed cluster. This will then be used
by the external-secrets pod so it can trust the https://vault-vault.apps.hub-domain...
API endpoint of the vault.

Tested with this change and once enabled via .pushHubCA the
kubernetes-external-secrets pod could correctly connect to the vault
running on the HUB (without this we'd get self-signed certs errors)

* Fix the TARGET_REPO calculation

* Fix common/ make test

Currently we fail in a bunch of tests. Let's fix these up so we'll be
able to introduce some unit testing CI soon.

Tested with:
  $ make test |grep FAIL
  Testing install chart (naked)
  Testing clustergroup chart (naked)
  Testing acm chart (naked)
  Testing install chart (normal)
  Testing clustergroup chart (normal)
  Testing acm chart (normal)

* Remove clusterselector for cases where we want the vault-ca installed on the hub cluster as well

* Fix common/ make test

Currently we fail in a bunch of tests. Let's fix these up so we'll be
able to introduce some unit testing CI soon.

Tested with:
  $ make test |grep FAIL
  Testing install chart (naked)
  Testing clustergroup chart (naked)
  Testing acm chart (naked)
  Testing install chart (normal)
  Testing clustergroup chart (normal)
  Testing acm chart (normal)

* Replicate validatedpatterns/multicloud-gitops#36

* Remove policy to deploy vault CA as unnecessary

* Changes to vault-utils to support vault/external-secrets combo

* Rename the namespace and serviceaccounts to the name of the new golang-based external secrets operator

We're moving to the newer golang-based external secrets operator at https://external-secrets.io/
To be more explicit about our intention we name namespaces and
serviceaccounts golang-external-secrets. Let's rename it in the hub
config as well.

Tested with the other related golang changes and everything worked as
expected.

* Add golang-external-secrets chart

* Add script to push vault secrets

* Fix test error in clustergroup example

This fixed the following:
Testing clustergroup chart (naked)
--- tests/clustergroup-naked.expected.yaml      2022-02-27 18:06:11.474410537 +0100
+++ tests/.clustergroup-naked.expected.yaml     2022-02-27 19:29:17.534554450 +0100
@@ -61,6 +61,7 @@
   # Changing the name affects the ClusterRoleBinding, the generated secret,
   # route URL, and argocd.argoproj.io/managed-by annotations
   name: example-gitops
+  namespace: common-example
   annotations:
     argocd.argoproj.io/compare-options: IgnoreExtraneous
 spec:
FAIL on clustergroup naked with opts []
make: *** [Makefile:34: test] Error 1

* Fix acm naked example test

Currently errors out with:
Testing acm chart (naked)
--- tests/acm-naked.expected.yaml       2022-02-27 18:06:11.474410537 +0100
+++ tests/.acm-naked.expected.yaml      2022-02-27 19:38:40.898341311 +0100
@@ -2,13 +2,6 @@
 # Source: acm/templates/policies/application-policies.yaml
 # TODO: Also create a GitOpsCluster.apps.open-cluster-management.io
 ---
-# Source: acm/templates/policies/hub-certificate-authority.yaml
-# We only push the hub CA to the regional clusters when the user explicitely tells us so
-# This template fetches "ca.crt" from the "kube-root-ca.crt" configMap from the hub
-# (this configmap is present in all namespaces) and puts it in the vault-ca secret inside
-# the k8s-external-secrets namespace so the external-secrets pod knows how to trust
-# the https://vault-vault.apps.hub-domain... endpoint
----
 # Source: acm/templates/multiclusterhub.yaml
 apiVersion: operator.open-cluster-management.io/v1
 kind: MultiClusterHub
FAIL on acm naked with opts []
make: *** [Makefile:34: test] Error 1

This was just a leftover for when we removed the hub-ca app that pushed
it around to managed clusters.

* Fix tests/clustergroup-normal.expected.yaml test

* Add initial checking github action on every pull/push

At the moment this only runs "make test".
We'll later expand this to do some additional linting.

* Add a helmlint target that runs helm lint over the charts

❯ make helmlint
==> Linting install
[INFO] Chart.yaml: icon is recommended

1 chart(s) linted, 0 chart(s) failed
==> Linting clustergroup
[INFO] Chart.yaml: icon is recommended

1 chart(s) linted, 0 chart(s) failed
==> Linting acm
[INFO] Chart.yaml: icon is recommended

1 chart(s) linted, 0 chart(s) failed

* Run make helmlint on every push/pull request

* Add golang-external-secrets to the charts being tested

* Add right params to helmlint

❯ make helmlint
==> Linting install
[INFO] Chart.yaml: icon is recommended

1 chart(s) linted, 0 chart(s) failed
==> Linting clustergroup
[INFO] Chart.yaml: icon is recommended

1 chart(s) linted, 0 chart(s) failed
==> Linting acm
[INFO] Chart.yaml: icon is recommended

1 chart(s) linted, 0 chart(s) failed
==> Linting golang-external-secrets
[INFO] Chart.yaml: icon is recommended
[WARNING] /home/michele/Engineering/cloud-patterns/common/golang-external-secrets: chart directory is missing these dependencies: external-secrets

1 chart(s) linted, 0 chart(s) failed

* Move make_common_subtree.sh to common from multicloud-gitops

* Updated values-datacenter example values file with working content

* adding working information to values-global

* updated readme with useable Makefile

* Removed previous version of common to convert to subtree from https://github.com/hybrid-cloud-patterns/common.git main

* updating options in values-datacenter

* generalizing README

Co-authored-by: Andrew Beekhof <andrew@beekhof.net>
Co-authored-by: Martin Jackson <martjack@redhat.com>
Co-authored-by: Martin Jackson <mhjacks@redhat.com>
Co-authored-by: Lester Claudio <claudiol@redhat.com>
Co-authored-by: Wolfgang Kulhanek <WolfgangKulhanek@gmail.com>
Co-authored-by: Michele Baldessari <michele@acksyn.org>
Co-authored-by: day0hero <jonny@redhat.com>
mbaldessari added a commit to validatedpatterns/medical-diagnosis that referenced this issue Mar 22, 2022
* Add missing file

* kustomize debugging

* The helm-with-kustomize example is useful to keep around

* Additional help targets

* Support loading just the site application on install

* Make it easy to drive the bootstrap option from the helper

* Make it easy to kustomize the secrets location from the helper

* Limit helm's template name to under 53 chars

* Create the application name ordered by significance

* Site name is included in the site's .Release.Name

* Try to stay under helm's name template max of 53 chars

* Fix the application role binding to refer to the actual policy

* Use the managed site name for the application and values file

* Enforce quoting of helmOverrides for acm policy applications

* Provide a default path to the site chart

* Drop references to manuela in common

* Allow namespaces without operatorgroups for charts that already include them

* Add argocd secret handling

* Add argosecret target

* Remove unneeded util dir

* Explain how to use common makefile

* Allow namespaces without operatorgroups for charts that already include them

* Add template toplevel makefile

* Allow templates to know which namespace they're being deployed into

* Refresh the kustomize-renderer example

* Refresh the sample datacenter from manuela

* Don't assume the install symlink is present

* Attempt new template format

* Ensure password has length before exiting wait loop

* Replace makefile template and make embedded shell wait for password to have length as well as 0 exit

* Add then

* Make script explain what it's doing

* make output clearer

* Revert "Attempt new template format"

This reverts commit c463cbc.

* Try putting applications into a dedicated argo instance

* All namespaces are created by the site chart

* Fix default applicaiton namespace handling

* Make the pattern name availble to argo applications

* Give preference to whatever was specified in the secrets file

* Strip off any auth tokens when calculating the target repo

* Ensure there are no spaces in the namespace

* Fix namespace for applications

* Fix namespace for application acm policies

* Include the pattern name to ensure uniqueness

* Update repo name

* Try putting applications into a dedicated argo instance

* All namespaces are created by the site chart

* Fix default applicaiton namespace handling

* Make the pattern name availble to argo applications

* Give preference to whatever was specified in the secrets file

* Strip off any auth tokens when calculating the target repo

* Ensure there are no spaces in the namespace

* Fix namespace for applications

* Fix namespace for application acm policies

* Include the pattern name to ensure uniqueness

* Fix the destination namespace for datacenter manifests

* Try a simpler argo name

* Use a shorter namespace

* acm: Fix the target namespace for the site application policy

* Fix application namespace

* The application name is already unique to the namespace

* Restore gitops to the name of the argocd CR

* Match the service accounts to the argocd name

* Document what argocd values need to be kept in sync

* Updated note regarding argo name

* Change the default 'name' label for namespaces

* Update common Makefile to have more parameters for secret making

* Re-factor install to not require .ONESHELL features as Mac doesn't support them out of the box

* Update Makefile doc and SECRET_NAME parameter

* Don't hardcode SECRET_NAME here

* Move script to scripts dir and call it from there

* New s3 secrets file for central-s3 support

* Took care of merge conflicts with s3-secret.yaml

* Adding functionalist to have a list of namespaces for a particular subscription

* Enhance compatibility with older versions of git

* Trim the example datacenter site

* Support real helm charts too

* Adds the if control to force booleans to be string type for argo on helm/vault overrides

* Improved secrets handling in pipelines (#10)

* Update common with new secret management

* Add values-global to Makefile

Co-authored-by: Wolfgang Kulhanek <WolfgangKulhanek@gmail.com>

* Add note regarding tekton annotation

* Add note regarding tekon annotation

* Ensure updated secret template variables are defined

* Missing template variable

* Update values.yaml

* Avoid assumptions about the site being called datacenter leaking into patterns

* Add missing template variables

* Standardize on Values.secrets for usernames as well as passwords

* Sync the example global values file with the i-e pattern

* Sync the plugin example application with the i-e pattern

* Ensure helm options stay in sync and add a simple test

* Make the test more stable and add missing values

* Extend the unit tests to the acm and site charts

* Ensure the global.repoURL variable is set consistently in tests

* Add some elements to .gitignore

* Fix whitespace in repoURL value in a POSIX-friendly way

* Remove manuela-specific elements and secrets

* Modify tests to match removal of secrets and namespace

* Remove cruft from makefile

* Loosen regex to extract target repo

* Add structure for vault

* Remove vault subdir to prep for alternate inclusion

* Squashed 'vault/' content from commit 9fa25e9

git-subtree-dir: vault
git-subtree-split: 9fa25e97c806073c7dd3274a851181cbb3d67868

* Change site to clustername to allow for multiple clusters in a config group

* Remove staging and adjust tests to reflect that

* Update examples for recent cleanups

* Support ocp authentication for namespaced argos

* Update examples for recent cleanups

* Support ocp authentication for namespaced argos

* Make sure that argo can manage cluster-wide resources from the relevant namespaces

Otherwise we can error out with:

  Cluster level ClusterRoleBinding vault-server-binding cannot be managed
  when in namespaced mode

By setting ARGOCD_CLUSTER_CONFIG_NAMESPACES to the namespaces where
argocd instance is installed we allow it to create cluster wide
resources for all namespaces.

Tested this and now I am correctly able to invoke the vault helm chart
from argo without the above error.

References:
- argoproj/argo-cd#5886
- argoproj-labs/argocd-operator#385

* Add some vault utilities and add a gitignore entry

* Add Makefile target to init and unseal vault

* Add an unseal target and provide it a default in the script

* Initial import of chart from 9348383 of https://github.com/external-secrets/external-secrets

* Remove vault and external secrets - we can install from helm directly

* Protect ourselves from calling vault-init twice

Without this we risk to easily overwrite the seal + token file and hence
lose future access to the vault.

* Add script for copying vault-token

* Add Hub cluster domain as a global variable based on ingress config

* Add code to extract root token

* Add function to wrap an exec including the token

* Add pki init to vault init makefile target

* Expand the PKI domain (knock off the first two domains for the hub cluster, e.g. apps and the next one up to allow the PKI to be bigger

* Correct pki role and domain creation

* Add more functions for secrets management

* pki init is done in vault_init, no need to have a separate makefile task

* Fix the name of the function to initialize the kubernetes backend

Otherwise we'll error out with:
common/scripts/vault-utils.sh: line 85: vault_k8s_init: command not found

* Add --interactive to the oc vault exec calls

This allows us to read stdin and push a file via stdin.
This is particularly useful when configuring the vault policy

* Add a policy init function to setup initial policy for the vault

* Add variable qualification to prevent helm template errors

* Add vault-route to workaround hard-coding of passthrough mode in vault helm chart 0.18.0

* Correct route resource, remove namespace and spell variable correctly

* Fix TTL lease typo in vault-init

Current wrong command:
bash-4.4$ vault secrets tune -max-lease=8760h pki
flag provided but not defined: -max-lease

Add -ttl at the end to fix it:
bash-4.4$ vault secrets tune --max-lease-ttl=8760h pki
Success! Tuned the secrets engine at: pki/

* Remove extra duplicate subcription YAML and force quoting in application install for consistency

* Add local domain to ACM policy

* Propogate localdomain setting to non-privileged argo

* Fix some tests

* Fix remaining tests

* Remove manuela tag from clustergroup chart

* Add extra framework options to level with clustergroup implementation

* Remove vault-route application

Now that vault-helm v0.19.0 has been released we can use it directly to
create the vault route with tls.termination set to 'edge', hence this is
not needed anymore.

* Remove vault-route application

Now that vault-helm v0.19.0 has been released we can use it directly to
create the vault route with tls.termination set to 'edge', hence this is
not needed anymore.

* Supply hubClusterDomain for localHubCluster where we don't have ACM to populate the lookup

* Don't conditionalize lookups when we know we need them

* Remove bashism in vault_exec

We currently uase "bash -c" inside vault_exec. This only works when
using UBI-based images. Let's move to 'sh -c' to be a bit more robust
in case we're in the presence of the upstream vault images which do not
have bash installed.

Tested a 'vault-init' on UBI images and it correctly worked with no
errors whatsoever in the log.

* Add namespace support to the regional gitops installations

This allows argo on regional clusters to have more rights.
Specifically this is needed to create the clusterbindingrole needed
for the k8s-external-secret operator.

As a first pass we'll use the '*' namespace. In a second iteration
we'll need to look at restricting that to openshift-gitops and the
namespace of the regional gitops instance. At this point of time
such a change is too invasive and is at risk of breaking existing
patterns.

Tested this on multicloud gitops and I correctly get argo to create
the clusterrolebinding in the k8s-external-secret:

  $ oc get clusterrolebinding | grep k8s-extern
  k8s-external-secrets-kubernetes-external-secrets          ClusterRole/k8s-external-secrets-kubernetes-external-secrets
  k8s-external-secrets-kubernetes-external-secrets-auth     ClusterRole/system:auth-delegator

Previously this would fail with:
  Cluster level CustomResourceDefinition "externalsecrets.kubernetes-client.io" can not be managed when in namespaced mode

* Add code to validate origin for install/upgrade

* Add better domain alternation logic and Makefile validation

* Stop using echo when returning a string in a function

This is not portable [1] and in some shells and also on bash depending
on the version [2] it may or may not automatically interpret switches (like -n).
Let's switch to printf which is the POSIX-blessed way of doing things
[3].

Tested this on my environment and was able to still do a vault-init
without any errors.

[1] https://wiki.bash-hackers.org/scripting/nonportable#echo_command
[2] https://stackoverflow.com/questions/11193466/echo-n-prints-n
[3] https://wiki.bash-hackers.org/commands/builtin/printf

* Add support for pushing the kube-root-ca.crt from the HUB to the managed clusters

By default this ACM templates is inactive and will only be activated if
asked explicitely via the .pushHubCA parameter.
It will pull the ca.crt field from the the kube-root-ca.crt ConfigMap on
the hub into a secret on the managed cluster. This will then be used
by the external-secrets pod so it can trust the https://vault-vault.apps.hub-domain...
API endpoint of the vault.

Tested with this change and once enabled via .pushHubCA the
kubernetes-external-secrets pod could correctly connect to the vault
running on the HUB (without this we'd get self-signed certs errors)

* Fix the TARGET_REPO calculation

* Fix common/ make test

Currently we fail in a bunch of tests. Let's fix these up so we'll be
able to introduce some unit testing CI soon.

Tested with:
  $ make test |grep FAIL
  Testing install chart (naked)
  Testing clustergroup chart (naked)
  Testing acm chart (naked)
  Testing install chart (normal)
  Testing clustergroup chart (normal)
  Testing acm chart (normal)

* Remove clusterselector for cases where we want the vault-ca installed on the hub cluster as well

* Fix common/ make test

Currently we fail in a bunch of tests. Let's fix these up so we'll be
able to introduce some unit testing CI soon.

Tested with:
  $ make test |grep FAIL
  Testing install chart (naked)
  Testing clustergroup chart (naked)
  Testing acm chart (naked)
  Testing install chart (normal)
  Testing clustergroup chart (normal)
  Testing acm chart (normal)

* Replicate validatedpatterns/multicloud-gitops#36

* Remove policy to deploy vault CA as unnecessary

* Changes to vault-utils to support vault/external-secrets combo

* Rename the namespace and serviceaccounts to the name of the new golang-based external secrets operator

We're moving to the newer golang-based external secrets operator at https://external-secrets.io/
To be more explicit about our intention we name namespaces and
serviceaccounts golang-external-secrets. Let's rename it in the hub
config as well.

Tested with the other related golang changes and everything worked as
expected.

* Add golang-external-secrets chart

* Add script to push vault secrets

* Fix test error in clustergroup example

This fixed the following:
Testing clustergroup chart (naked)
--- tests/clustergroup-naked.expected.yaml      2022-02-27 18:06:11.474410537 +0100
+++ tests/.clustergroup-naked.expected.yaml     2022-02-27 19:29:17.534554450 +0100
@@ -61,6 +61,7 @@
   # Changing the name affects the ClusterRoleBinding, the generated secret,
   # route URL, and argocd.argoproj.io/managed-by annotations
   name: example-gitops
+  namespace: common-example
   annotations:
     argocd.argoproj.io/compare-options: IgnoreExtraneous
 spec:
FAIL on clustergroup naked with opts []
make: *** [Makefile:34: test] Error 1

* Fix acm naked example test

Currently errors out with:
Testing acm chart (naked)
--- tests/acm-naked.expected.yaml       2022-02-27 18:06:11.474410537 +0100
+++ tests/.acm-naked.expected.yaml      2022-02-27 19:38:40.898341311 +0100
@@ -2,13 +2,6 @@
 # Source: acm/templates/policies/application-policies.yaml
 # TODO: Also create a GitOpsCluster.apps.open-cluster-management.io
 ---
-# Source: acm/templates/policies/hub-certificate-authority.yaml
-# We only push the hub CA to the regional clusters when the user explicitely tells us so
-# This template fetches "ca.crt" from the "kube-root-ca.crt" configMap from the hub
-# (this configmap is present in all namespaces) and puts it in the vault-ca secret inside
-# the k8s-external-secrets namespace so the external-secrets pod knows how to trust
-# the https://vault-vault.apps.hub-domain... endpoint
----
 # Source: acm/templates/multiclusterhub.yaml
 apiVersion: operator.open-cluster-management.io/v1
 kind: MultiClusterHub
FAIL on acm naked with opts []
make: *** [Makefile:34: test] Error 1

This was just a leftover for when we removed the hub-ca app that pushed
it around to managed clusters.

* Fix tests/clustergroup-normal.expected.yaml test

* Add initial checking github action on every pull/push

At the moment this only runs "make test".
We'll later expand this to do some additional linting.

* Add a helmlint target that runs helm lint over the charts

❯ make helmlint
==> Linting install
[INFO] Chart.yaml: icon is recommended

1 chart(s) linted, 0 chart(s) failed
==> Linting clustergroup
[INFO] Chart.yaml: icon is recommended

1 chart(s) linted, 0 chart(s) failed
==> Linting acm
[INFO] Chart.yaml: icon is recommended

1 chart(s) linted, 0 chart(s) failed

* Run make helmlint on every push/pull request

* Add golang-external-secrets to the charts being tested

* Add right params to helmlint

❯ make helmlint
==> Linting install
[INFO] Chart.yaml: icon is recommended

1 chart(s) linted, 0 chart(s) failed
==> Linting clustergroup
[INFO] Chart.yaml: icon is recommended

1 chart(s) linted, 0 chart(s) failed
==> Linting acm
[INFO] Chart.yaml: icon is recommended

1 chart(s) linted, 0 chart(s) failed
==> Linting golang-external-secrets
[INFO] Chart.yaml: icon is recommended
[WARNING] /home/michele/Engineering/cloud-patterns/common/golang-external-secrets: chart directory is missing these dependencies: external-secrets

1 chart(s) linted, 0 chart(s) failed

* Move make_common_subtree.sh to common from multicloud-gitops

* Find out charts dynamically

* Fix examples values to use imageregistry

This avoids the following helm error:

Error: template: example/templates/environment.yaml:6:28: executing "example/templates/environment.yaml" at <.Values.global.imageregistry.hostname>: nil pointer evaluating interface {}.hostname

* Add naked and normal expected.yaml files for examples-kustomize-renderer chart

Now that we find charts dynamically we need to make sure this chart is
covered as well

* Add proper naked+normal golang-secrets expected yaml files

* Add external-secrets dependency chart to golang-external-secrets

Otherwise any linting will complain and/or fail. We add
external-secrets-0.4.2.tgz to golang-external-secrets/charts/

This was done with: helm dependency update golang-external-secrets

* Make sure we fail if helm template errors out

We need to explicitely error out when helm template fails so we catch
any errors during templating itself

* Add global hubClusterDomain to golang-external-secrets values

This makes it so that we have a default value and helm templating
won't barf

* Remove Makefile.toplevel

I have not seen it referenced in any other file nor repo in our org.

* Update secret.sh from Industrial Edge

Via 465b1c40 in Industrial Edge we updated common/scripts/secret.sh in
the IE repo only.
Let's bring those changes back into common and while we're at it, let's
make the indenting more consistent.

* Remove old vault-token.sh script

It is not used anywhere any longer:
$ grep -ir vault-token industrial-edge multicloud-gitops common medical-diagnosis
$

* Remove init target

Common is not a submodule any longer. No point in keeping this around

* Add load-secrets target

* Switch to mustonlyhave complianceType in templates pushed out by ACM

As somewhat specified in [1] there are different complianceTypes and
'musthave' does not imply that the object being applied is *identical*
to what is specified in the policy. Namely the docs for 'musthave' state:

  "The other fields in the template are a subset of what exists in the object."

The actual consequence on real deployment of 'musthave' is the
following:
* Existing object
- foo
  bar:
  - a
  - b

If the above template gets changed in ACM:
- foo
  bar:
  - d
  - e

The end result in case of 'musthave' complianceType will be:
- foo
  bar:
  - a
  - b
  - d
  - e

So let's switch to complianceType 'mustonlyhave' which actually
enforces the full object:

  Indicates that an object must exist with the exact name and relevant fields.

Tested this on my multicloud environment and I could observe that
changes to the ACM policy templates started properly replacing objects
on the remote clusters.

[1] https://access.redhat.com/documentation/en-us/red_hat_advanced_cluster_management_for_kubernetes/2.4/html-single/governance/index#configuration-policy-yaml-table

Closes: MBP-199

* Removed previous version of common to convert to subtree from https://github.com/hybrid-cloud-patterns/common.git main

Co-authored-by: Andrew Beekhof <andrew@beekhof.net>
Co-authored-by: Martin Jackson <martjack@redhat.com>
Co-authored-by: Martin Jackson <mhjacks@redhat.com>
Co-authored-by: Lester Claudio <claudiol@redhat.com>
Co-authored-by: jrickard <jonathan.m.rickard@gmail.com>
Co-authored-by: Wolfgang Kulhanek <WolfgangKulhanek@gmail.com>
@keskad
Copy link

keskad commented Apr 28, 2022

The official documentation lacks information about the ARGOCD_CLUSTER_CONFIG_NAMESPACES parameter - could it be added to documetation? Is this parameter officially supported and will not be deleted in the future without notice?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation fixed-in-latest Fix is already made and waiting for new release
Projects
None yet
Development

No branches or pull requests

5 participants