Common update#99
Merged
mbaldessari merged 138 commits intovalidatedpatterns:mainfrom Jul 20, 2023
Merged
Conversation
…annotations:
labels:
openshift.io/node-selector: ""
annotations:
openshift.io/cluster-monitoring: "true"
When running the `pattern.sh` script multiple times, a lot of podman exited containers will be left on the machine, adding `--rm` parameter to `podman run` makes podman automatically delete the exited containers leaving the machine cleaner.
…ontainers Avoid exited containers proliferation using pattern.sh script
Generating the ICSP and allowing insecure registries is best done prior to helm upgrade, and requires VPN access to registry-proxy.engineering.redhat.com
WIP: Add labels and annotations support for namespaces (RFE)
This is especially useful when multiple people are working on a pattern
an have been using different names:
$ make help |grep Pattern:
Pattern: multicloud-gitops
$ make NAME=foobar help |grep Pattern:
Pattern: foobar
Allow overriding the pattern's name
vault helm 0.24.1
After merging validatedpatterns/patterns-operator@235b303 it is now effectively possible to pick a different catalogSource for the gitops operator. This is needed in order to allow CI to install the gitops operator from an IIB
This variable can be set in order to pass additional helm arguments from the the command line. I.e. we can set things without having to tweak values files So it is now possible to run something like the following: ./pattern.sh make install \ EXTRA_HELM_OPTS="--set main.gitops.operatorSource=iib-49232"
This reverts commit ab5532a.
This reverts commit 2797699.
ArgoCD will retry 5 times by default to sync an application in case of errors and then will give up. So if an application contains a reference to a CRD that has not been installed yet (say because it will be installed by another application), it will error out and retry later. This happens by default for a maximum of 5 times [1]. After those 5 times the application will give up and will stay in Degraded moded and eventually move to Failed. In this case a manual sync will usually fix the application just fine (i.e. as long as the missing CRD has been installed in the meantime). Now to solve this issue we can add complex preSync Jobs that wait for the needed resources, but this fundamentally breaks the simplicity of things and introduces unneeded dependencies. In this change we just increase the default retry limit to something larger (20) that should cover most cases. The retry limit functionality is rather undocumented currently in the docs but is defined at [2] and also shown at [3]. In our patterns' case the concrete issue happened as follows: 1. ESO ClusterSecrets were often not synced/degraded 2. We introduced a Job in a preSync hook for the ESO chart that would wait on vault to be ready before applying the rest of ESO 3. MCG started failing because the config-demo app had already tried to sync 5 times and failed everytime because the ESO CRDs were not installed yet (due to ESO waiting on vault) So instead of adding yet another job, let's just try a lot more often. We picked 20 as a sane default because that should have argo try for about 60 minutes (3min is the default maximum backoff limit) Tested with two MCG installations (with the ESO Job hook included) and both worked out of the box. Whereas before I managed to get three failures out of three installs. [1] https://github.com/argoproj/argo-cd/blob/master/controller/appcontroller.go#L1680 [2] https://github.com/argoproj/argo-cd/blob/master/manifests/crds/application-crd.yaml#L1476 [3] https://github.com/argoproj/argo-cd/blob/master/docs/operator-manual/application.yaml#L202C18-L202C100
Revert ESO/Vault Job and add a default higher number of retries
We can only split out bits of yaml that reference $.* variables. This is because these sinppets in _helpers.tbl are passed a single context either $ or . and cannot use both like the top-level domain.
I only remove the variables that are defined identically in ApplicationSet and in the helper. Leaving the other ones as is as their presence is not entirely clear to me and I do not want to risk breaking things.
When HOME is set we replace it with '~' in this debug message
because when run from inside the container the HOME is /pattern-home
which is confusing for users. Printing out '~' when at the start of
the string is less confusing.
Before:
ok: [localhost] => {
"msg": "/home/michele/.config/hybrid-cloud-patterns/values-secret-multicloud-gitops.yaml"
}
After:
ok: [localhost] => {
"msg": "~/.config/hybrid-cloud-patterns/values-secret-multicloud-gitops.yaml"
}
Tweak the load secret debug message to be clearer
If it is somewhere under /tmp or out of the HOME folder, bail out explaining why. This has caused a few silly situations where the user would save the KUBECONFIG file under /tmp. Since bind-mounting /tmp seems like a wrong thing to do in general, we at least bail out with a clear error message. To do this we rely on a bash functionality so let's just switch the script to use that. Tested as follows: export KUBECONFIG=/tmp/kubeconfig ./scripts/pattern-util.sh make help /tmp/kubeconfig is pointing outside of the HOME folder, this will make it unavailable from the container. Please move it somewhere inside your /home/michele folder, as that is what gets bind-mounted inside the container export KUBECONFIG=~/kubeconfig ./scripts/pattern-util.sh make help Pattern: common Usage: make <target> ...
Provision sno
Sanely handle cluster pools with no clusters (yet)
We currently have a small inconsistency where we use common/clustergroup in order to point Argo CD to this chart, but the name inside the chart is 'pattern-clustergroup'. This inconsistency is currently irrelevant, but in the future when migrating to helm charts inside proper helm repos, this becomes problematic. So let's fix the name to be the same as the folder. Tested on MCG successfully.
Clustergroup Chart.yaml name change
Check if the KUBECONFIG file is inside /tmp
Currently with the following values snippet:
managedClusterGroups:
exampleRegion:
name: group-one
acmlabels:
- name: clusterGroup
value: group-one
helmOverrides:
- name: clusterGroup.isHubCluster
value: false
clusterPools:
exampleAWSPool:
size: 1
name: aws-ap-bandini
openshiftVersion: 4.12.24
baseDomain: blueprints.rhecoeng.com
controlPlane:
count: 1
platform:
aws:
type: m5.2xlarge
workers:
count: 0
platform:
aws:
region: ap-southeast-2
clusters:
- One
You will get a clusterClaim that is pointing to the wrong Pool:
NAMESPACE NAME POOL
open-cluster-management one-group-one aws-ap-bandini
This is wrong because the clusterPool name will be generated using the
pool name + "-" group name:
{{- $pool := . }}
{{- $poolName := print .name "-" $group.name }}
But the clusterPoolName inside the clusterName is only using the
"$pool.name" which will make the clusterClaim ineffective as the pool
does not exist.
Switch to using the same poolName that is being used when creating the
clusterPool.
Fix the clusterPoolName in clusterClaims
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Updated common to the latest.