Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chart changes to fix race condition during helm install #75

Merged
merged 8 commits into from
Feb 23, 2024

Conversation

rajatjindal
Copy link
Contributor

@rajatjindal rajatjindal commented Feb 18, 2024

When creating SpinAppExecutor, it fails as the webhook service is not available yet (because spin-operator deployment pods needs to be running, which needs certificate volume, which depends on Certificate resource, which is a post-install hook for helm).

the fix is to create SpinAppExecutor as post-install hook (with weight: "3" to ensure it runs after certificate resource is created).

The fix is

  • to pull cert-manager helm chart as a pre-requisite instead of dependency of spin-operator chart
  • remove post-install helm hooks from cert-issuer and webhook-cert
  • install spin-app-executor as post-install hook (once webhook service is ready)
  • install spin-app-executor in default namespace (as our examples refer to default namespace)

I also added a smoke test for this. but it needs a publically available image for the operator (which we publish on ttl.sh as part of PR). we just need to find a way to have helm-install-smoke-test have a dependency on the action that publish ephermal docker image for PR.

Copy link

github-actions bot commented Feb 18, 2024

This PR now has an image available for testing:

  ttl.sh/spoopy-operator-pr-:24h

Copy link
Contributor

@vdice vdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you @rajatjindal!

@rajatjindal
Copy link
Contributor Author

I was attempting to write a smoke test for this and realized it may not be the complete fix yet. I will be spending some more time on this today.

@rajatjindal rajatjindal changed the title add hook annotations to spin-shim-executor resource to fix race condition chart changes to fix race condition during helm install Feb 20, 2024
@rajatjindal rajatjindal force-pushed the fix-helm-install branch 2 times, most recently from 77170ca to 71771bc Compare February 20, 2024 16:52
@endocrimes
Copy link
Contributor

Needs a rebase (and probably the github workflow updating to point to go1.22)

@rajatjindal
Copy link
Contributor Author

Needs a rebase (and probably the github workflow updating to point to go1.22)

done

Copy link
Contributor

@vdice vdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally and timing issues appear to be resolved but --wait is essential on the helm install command... see associated comment...

charts/spin-operator/Chart.yaml Outdated Show resolved Hide resolved
@@ -2,4 +2,8 @@ apiVersion: core.spinoperator.dev/v1
kind: SpinAppExecutor
metadata:
name: containerd-shim-spin
namespace: default
Copy link
Contributor

@vdice vdice Feb 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We either need to add these same annotations to the canonical source (https://github.com/spinkube/spin-operator/blob/main/config/samples/shim-executor.yaml) or perhaps remove this logic and maintain a separate, chart-specific version. (Otherwise, these changes are lost on the next chart (re-)generation by helmify.)

Copy link
Contributor

@vdice vdice Feb 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like if I don't add --wait to the helm install command, I get the timing error:

helm upgrade --install \
		-n spin-operator \
		--create-namespace \
		--set controllerManager.manager.image.repository=vdice/spin-operator \
		--set controllerManager.manager.image.tag=latest \
		spin-operator charts/spin-operator
Release "spin-operator" does not exist. Installing it now.
Error: failed post-install: warning: Hook post-install spin-operator/templates/containerd-shim-spin-executor.yaml failed: 1 error occurred:
	* Internal error occurred: failed calling webhook "mspinappexecutor.kb.io": failed to call webhook: Post "https://spin-operator-webhook-service.spin-operator.svc:443/mutate-core-spinoperator-dev-v1-spinappexecutor?timeout=10s": no endpoints available for service "spin-operator-webhook-service"


make: *** [helm-install] Error 1

Do we ensure docs/CI/etc all use --wait to avoid this error? Or do we look at not including this CR in the helm chart and rather documenting how it needs to be installed in any namespace that users wish to run SpinApps? (Well, we should doc that behavior anyways... looks like this is being tracked by #55)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: install->uninstall->install, currently the SpinAppExecutor CR is stuck terminating when deleted (currently tracked in #7 (comment)) so the subsequent install fails with:

Release "spin-operator" does not exist. Installing it now.
Error: failed post-install: warning: Hook post-install spin-operator/templates/containerd-shim-spin-executor.yaml failed: 1 error occurred:
	* object is being deleted: spinappexecutors.core.spinoperator.dev "containerd-shim-spin" already exists

I guess this is a heads up for now? i.e. should resolve once this CR is reliably removed...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, --wait is indeed required if we want the post-install hook of setting up CR succeed. My suggestion would be to add --wait and also make it clear using doc that CR is required in all namespaces where we wish to install spinapps.

one other way could be to make available a helm chart variable "spin-app-namespaces", we can then iterate over that list and install CR automatically during helm release installation. wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We either need to add these same annotations to the canonical source (https://github.com/spinkube/spin-operator/blob/main/config/samples/shim-executor.yaml) or perhaps remove this logic and maintain a separate, chart-specific version. (Otherwise, these changes are lost on the next chart (re-)generation by helmify.)

👍 thank you for catching this. Made the change to the canonical source.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one other way could be to make available a helm chart variable "spin-app-namespaces", we can then iterate over that list and install CR automatically during helm release installation. wdyt?

That's a creative idea to explore in the future!

@@ -84,6 +84,21 @@ make install
kubectl apply -f spin-runtime-class.yaml
```

Install the cert-manager helm chart (if you already have cert-manager running, you can skip this step)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.github/workflows/helm-install-smoketest.yaml Outdated Show resolved Hide resolved
.github/workflows/helm-install-smoketest.yaml Outdated Show resolved Hide resolved
@rajatjindal
Copy link
Contributor Author

fixes #74 and #20

.github/workflows/helm-install-smoketest.yaml Outdated Show resolved Hide resolved
@@ -2,4 +2,8 @@ apiVersion: core.spinoperator.dev/v1
kind: SpinAppExecutor
metadata:
name: containerd-shim-spin
namespace: default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one other way could be to make available a helm chart variable "spin-app-namespaces", we can then iterate over that list and install CR automatically during helm release installation. wdyt?

That's a creative idea to explore in the future!

Signed-off-by: Rajat Jindal <rajatjindal83@gmail.com>
…p during helm chart install

Signed-off-by: Rajat Jindal <rajatjindal83@gmail.com>
Signed-off-by: Rajat Jindal <rajatjindal83@gmail.com>
@rajatjindal rajatjindal force-pushed the fix-helm-install branch 2 times, most recently from 638485a to 8116df0 Compare February 22, 2024 12:52
Signed-off-by: Rajat Jindal <rajatjindal83@gmail.com>
Signed-off-by: Rajat Jindal <rajatjindal83@gmail.com>
Signed-off-by: Rajat Jindal <rajatjindal83@gmail.com>
Signed-off-by: Rajat Jindal <rajatjindal83@gmail.com>
Signed-off-by: Rajat Jindal <rajatjindal83@gmail.com>
@rajatjindal rajatjindal merged commit 38b316a into main Feb 23, 2024
7 of 8 checks passed
@rajatjindal rajatjindal deleted the fix-helm-install branch February 23, 2024 02:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants