Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: ensure docker buildx removes the running nodes #448

Merged
merged 1 commit into from
Sep 12, 2023

Conversation

davidspek
Copy link
Contributor

Summary

I've noticed builder deployments on our cluster aren't getting cleanup up properly. This PR solves that problem.

@davidspek davidspek added the bug-fix This pull request fixes a bug label Sep 12, 2023
Signed-off-by: David van der Spek <vanderspek.david@gmail.com>
@davidspek davidspek merged commit 23ae1e8 into main Sep 12, 2023
11 checks passed
@davidspek davidspek deleted the fix-buildx-cleanup branch September 12, 2023 16:57
davidspek added a commit that referenced this pull request Sep 12, 2023
Signed-off-by: David van der Spek <vanderspek.david@gmail.com>
zreigz added a commit that referenced this pull request Sep 26, 2023
* Bump cluster-api-migration

* bump migrator

* Bump cluster-api-migration

* bump migrator

* bump migrator

* bump migrator

* Update GCP migration config

* optimize imports

* Remove --cluster-api flag

* update google bootstrap flags

* Fix deploy logic

* bump migrator

* update destroy bootstrap flags for google provider

* Check if cluster exists

* update destroy steps

* Fix deploy

* Add logging

* Add missing new line

* Fix log types

* Add client ID and secret to init survey

* remove cluster resources during destroy

* Fix wait command

* Remove plural clusters watch command

* Run go mod tidy

* Fix unit tests

* Print step numbers for bootstrap and migration

* Remove plural cluster watch command and some unused code

* Remove build step and update descriptions for CAPI deploy

* Refactor deploy and migration steps

* Refactor destroy steps

* Add destroy logs

* Refactor

* Move CAPI related logic from cmd to pkg

* Extract common code

* Move checks

* Fix minor import issue

* Cleanup

* Remove unused flag
Remove duplicated command

* Minor improvements

* Add TODO

* add post install step

* Update cluster readiness check

* Fix merge conflicts

* Update migration configuration for gcp

* Export execute steps function

* Refactor

* Refactor migration

* Add tests for common functions

* Improve GCP preflight checks

* Add tests for migration functions

* Update messaging

* add kind provider

* Raise destroy timeout

* Refactor cilium.go

* Fix resource group and storage account name validation

* Add command to check if chart is installed

* save kubeconfig

* add kind configuration

* fix kind configuration

* fix docker destroy

* normilize kind

* update e2e test

* update github action

* bump kind action

* create bootstrap namespace

* create bootstrap namespace

* add extra debug

* do not run migrate when cluster already migrated

* read sa email from credentials file

* add vendor dir to gitignore

* fix import cycle

* add PLURAL_DISABLE_MP_TABLE_VIEW env for machine pools view

* remove bootstrap operator dependencies

* cilium update

* refactor

* split e2e tests

* change name

* Refactor e2e workflows

* distinguish between regular and cluster api

* distinguish between regular and cluster api - fix

* distinguish between regular and cluster api - improvement

* distinguish between regular and cluster api - improvement

* distinguish between regular and cluster api - improvement

* add e2e test for cluster api

* enable list view for destroy

* add e2e test to check installed packages

* fix linter

* Update github.com/gin-gonic/gin to avoid CVE

* Bump dependencies

* Read Go version from go.mod in CI

* Bump dependencies

* improve error handling for deoploy/destroy cluster

* e2e update machine pool

* Refactor storage account code

* Fixes

* Fix unit tests

* Fix kind delete

* remove role permissions check for gcp SA and use local CLI ADC for migration and bootstraping

* fetch AvailabilityZones

* fix unit test

* Use Microsoft Graph SDK to create service principal and get client ID and secret

* set bootstrapMode flag for gcp during the bootstrap phase

* fix fetching zones

* Add proper role assignment to Azure service principal

* fix execute not showing error and add workaround for tf value templating issue

* Minor improvements

* read gcp credentials from adc file

* Fix client ID

* change migrate to run deploy at the end and run gcp in bootstrapMode during migrate

* update gcp permissions check

* set azure bootstrap mode flag

Signed-off-by: David van der Spek <vanderspek.david@gmail.com>

* Add commit flag at the end of running migrate (#436)

It's very likely a large number of users will forget to manage their git, we should just remove that possibility w/ this.

* do not use bootstrap mode for the gcp migration

* improve gcp permissions check messaging

* Fix typo

* Enable OIDC issuer for Azure clusters

* add some todo comments

Signed-off-by: David van der Spek <vanderspek.david@gmail.com>

* Create temporary service principal with password during deploy and destroy

* Refactor

* e2e update machine pool version

* Fix destroy

* update bootstrap step building logic

* add plural build-values REPO

* init bubbletea tui

* revert bootstrap step changes

* Resolve Helm issue

* Extract methods from bootstrap steps

* Fix destroy

* Modify aws auth configmap manually to solve migration chicken-egg (#437)

* Modify aws auth configmap manually to solve migration chicken-egg

This allows us to reusably modify the aws-auth configmap for eks from the client which should help resolve some migrration-time issues

* add to migrate steps

* Add secret list and create funcs

* Add kube initializer with context

* add feature flag for CAPI stuff

* fix build

* Add kube initializer with context

* set aws credentials

* cleanup build values command

* use dynamic credentials for GCP without storing them on the repo

* lint fix

* Refactor

* Rename file

* allow overriding enable field of helm modules

* Fix var name

* Simplify migration

* Restore uninstall azure-identity package step

* update gcp permissions check name

* fix nil pointer error when listing uninstalled package

* improve fetching AZs

* bump migrator version (#440)

Signed-off-by: David van der Spek <vanderspek.david@gmail.com>

* fix gcp provider name

* remove credentials

* Properly normalize Google -> GCP provider name and add migration step to update google provider name to gcp

* update go.sum

* make genmock

* Fix executor println (#443)

This was always saying "actionName <app>" instead of the passed action name.

* bump migrator

* small refactor

* Bump migrator version

* fix null replacment

* Deprecate values.yaml migration

* bump migrator

* Fix Azure destroy after migration

* Refactor step filtering

* Fix Azure identity bug

* add posthog feature call timeout and fix caching

* cleanup some steps

* Switch google to gcp during init

* Update messaging for GCP

* bump migrator

* update go.sum

* fix linters

* ci: ensure docker buildx removes the running nodes (#448)

Signed-off-by: David van der Spek <vanderspek.david@gmail.com>

* Add semver validation for required bootstrap tf/helm modules on migration (#445)

There are now some requirements for performing a migration tied to our helm/tf.  This will at least guarantee they're installed at migrate time.

* remove default values from migration values.yaml

* go mod tidy

* update AZs during migration

* disable external-dns and plural-certmanager-webhook

* Do not delete bootstrap cluster on failed deploy

* fix disabling plural-certmanager-webhook

Signed-off-by: David van der Spek <vanderspek.david@gmail.com>

* also disable external dns on gcp and azure

Signed-off-by: David van der Spek <vanderspek.david@gmail.com>

* Update step handling

* Add retry mechanism

* Fix step numbering

* Fix unit tests

* Further improvements

* Use map to store provider tags

* add move state backup and restore to capi deploy

* Further improvements

* Fix OIDC issuer step

* Fix typo

* add initial step confirm support

* move capi backup to .plural dir and add multi-cluster backup support

* Remove tui package

* add conditional recovery steps when cluster issues are detected

---------

Signed-off-by: David van der Spek <vanderspek.david@gmail.com>
Co-authored-by: Lukasz Zajaczkowski <zreigz@gmail.com>
Co-authored-by: Sebastian Florek <sebastian@plural.sh>
Co-authored-by: David van der Spek <vanderspek.david@gmail.com>
Co-authored-by: michaeljguarino <mguarino46@gmail.com>
Co-authored-by: David van der Spek <28541758+DavidSpek@users.noreply.github.com>
michaeljguarino pushed a commit that referenced this pull request Aug 28, 2024
Signed-off-by: David van der Spek <vanderspek.david@gmail.com>
michaeljguarino added a commit that referenced this pull request Aug 28, 2024
* Bump cluster-api-migration

* bump migrator

* Bump cluster-api-migration

* bump migrator

* bump migrator

* bump migrator

* Update GCP migration config

* optimize imports

* Remove --cluster-api flag

* update google bootstrap flags

* Fix deploy logic

* bump migrator

* update destroy bootstrap flags for google provider

* Check if cluster exists

* update destroy steps

* Fix deploy

* Add logging

* Add missing new line

* Fix log types

* Add client ID and secret to init survey

* remove cluster resources during destroy

* Fix wait command

* Remove plural clusters watch command

* Run go mod tidy

* Fix unit tests

* Print step numbers for bootstrap and migration

* Remove plural cluster watch command and some unused code

* Remove build step and update descriptions for CAPI deploy

* Refactor deploy and migration steps

* Refactor destroy steps

* Add destroy logs

* Refactor

* Move CAPI related logic from cmd to pkg

* Extract common code

* Move checks

* Fix minor import issue

* Cleanup

* Remove unused flag
Remove duplicated command

* Minor improvements

* Add TODO

* add post install step

* Update cluster readiness check

* Fix merge conflicts

* Update migration configuration for gcp

* Export execute steps function

* Refactor

* Refactor migration

* Add tests for common functions

* Improve GCP preflight checks

* Add tests for migration functions

* Update messaging

* add kind provider

* Raise destroy timeout

* Refactor cilium.go

* Fix resource group and storage account name validation

* Add command to check if chart is installed

* save kubeconfig

* add kind configuration

* fix kind configuration

* fix docker destroy

* normilize kind

* update e2e test

* update github action

* bump kind action

* create bootstrap namespace

* create bootstrap namespace

* add extra debug

* do not run migrate when cluster already migrated

* read sa email from credentials file

* add vendor dir to gitignore

* fix import cycle

* add PLURAL_DISABLE_MP_TABLE_VIEW env for machine pools view

* remove bootstrap operator dependencies

* cilium update

* refactor

* split e2e tests

* change name

* Refactor e2e workflows

* distinguish between regular and cluster api

* distinguish between regular and cluster api - fix

* distinguish between regular and cluster api - improvement

* distinguish between regular and cluster api - improvement

* distinguish between regular and cluster api - improvement

* add e2e test for cluster api

* enable list view for destroy

* add e2e test to check installed packages

* fix linter

* Update github.com/gin-gonic/gin to avoid CVE

* Bump dependencies

* Read Go version from go.mod in CI

* Bump dependencies

* improve error handling for deoploy/destroy cluster

* e2e update machine pool

* Refactor storage account code

* Fixes

* Fix unit tests

* Fix kind delete

* remove role permissions check for gcp SA and use local CLI ADC for migration and bootstraping

* fetch AvailabilityZones

* fix unit test

* Use Microsoft Graph SDK to create service principal and get client ID and secret

* set bootstrapMode flag for gcp during the bootstrap phase

* fix fetching zones

* Add proper role assignment to Azure service principal

* fix execute not showing error and add workaround for tf value templating issue

* Minor improvements

* read gcp credentials from adc file

* Fix client ID

* change migrate to run deploy at the end and run gcp in bootstrapMode during migrate

* update gcp permissions check

* set azure bootstrap mode flag

Signed-off-by: David van der Spek <vanderspek.david@gmail.com>

* Add commit flag at the end of running migrate (#436)

It's very likely a large number of users will forget to manage their git, we should just remove that possibility w/ this.

* do not use bootstrap mode for the gcp migration

* improve gcp permissions check messaging

* Fix typo

* Enable OIDC issuer for Azure clusters

* add some todo comments

Signed-off-by: David van der Spek <vanderspek.david@gmail.com>

* Create temporary service principal with password during deploy and destroy

* Refactor

* e2e update machine pool version

* Fix destroy

* update bootstrap step building logic

* add plural build-values REPO

* init bubbletea tui

* revert bootstrap step changes

* Resolve Helm issue

* Extract methods from bootstrap steps

* Fix destroy

* Modify aws auth configmap manually to solve migration chicken-egg (#437)

* Modify aws auth configmap manually to solve migration chicken-egg

This allows us to reusably modify the aws-auth configmap for eks from the client which should help resolve some migrration-time issues

* add to migrate steps

* Add secret list and create funcs

* Add kube initializer with context

* add feature flag for CAPI stuff

* fix build

* Add kube initializer with context

* set aws credentials

* cleanup build values command

* use dynamic credentials for GCP without storing them on the repo

* lint fix

* Refactor

* Rename file

* allow overriding enable field of helm modules

* Fix var name

* Simplify migration

* Restore uninstall azure-identity package step

* update gcp permissions check name

* fix nil pointer error when listing uninstalled package

* improve fetching AZs

* bump migrator version (#440)

Signed-off-by: David van der Spek <vanderspek.david@gmail.com>

* fix gcp provider name

* remove credentials

* Properly normalize Google -> GCP provider name and add migration step to update google provider name to gcp

* update go.sum

* make genmock

* Fix executor println (#443)

This was always saying "actionName <app>" instead of the passed action name.

* bump migrator

* small refactor

* Bump migrator version

* fix null replacment

* Deprecate values.yaml migration

* bump migrator

* Fix Azure destroy after migration

* Refactor step filtering

* Fix Azure identity bug

* add posthog feature call timeout and fix caching

* cleanup some steps

* Switch google to gcp during init

* Update messaging for GCP

* bump migrator

* update go.sum

* fix linters

* ci: ensure docker buildx removes the running nodes (#448)

Signed-off-by: David van der Spek <vanderspek.david@gmail.com>

* Add semver validation for required bootstrap tf/helm modules on migration (#445)

There are now some requirements for performing a migration tied to our helm/tf.  This will at least guarantee they're installed at migrate time.

* remove default values from migration values.yaml

* go mod tidy

* update AZs during migration

* disable external-dns and plural-certmanager-webhook

* Do not delete bootstrap cluster on failed deploy

* fix disabling plural-certmanager-webhook

Signed-off-by: David van der Spek <vanderspek.david@gmail.com>

* also disable external dns on gcp and azure

Signed-off-by: David van der Spek <vanderspek.david@gmail.com>

* Update step handling

* Add retry mechanism

* Fix step numbering

* Fix unit tests

* Further improvements

* Use map to store provider tags

* add move state backup and restore to capi deploy

* Further improvements

* Fix OIDC issuer step

* Fix typo

* add initial step confirm support

* move capi backup to .plural dir and add multi-cluster backup support

* Remove tui package

* add conditional recovery steps when cluster issues are detected

---------

Signed-off-by: David van der Spek <vanderspek.david@gmail.com>
Co-authored-by: Lukasz Zajaczkowski <zreigz@gmail.com>
Co-authored-by: Sebastian Florek <sebastian@plural.sh>
Co-authored-by: David van der Spek <vanderspek.david@gmail.com>
Co-authored-by: michaeljguarino <mguarino46@gmail.com>
Co-authored-by: David van der Spek <28541758+DavidSpek@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-fix This pull request fixes a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants