Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(migrations): Migrations supporting shared managed service accounts. #1022

Merged
merged 3 commits into from
Mar 22, 2021

Conversation

ajbw
Copy link
Contributor

@ajbw ajbw commented Mar 10, 2021

Per spinnaker/spinnaker#6319 and spinnaker/orca#4083, this PR contains
migrations to support shared managed service accounts.

Please see spinnaker/orca#4083 for background on what shared managed
service accounts are, and what they're for.

This PR modifies the existing DeleteDanglingServiceAccountsMigration
migration, and adds a new SharedManagedServiceAccountsMigration
migration.

DeleteDanglingServiceAccountsMigration now supports shared managed
service accounts, and recognises two new configuration options. The
first new configuration option,
migrations.deleteDanglingManagedServiceAccounts, allows users to
enable this migration even past the existing timeboxing (which expired
31 December 2020). We require this to be enabled at-will, so that after
migrating all pipelines to shared managed service accounts, users can
delete old/unused per-pipeline managed service accounts.

The second new configuration option in
DeleteDanglingServiceAccountsMigration,
migrations.deleteDanglingSharedManagedServiceAccounts, allows users to
delete unused shared managed service accounts. This situation can arise
when shared managed service accounts is disabled across the environment,
OR when a pipeline trigger has been modified such that a new combination
of roles is in use, and the shared managed service account associated
with the original combination of roles is no longer required. In our
environment, this happens relatively infrequently, so we don't run with
this migration enabled all the time, but it's available should it become
necessary to clean up old roles.

This PR also adds a new SharedManagedServiceAccountsMigration
migration, to allow users who have enabled shared managed service
accounts in orca to migrate all existing pipelines away from
per-pipeline managed service accounts across to shared managed service
accounts. This behaviour is controlled via a new configuration option,
migrations.migrateToSharedManagedServiceAccounts.

To illustrate how these migrations might be used, this is the way we
rolled these changes out in our organisation:

  1. enable orca's new useSharedManagedServiceAccounts flag (added in
    feat(orca-front50): Reuse service accounts based on their content. orca#4083). This causes all new pipelines to be
    configured with a shared managed service account from the outset.
  2. set front50 replicas to 1, and enable front50 migrations and
    the new migrateToSharedManagedServiceAccounts flag. This causes
    all existing pipelines to be updated to use shared managed service
    accounts. (We opted to run with a single front50 replica during
    this process as it seems the MigrationRunner does not have any
    concurrency protection.)
  3. disable front50's migrateToSharedManagedServiceAccounts flag.
  4. enable front50's new deleteDanglingManagedServiceAccounts flag,
    to remove all now-unused per-pipeline managed service accounts.
  5. disable front50's deleteDanglingManagedServiceAccounts flag and
    disable front50 migrations altogether. Set front50 back to 3
    replicas (our usual production configuration).

In our production environment, migrateToSharedManagedServiceAccounts
to around 15 seconds to migrate 4,000 pipelines, and the subsequent
deleteDanglingManagedServiceAccounts took around 10 seconds to delete
around 4,000 unused managed service accounts.

Note that we performed all of these steps in quick succession (as we
were attempting to fix major performance issues in our production
environment). Subsequent to the deleteDanglingManagedServiceAccounts
step, we found that pipelines/stages started prior to the maintenance
could not be restarted, so we would recommend warning users in advance
(or waiting a good amount of time between the initial migration to
shared managed service accounts, and the subsequent deletion of old
per-pipeline managed service accounts).

Alexandra Spillane and others added 2 commits March 10, 2021 11:25
Per spinnaker/spinnaker#6319 and spinnaker/orca#4083, this PR contains
migrations to support shared managed service accounts.

Please see spinnaker/orca#4083 for background on what shared managed
service accounts are, and what they're for.

This PR modifies the existing `DeleteDanglingServiceAccountsMigration`
migration, and adds a new `SharedManagedServiceAccountsMigration`
migration.

`DeleteDanglingServiceAccountsMigration` now supports shared managed
service accounts, and recognises two new configuration options.  The
first new configuration option,
`migrations.deleteDanglingManagedServiceAccounts`, allows users to
enable this migration even past the existing timeboxing (which expired
31 December 2020).  We require this to be enabled at-will, so that after
migrating all pipelines to shared managed service accounts, users can
delete old/unused per-pipeline managed service accounts.

The second new configuration option in
`DeleteDanglingServiceAccountsMigration`,
`migrations.deleteDanglingSharedManagedServiceAccounts`, allows users to
delete unused shared managed service accounts.  This situation can arise
when shared managed service accounts is disabled across the environment,
OR when a pipeline trigger has been modified such that a new combination
of roles is in use, and the shared managed service account associated
with the original combination of roles is no longer required.  In our
environment, this happens relatively infrequently, so we don't run with
this migration enabled all the time, but it's available should it become
necessary to clean up old roles.

This PR also adds a new `SharedManagedServiceAccountsMigration`
migration, to allow users who have enabled shared managed service
accounts in `orca` to migrate all existing pipelines away from
per-pipeline managed service accounts across to shared managed service
accounts.  This behaviour is controlled via a new configuration option,
`migrations.migrateToSharedManagedServiceAccounts`.

To illustrate how these migrations might be used, this is the way we
rolled these changes out in our organisation:

1. enable `orca`'s new `useSharedManagedServiceAccounts` flag (added in
   spinnaker/orca#4083).  This causes all _new_ pipelines to be
   configured with a shared managed service account from the outset.
2. set `front50` replicas to 1, and enable `front50` migrations and
   the new `migrateToSharedManagedServiceAccounts` flag.  This causes
   all _existing_ pipelines to be updated to use shared managed service
   accounts.  (We opted to run with a single `front50` replica during
   this process as it seems the `MigrationRunner` does not have any
   concurrency protection.)
3. disable `front50`'s `migrateToSharedManagedServiceAccounts` flag.
4. enable `front50`'s new `deleteDanglingManagedServiceAccounts` flag,
   to remove all now-unused per-pipeline managed service accounts.
5. disable `front50`'s `deleteDanglingManagedServiceAccounts` flag and
   disable `front50` migrations altogether.  Set `front50` back to 3
   replicas (our usual production configuration).

In our production environment, `migrateToSharedManagedServiceAccounts`
to around 15 seconds to migrate 4,000 pipelines, and the subsequent
`deleteDanglingManagedServiceAccounts` took around 10 seconds to delete
around 4,000 unused managed service accounts.

Note that we performed all of these steps in quick succession (as we
were attempting to fix major performance issues in our production
environment).  Subsequent to the `deleteDanglingManagedServiceAccounts`
step, we found that pipelines/stages started prior to the maintenance
could not be restarted, so we would recommend warning users in advance
(or waiting a good amount of time between the initial migration to
shared managed service accounts, and the subsequent deletion of old
per-pipeline managed service accounts).
@ajordens ajordens added the ready to merge Approved and ready for merge label Mar 22, 2021
@mergify mergify bot added the auto merged label Mar 22, 2021
@mergify mergify bot merged commit 70d43f0 into spinnaker:master Mar 22, 2021
pemmasanikrishna pushed a commit to pemmasanikrishna/front50 that referenced this pull request Sep 21, 2021
…ts. (spinnaker#1022)

Per spinnaker/spinnaker#6319 and spinnaker/orca#4083, this PR contains
migrations to support shared managed service accounts.

Please see spinnaker/orca#4083 for background on what shared managed
service accounts are, and what they're for.

This PR modifies the existing `DeleteDanglingServiceAccountsMigration`
migration, and adds a new `SharedManagedServiceAccountsMigration`
migration.

`DeleteDanglingServiceAccountsMigration` now supports shared managed
service accounts, and recognises two new configuration options.  The
first new configuration option,
`migrations.deleteDanglingManagedServiceAccounts`, allows users to
enable this migration even past the existing timeboxing (which expired
31 December 2020).  We require this to be enabled at-will, so that after
migrating all pipelines to shared managed service accounts, users can
delete old/unused per-pipeline managed service accounts.

The second new configuration option in
`DeleteDanglingServiceAccountsMigration`,
`migrations.deleteDanglingSharedManagedServiceAccounts`, allows users to
delete unused shared managed service accounts.  This situation can arise
when shared managed service accounts is disabled across the environment,
OR when a pipeline trigger has been modified such that a new combination
of roles is in use, and the shared managed service account associated
with the original combination of roles is no longer required.  In our
environment, this happens relatively infrequently, so we don't run with
this migration enabled all the time, but it's available should it become
necessary to clean up old roles.

This PR also adds a new `SharedManagedServiceAccountsMigration`
migration, to allow users who have enabled shared managed service
accounts in `orca` to migrate all existing pipelines away from
per-pipeline managed service accounts across to shared managed service
accounts.  This behaviour is controlled via a new configuration option,
`migrations.migrateToSharedManagedServiceAccounts`.

To illustrate how these migrations might be used, this is the way we
rolled these changes out in our organisation:

1. enable `orca`'s new `useSharedManagedServiceAccounts` flag (added in
   spinnaker/orca#4083).  This causes all _new_ pipelines to be
   configured with a shared managed service account from the outset.
2. set `front50` replicas to 1, and enable `front50` migrations and
   the new `migrateToSharedManagedServiceAccounts` flag.  This causes
   all _existing_ pipelines to be updated to use shared managed service
   accounts.  (We opted to run with a single `front50` replica during
   this process as it seems the `MigrationRunner` does not have any
   concurrency protection.)
3. disable `front50`'s `migrateToSharedManagedServiceAccounts` flag.
4. enable `front50`'s new `deleteDanglingManagedServiceAccounts` flag,
   to remove all now-unused per-pipeline managed service accounts.
5. disable `front50`'s `deleteDanglingManagedServiceAccounts` flag and
   disable `front50` migrations altogether.  Set `front50` back to 3
   replicas (our usual production configuration).

In our production environment, `migrateToSharedManagedServiceAccounts`
to around 15 seconds to migrate 4,000 pipelines, and the subsequent
`deleteDanglingManagedServiceAccounts` took around 10 seconds to delete
around 4,000 unused managed service accounts.

Note that we performed all of these steps in quick succession (as we
were attempting to fix major performance issues in our production
environment).  Subsequent to the `deleteDanglingManagedServiceAccounts`
step, we found that pipelines/stages started prior to the maintenance
could not be restarted, so we would recommend warning users in advance
(or waiting a good amount of time between the initial migration to
shared managed service accounts, and the subsequent deletion of old
per-pipeline managed service accounts).

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants