Skip to content

Commit

Permalink
Feat static bootstrap stack names (#712)
Browse files Browse the repository at this point in the history
* Feat change to static bootstrap stack names

**Why?**

Initially, ADF would generate bootstrap stack names that included the name
of the OU at the end. For example, for an OU named banking, it would generate
the global `adf-global-base-banking` stack.

This, however, makes it harder to harden ADF. As it would need access rights
to deploy and manage CloudFormation stacks with a wildcard at the end.
Instead of listing a limited number of stack names.

Additionally, it makes it harder to write an SCP to limit who can update these
stacks as well.

**What?**

* Instead of using the OU name, the bootstrap stacks will be named:
  `adf-(global|regional)-base-bootstrap`.
* Exception being the `adf-(global|regional)-base-deployment` stack,
  as this stack contains the resources that ADF needs to operate.
  As well as the `adf-global-base-adf-build` stack that gets deployed to the
  management account.
  Renaming these stacks would require uninstalling ADF and reinstalling it from
  scratch. Hence these are kept as-is.
* Tightened the IAM policies that grant access to manage the bootstrap stacks.
* Added a functionality to delete deprecated stacks automatically and upgrade
  to the new stack name via the `aws-deployment-framework-bootstrap` pipeline.
* When a deprecated bootstrap stack is deleted, it will first delete the
  global-iam stack if required. As the global-iam stack adds policies to the
  roles that are created in the bootstrap stack. Therefore, the global-iam
  stack should be removed before the bootstrap stack can be deleted in the
  global region.
* Fix CloudFormation Stack/ChangeSet waiter error capture, to report back the
  account, region, and stack name that ran into a failure when needed.

* Fix /adf_version param lookup to /adf/adf_version
  • Loading branch information
sbkok committed Apr 8, 2024
1 parent 0cad5db commit 738dc7b
Show file tree
Hide file tree
Showing 8 changed files with 823 additions and 43 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -86,11 +86,12 @@ Resources:
- cloudformation:CreateChangeSet
- cloudformation:CreateStack
- cloudformation:CreateUploadBucket
- cloudformation:DeleteStack
- cloudformation:DeleteChangeSet
- cloudformation:DescribeStacks
- cloudformation:DeleteStack
- cloudformation:DescribeChangeSet
- cloudformation:DescribeStacks
- cloudformation:ExecuteChangeSet
- cloudformation:ListStacks
- cloudformation:SetStackPolicy
- cloudformation:SignalResource
- cloudformation:UpdateStack
Expand Down Expand Up @@ -125,23 +126,31 @@ Resources:
- !Sub "arn:${AWS::Partition}:ssm:*:${AWS::AccountId}:parameter/adf/*"
- Effect: Allow
Action:
- iam:CreateRole
- iam:CreatePolicy
- iam:UpdateAssumeRolePolicy
- iam:CreateRole
- iam:DeleteRole
- iam:DeleteRolePolicy
- iam:GetRole
- iam:GetRolePolicy
- iam:DeleteRole
- iam:TagRole
- iam:PutRolePolicy
- iam:DeleteRolePolicy
- iam:TagRole
- iam:UntagRole
- iam:UpdateAssumeRolePolicy
Resource:
- !Sub "arn:${AWS::Partition}:iam::${AWS::AccountId}:role/adf-cloudformation-role"
- !Sub "arn:${AWS::Partition}:iam::${AWS::AccountId}:role/adf-cloudformation-deployment-role"
- !Sub "arn:${AWS::Partition}:iam::${AWS::AccountId}:role/adf-codecommit-role"
- !Sub "arn:${AWS::Partition}:iam::${AWS::AccountId}:role/adf-automation-role"
- !Sub "arn:${AWS::Partition}:iam::${AWS::AccountId}:role/adf-readonly-automation-role"
- !Sub "arn:${AWS::Partition}:iam::${AWS::AccountId}:role/adf-update-cross-account-access-role"
- !Sub "arn:${AWS::Partition}:iam::${AWS::AccountId}:role/${CrossAccountAccessRole}"
- !Sub "arn:${AWS::Partition}:iam::${AWS::AccountId}:role/adf-terraform-role"
- Effect: "Allow"
Action:
- iam:DeleteRole
- iam:DeleteRolePolicy
- iam:UntagRole
Resource:
- !Sub "arn:${AWS::Partition}:iam::${AWS::AccountId}:role/${CrossAccountAccessRole}"
- !Sub "arn:${AWS::Partition}:iam::${AWS::AccountId}:role/${CrossAccountAccessRole}-readonly"
Roles:
- !Ref OrganizationsRole
Original file line number Diff line number Diff line change
Expand Up @@ -310,6 +310,7 @@ def worker_thread(
account_id=account_id
)
try:
cloudformation.delete_deprecated_base_stacks()
cloudformation.create_stack()
if region == config.deployment_account_region:
cloudformation.create_iam_stack()
Expand Down Expand Up @@ -498,6 +499,7 @@ def main(): # pylint: disable=R0915
s3_key_path="adf-bootstrap/" + account_path,
account_id=deployment_account_id
)
cloudformation.delete_deprecated_base_stacks()
cloudformation.create_stack()
update_deployment_account_output_parameters(
deployment_account_region=config.deployment_account_region,
Expand All @@ -520,6 +522,7 @@ def main(): # pylint: disable=R0915
s3_key_path='adf-build',
account_id=ACCOUNT_ID
)
cloudformation.delete_deprecated_base_stacks()
cloudformation.create_stack()
threads = []
account_ids = [
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
import re
import os

from botocore.exceptions import WaiterError, ClientError
from botocore.exceptions import WaiterError, ClientError, ValidationError
from botocore.config import Config
import tenacity

Expand All @@ -27,6 +27,9 @@
# A stack name can contain only alphanumeric characters (case sensitive)
# and hyphens.
CFN_UNACCEPTED_CHARS = re.compile(r"[^-a-zA-Z0-9]")
ADF_GLOBAL_IAM_STACK_NAME = 'adf-global-base-iam'
ADF_GLOBAL_BOOTSTRAP_STACK_NAME = 'adf-global-base-bootstrap'
ADF_GLOBAL_ADF_BUILD_STACK_NAME = 'adf-global-base-adf-build'


class StackProperties:
Expand Down Expand Up @@ -62,6 +65,30 @@ class StackProperties:
'DELETE_IN_PROGRESS': 'stack_delete_complete',
'REVIEW_IN_PROGRESS': 'change_set_create_complete',
}
all_except_deleted_states = [
'CREATE_IN_PROGRESS',
'CREATE_FAILED',
'CREATE_COMPLETE',
'ROLLBACK_IN_PROGRESS',
'ROLLBACK_FAILED',
'ROLLBACK_COMPLETE',
'DELETE_IN_PROGRESS',
'DELETE_FAILED',
'UPDATE_IN_PROGRESS',
'UPDATE_COMPLETE_CLEANUP_IN_PROGRESS',
'UPDATE_COMPLETE',
'UPDATE_FAILED',
'UPDATE_ROLLBACK_IN_PROGRESS',
'UPDATE_ROLLBACK_FAILED',
'UPDATE_ROLLBACK_COMPLETE_CLEANUP_IN_PROGRESS',
'UPDATE_ROLLBACK_COMPLETE',
'REVIEW_IN_PROGRESS',
'IMPORT_IN_PROGRESS',
'IMPORT_COMPLETE',
'IMPORT_ROLLBACK_IN_PROGRESS',
'IMPORT_ROLLBACK_FAILED',
'IMPORT_ROLLBACK_COMPLETE',
]

def __init__(
self,
Expand Down Expand Up @@ -109,9 +136,22 @@ def get_parameters(self):
return []

def _get_stack_name(self):
raw_stack_name = f'adf-{self._get_geo_prefix()}-base-{self.ou_name}'
stack_suffix = (
self.ou_name if self.ou_name in ['deployment', 'adf-build']
else 'bootstrap'
)
raw_stack_name = f'adf-{self._get_geo_prefix()}-base-{stack_suffix}'
return CFN_UNACCEPTED_CHARS.sub("-", raw_stack_name)

def _get_valid_stack_names(self):
valid_stack_names = [self._get_stack_name()]
if self.region == self.deployment_account_region:
valid_stack_names.append(ADF_GLOBAL_IAM_STACK_NAME)
valid_stack_names.append(ADF_GLOBAL_BOOTSTRAP_STACK_NAME)
valid_stack_names.append(ADF_GLOBAL_ADF_BUILD_STACK_NAME)

return valid_stack_names


class WaitException(Exception):
pass
Expand Down Expand Up @@ -196,7 +236,7 @@ def _wait_stack(self, waiter_type, stack_name):
'MaxAttempts': 45
}
)
except ClientError as client_error:
except (WaiterError, ClientError) as client_error:
LOGGER.error(
"%s in %s - Failed to wait for stack %s error %s",
self.account_id,
Expand Down Expand Up @@ -226,14 +266,18 @@ def _wait_change_set(self):
'MaxAttempts': 20
}
)
except ClientError as client_error:
LOGGER.error(
"%s in %s - Failed to wait for change set of %s error %s",
self.account_id,
self.region,
self.stack_name,
client_error,
)
except (WaiterError, ClientError) as error:
if not CloudFormation._change_set_failed_due_to_empty(
error.last_response["Status"],
error.last_response["StatusReason"],
):
LOGGER.error(
"%s in %s - Failed to wait for change set of %s error %s",
self.account_id,
self.region,
self.stack_name,
error,
)
raise

def _get_waiter_type(self):
Expand Down Expand Up @@ -450,7 +494,7 @@ def create_iam_stack(self):
self.template_url = self.s3.fetch_s3_url(
self._create_template_path(self.s3_key_path, 'global-iam')
)
self.stack_name = 'adf-global-base-iam'
self.stack_name = ADF_GLOBAL_IAM_STACK_NAME
self._wait_if_in_progress()
waiter = self._get_waiter_type()
create_change_set = self._create_change_set()
Expand Down Expand Up @@ -496,17 +540,153 @@ def get_stack_regional_outputs(self):
}

def delete_all_base_stacks(self, wait_override=False):
for stack in paginator(self.client.list_stacks):
if bool(
re.search(
'adf-(global|regional)-base',
stack.get('StackName'))):
if stack.get(
'StackStatus') in StackProperties.clean_stack_status:
LOGGER.warning(
'Removing Stack: %s',
stack.get('StackName'))
self.delete_stack(stack.get('StackName'), wait_override)
self._delete_base_stacks(
wait_override=wait_override,
)

def delete_deprecated_base_stacks(self):
self._delete_base_stacks(
wait_override=True,
deprecated_only=True,
)

def _delete_base_stacks(
self,
wait_override=False,
deprecated_only=False,
):
deleted_any = False
bootstrap_stack_found = False
for stack in paginator(
self.client.list_stacks,
StackStatusFilter=StackProperties.all_except_deleted_states,
):
matches_search = bool(
re.search(
'adf-(global|regional)-base',
stack.get('StackName'),
)
)
if not matches_search:
continue
if len(stack.get('ParentId', '')) > 0:
# Skip nested stacks
continue

if deleted_any and stack.get('StackName') == ADF_GLOBAL_IAM_STACK_NAME:
# We deleted the IAM stack already
continue

should_be_deleted = (
not deprecated_only
or stack.get('StackName') not in self._get_valid_stack_names()
)
if not should_be_deleted:
if stack.get('StackName') != ADF_GLOBAL_IAM_STACK_NAME:
bootstrap_stack_found = True
continue

if stack.get('StackStatus') == 'DELETE_COMPLETE':
# Nothing to do here
continue

LOGGER.debug(
'Base stack should be deleted: %s',
stack.get('StackName'),
)

should_delete_iam_stack = (
not deleted_any
and self.region == self.deployment_account_region
and stack.get('StackName') != ADF_GLOBAL_IAM_STACK_NAME
)
if should_delete_iam_stack:
# Remove the IAM stack before deleting an ADF global stack
# If we are deleting a bootstrap stack, we need to assume this
# might hosts the roles that get policies attached by the
# global-iam stack. Since the policies need to be deleted
# before one can delete the role, we need to delete the global
# IAM stack first.
self._delete_iam_stack_if_exists()

self._delete_stack_or_instruct_user(
stack_name=stack.get('StackName'),
stack_status=stack.get('StackStatus'),
wait_override=wait_override,
)
deleted_any = True

if deprecated_only and not bootstrap_stack_found and not deleted_any:
# If we did not find any bootstrap stack but we did run into the
# global IAM stack, then we should delete the global IAM stack.
# As the policies that the CloudFormation stack manages would
# need to be recreated and applied to new IAM Roles as created
# by a upcoming bootstrap stack.
self._delete_iam_stack_if_exists()

def _get_stack_status(self, name):
try:
LOGGER.debug(
"%s in %s - Retrieve stack status of: %s",
self.account_id,
self.region,
name,
)
response = self.client.describe_stacks(
StackName=name,
)
if response and len(response.get('Stacks', [])) > 0:
return response['Stacks'][0]['StackStatus']
return None
except (ClientError, ValidationError) as error:
if error.response['Error']['Code'] == 'ValidationError':
LOGGER.debug(
"%s in %s - Stack does not exist: %s",
self.account_id,
self.region,
name,
)
# If the stack does not exist, a ValidationError is raised.
return None # None implies missing
LOGGER.error(
"%s in %s - Retrieve stack status of: %s failed (%s): %s",
self.account_id,
self.region,
name,
error.response['Error']['Code'],
error.response['Error']['Message'],
)
raise

def _delete_iam_stack_if_exists(self):
iam_stack_status = self._get_stack_status(ADF_GLOBAL_IAM_STACK_NAME)
if iam_stack_status:
self._delete_stack_or_instruct_user(
stack_name=ADF_GLOBAL_IAM_STACK_NAME,
stack_status=iam_stack_status,
wait_override=True,
)

def _delete_stack_or_instruct_user(
self,
stack_name,
stack_status,
wait_override,
):
clean_stack_status = (
stack_status in StackProperties.clean_stack_status
)
if clean_stack_status:
LOGGER.warning('Removing stack: %s', stack_name)
self.delete_stack(stack_name, wait_override)
return

LOGGER.warning(
'Please remove stack %s manually, state %s implies that it '
'cannot be deleted automatically',
stack_name,
stack_status,
)

def get_stack_output(self, value):
try:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ def _build_param_name(name, adf_only=True):
slash_name = name if name.startswith('/') else f"/{name}"
add_prefix = (
adf_only
and not slash_name.startswith(PARAMETER_PREFIX)
and not slash_name.startswith(f"{PARAMETER_PREFIX}/")
)
param_prefix = PARAMETER_PREFIX if add_prefix else ''
return f"{param_prefix}{slash_name}"
Expand Down
Loading

0 comments on commit 738dc7b

Please sign in to comment.