Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defaults & configuration policy #25

Closed
2 of 7 tasks
eladb opened this issue Dec 8, 2019 · 50 comments
Closed
2 of 7 tasks

Defaults & configuration policy #25

eladb opened this issue Dec 8, 2019 · 50 comments
Labels
framework status/stale The RFC did not get any significant enough progress or tracking and has become stale.

Comments

@eladb
Copy link
Contributor

eladb commented Dec 8, 2019

PR Champion
- @eladb

(See #163 for a previous attempt at an RFC.)

Description

Allow users to specify or install some code that controls the defaults of AWS resources or applies some policy or validation at the stack or app level.

Progress

  • Tracking Issue Created
  • RFC PR Created
  • Core Team Member Assigned
  • Initial Approval / Final Comment Period
  • Ready For Implementation
    • implementation issue 1
  • Resolved
@eladb
Copy link
Contributor Author

eladb commented Apr 6, 2020

Design concept:

We add a method to Stack with the following implementation (name is meh):

public getPropsForResource(className: string, props: any): any {
  return props;
}

Then, we somehow enforce that all L2s will call this method when they are initialized and pass in the props that were passed to them, along with the full name of the class (i.e. aws-s3.Bucket).

This will allow users to override this method at the stack level (e.g. their base MyCompanyStack class) and implement policy around default values. For example, they can enforce that all Buckets are created with encryption:

public getPropsForResource(className: string, props: any): any {
  if (className === 'aws-s3.Bucket') {
    if (props.encryption && props.encryption !== ENCRYPTED) { throw new Error('buckets must be encrypted'); }
  props.encryption = ENCRYPTED;
  }

  return props; // passthrough
}

Something along those lines... Just an option.

@nirvana124
Copy link

I have tried to create one example of enforcing user to follow organization defaults and if you want you can create new resources or modify existing resource using Aspects.
https://github.com/nirvana124/aws-cdk-library-example

@eladb eladb added status/review Proposal pending review/revision and removed status/proposed Newly proposed RFC labels Jun 23, 2020
@njlynch njlynch removed their assignment Oct 2, 2020
@njlynch njlynch added status/proposed Newly proposed RFC and removed status/review Proposal pending review/revision labels Oct 2, 2020
@Dzhuneyt
Copy link

Dzhuneyt commented Dec 7, 2020

Isn't this already possible to be implemented using Aspects?

@eladb
Copy link
Contributor Author

eladb commented Dec 8, 2020

@Dzhuneyt Aspects can (potentially) mutate constructs after they have been initialized but there are many initialization properties that cannot be modified after the instance is created.

@JordanDeBeer
Copy link

One thing I've experimented with is in my attempt to create an "organization standard library" is trying to add my own logic to the core.Construct.onValidate() method. I couldn't get it working in a way that was intuitive in the short amount of time I spent on it, but I feel like exposing some sort of API for adding validation functions to this method could be useful, or at the very least insightful.

@eladb
Copy link
Contributor Author

eladb commented Dec 8, 2020

One thing I've experimented with is in my attempt to create an "organization standard library" is trying to add my own logic to the core.Construct.onValidate() method. I couldn't get it working in a way that was intuitive in the short amount of time I spent on it, but I feel like exposing some sort of API for adding validation functions to this method could be useful, or at the very least insightful.

Ha! We recently changed the api for validations and you should be able to just do this:

this.node.addValidation(...);

See: https://docs.aws.amazon.com/cdk/api/latest/docs/constructs.Node.html#addwbrvalidationvalidation

@JordanDeBeer
Copy link

would an upstream base validation aspect to support easily adding these validations to a given construct type be useful for this rfc?

I am thinking something like this (pseudo code-ish):

class ValidatorAspect implements core.IAspect {
  type: constructs.IConstructs;
  fn: constructs.IValidation
  constructor(type: constructs.IConstruct, fn: constructs.IValidation) {
    this.type = type
    this.fn = fn
  }

  public visit(node: constructs.IConstruct): void {
    const n = node as constructs.Node;
    if (node instanceof this.type) {
      n.addValidation({
        validate: this.fn
      });
    }
  }
}

it's more reactive compared to the getPropsForResource design, but i can see both being useful. I'd be happy to work on adding something like this if there's interest.

@msmjack
Copy link

msmjack commented Apr 15, 2021

@ericzbeard gave an awesome talk to my company a few months ago. He mentioned that companies keep trying to make a CDK L2.5 library. (That's what I wanted to do first too!) He mentioned that this was not recommended. This concept really seems to fix that issue.

In his blog post, he has a section about "Don't use constructs for compliance"

Another common pattern we have seen, particularly among enterprise customers, is creating a collection of construct libraries based on the L2 constructs included with the AWS CDK, with a 1-1 mapping of subclasses. For example, you might create a class called MyCompanyBucket that extends s3.Bucket and MyCompanyFunction that extends lambda.Function. Inside those subclasses, you implement your company’s security best practices like encrypting data at rest, or requiring the use of certain security policies. This pattern serves a need, but it comes with a significant drawback: as the ecosystem of AWS CDK constructs, such as AWS Solutions Constructs, grows and becomes more useful, your developer community will be effectively cut off from it if they can’t use code that instantiates the base L2s.

Investigate the usage of service control polices and permissions boundaries at the organization level to enforce your security guardrails. Use aspects or tools like CFN Guard to make assertions about the properties of infrastructure elements before deployment.

While, I totally agree that we should use tools like CFN Guard, there is still a need and it sounds like Aspects is not the full answer.

Aspects can (potentially) mutate constructs after they have been initialized but there are many initialization properties that cannot be modified after the instance is created.

I want to sell the idea that the current level of upvotes (8) does not reflect the actual need for this concept. From the blog post, this sounds like this might "serve a need" to a "common pattern" but doesn't have the drawbacks and headaches of a company specific wrapper.

@AjkayAlan
Copy link

I can vouch for @maria-jackson's thoughts above. Currently in my organization, we are struggling to find a good balance. On the one hand, we want to provide the ability for engineers to create infrastructure which follows industry best practice. On the other hand, we want engineers to build infrastructure easily which is safe, secure, and adheres to our organization requirements. For example, one of our organization policies is that no S3 bucket should be public by default. Presently we really have 3 ways to accomplish this:

  1. Provide an "L2.5" construct library, which directly goes against @ericzbeard's guidance, and for good reason.
  2. Run proactive scans against deployed infrastructure in our AWS account, or use SCP's.
  3. Create a Cfn-Guard ruleset. Require engineers to scan their CDK output using it in their pipeline

While cfn-guard would potentially solve this, the downside is that it would likely get you feedback in the pipeline (unless you run it locally). Instead, if we were able to bake these checks into CDK synthing (or just into the native constructs via a hook like onValidate), we would likely get quicker feedback to the engineer writing the code.

@msmjack
Copy link

msmjack commented Apr 23, 2021

None of these is integral to the solution, but they would be nice to have. Take the pseudo code as an illustration of an idea, not a preferred implementation. (I do not understand CDK or Typescript enough)

Easy to Read Configuration Code

I'd really like the reusable library to be really easy to read and look a lot like how you pass CDK parameters. I want it to be very clear what people are getting and very clear to our Security folks what the defaults are.

aws-s3.Bucket.defaults({Encryption: KMS_MANAGED});
aws-ec2.Volume.defaults({encrypted: True});
my-company.AwesomeL3.defaults({Foo: BAR});

L1, L2, and L3 Default Configurations

As the CDK community expands aws-solutions-constructs, cdk-patterns, and their own L3 libraries, I see a need to provide defaults to L3, not just L2s.

For example, I use aws-apigateway-dynamodb and pretend that it defaults to no encryption. While I could pass the value of the decryption every time to the L3 construct, I don't want to. If the solution just provides L2 defaults, those would be overwritten every time.

  1. Code Provided L3
  2. Company Provided L3 default
  3. Library Provided L3 default
  4. Code Provided L2 default
  5. Company Provided L2 default
  6. aws-cdk provided L2 default
  7. Code Provided L1 default
  8. Company Provided L1 default
  9. CloudFormation Defaults

Multiple Default Configurations

I'm also wondering how multiple defaults would work. If my company had C-L2, my BU had B-L2, and my team had T-L2. I'd have things in my BU that I'd want to come before my company, but I'd also like the latest updates from my company.

aws-s3.Bucket.defaults({
	Encryption: KMS_MANAGED
}).applyAdditionalDefaults(C-L2);

Easy Translation from cfn-guard

It would be cool if your cfn-guard rules could generate your CDK defaults.

Like, this could translated to something very specific
AWS::EC2::Volume Encrypted == false

But this could not
AWS::EC2::Volume Size == 101 |OR| AWS::EC2::Volume Size == 99

@ericzbeard
Copy link
Contributor

Thanks for the input, @maria-jackson !

@shawnsparks-work
Copy link

@maria-jackson How does the defaults approach help when using libraries like aws-solutions-constructs? You would have to create each resource manually and pass them into the L3 constructs which certainly loses some of the benefit of those libraries.

@msmjack
Copy link

msmjack commented May 25, 2021

The solution below meets "Allow users to specify or install some code that controls the defaults of AWS resources," but not "or applies some policy or validation at the stack or app level." It more meets the Best practices for building company default constructs #3235, but as that is considered a duplicate of this issue I thought it was appropriate to add the discussion here.

Looking at https://github.com/awslabs/aws-solutions-constructs, here's the pattern I'd like the aws-cdk maintainers thoughts on as an alternative to the L2.5 library for default properties for "Defaults & configuration polic[ies]" or "compliant constructs."

  1. Create a library with default properties, ex DefaultLogGroupProps()
  2. Create version of overrideProps()
  3. Pass the defaults (or your overrides) to override the other properties.
_logGroupProps = overrideProps(DefaultLogGroupProps(), logGroupProps);

const logGroup = new logs.LogGroup(scope, _logGroupId, _logGroupProps);

@alexpulver
Copy link

alexpulver commented Aug 22, 2021

I also like the AWS Solutions Constructs approach (thanks @maria-jackson for the great summary!).

The company could have its own libraries with the various defaults. The security team can own company_cdk_security, operations team (if the company has operations as a centralized function) can own company_cdk_operations, and so on. It would be great if CDK could generate these libraries from cfn-guard rules. Alternatively, the central teams can create simple collections, such as the example below (in Python). Then,cfn-guard rules will validate the eventual result as part of the deployment process.

I think that another important capability is to layer multiple defaults, and allow the application owners to reset some properties before creating the construct. For example, the defaults for public website might include enforce_ssl = True. For a specific use case, the application owner can get an exception to turn this off, while keeping the rest of the defaults.

Note: I added non-existing .update(other_props: object) and .from_props(props: object) methods to illustrate how the above might look like without using the hacky implementation that I experimented with :). The .from_props(props: object) method is only needed in Python, since the current implementation uses keyword arguments to pass properties.

Edit: If you want to experiment with that approach, I created a simple example: https://github.com/alexpulver/company-guardrails

company_cdk_security/aws_s3.py

from aws_cdk import aws_s3 as s3
from aws_cdk import core as cdk


class BucketPropsCollection:
    public_access = s3.BucketProps(enforce_ssl=True, public_read_access=True)

    __expiration_lifecycle_rule = s3.LifecycleRule(expiration=cdk.Duration.days(30))
    retention_policy = s3.BucketProps(lifecycle_rules=[__expiration_lifecycle_rule])

app.py

from aws_cdk import aws_s3 as s3
from aws_cdk import core as cdk

# import company_cdk_solutions
from company_cdk_security import aws_s3 as security_s3
# from company_cdk_operations import aws_s3 as operations_s3
# from company_cdk8s_security import deployment as security_deployment

app = cdk.App()

stack = cdk.Stack(app, "Website")

# I am using Python's dict.update([other]) semantics:
# https://docs.python.org/3/library/stdtypes.html#dict.update
bucket_props = security_s3.BucketPropsCollection.public_access
bucket_props.update(security_s3.BucketPropsCollection.retention_policy)
bucket_props.update(s3.BucketProps(versioned=True, enforce_ssl=False))
bucket = s3.Bucket.from_props(stack, "Bucket", bucket_props)

app.synth()

@skinny85
Copy link
Contributor

skinny85 commented Aug 24, 2021

I really like this discussion, and I agree with the general direction where this is going!

I have a few comments on the details of each of the proposed solutions.


@maria-jackson I think, instead of having an overrideProps() function, you can take advantage of the fact that DefaultLogGroupProps() is also a function (BTW, I think calling it defaultLogGroupProps() is more idiomatic in TypeScript/JavaScript). So, you can have defaultLogGroupProps() take in an optional LogGroupProps object. In case LogGroupProps has any required properties, you can use the Partial TypeScript type, as in function defaultLogGroupProps(props: Partial<LogGroupProps> = {}) { ....

So, the code becomes:

const logGroup = new logs.LogGroup(this, 'LogGroup', defaultLogGroupProps({
  retention: logs.RetentionDays.TWO_MONTHS, // we can override any property of defaultLogGroupProps() this way
});

Which I think looks great.


@alexpulver I think you want to make these functions as well - making them class fields can have weird side-effects, as they are mutable.

You don't have the nice Partial feature from TypeScript, but I believe you can emulate it using keyword arguments:

class BucketPropsCollection:
  def public_access_props(**kwargs):
    return s3.BucketProps(enforce_ssl=True, public_read_access=True, **kwargs)

And then, in your CDK app:

from aws_cdk import aws_s3 as s3
from aws_cdk import core as cdk
# import company_cdk_solutions
from company_cdk_security import aws_s3 as security_s3


app = cdk.App()
stack = cdk.Stack(app, "Website")

bucket = s3.Bucket(stack, "Bucket", security_s3.BucketPropsCollection.public_access_props(
  versioned=True,
  enforce_ssl=False,
))

app.synth()

I'm sure we can get some nice types here too, but I'm not that familiar with Python to add them 🙂.

@alexpulver
Copy link

alexpulver commented Aug 25, 2021

Thanks for the feedback! I changed to functions; it makes sense :).

In Python, when specifying the same parameter twice, interpreter bombs. For example, it can be enforce_ssl as part of **kwargs besides explicit use by BucketPropsCollection.public_access(). I couldn't think of an easy way for policy library vendors to work around that. Open to suggestions :).

As I mentioned above, I think we should support layered/chained construction of properties. For example, there could be methods to support compliance with FedRAMP (see example at the end from AWS Config conformance pack sample template), SOC, and other standards. If I need to comply with both FedRAMP and SOC, as a policy library consumer, I want to apply both, and trust the AWS CDK API to do the right/defined thing.

@skinny85 if I iterate a bit on the experience you suggested, it would be something like below - what do you think?

cdk.Props.merge is an imaginary new class and a static method :). It gets an arbitrary number of props objects. All objects should be of the same type, otherwise it is an error. The method reads props in the first and next instances, then creates a new instance with all values merged. If there is a same prop in both instances, the latter takes precedence. The same then happens for next instance and so on. I implemented an example of this approach in the cdk_props.py module. The reason I suggested cdk.Props namespace, is because I think that logic can apply to all props classes. That will also allow all props classes to apply whatever logic they have on the props arguments.

Another reason for having the merge method as part of AWS CDK API is that multiple vendors could produce policy libraries, and might not agree on the same API for merging prop values. I am thinking of something like Managed rules for AWS Web Application Firewall.

from aws_cdk import aws_s3 as s3
from aws_cdk import core as cdk

from company_cdk_security import aws_s3 as security_s3

app = cdk.App()

stack = cdk.Stack(app, "Website")

bucket_props = cdk.Props.merge(
    security_s3.BucketPropsCollection.public_access(),
    security_s3.BucketPropsCollection.retention_policy(),
    s3.BucketProps(versioned=True, enforce_ssl=False),
)
bucket = s3.Bucket.from_props(stack, "Bucket", bucket_props)

app.synth()

This way, policy library providers would just define collections of policies, and let the clients to combine them in the desired order.

FedRAMP compliance controls example:
image

@shawnsparks-work
Copy link

shawnsparks-work commented Aug 31, 2021

It is a black and white choice from the security and architecture point of view, no?

No, it's not black and white unfortunately. There are policies which should apply to 99% of apps. When you have 100+ apps, there's going to be an app where the risk of not complying to a particular control is worth the reward and an exception gets put in place.

@OperationalFallacy
Copy link

@shawnsparks-work I agree, and probably more than 1%, those should be ok with "choose your own adventure", no?
Maybe you can give an example or two?

@GriffinMB
Copy link

Ultimately, I think there are a couple of different components to this discussion, which might benefit from splitting it out into multiple topics.

There's a library-level component, like an Aspect-style enforcement or L2.5 constructs. There's also an organization-level component, which would allow an Org to enforce best practices uniformly. I'm working on a formal proposal for the second piece, but to give a high level summary:

This proposal attempts to address the Org component with a new L3 Construct and CLI command. It enables static analysis locally and as part of a CI/CD pipeline, which allow developers and organizations to enforce best-practices for their cloud infrastructure.

The CLI command (cdk check) synthesizes templates, and POSTs them to an API endpoint (configurable in cdk.json). The API service validates the template, and returns results. The API service is generic, and can be built with any tool (e.g. cfn-nag, cfn-guard, in-house scanner, etc) as long as the API conforms to the specification. Users can also configure CDK to run scans on synth via cdk.json.

The L3 construct is a conforming implementation of the API, and uses cloudformation-guard for analysis.

Benefits of an API over a Library for Orgs

  • Updates are rolled out without dev-teams needing to change anything
  • A centralized place that can be hit locally, in CI/CD, etc
  • Orgs can add additional functionality, like integration with Access Analyzer

@alexpulver
Copy link

Thanks @GriffinMB ! How would you suggest to handle exceptions from the enforcement?

@alexpulver
Copy link

Additionally, how can checks/enforcements from multiple domains/vendors be combined? For example, there are operational and security best practices (e.g. Well-Architected pillars) that can be checked for. Additionally, different ISVs might want to provide rule packs, and a project can use several of them.

@GriffinMB
Copy link

Exceptions can be handled in a lot of ways ha. We're working on a solution to that and will include it in the full proposal. There's a kind of natural way to do it by encoding exception handling in guard rules, but that's not the solution we're looking at right now since it isn't necessarily as flexible.

Different domains can be managed by rulesets. When the API gets a scan request, it will check the template against all rulesets the user has permissions for. Different Orgs, teams, etc would only have permissions to the rulesets applicable to them. This is something else that'll have more detail in the full proposal, but it would look something like:

import * as check from '@aws-cdk/check';

// Principal with permissions to call the API
const account = new iam.AccountPrincipal(...);

// An API service using API Gateway and Lambda Functions
const validator = new check.Validator(this, 'myValidator');

// Grant the principal permissions to validate any ruleset.
validator.grantValidateAllRulesets(account);

Then, when a principal associated with the account hits the API, their template would be checked against all available rulesets. If you want to only grant permissions to specific rulesets, it would look like this:

// Grant a principal permissions to use the validator API, and check
// their templates against "security-ruleset"
validator.grantValidatorRuleset('security-ruleset', principal)

On the user side of things, if they want to kick off a local scan and only care about a subset of rulesets they have access to, they can use the --ruleset flag (cdk check --ruleset security-ruleset).

@GriffinMB
Copy link

Also, the idea is that the deployment of this would ideally be in a separate account, where you have a single stack that creates and manages the Validator, Rulesets, and permissions for each.

Users can also add additional Validator endpoints if an org has multiple Validators with different Well Architected Rulesets.

@alexpulver
Copy link

alexpulver commented Sep 3, 2021

When you say "Aspect-style enforcement" - do you mean checks and surfacing issues, like cdk-nag does, or mutating the application in place without developer control?

@GriffinMB
Copy link

I mean something along the lines of this: #25 (comment)

More generally, I just mean any solution where Construct validation is done on the Construct objects directly, during synthesis or compilation, rather than on build artifacts. I'm trying to differentiate between the (as I see it) two different problem spaces that can be addressed:

  • Libraries, which are useful for local development or code review, and may provide additional defaults or developer utility (what I'm referring to as Aspect-style)
  • A service-based solution, amenable to scalable organization-wide enforcement (what I'm proposing)

@rix0rrr
Copy link
Contributor

rix0rrr commented Oct 6, 2021

Working that the props level might work, but I can't help but notice that what we're looking for is Dependency injection.

We might want to substitute an internal call to new kms.Key() with something like new mycompany.SecureKmsKey() (or whatever). For that we would need some IoC container-like mechanism, and never call new kms.Key() ever again. Instead, we would call something like:

const key = Injector.of(this).obtain(kms.Key);

We now have the ability to

  • Inject other classes (potentially even replacing L1s?)
  • Reuse class instances (return the same key for all constructs that would otherwise each have created their own key)

This is strictly more powerful, and convenient APIs that leave the class the same but just override some props defaults could still be built on top of this.

@alexpulver
Copy link

I looked deeper into cfn-guard, and I think that having cfn-guard validate command add support for something like a CDKTemplate type would be sufficient (besides the existing CFNTemplate type added in 2.0.3 version). The CDKTemplate type can add the CDK path to the rules evaluation output, making it easier for developers to identify the relevant construct in the source code. I opened an enhancement issue for that (see a shorter summary of the issue below). Beyond that, cfn-guard rules should probably serve as a source of truth. Given the CDKTemplate (or similar capability) is available, I would retract my suggestion for a need in an AWS CDK-specific policy-as-code evaluation capability. It should also play nice with CDK Pipelines when used in validation steps.

For an example, look at the following CloudFormation logical ID: APILambdaFunctionServiceRole3284E859. The AWS CDK generates it for a resource of user management backend (an example CDK app) production environment that is deployed by a CI/CD pipeline. CDK path for that logical ID is UserManagementBackend-Pipeline/UserManagementBackend-Prod/Stateless/API/LambdaFunction/ServiceRole/Resource. Parts before /API/... refer to CI/CD pipeline stack, deployment stage and stateless stack, respectively. As an AWS CDK developer, I can quickly understand what source code I should look at to make changes, because I provided these construct IDs when instantiating the construct classes (see deployment.py snippet below).

I think that can help organizations working with both raw CloudFormation and AWS CDK. It can also be a basis for a service-based solution like @GriffinMB mentioned.

What do you think?

deployment.py

class UserManagementBackend(cdk.Stage):
    def __init__(
        self,
        scope: cdk.Construct,
        id_: str,
        *,
        database_dynamodb_billing_mode: dynamodb.BillingMode,
        api_lambda_reserved_concurrency: int,
        **kwargs: Any,
    ):
        super().__init__(scope, id_, **kwargs)

        stateful = cdk.Stack(self, "Stateful")
        database = Database(
            stateful, "Database", dynamodb_billing_mode=database_dynamodb_billing_mode
        )
        stateless = cdk.Stack(self, "Stateless")
        api = API(
            stateless,
            "API",
            database_dynamodb_table_name=database.dynamodb_table.table_name,
            lambda_reserved_concurrency=api_lambda_reserved_concurrency,
        )

@mnoumanshahzad
Copy link

@GriffinMB thanks for sharing valuable insights on how you are approaching the problem spaces. A lot of your comments are making sense to me.

What I am confused about is the enforcement part of things.

How I currently understand it:

  • A developer will need to include the validator to their CDK code
  • The cdk check command will scan and return a report on compliance with rulesets
  • In case some compliance is not met, the developer has to modify their CDK code

How is the above workflow different than the CDK Aspects?

  • A developer can include an organisation-wide CDK Aspects library
  • The cdk synth / cdk deploy command will allow the Aspects to run validation
  • In case some compliance is not met, the developer will have to modify their CDK code

The CDK Aspects library can be developed internally at its own pace and the developers are not blocked by it.

@mnoumanshahzad
Copy link

Probably I am missing out on something or misunderstanding the intent of the discussion

On the matter of providing defaults, one can provide a custom >=L2 construct with defaults of choice and it is the matter of exposing the resources encapsulated within a construct so that the CDK escape hatches can be leveraged.

On the construct provided on an organisation level

export class MyDummyConstruct extends Construct {

  public readonly role: IRole;
  public readonly bucket: IBucket;

  constructor(scope: Construct, id:string, props: MyDummyProps) {
    super(scope, id);
    
    this.role =. new Role(this, "Role", {... can enter the sane defaults ...})
    this.bucket = new Bucket(this, "Bucket", {... can enter the sane defaults ...})
  }
}

On the consumption side, making use of the escape hatch

const myDummyResource = new MyDummyConstruct(this, "DummyConstruct", {... some props ...})

// The underlying resource is accessed and behavior other than the sane default is applied
myDummyResource.bucket.applyRemovalPolicy(RemovalPolicy.DESTROY)

Additionally, the props for a construct can also be used as a mechanism to override the sane defaults.

Would be helpful to clarify my thoughts if someone can point out the drawbacks and in blind spots in this approach.

@alexpulver
Copy link

alexpulver commented Dec 28, 2021

My 2 cents on using escape hatches :). Each construct library has a public API. To me, using escape hatches is like using internal implementation details of a library. These are subject to change without notice, because they are not part of the API contract. And this can break my code unexpectedly, so I prefer to avoid this.

I'd rather prefer the library to add the required functionality (e.g. expose the underlying resources through an intentional public API). If the library generates many resources (like ApplicationLoadBalancedFargateService), it will take each consumer a significant effort to understand the underlying construct tree and find the relevant resource to patch. That breaks the intent of a higher-level construct library - encapsulate the details through abstraction to save library consumers time.

I think the organizational constructs should support at least the following properties to avoid being a blocker while putting guardrails in place:

  • Provide AWS resource-level constructs with company defaults. There are always exceptions to patterns, and this will allow builders to create their own solution architecture, while still using the provided defaults
  • AWS resource-level constructs with company defaults should favor inheritance over composition. This will allow library consumers to adopt new service features as they come out. Using composition allows the organization to block this on purpose and requires builders to wait for vetting from the library owners and a new library version.
  • AWS resource-level constructs with company defaults should allow to override the defaults through constructor properties. Like @shawnsparks-work mentioned in Defaults & configuration policy #25 (comment), there are use cases where an exception to defaults is acceptable. Allowing to override the defaults through constructor properties allows to set an explicit API contract between library provider and consumer (like @mnoumanshahzad mentioned in Defaults & configuration policy #25 (comment) as well).

There are additional topics being discussed in this thread, unless I missed something :):

  • Provide policy libraries to help builders configure resources faster. For example, one application needs to comply with FedRAMP and NIST 800-53, whereas another application should comply with PCI-DSS, and another with none. I don't think the defaults mentioned above should always include all the controls. It might have cost implications, for example capturing S3 access logs where not required. Policy libraries approach allows the organization to implement the requirements once, and builders to use them when needed. The policy libraries should be a collection of properties for AWS resource-level constructs. The reason is that using AWS-resource level constructs directly will not allow to create a single S3 bucket (for example) that supports several compliance standards at the same time. That is because each compliance standard can focus on a distinct set of properties, and there is a need for a union between them.
  • Provide an organization with capability to write rules once for multiple infrastructure as code tools. AWS CloudFormation Guard 2.0 (cfn-guard) is an example of such tool - it allows to write policy rules for any JSON- and YAML-formatted file such as Kubernetes configurations, Terraform JSON configurations, and CloudFormation templates. Builders can synthesize their AWS CDK code to CloudFormation, and run cfn-guard rules on the resulting template for find policy violations. The violations can be fixed (e.g. using policy libraries) or suppressed (if this is an accepted risk).
  • Provide an organization with capability to make the latest rules automatically available for all projects. Dependabot-style pull requests for policy/rule libraries or @GriffinMB's approach (Defaults & configuration policy #25 (comment)) can be ways to implement this. Alternatively, if the company has a process for validating any infrastructure before deployment, that can be another way to verify that latest rules were used to check the infrastructure code.
  • Provide builders with capability to suppress rules violations. Let's say I used AWS resource-level constructs with defaults and a policy library to further configure its properties. I need to make an exception - e.g. turn off S3 access logs. The policy rule will report this as a violation, and I need to suppress it. For example, cdk-nag provides several ways for suppressing a rule.

@dontirun
Copy link

dontirun commented Jan 4, 2022

There are additional topics being discussed in this thread, unless I missed something :):

I wonder how much of this could be implemented through the current and possibly extended functionality of the cdk watch and the AWS Toolkit for VS Code for fast feedback.

If cdk watch added functionality to read cfn-guard policy, it could provide fast feedback with those organization specific rules, like it already does with aspects.

The AWS Toolkit for VS Code could read cfn-guard policy and autofill those defaults in IaC like cdk, cfn, Terraform, etc on instantiation of the specific resources.

@awsmjs
Copy link
Contributor

awsmjs commented Dec 15, 2023

Closing this ticket as it is not a current priority.

@awsmjs awsmjs closed this as completed Dec 15, 2023
@mrgrain mrgrain added status/stale The RFC did not get any significant enough progress or tracking and has become stale. and removed status/proposed Newly proposed RFC labels Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
framework status/stale The RFC did not get any significant enough progress or tracking and has become stale.
Projects
None yet
Development

No branches or pull requests