Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC 470: Aurora Serverless v2 support #472

Closed
wants to merge 11 commits into from

Conversation

pahud
Copy link

@pahud pahud commented Jan 3, 2023

This is a request for comments about Aurora Serverless v2 support. See #470 for
additional details.

APIs will be signed off by @vinayak-kukreja


By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache-2.0 license

Copy link
Contributor

@corymhall corymhall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a stop-gap approach to adding serverless v2 support. Maybe that's what we want to do, but personally I think we need to redesign the entire construct. From a users perspective I need to know too much outside of how the construct works.

Regardless, I would like to see more discussion around specific use cases and maybe when you should/should not use this construct. For example, I don't think any of the discussion in the original PR has been addressed aws/aws-cdk#22446

### Using Aurora Serverless v2 for existing provisioned workloads

Aurora Serverless v2 allows you to add one or more Aurora Serverless v2 DB instances to
the existing cluster as reader DB instances. Make sure the cluster engine is compatible with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only reader? I thought you can have writer as well?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @corymhall

If user has provisioned the cluster with the existing DatabaseCluster L2 construct with at least 1 provisioned instances, any additional serverless instance added into this cluster would always be the reader.

To create a serverless writer, first we need create the cluster with no provisioned instances with instance: 0 and the first serverless instance added into this cluster would be the writer.

Copy link
Contributor

@vinayak-kukreja vinayak-kukreja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is not a great customer experience. I agree with Cory that to use this construct the customers would need to understand a lot of the context of what parts are needed to make a cluster work with serverless v2 configuration. Our API is not clearly stating the intent here.

For instance, if I am a customer who would like to have a database cluster that supports Aurora Serverless V2 configuration, would it not be easier to just know that a construct does this for me and abstracting me from the internals of how that is possible. Mentioning serverlessV2Scaling here seems odd to me since that would already be my intent.

I believe we are trying to solve two different intent here,

Customers who already have a database cluster

For them to utilize serverless v2 configuration, they would need to,

  1. Make sure there database engine version aligns with what is supported by v2
  2. Enabling the serverlessV2Scaling configuration
  3. Make sure the region is supported(Which is common for both)

Things to consider here would be,

  • Is it easy to switch database engine versions? Does that cause a stack update or block us on using certain features?
  • Can we provide the customers with validations or defaults around database engine versions and regions?

Customers who would like to create a new database cluster

These customers might not be aware of these recent changes. Would it not make sense to provide them with a default configured cluster that supports v2 configuration.

Things to consider here would be,

  • The cost the customer would incur. From my understanding, serverless v2 instances have a minimum ACU(Aurora Capacity Unit) of 0.5 which would add to their cost. We should warn the customer when they are doing this or mention it somewhere.
  • What use cases would be common here? If we are to provide a default configuration what would that look like? Would it be one serverless writer and one serverless reader? What would that cost?

I would like to learn how the customers utilize this new configuration? What could a default scenario look like that can target to solve a commonly used pattern?

For instance,
For an Aurora Serverless v2 cluster, you can choose whether to encrypt the cluster volume. For an Aurora Serverless v1 cluster, the cluster volume is always encrypted. You can choose the encryption key, but you can't disable encryption.

Does default encryption make sense for us? What should be our VPC configuration? Just some things to figure out defaults for. Let me know if this makes sense.

Comment on lines 49 to 61
const cluster = new rds.DatabaseCluster(this, 'Database', {
...
instances: 2,
// make sure the engine is compatible with serverless v2.
engine: rds.DatabaseClusterEngine.auroraMysql({ version: rds.AuroraMysqlEngineVersion.VER_3_02_1 }),
// required before you are allowed to add serverless instances
serverlessV2Scaling: {
maxCapacity: 4,
minCapacity: 0.5,
},
});
// add a serverless reader
cluster.addServerlessInstance('InstanceId')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The serverlessV2Scaling and instances is confusing to me. I believe as a customer this is too much to know and does not reflect intent in the API.

For instance, what does instances mean if I have serverlessV2Scaling mentioned. I would assume those are two serverless v2 instances being created by the construct. But these are provisioned instances right?

And we are also assuming the customer to know that there is no way to provision serverlessInstance: 1 in the API and we are doing it via addServerlessInstance(). Let me know if I am getting this right?

And as you mentioned restrictions on region and database engines that are supported by v2. There should either be validations or defaults around it if we are opting for v2 and this responsibility should not lie with the customer to configure.

@mergify mergify bot dismissed stale reviews from vinayak-kukreja and corymhall January 10, 2023 18:10

Pull request has been modified.

@pahud
Copy link
Author

pahud commented Jan 10, 2023

HI @corymhall and @vinayak-kukreja

I've updated this RFC to propose a new construct instead. Please check it out again.

Copy link
Contributor

@vinayak-kukreja vinayak-kukreja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we should add a section about use cases to this. We still need to understand the intent behind the API. That would help us figure out how we can make the user experience better with our abstractions.

I can reach out to Aurora team to setup a conversation thread so that you can gather this information.
Has there been any community insights on the use cases or documented anywhere else?

## Aurora Serverless v2 support

Aurora Serverless v2 is an on-demand, autoscaling configuration for Amazon Aurora.
Aurora Serverless v2 helps to automate the processes of monitoring the workload and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have helper methods supporting metrics here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As DatabaseClusterV2 extends the DatabaseClusterBase abstract class and implement IDatabaseCluster so all helper methods defined in the base class should be available for DatabaseClusterV2. If we need to add additional helper methods for this new construct we can consider either add in the abstract class or this class. As this is a new construct without breaking changes, I think it's always possible to add any helper methods if necessary but doesn't have to be do that in the first PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And Yes! I will add a section for all known use cases.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Cases added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding those.

In regards to the helper methods, what I want to convey is to think about how certain functions enable a better customer experience. They can be implemented later but the RFC would be a good place to list them out if any.


```ts
new rds.DatabaseClusterV2(stack, 'cluster', {
engine,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have some validations in place for Serverless V2 instances? I believe these do not work with all versions of database engines.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we should validate the engine: rds.IClusterEngine to make sure it's compatible with serverless v2 and throw if necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, do you believe we can have others? I remember you mentioned this is limited to certain regions as well?

Can these be documented in the RFC?

```ts
new rds.DatabaseClusterV2(stack, 'cluster', {
engine,
vpc,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any defaults for VPC that would make sense for our database here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes ec2.IVpc would suffice.

engine,
vpc,
// writer(serverless)
writer: { serverless: true },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does serverless: true mean here? A serverless instance has minimum and maximum ACUs right? What are those defaults for us? How can the customer modify them? Do we plan to add to the documentation about expenses this would add?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default ACU is defined in the serverlessV2Scaling prop for DatabaseClusterV2 which is optional and the default is min: 0.5 max: 1, and that is the minimal required value. We are not allowed to define ACU for individual serverless instances. The only available configuration is the serverlessV2Scaling for the cluster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, could you please tell me why the maximum is 1? I see it can be higher here.

We are not allowed to define ACU for individual serverless instances.

By this you mean the config of min and max that we would set would apply to all serverless v2 instances. Right?

Copy link
Author

@pahud pahud Jan 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we should discuss what is the best defaults of min and max. The minimal possible values from cost-saving perspective is min:0.5 and max:1

  • the minimal value of min is 0.5
  • the value of max should be greater than the value of min hence 1

With those defaults, the default cluster would have minimal required values with the best cost saving while customers can always specify any higher values as they wish. With that being said, I think the defaults make sense. Thoughts?

could be either serverless or privisioned.

```ts
new rds.DatabaseClusterV2(stack, 'cluster', {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you mention the defaults for this API and any validations that would be implemented?

What would happen if I do,

new rds.DatabaseClusterV2(stack, 'cluster');

Copy link
Author

@pahud pahud Jan 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The minimal required props of this construct would be engine and vpc. The engine prop is rds.IClusterEngine and we can check in the construct if the engine is compatible with serverless v2 and throw if necessary.

vpc is required in the InstanceProps for the old DatabaseCluster but we don't need the whole InstanceProps. We just need the vpc: ec2.IVpc.

check out this POC for more implementation details.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the confusion. What I meant to ask was what configuration of cluster gets created if we are just relying on defaults? For instance, do we generate 2 provisioned instances in the default scenario

I was unable to figure this out in the POC. Could you point me to the implementation and also add that information in the RFC?

[Using Aurora Serverless v2](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless-v2.html)
for more details.

### Create a new cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can any helper methods make it easier or more clear for the customer to use this resource?

Copy link
Author

@pahud pahud Jan 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As DatabaseClusterV2 extends the DatabaseClusterBase abstract class and implement IDatabaseCluster so all helper methods defined in the base class should be available for DatabaseClusterV2. If there's new requirement to add any new helper method we can consider add in the base class so the old DatabaseCluster class can inherit as well. As DatabaseClusterV2 aims to create a new ServerlessV2-enabled cluster with a writer and optional readers, which could be declared in the construct property, it is not clear to me to have the requirement to define additional helper methods in the L2 construct class rather than the base class. If there's any, please let me know.

Comment on lines +56 to +58
{ instanceType: ec2.InstanceType.of(ec2.InstanceClass.T3, ec2.InstanceSize.LARGE) },
// reader 2(serverless)
{ serverless: true },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks odd to me. We have more flexibility provisioning an instance but serverless is just a boolean? Can this experience be modeled to be similar?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need an interface for writer and readers and they can be either provisioned or serverless and the default is provisioned.

For a serverless instance, the minimal required property would be serverless: true since the default is provisioned while for a provisioned instanced the default could be just {} but you can optionally specify custom instanceType which is not allowed in serverless.

I think this could satisfy both serverless and provisioned here but I am open to different design for this interface.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there could be an issue with large clusters with our current API with list of readers and writers.

For instance, imagine a cluster with 50 readers. Instantiating with current API would look quite large and have redundant information as well.

Also, could you tell me what do you mean by custom instance type is not allowed in serverless? From what I understand, we can have ACU's modified for serverless v2 instances?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my current design, if you are creating a large cluster with 50 or more readers, you need to pass a list of instance options to the readers prop. And each element in the list could have different instance types or instance options. Do you expect any other user experience for this scenario?

new DatabaseClusterV2(scope, id, {
  writer,
  readers: readerList,
}

Also, could you tell me what do you mean by custom instance type is not allowed in serverless? From what I understand, we can have ACU's modified for serverless v2 instances?

In serverlessV2, there is no option to specify individual ACU for individual serverless instances. All serverless instances are sharing the same ServerlessV2ScalingConfiguration which is cluster-wise. Behind the scene the serverless instance is literally the CfnDbInstance which does not allow you to specify any specific ACU for it.

Comment on lines +121 to +122
Users that opt-in serverless v2 enabled clusters should use this class to create a new cluster
as well as writer and reader instances, which could be serverless, porivioned or mixed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can the user opt-in for this functionality? Is it just by using this construct?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. From my current proposed design, the DatabaseClusterV2 construct is only for serverlessV2-enabled clusters only. When you instantiate this class, a CfnDatabaseCluster would be provisioned with serverlessV2 enabled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you for clarifying. Could you please add this to the RFC too?

@mergify mergify bot dismissed vinayak-kukreja’s stale review January 17, 2023 22:37

Pull request has been modified.

@pahud
Copy link
Author

pahud commented Jan 17, 2023

Just added the Use Cases section and elaborated more details on the design. FYR @vinayak-kukreja

Comment on lines +27 to +48
## Use Cases

1. To create a new Aurora Serverless v2 cluster with one writer only. This writer could be either serverless or provisioned instance:

- serverless writer
- provisioned writer

2. To create a new Aurora Serverless v2 cluster with one writer and one reader as serverless or provisoined. This could be:

- serverless writer + serverless reader
- serverless writer + provisioned reader
- provisioned writer + serverless reader
- provisioned writer + provisioned reader

3. To create a new Aurora Serverless v2 cluster with one writer and multiple readers. This could be:

- serverless writer + all serverless readders
- serverless writer + all provisioned readers
- serverless writer + mixed readers
- provisioned writer + all serverless readers
- provisioned writer + all provisioned readers
- provisioned writer + mixed readers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding these.

Let me make sure the Aurora Serverless Service team also reviews this and if there are any more that they think could provide a better customer experience, then we can add those here as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After syncing with the Aurora team, listing here for visibility what this design needs to look out for,

  • Supporting a customer use case where the cluster could be empty i.e. without any instance and just the storage layer. This helps customers not actively using the database and remove the compute layer which adds costs for them. This could potentially also help in migrating from an existing cluster to a serverless v2 supported cluster.
  • Decide on a default scenario for provisioning.
  • Decide on default minimum and maximum ACUs for serverless v2 instance. Should there be different defaults based on what database engine is selected?
  • Helper methods to add/remove/upgrade instances.
  • Metrics to detect failover in instances to alarm over and general database cluster metrics that can be helpful for the customers.

During implementation,

  • Testing that provisioning is working with all supported database engines.
  • Callouts in documentation about
    • Unreachable writer instance due to DNS caching if the instance was replaced due to a failover.
    • Cluster instances sizes are recommended to be similar to better handle a failover scenario.

@vinayak-kukreja
Copy link
Contributor

Hey @pahud, replied to some of the comments thread. Let me know if you have any queries.

@pahud
Copy link
Author

pahud commented Jan 26, 2023

Per discussion with the core team offline, I am closing this RFC in favor of a new one.

@pahud pahud closed this Jan 26, 2023
@evgenyka evgenyka added the l2-request request for new L2 construct label Oct 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
l2-request request for new L2 construct
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants