Add rds clone cluster helpers #33547

sureshc · 2020-03-10T22:00:21Z

Add RDS library with methods to clone and delete a cluster. This can be used by scheduled processes such as Contact Rollups or the MySQL --> Redshift export to provision a clone of the production Aurora cluster or by developers to simplify cloning a cluster for testing.

Background

https://codedotorg.atlassian.net/browse/INF-242

Reviewer Checklist:

Tests provide adequate coverage
Code is well-commented
New features are translatable or updates will not break translations
Relevant documentation has been added or updated
User impact is well-understood and desirable
Pull Request is labeled appropriately
Follow-up work items (including potential tech debt) are tracked and linked

…lass.

sureshc · 2020-03-11T17:13:34Z

@hacodeorg Will these methods help, if you need to create/delete a clone in Contact Rollups?

lib/cdo/aws/rds.rb

sureshc · 2020-03-13T23:12:39Z

lib/cdo/aws/rds.rb

+            db_instance_class: instance_type,
+            engine: source_cluster.engine,
+            db_cluster_identifier: clone_cluster_id,
+            db_parameter_group_name: source_writer_instance.db_parameter_groups[0].db_parameter_group_name


Should we duplicate the source cluster/instance Parameter Groups instead of re-using them? What's the risk of someone cloning the production cluster and then inadvertently modifying the production cluster when they experiment with changes to their clone's parameter group?

Summarizing our discussion offline, our options are:

Re-use source writer's parameter group (PG)

Create a new temporary PG whenever clone_cluster is called, copied from the source writer's PG

Create a separate, long-lived 'clone' PG to use for cloned clusters

We're leaning towards option 2 or 3, since option 1 risks someone modifying the clone's attached PG and negatively affecting the production cluster.

Option 2 would add an extra resource that would need to be deleted when deleting a cloned cluster;

Option 3 could cause conflicts if multiple cloned clusters need to be configured with different parameters.

I'd probably go with 3 for now, but I'm fine with either 2 or 3, whichever seems easier to implement/maintain.

I went with Option 2 because it ensures that engineer cloning a cluster gets a new cluster that is as similar as possible to the one they're cloning without risking that they'll inadvertently change settings on the source cluster.

wjordan

The slow-test and the rake-task environment-variable argument defaults definitely need to be looked into, the rest of the comments/suggestions are more optional.

wjordan · 2020-03-17T21:14:45Z

lib/cdo/aws/rds.rb

+            db_instance_class: instance_type,
+            engine: source_cluster.engine,
+            db_cluster_identifier: clone_cluster_id,
+            db_parameter_group_name: source_writer_instance.db_parameter_groups[0].db_parameter_group_name


Summarizing our discussion offline, our options are:

Re-use source writer's parameter group (PG)

Create a new temporary PG whenever clone_cluster is called, copied from the source writer's PG

Create a separate, long-lived 'clone' PG to use for cloned clusters

We're leaning towards option 2 or 3, since option 1 risks someone modifying the clone's attached PG and negatively affecting the production cluster.

Option 2 would add an extra resource that would need to be deleted when deleting a cloned cluster;

Option 3 could cause conflicts if multiple cloned clusters need to be configured with different parameters.

I'd probably go with 3 for now, but I'm fine with either 2 or 3, whichever seems easier to implement/maintain.

wjordan · 2020-03-17T21:26:35Z

lib/cdo/aws/rds.rb

+    # @param source_cluster_id [String] DB cluster id of the cluster to clone.  Defaults to current environment's cluster.
+    # @param clone_cluster_id [String] DB cluster id to assign to clone.  Defaults to source cluster id + "-clone"
+    # @param instance_type [String]
+    def self.clone_cluster(


Instead of class methods all accepting a cluster_id / source_cluster_id parameter, this could be refactored into a Cluster class with instance methods and an id instance variable.

If it simplifies things, this could possibly even inherit from the existing Aws::RDS::DBCluster resource class (but not necessary).

I chose to leave these as class methods. Cluster clones are typically long lived. I think it's rare that a caller is going to instantiate Cluster, call then clone method on it, and then later call the delete method on it.

Just to clarify my suggestion wasn't about assuming a ~~long~~short-lived object lifecycle, I just thought it would result in more compact/simple code by removing some duplication across the class methods. Fine to leave as-is though, thanks for considering.

True. I agree that the implementation of these 2 methods would be more compact. I'm thinking that usage of these methods would be less compact (instantiate + invoke method).

Ah, I understand now, yeah that's a good point.

wjordan · 2020-03-17T21:37:22Z

lib/rake/rds.rake

+    Cdo::RDS.clone_cluster(
+      source_cluster_id: ENV['SOURCE_CLUSTER_ID'],
+      clone_cluster_id: ENV['CLONE_CLUSTER_ID'],
+      instance_type: ENV['INSTANCE_TYPE']
+    )


I don't think keyword-argument use defaults instead of passed nil values, as the comments seem to imply.
To use the argument defaults instead of nil I think you'd need to do something like:

Suggested change

Cdo::RDS.clone_cluster(

source_cluster_id: ENV['SOURCE_CLUSTER_ID'],

clone_cluster_id: ENV['CLONE_CLUSTER_ID'],

instance_type: ENV['INSTANCE_TYPE']

)

options = {

source_cluster_id: ENV['SOURCE_CLUSTER_ID'],

clone_cluster_id: ENV['CLONE_CLUSTER_ID'],

instance_type: ENV['INSTANCE_TYPE']

}

Cdo::RDS.clone_cluster(options.compact)

I would at least manually test this, just to make sure it's doing what's expected.

I've confirmed that keyword arguments support defaults and have tested, as well, to confirm that.

Are you sure that nil values don't override default keyword arguments? My basic testing suggests they do:

> def test(foo: 'bar'); foo end => :test > test => "bar" > test foo: ENV['NOT_SET'] => nil

ah! I understand, your point. The logic I implemented definitely does not work correctly.

wjordan · 2020-03-17T21:57:37Z

lib/cdo/aws/rds.rb

+          rds_client.wait_until(
+            :db_instance_deleted,
+            {db_instance_identifier: instance.db_instance_identifier},
+            {max_attempts: 20, delay: 60}


This constant makes the unit test take 60 seconds to run despite the stubbed API requests. Make this delay a parameter (default 60) so it can be overridden in the unit test to 0.1 so it runs quickly.

Done! I was wondering why the test was taking so long to execute.

This line still doesn't seem to have been updated with the parameter

Done for realz, now.

wjordan · 2020-03-17T22:27:15Z

shared/test/test_rds.rb

+    @clone_cluster_id = 'cluster-clone'
+    @cluster_to_delete_id = 'delete-me-cluster'
+
+    @source_cluster = {


Were these hashes manually composed, or auto-generated somehow? If manually composed, aside from the amount of time it must have taken to put these together (sunk cost), my concern moving forward is how we plan to maintain these fixtures over time as the rds API changes or evolves (as it inevitably will). Or if we wanted to go the auto-generated fixtures route, recording/replaying the tests through vcr might be another option.

Is it possible to trim down these fixtures so they only contain the relevant fields necessary to properly test the implementation, and leave out the rest of the noise? That might make this test easier to read/maintain.

They were auto-generated. I invoked the APIs from the Ruby console and copied/pasted the responses (changing unique identifiers to anonymous values). I'll trim the unused attributes after implementing the parameter group cloning functionality.

It was useful to have anonymized copies of real API responses, particularly when I added functionality to copy/delete ParameterGroups. I've trimmed off most of the unused attributes on the stubbed data.

…ster.

…ve attempt to use Cloud Formation to provision cluster clone.

…uster's Parameter Groups and update `delete_cluster` to delete its Parameter Groups (if it appears to own them).

…set to nil instead of their default values.

wjordan

Looks good overall! Left comments on a few small things (mostly optional style nits).
I'd also prefer a squash-rebase (followed by a force-push of the branch) to remove the intermediate/wip commits- but I don't think we've standardized on that as a team so also optional.

wjordan · 2020-03-31T17:29:54Z

lib/cdo/aws/rds.rb

+            db_parameter_group_name: copy_source_writer_instance_parameter_group.db_parameter_group_name
+          }
+        )
+        # Wait 30 minutes.  As of mid-2019, it takes about 15 minutes to provision a clone of the production cluster.


This comment can be removed now that this constant is a parameter

lib/cdo/aws/rds.rb

wjordan · 2020-03-31T17:33:03Z

lib/cdo/aws/rds.rb

+          rds_client.wait_until(
+            :db_instance_deleted,
+            {db_instance_identifier: instance.db_instance_identifier},
+            {max_attempts: 20, delay: 60}


This line still doesn't seem to have been updated with the parameter

wjordan · 2020-03-31T17:45:31Z

lib/cdo/aws/rds.rb

+          {
+            source_db_cluster_parameter_group_identifier: source_cluster[:db_cluster_parameter_group],
+            target_db_cluster_parameter_group_description: "#{clone_cluster_id}-auroraclusterdbparameters",
+            target_db_cluster_parameter_group_identifier: "#{clone_cluster_id}-auroraclusterdbparameters",


(small/optional) you could make auroraclusterdbparameters (and also aurorawriterdbparameters) constants to make it more clear they're key (shared) strings and not merely descriptive

Because I decided not to combine these 2 methods into a Cluster Class, I think a constant wouldn't work here? But I did update the clone_cluster method to use variables instead of repeating the same string.

wjordan · 2020-03-31T17:48:41Z

lib/cdo/aws/rds.rb

+        source_writer_instance_identifier = source_cluster.
+          db_cluster_members.
+          select(&:is_cluster_writer).
+          first.


(style/nit) .find(&:x) could replace .select(&:x).first

wjordan · 2020-03-31T18:39:59Z

lib/cdo/aws/rds.rb

+        source_cluster = rds_client.describe_db_clusters({db_cluster_identifier: source_cluster_id}).db_clusters.first
+
+        copy_source_cluster_parameter_group = rds_client.copy_db_cluster_parameter_group(
+          {


(style/nit) you can remove braces when passing hash options as the last argument to a Ruby method

sureshc added 11 commits March 2, 2020 15:56

stub RDS library

9156b25

stash

1f32ef3

Add delete_cluster method and improve/fix clone_cluster method.

d726305

Merge branch 'staging' into add-rds-clone-cluster-helpers

8d5df8e

Add custom waiter for deleting a cluster.

4e4a413

Increase delete cluster wait time.

7671ec8

Factories for tests.

0a61af2

Fix delete_cluster waiter and add clone_cluster test.

1e95527

Add client stubs and test for delete_cluster.

64e6bf5

Switch to using the error code instead of the fully qualified error c…

ed4291f

…lass.

Fix attributes copied from source cluster.

037b4a0

sureshc requested review from hacodeorg and wjordan March 11, 2020 17:13

sureshc added 3 commits March 11, 2020 10:28

Merge branch 'staging' into add-rds-clone-cluster-helpers

7f1e01e

Add RDS policy to production daemon.

b7b3925

Add rake tasks that clone / delete clusters.

27917ee

hacodeorg reviewed Mar 11, 2020

View reviewed changes

lib/cdo/aws/rds.rb Show resolved Hide resolved

sureshc marked this pull request as ready for review March 11, 2020 23:27

sureshc requested a review from Hamms March 13, 2020 16:56

sureshc commented Mar 13, 2020

View reviewed changes

wjordan suggested changes Mar 17, 2020

View reviewed changes

sureshc added 8 commits March 18, 2020 11:40

Merge branch 'staging' into add-rds-clone-cluster-helpers

ab7c8ff

Use a db_clone CloudFormation template to provision clone of a db clu…

750dbc1

…ster.

Remove debug logic.

63567b2

Add waiter configuration arguments to clone_cluster method and remo…

f872ef0

…ve attempt to use Cloud Formation to provision cluster clone.

Fix syntax error.

6bb66d5

Fix it for real this time.

016442e

Add waiter arguments to delete_cluster method.

d718e86

Update clone_cluster method to create & use copies of the source cl…

431615a

…uster's Parameter Groups and update `delete_cluster` to delete its Parameter Groups (if it appears to own them).

sureshc added 3 commits March 27, 2020 15:39

Fix for missing environment variables resulting in arguments getting …

a8274fe

…set to nil instead of their default values.

Fix stub.

76a472b

Trim stubbed API response data.

6c02392

sureshc requested a review from wjordan March 30, 2020 17:14

wjordan approved these changes Apr 1, 2020

View reviewed changes

Clean up based on feedback in Pull Request.

199fb4e

sureshc merged commit e8fb7db into staging Apr 1, 2020

sureshc mentioned this pull request Apr 2, 2020

Add permission to copy/delete ParameterGroups. #33999

Merged

7 tasks

This was referenced Apr 10, 2020

Fix clone cluster policy #34161

Merged

Fix clone cluster policy (again) #34171

Merged

fisher-alice deleted the add-rds-clone-cluster-helpers branch July 13, 2022 22:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add rds clone cluster helpers #33547

Add rds clone cluster helpers #33547

sureshc commented Mar 10, 2020

sureshc commented Mar 11, 2020

sureshc Mar 13, 2020

wjordan Mar 17, 2020

sureshc Mar 30, 2020

wjordan left a comment

wjordan Mar 17, 2020

wjordan Mar 17, 2020

sureshc Mar 30, 2020

wjordan Apr 1, 2020 •

edited

sureshc Apr 1, 2020

wjordan Apr 1, 2020

wjordan Mar 17, 2020

sureshc Mar 18, 2020

wjordan Mar 18, 2020

sureshc Mar 18, 2020

sureshc Mar 28, 2020

wjordan Mar 17, 2020

sureshc Mar 18, 2020

wjordan Mar 31, 2020

sureshc Apr 1, 2020

wjordan Mar 17, 2020

sureshc Mar 18, 2020

sureshc Mar 30, 2020

wjordan left a comment

wjordan Mar 31, 2020

sureshc Apr 1, 2020

wjordan Mar 31, 2020

wjordan Mar 31, 2020

sureshc Apr 1, 2020

wjordan Mar 31, 2020

sureshc Apr 1, 2020

wjordan Mar 31, 2020

sureshc Apr 1, 2020

Add rds clone cluster helpers #33547

Add rds clone cluster helpers #33547

Conversation

sureshc commented Mar 10, 2020

Background

Reviewer Checklist:

sureshc commented Mar 11, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wjordan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wjordan Apr 1, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wjordan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wjordan Apr 1, 2020 •

edited