Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rds clone cluster helpers #33547

Merged
merged 26 commits into from Apr 1, 2020
Merged

Add rds clone cluster helpers #33547

merged 26 commits into from Apr 1, 2020

Conversation

sureshc
Copy link
Contributor

@sureshc sureshc commented Mar 10, 2020

Add RDS library with methods to clone and delete a cluster. This can be used by scheduled processes such as Contact Rollups or the MySQL --> Redshift export to provision a clone of the production Aurora cluster or by developers to simplify cloning a cluster for testing.

Background

https://codedotorg.atlassian.net/browse/INF-242

Reviewer Checklist:

  • Tests provide adequate coverage
  • Code is well-commented
  • New features are translatable or updates will not break translations
  • Relevant documentation has been added or updated
  • User impact is well-understood and desirable
  • Pull Request is labeled appropriately
  • Follow-up work items (including potential tech debt) are tracked and linked

@sureshc
Copy link
Contributor Author

sureshc commented Mar 11, 2020

@hacodeorg Will these methods help, if you need to create/delete a clone in Contact Rollups?

@sureshc sureshc marked this pull request as ready for review March 11, 2020 23:27
@sureshc sureshc requested a review from Hamms March 13, 2020 16:56
db_instance_class: instance_type,
engine: source_cluster.engine,
db_cluster_identifier: clone_cluster_id,
db_parameter_group_name: source_writer_instance.db_parameter_groups[0].db_parameter_group_name
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we duplicate the source cluster/instance Parameter Groups instead of re-using them? What's the risk of someone cloning the production cluster and then inadvertently modifying the production cluster when they experiment with changes to their clone's parameter group?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summarizing our discussion offline, our options are:

  1. Re-use source writer's parameter group (PG)
  2. Create a new temporary PG whenever clone_cluster is called, copied from the source writer's PG
  3. Create a separate, long-lived 'clone' PG to use for cloned clusters

We're leaning towards option 2 or 3, since option 1 risks someone modifying the clone's attached PG and negatively affecting the production cluster.

  • Option 2 would add an extra resource that would need to be deleted when deleting a cloned cluster;
  • Option 3 could cause conflicts if multiple cloned clusters need to be configured with different parameters.

I'd probably go with 3 for now, but I'm fine with either 2 or 3, whichever seems easier to implement/maintain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with Option 2 because it ensures that engineer cloning a cluster gets a new cluster that is as similar as possible to the one they're cloning without risking that they'll inadvertently change settings on the source cluster.

Copy link
Contributor

@wjordan wjordan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The slow-test and the rake-task environment-variable argument defaults definitely need to be looked into, the rest of the comments/suggestions are more optional.

db_instance_class: instance_type,
engine: source_cluster.engine,
db_cluster_identifier: clone_cluster_id,
db_parameter_group_name: source_writer_instance.db_parameter_groups[0].db_parameter_group_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summarizing our discussion offline, our options are:

  1. Re-use source writer's parameter group (PG)
  2. Create a new temporary PG whenever clone_cluster is called, copied from the source writer's PG
  3. Create a separate, long-lived 'clone' PG to use for cloned clusters

We're leaning towards option 2 or 3, since option 1 risks someone modifying the clone's attached PG and negatively affecting the production cluster.

  • Option 2 would add an extra resource that would need to be deleted when deleting a cloned cluster;
  • Option 3 could cause conflicts if multiple cloned clusters need to be configured with different parameters.

I'd probably go with 3 for now, but I'm fine with either 2 or 3, whichever seems easier to implement/maintain.

# @param source_cluster_id [String] DB cluster id of the cluster to clone. Defaults to current environment's cluster.
# @param clone_cluster_id [String] DB cluster id to assign to clone. Defaults to source cluster id + "-clone"
# @param instance_type [String]
def self.clone_cluster(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of class methods all accepting a cluster_id / source_cluster_id parameter, this could be refactored into a Cluster class with instance methods and an id instance variable.

If it simplifies things, this could possibly even inherit from the existing Aws::RDS::DBCluster resource class (but not necessary).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose to leave these as class methods. Cluster clones are typically long lived. I think it's rare that a caller is going to instantiate Cluster, call then clone method on it, and then later call the delete method on it.

Copy link
Contributor

@wjordan wjordan Apr 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify my suggestion wasn't about assuming a longshort-lived object lifecycle, I just thought it would result in more compact/simple code by removing some duplication across the class methods. Fine to leave as-is though, thanks for considering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I agree that the implementation of these 2 methods would be more compact. I'm thinking that usage of these methods would be less compact (instantiate + invoke method).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I understand now, yeah that's a good point.

Comment on lines 12 to 16
Cdo::RDS.clone_cluster(
source_cluster_id: ENV['SOURCE_CLUSTER_ID'],
clone_cluster_id: ENV['CLONE_CLUSTER_ID'],
instance_type: ENV['INSTANCE_TYPE']
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think keyword-argument use defaults instead of passed nil values, as the comments seem to imply.
To use the argument defaults instead of nil I think you'd need to do something like:

Suggested change
Cdo::RDS.clone_cluster(
source_cluster_id: ENV['SOURCE_CLUSTER_ID'],
clone_cluster_id: ENV['CLONE_CLUSTER_ID'],
instance_type: ENV['INSTANCE_TYPE']
)
options = {
source_cluster_id: ENV['SOURCE_CLUSTER_ID'],
clone_cluster_id: ENV['CLONE_CLUSTER_ID'],
instance_type: ENV['INSTANCE_TYPE']
}
Cdo::RDS.clone_cluster(options.compact)

I would at least manually test this, just to make sure it's doing what's expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've confirmed that keyword arguments support defaults and have tested, as well, to confirm that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that nil values don't override default keyword arguments? My basic testing suggests they do:

> def test(foo: 'bar'); foo end
=> :test
> test
=> "bar"
> test foo: ENV['NOT_SET']
=> nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah! I understand, your point. The logic I implemented definitely does not work correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

rds_client.wait_until(
:db_instance_deleted,
{db_instance_identifier: instance.db_instance_identifier},
{max_attempts: 20, delay: 60}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constant makes the unit test take 60 seconds to run despite the stubbed API requests. Make this delay a parameter (default 60) so it can be overridden in the unit test to 0.1 so it runs quickly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! I was wondering why the test was taking so long to execute.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line still doesn't seem to have been updated with the parameter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done for realz, now.

@clone_cluster_id = 'cluster-clone'
@cluster_to_delete_id = 'delete-me-cluster'

@source_cluster = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were these hashes manually composed, or auto-generated somehow? If manually composed, aside from the amount of time it must have taken to put these together (sunk cost), my concern moving forward is how we plan to maintain these fixtures over time as the rds API changes or evolves (as it inevitably will). Or if we wanted to go the auto-generated fixtures route, recording/replaying the tests through vcr might be another option.

Is it possible to trim down these fixtures so they only contain the relevant fields necessary to properly test the implementation, and leave out the rest of the noise? That might make this test easier to read/maintain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They were auto-generated. I invoked the APIs from the Ruby console and copied/pasted the responses (changing unique identifiers to anonymous values). I'll trim the unused attributes after implementing the parameter group cloning functionality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was useful to have anonymized copies of real API responses, particularly when I added functionality to copy/delete ParameterGroups. I've trimmed off most of the unused attributes on the stubbed data.

@sureshc sureshc requested a review from wjordan March 30, 2020 17:14
Copy link
Contributor

@wjordan wjordan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall! Left comments on a few small things (mostly optional style nits).
I'd also prefer a squash-rebase (followed by a force-push of the branch) to remove the intermediate/wip commits- but I don't think we've standardized on that as a team so also optional.

db_parameter_group_name: copy_source_writer_instance_parameter_group.db_parameter_group_name
}
)
# Wait 30 minutes. As of mid-2019, it takes about 15 minutes to provision a clone of the production cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment can be removed now that this constant is a parameter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

lib/cdo/aws/rds.rb Outdated Show resolved Hide resolved
rds_client.wait_until(
:db_instance_deleted,
{db_instance_identifier: instance.db_instance_identifier},
{max_attempts: 20, delay: 60}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line still doesn't seem to have been updated with the parameter

{
source_db_cluster_parameter_group_identifier: source_cluster[:db_cluster_parameter_group],
target_db_cluster_parameter_group_description: "#{clone_cluster_id}-auroraclusterdbparameters",
target_db_cluster_parameter_group_identifier: "#{clone_cluster_id}-auroraclusterdbparameters",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(small/optional) you could make auroraclusterdbparameters (and also aurorawriterdbparameters) constants to make it more clear they're key (shared) strings and not merely descriptive

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I decided not to combine these 2 methods into a Cluster Class, I think a constant wouldn't work here? But I did update the clone_cluster method to use variables instead of repeating the same string.

source_writer_instance_identifier = source_cluster.
db_cluster_members.
select(&:is_cluster_writer).
first.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(style/nit) .find(&:x) could replace .select(&:x).first

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

source_cluster = rds_client.describe_db_clusters({db_cluster_identifier: source_cluster_id}).db_clusters.first

copy_source_cluster_parameter_group = rds_client.copy_db_cluster_parameter_group(
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(style/nit) you can remove braces when passing hash options as the last argument to a Ruby method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@sureshc sureshc merged commit e8fb7db into staging Apr 1, 2020
@fisher-alice fisher-alice deleted the add-rds-clone-cluster-helpers branch July 13, 2022 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants