-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for moving snapshots to s3 #5
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments, please address.
Also, do add the unit test file for the newly added lambda
snap_arn = snap['DBSnapshotArn'] | ||
snap_status = snapshots[0]['Status'] | ||
if snap_status == 'available': | ||
return snap_arn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we directly not return snap['DBSnapshotArn'] if snapshot status is available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ARN will be available even in creating state, couldn't place it inside if case with cluster support changes now. Let me know if its not ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the above comment will address this issue as well i believe. do check and verify
@namitad resolved conflicts that came up. Do take a look the PR that has review changes and support for Cluster snapshots also added. |
region = os.environ['Region'] | ||
rds = boto3.client('rds', region) | ||
if is_cluster: | ||
snapshots_response = rds.describe_db_cluster_snapshots(DBClusterSnapshotIdentifier=snapshot_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the pattern followed is to have separate py files for cluster and instance as they have completely independent branches and the isCluster check is added in the beginning itself.
Please check the pattern followed and move the cluster specific logic to its own file. Any common processing can be added in the common dir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactored to have separate py files for cluster and instance, with common function in util file
snap_arn = snap['DBSnapshotArn'] | ||
snap_status = snapshots[0]['Status'] | ||
if snap_status == 'available': | ||
return snap_arn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the above comment will address this issue as well i believe. do check and verify
response = rds.start_export_task( | ||
ExportTaskIdentifier=export_id, | ||
SourceArn=snapshot_arn, | ||
S3BucketName=bucket_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we use the same bucket name that was provided by the user in https://github.com/intuit/Trapheus/blob/master/README.md -> Instructions -> Setup -> point 2. ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use that too, but that bucket is meant for Trapheus Cloudformation deployment files right. So I did not want to store the snapshots there. What do you guys think @namitad and @stationeros
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also not sure how we can get that bucket name in our lambdas, unless the user passes it in params, which I don't think should be done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s3 bucket name needs to be a input as its not part of the state machine while executing sam deploy
src/common/python/constants.py
Outdated
|
||
DELETE = "Delete" | ||
RENAME = "Rename" | ||
SNAPSHOT = "SnapshotCreation" | ||
DB_RESTORE = "Restore" | ||
CLUSTER_RESTORE = "ClusterRestore" | ||
EXPORT_SNAPSHOT = "SnapshotExportTask" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove the Task from SnapShotExportTask as the name doesn't follow the convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done,
4d22883
@@ -0,0 +1,52 @@ | |||
import os | |||
import time | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove whitespaces between imports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stationeros So the blanks between imports are added by PyCharm as the import layout when you click optimize imports cmd
rds = boto3.client('rds', region) | ||
snapshots_response = rds.describe_db_cluster_snapshots(DBClusterSnapshotIdentifier=snapshot_name) | ||
assert snapshots_response['ResponseMetadata'][ | ||
'HTTPStatusCode'] == 200, f"Error fetching cluster snapshots: {snapshots_response}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an illegal character "f" present in the line 44. Can you please check that.
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
assert snapshots_response['ResponseMetadata'][ | ||
'HTTPStatusCode'] == 200, f"Error fetching cluster snapshots: {snapshots_response}" | ||
snapshots = snapshots_response['DBClusterSnapshots'] | ||
assert len(snapshots) == 1, f"More than one snapshot matches name {snapshot_name}" |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @namitad already added info around it in the above comment. Just to add to it, F-strings have a shorter, more readable syntax, and is also much faster. Con is, it requires at least Python 3.6, since I saw Trapheus doc already had python 3.7 as a pre-req. I thought it wasn't a problem.
if snap_status == 'available': | ||
return snap['DBClusterSnapshotArn'] | ||
else: | ||
raise Exception(f"Snapshot is not available yet, status is {snap_status}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Illegal character . Please check.
snapshot_id = instance_id + constants.SNAPSHOT_POSTFIX | ||
snapshot_arn = get_instance_snapshot_arn(snapshot_id) | ||
account_id = util.get_aws_account_id() | ||
bucket_name = constants.RDS_SNAPSHOTS_BUCKET_NAME_PREFIX + account_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RiyaJohn is it assumed that the s3 bucket is already existing? and if it doesnt, the export task throws an exception?
can we add a testcase on this scenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bucket will exist as the bucket is created as part of our deployment here
Line 370 in 4d22883
SnapshotsBucket: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't throw an exception, when the bucket doesn't exist, as the deployment always creates one (which imo is correct (better)). Imo there is no need handling a case that will not appear (having no export bucket), as that would assume the user tempering manually. Usually system doesn't handle these scenarios. Anyway, just my 2 cents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@despot agreed. If its being created as part of the deployment, i think we are ok. But considering a scenario that the s3 bucket can be deleted, do you think its good to have that handled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@namitad thanks for considering my opinion. I was trying to find some articles about what i already practiced within the team in the past, though I would need to dig deeper. The idea is, that usually the application refers to doing one thing, and as soon as the team starts handling cases (including only exception handling) what a user can do outside of the prescribed way, the team goes into the direction of solving the problem of the universe. More pragmatically, the team wouldn't wanna handle someone deleting a lambda function or the kmsKeyId manually e.t.c. as this is not what they should do. So I don't think a test should be written for the user deleting the S3 bucket. It all comes down to what you want the application to do. Does a user deleting a S3 bucket a normal behavior that you expect to handle. I don't imagine this app like that. But if you do, then either handle the case (by, for instance, creating a new S3 bucket) or have a custom exception so you remember to do this in future. As I think this shouldn't be handled, imo @RiyaJohn should not do anything more regarding the S3 bucket. But again, you are leading the project :), so let us know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Description
Thanks for contributing this Pull Request. Make sure that you submit this Pull Request against the
master
branch of this repository, add a brief description, and tag the relevant issue(s) and PR(s) below.Fixes #4