-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for moving the snapshot to a S3 bucket #4
Comments
Hi! I would like to work on this issue, would be great if I could get more info on where we plan to add this step in the state machine. |
@namitad @stationeros this should be reopened as there are 24 regions in total (https://aws.amazon.com/about-aws/global-infrastructure/#:~:text=AWS%20Global%20Infrastructure%20Map,Indonesia%2C%20Japan%2C%20and%20Spain.) out of which only 5 are supporting exporting snapshot to S3 (https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ExportSnapshot.html): The following should be done to get the rest of the ~80%: I already create a PR for the non-cluster part: #34 . Kindly review it, and if ok, I will do the cluster part as well. |
Thanks @despot for notifying about this. While I am aware of the support of the snapshot export function typically being limited to five regions, due to the limit on the number of access control list (ACL) entries for a VPC, I am also mindful of the fact that the export across region will induce a certain lag primarily due to the longer network channels between regional data centers. IMHO the export to S3 feature can be looked at from the purview of offloading snapshots to S3 for long term retention , typically suitable for someone who want to create data lakes and have a way to query their snapshots easily.(P.S. I do see Trapheus branching into that someday as well). However, as it stands today, this is not good if you would like to import data back into RDS or use the exported data to restore an RDS instance or cluster. Having said that, if we look at our implementation, we do see a potential issue wherein the async export might take more time than the entire state machine to execute (probably might get into a timeout) given the size of data and various other constraints of different vendors. The chances of the same increases multifold in case of a cross-region export. On that front, I have a few larger questions
Do let me know your thoughts. |
Hi @stationeros thanks for the review and thoughts. Legend: "While I am aware of the support of the snapshot export function typically being limited to five regions, due to the limit on the number of access control list (ACL) entries for a VPC, I am also mindful of the fact that the export across region will induce a certain lag primarily due to the longer network channels between regional data centers." Additionally, as it is built now the user can decide whether to use the feature part from my PR or not. If they don't use it, the resources, commands and time of execution stay the same. This is just a helper in case they need it. So no loss, just gain. "1. Is there a way to benchmark the cross region approach in terms of any standard performance metrics."
"2. Is there a way to notify users (via slack or email) that the export to S3 has been successful." Kind regards, |
Thanks @despot , sorry couldn't catch up to this because of vacation time. Agreed that a more rigorous benchmarking does warrant for some dedicated time, although I am glad that we have started the discussion in the direction. Also to answer the point Why wouldn't we allow someone to export to a S3 export supporting region, if they have the resources? Wouldn't it be better for the system to have this as an option regardless of how many use-cases there are for it?" I am just being ridiculously cautious about putting sleeper features in the codebase without any particular roadmap for delivering a value out of it, although in this instance it doesn't seem to be the case with having discovered the use case for building data lakes and the likes of it. Your initial analysis with an empty RDS instance albeit not very conclusive does look a step in the right direction. I would branch out a separate issue for the benchmark tests and hope to get to a more right place in time. |
... @stationeros and I hope you had a great vacation ;) @stationeros @namitad @RiyaJohn as we are collaborating, feel free to connect with me on linkedin: First name: Despot Last name: Jakimovski. |
@despot Thanks for reaching out. Yes, was finally able to catch a break in these desperate times :) . Will be glad to connect. |
Right now the snapshot created is stored in the source AWS account. We want to add support for storing of this snapshot in a S3 bucket thereby opening up to additional DR use cases.
Please refer the below doc.
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ExportSnapshot.html
The text was updated successfully, but these errors were encountered: