-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-28957: Adding support for continuous Backup and Point-in-Time Recovery #7445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…p to External Storage (#6633) * HBASE-28996: Implement Custom ReplicationEndpoint to Enable WAL Backup to External Storage * fix spotless error
…ckup (#6710) * HBASE-29025: Enhance the full backup command to support continuous backup * add new check for full backup command regards to continuous backup flag * minor fixes
…6848) Signed-off-by: Andor Molnár <andor@apache.org> Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org>
…ritical backups and propose correct approach (#6922) * improve the logic of backup deletion validation of PITR-critical backups * add new tests
Signed-off-by: Andor Molnar <andor@apache.org> Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
…nd (#7007) * Store bulkload files in daywise bucket as well * Integrate backup WAL cleanup logic with the delete command * address the review comments * address the review comments * address the review comments * add more unit tests to cover all cases * address the review comments
Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Signed-off-by: Andor Molnár andor@apache.org Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com> Reviewed by: Vinayak Hegde <vinayakph123@gmail.com> Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
…up is Force Deleted (#7090) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
#7106) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com> Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
…inuous Backup (#7119) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
…tries handling to ReplicationEndpoint (#7145) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
…g Incremental Backup (#7166) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
…larity (#7171) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Signed-off-by: Andor Molnár <andor@apache.org>
…7239) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com>
#7246) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Signed-off-by: Andor Molnár andor@apache.org Reviewed by: Kevin Geiszler <kevin.j.geiszler@gmail.com> Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com>
… backup (#7400) * Scan WALs to identify bulkload operations for incremental backup * Update unit test * Info log * Minor test fix * Address review comments * Spotless apply * Addressed review comment * spotless * Remove log * Retrigger CI --------- Co-authored-by: Ankit Solomon <asolomon@cloudera.com>
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
| import org.apache.yetus.audience.InterfaceAudience; | ||
|
|
||
| @InterfaceAudience.LimitedPrivate(HBaseInterfaceAudience.REPLICATION) | ||
| public enum ReplicationResult { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still do not think we need to introduce this new class...
The current implementation will update the wal position every time when we finish sending out a batch, in general I think this could be changed, FWIW, we can increase the batch to reduce the time we persist the log position, so it is not a big problem to not always persist the position.
In this way, we could add a callback when we actually want to persist the log position, to commit the data to external storage. I think in this way the logic more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still do not think we need to introduce this new class...
The current implementation will update the wal position every time when we finish sending out a batch, in general I think this could be changed, FWIW, we can increase the batch to reduce the time we persist the log position, so it is not a big problem to not always persist the position.
I'm not sure I understand this. We're streaming the data to the S3 object, but we don't want to persist the WAL position every time a batch is shipped, only when we close the stream and open a new one.
In order to persist WAL position every time a batch is shipped, the batch size must be equal to desired S3 object size.
In this way, we could add a callback when we actually want to persist the log position, to commit the data to external storage. I think in this way the logic more clear.
I think this could work. The callback is passed to the shipper which will call it when it wants to persist. @vinayakphegde @ankitsol wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that, now our mechanism is to persist the log position after every batch right? But since we can change the batch size to tune the frequency of persisting log position, so logically there is no hard limit that we 'must' persist the log position after every batch.
I think we can change the condition on whether we need to log position.
- Every time or a fixed number of times(a number)
- Uncommitted size
- Time after last commit
- Every time when moving to a new file(a flag)
This can be a more general framework and do not need to introduce more concept.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Apache9, if I understand correctly what you're saying, taking your first option:
Every time or a fixed number of times (a number)
I assume that if the fixed number is 10, for the first 9 times we push the batch, the endpoint consumes it and returns true, but we don’t move the offset. On the 10th time, the endpoint should mandatorily persist the file and then return true. Once the endpoint returns true, we assume it has persisted, and we move the offset.
With this approach, we can stick to using only true/false without needing to pass the replication source to the endpoint for persistence.
We’ll take a look into this and see if there could be any issues with this approach. Thanks!
wchevreuil
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, as this is quite a large PR, I have a few (maybe dumb) questions about it:
- Is this a per-table configuration?
- Do we need to define a specific replication peer for this?
- If we have normal replication enabled (a normal replication peer), will all CFs with REPLICATION_SCOPE = 1 also be targeted for this PITR?
- Do we keep pitr wal files indefinitely?
- Should the pitr endpoint replicate method consider if the underlying FS supports sync before deciding the result to return?
- Does it support pause (disable_peer)?
- Can we include doc changes within this PR?
|
@wchevreuil I have answers to some of your questions:
Backups need to be enabled in the HBase conf file via
A replication peer is defined by default as
No, WALs outside of the PITR window get cleaned up automatically. They are also removed when the backup is deleted. This is tracked in HBASE-28992
I believe that is the plan. It's still a work in progress. |
Negative. Continuous backup needs a special replication endpoint to be set for the peer |
The PITR endpoint assumes that |
This is a very good question. Given that it's a standard replication endpoint, I would say sure, why not. This is something that we should validate sooner rather than later.. |
I suggest to finish reviewing and merging this patch first. Proper documentation and integration testing should come afterwards in separate PRs. |
@anmolnar @wchevreuil We actually use that. When cleaning up WAL files, if all the files are removed (which happens when deleting the last full backup — the bare minimum required for PITR), we disable the peer. More details: BackupCommands.java#L900 |
How does the PITR Endpoint avoids backing up entries for CFs with REPLICATION_SCOPE set to true (for normal replication), but which are not primarily wanted to be in PITR backups? Is it possible to "opt out" from PITR at CF table/level? |
Will this trigger backups on all existing tables?
Can you elaborate further how this peer is defined?
Ok, so we can only do actually PITR within that window period. Can you point me to the code in this PR where this is implemented? |
@wchevreuil That is determined when starting the continuous backup. For example, if you want to start continuous backup for table1, you initiate a full backup with the continuous backup option enabled. This performs the full backup and adds the table to the continuous_backup_replication_peer (the default replication peer used for continuous backup). If the column families don’t already have replication scope enabled, it’s also set to true at this stage. hbase/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/FullTableBackupClient.java Line 269 in 6aa212f
enableTableReplication() updateContinuousBackupReplicationPeer() addContinuousBackupReplicationPeer() So, if you manually change the replication scope to true for a random table, it will not be replicated as part of the continuous backup, since it’s not included in the continuous_backup_replication_peer. |
hbase/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/FullTableBackupClient.java Line 269 in 6aa212f
this contains the whole logic.
here is the PITR logic is defined. validateAndRestore() method. hbase/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/AbstractPitrRestoreHandler.java Line 68 in 6aa212f
And this is where the clean-up logic is defined — how backed-up WALs are cleaned up as part of continuous backup. hbase/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupCommands.java Line 900 in 6aa212f
|
@wchevreuil You trigger a backup by running the |
Ok, so only after a full backup is created on a given table through the described command is when that table is added to the pitr replication peer. Is there any action that may remove a given table from the list of tables tracked by the pitr replication peer? |
@wchevreuil The only action I can think of is forcefully deleting the full backup (i.e. using @vinayakphegde @ankitsol @anmolnar Can one of you please confirm? |
Correct. related code: hbase/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupCommands.java Line 900 in 6aa212f
|
wchevreuil
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, all my questions have been addressed.
Regarding the wal tracking approach, I don't have a strong opinion. @Apache9 , do you think this is a blocker for this merge? Or an alternative could be worked as a follow-up PR?
We're working on a pull request currently which would give us some insight on the alternative approach and lets us decide which way to move forward. Hopefully will be ready in a few days. |
|
I suggest we try the approach to not introduce the new 'ReplicationResult' concept to see if it could work before merging this back. Once the new concept is there, it will be difficult to remove it in the future, as later code may rely on it too... |
Summary
This PR introduces Continuous Backup and Point-in-Time Recovery (PITR) support in HBase.
The feature enables continuous WAL archival and recovery to any specific point in time, providing stronger data protection and recovery flexibility.
Key Highlights
For detailed design and implementation, refer to the design document:
[Continuous Backup and Point-in-Time Recovery]