Skip to content

Conversation

@vinayakphegde
Copy link
Contributor

No description provided.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

for (FileStatus item : bulkContents) {
fs.delete(item.getPath(), true); // recursive delete of each child
}
System.out.println("Deleted all contents under Bulk Load directory: " + bulkloadDir);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason you chose to loop and delete the contents individually?
I think a single recursive fs.delete() here would reduce RPCs (like in line 1072 in deleteOldWALFiles())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason is to preserve the parent directories. In this case, I want to delete all the WAL directories and bulkload directories, but still keep the root directory intact. This way, when we restart the continuous backup to the same backup directory, the required directory structure will already be in place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the required directory isn't exist, e.g. the first using backup, what would that be ? will we create it?

the problem is that this list and per-call in s3 could be experience , that's why S3A has the feature of fs.s3a.multiobjectdelete.enable (with fs.delete path recursively), if we do this way here, and if the list of bulkloaded files are huge within the bulkload directory, you will hit s3 throttling very easily, and we will need to write/folk the same implementation here.

so, instead of having the same required directory structure in place, I suggested you handle it as an enable/reenable check for the required directory structure, and create it to avoid large mount of per-object delete call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One clarification from my side:

Our current backup directory structure looks like this:

-- wal_backup_directory/
     -- WALs/
          -- 23-08-2025/
               ... wal files
          -- 24-08-2025/
          -- 25-08-2025/
     -- bulk-load-files/
          -- 23-08-2025/
               ... bulkload files
          -- 24-08-2025/
          -- 25-08-2025/

As you can see, when we loop and delete, it’s done at the day-wise directory level, not at the individual file level. So the number of delete operations is relatively small.

Regarding re-creation: when we enable a replication peer, there isn’t any placeholder mechanism to re-create this directory structure. In the enable/disable flow, everything remains running, the only difference is that replication is paused and then resumed. Once enabled again, entries are sent to the replication endpoint directly.

It’s only during a restart that things are different. On restart, we instantiate the ContinuousBackupReplicationEndpoint again, and in its constructor we can re-create the directory structure if it’s missing. With enable/disable, this re-creation step isn’t possible.

@taklwu taklwu requested a review from Copilot August 25, 2025 23:04
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances the backup cleanup process to handle bulk-loaded HFiles alongside WAL files during delete and cleanup operations. The implementation ensures that both WAL and bulk-load directories are managed consistently throughout the backup lifecycle.

Key changes:

  • Extended cleanup commands to process bulk-load directories in addition to WAL directories
  • Updated test infrastructure to create and verify both WAL and bulk-load folder structures
  • Added comprehensive verification to ensure both directory types are properly cleaned up

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
TestBackupCommands.java Updated test to verify cleanup of bulk-load directories alongside WAL directories
TestBackupDeleteWithCleanup.java Enhanced test setup and verification methods to handle both WAL and bulk-load directories
BackupCommands.java Modified cleanup logic to delete old bulk-load files in addition to WAL files

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

if (dayStart + ONE_DAY_IN_MILLISECONDS - 1 < cutoffTime) {
System.out.println("Deleting outdated WAL directory: " + dirPath);
fs.delete(dirPath, true);
fs.delete(new Path(bulkloadDir, dirName), true);
Copy link

Copilot AI Aug 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider adding a corresponding print statement for bulk-load directory deletion, similar to line 1070 for WAL directories, to maintain consistent logging behavior.

Suggested change
fs.delete(new Path(bulkloadDir, dirName), true);
Path bulkloadPath = new Path(bulkloadDir, dirName);
System.out.println("Deleting corresponding bulk-load directory: " + bulkloadPath);
fs.delete(bulkloadPath, true);

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this comment is right, please change the code accordingly

Copy link
Contributor

@taklwu taklwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO @Kota-SH is giving the right comment on the bulkloadDir should be using delete recursively, please change accordingly .

for (FileStatus item : bulkContents) {
fs.delete(item.getPath(), true); // recursive delete of each child
}
System.out.println("Deleted all contents under Bulk Load directory: " + bulkloadDir);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the required directory isn't exist, e.g. the first using backup, what would that be ? will we create it?

the problem is that this list and per-call in s3 could be experience , that's why S3A has the feature of fs.s3a.multiobjectdelete.enable (with fs.delete path recursively), if we do this way here, and if the list of bulkloaded files are huge within the bulkload directory, you will hit s3 throttling very easily, and we will need to write/folk the same implementation here.

so, instead of having the same required directory structure in place, I suggested you handle it as an enable/reenable check for the required directory structure, and create it to avoid large mount of per-object delete call.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 33s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ HBASE-28957 Compile Tests _
+1 💚 mvninstall 3m 56s HBASE-28957 passed
+1 💚 compile 0m 34s HBASE-28957 passed
-0 ⚠️ checkstyle 0m 10s /buildtool-branch-checkstyle-hbase-backup.txt The patch fails to run checkstyle in hbase-backup
+1 💚 spotbugs 0m 34s HBASE-28957 passed
+1 💚 spotless 0m 47s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+1 💚 mvninstall 3m 5s the patch passed
+1 💚 compile 0m 32s the patch passed
+1 💚 javac 0m 32s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 9s /buildtool-patch-checkstyle-hbase-backup.txt The patch fails to run checkstyle in hbase-backup
+1 💚 spotbugs 0m 36s the patch passed
+1 💚 hadoopcheck 12m 11s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 spotless 0m 45s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 10s The patch does not generate ASF License warnings.
31m 37s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7239/2/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #7239
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux 952f54ade269 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision HBASE-28957 / ecf7c1b
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 83 (vs. ulimit of 30000)
modules C: hbase-backup U: hbase-backup
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7239/2/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 28s Docker mode activated.
-0 ⚠️ yetus 0m 4s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ HBASE-28957 Compile Tests _
+1 💚 mvninstall 3m 8s HBASE-28957 passed
+1 💚 compile 0m 20s HBASE-28957 passed
+1 💚 javadoc 0m 15s HBASE-28957 passed
+1 💚 shadedjars 5m 56s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 2m 55s the patch passed
+1 💚 compile 0m 20s the patch passed
+1 💚 javac 0m 20s the patch passed
+1 💚 javadoc 0m 13s the patch passed
+1 💚 shadedjars 5m 54s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 18m 56s hbase-backup in the patch passed.
39m 29s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7239/2/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #7239
Optional Tests javac javadoc unit compile shadedjars
uname Linux cc33b6e77084 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision HBASE-28957 / ecf7c1b
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7239/2/testReport/
Max. process+thread count 3775 (vs. ulimit of 30000)
modules C: hbase-backup U: hbase-backup
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7239/2/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@taklwu taklwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@taklwu taklwu merged commit 0b1ec92 into apache:HBASE-28957 Aug 26, 2025
1 check passed
anmolnar pushed a commit that referenced this pull request Sep 11, 2025
…7239)

Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com>
anmolnar pushed a commit that referenced this pull request Nov 6, 2025
…7239)

Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants