HBASE-29524 Handle bulk-loaded HFiles in delete and cleanup process #7239

vinayakphegde · 2025-08-20T16:48:48Z

No description provided.

Kota-SH · 2025-08-21T15:58:07Z

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupCommands.java

+          for (FileStatus item : bulkContents) {
+            fs.delete(item.getPath(), true); // recursive delete of each child
+          }
+          System.out.println("Deleted all contents under Bulk Load directory: " + bulkloadDir);


Is there a specific reason you chose to loop and delete the contents individually?
I think a single recursive fs.delete() here would reduce RPCs (like in line 1072 in deleteOldWALFiles())

The main reason is to preserve the parent directories. In this case, I want to delete all the WAL directories and bulkload directories, but still keep the root directory intact. This way, when we restart the continuous backup to the same backup directory, the required directory structure will already be in place.

if the required directory isn't exist, e.g. the first using backup, what would that be ? will we create it?

the problem is that this list and per-call in s3 could be experience , that's why S3A has the feature of fs.s3a.multiobjectdelete.enable (with fs.delete path recursively), if we do this way here, and if the list of bulkloaded files are huge within the bulkload directory, you will hit s3 throttling very easily, and we will need to write/folk the same implementation here.

so, instead of having the same required directory structure in place, I suggested you handle it as an enable/reenable check for the required directory structure, and create it to avoid large mount of per-object delete call.

One clarification from my side:

Our current backup directory structure looks like this:

-- wal_backup_directory/ -- WALs/ -- 23-08-2025/ ... wal files -- 24-08-2025/ -- 25-08-2025/ -- bulk-load-files/ -- 23-08-2025/ ... bulkload files -- 24-08-2025/ -- 25-08-2025/

As you can see, when we loop and delete, it’s done at the day-wise directory level, not at the individual file level. So the number of delete operations is relatively small.

Regarding re-creation: when we enable a replication peer, there isn’t any placeholder mechanism to re-create this directory structure. In the enable/disable flow, everything remains running, the only difference is that replication is paused and then resumed. Once enabled again, entries are sent to the replication endpoint directly.

It’s only during a restart that things are different. On restart, we instantiate the ContinuousBackupReplicationEndpoint again, and in its constructor we can re-create the directory structure if it’s missing. With enable/disable, this re-creation step isn’t possible.

Copilot

Pull Request Overview

This PR enhances the backup cleanup process to handle bulk-loaded HFiles alongside WAL files during delete and cleanup operations. The implementation ensures that both WAL and bulk-load directories are managed consistently throughout the backup lifecycle.

Key changes:

Extended cleanup commands to process bulk-load directories in addition to WAL directories
Updated test infrastructure to create and verify both WAL and bulk-load folder structures
Added comprehensive verification to ensure both directory types are properly cleaned up

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
TestBackupCommands.java	Updated test to verify cleanup of bulk-load directories alongside WAL directories
TestBackupDeleteWithCleanup.java	Enhanced test setup and verification methods to handle both WAL and bulk-load directories
BackupCommands.java	Modified cleanup logic to delete old bulk-load files in addition to WAL files

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupCommands.java

Copilot · 2025-08-25T23:05:19Z

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupCommands.java

          if (dayStart + ONE_DAY_IN_MILLISECONDS - 1 < cutoffTime) {
            System.out.println("Deleting outdated WAL directory: " + dirPath);
            fs.delete(dirPath, true);
+            fs.delete(new Path(bulkloadDir, dirName), true);


[nitpick] Consider adding a corresponding print statement for bulk-load directory deletion, similar to line 1070 for WAL directories, to maintain consistent logging behavior.

Suggested change

fs.delete(new Path(bulkloadDir, dirName), true);

Path bulkloadPath = new Path(bulkloadDir, dirName);

System.out.println("Deleting corresponding bulk-load directory: " + bulkloadPath);

fs.delete(bulkloadPath, true);

IMO this comment is right, please change the code accordingly

taklwu

IMO @Kota-SH is giving the right comment on the bulkloadDir should be using delete recursively, please change accordingly .

taklwu · 2025-08-25T23:15:56Z

hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupCommands.java

+          for (FileStatus item : bulkContents) {
+            fs.delete(item.getPath(), true); // recursive delete of each child
+          }
+          System.out.println("Deleted all contents under Bulk Load directory: " + bulkloadDir);


if the required directory isn't exist, e.g. the first using backup, what would that be ? will we create it?

the problem is that this list and per-call in s3 could be experience , that's why S3A has the feature of fs.s3a.multiobjectdelete.enable (with fs.delete path recursively), if we do this way here, and if the list of bulkloaded files are huge within the bulkload directory, you will hit s3 throttling very easily, and we will need to write/folk the same implementation here.

so, instead of having the same required directory structure in place, I suggested you handle it as an enable/reenable check for the required directory structure, and create it to avoid large mount of per-object delete call.

Apache-HBase · 2025-08-26T10:12:42Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 33s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	hbaseanti	0m 0s		Patch does not have any anti-patterns.
			_ HBASE-28957 Compile Tests _
+1 💚	mvninstall	3m 56s		HBASE-28957 passed
+1 💚	compile	0m 34s		HBASE-28957 passed
-0 ⚠️	checkstyle	0m 10s	/buildtool-branch-checkstyle-hbase-backup.txt	The patch fails to run checkstyle in hbase-backup
+1 💚	spotbugs	0m 34s		HBASE-28957 passed
+1 💚	spotless	0m 47s		branch has no errors when running spotless:check.
			_ Patch Compile Tests _
+1 💚	mvninstall	3m 5s		the patch passed
+1 💚	compile	0m 32s		the patch passed
+1 💚	javac	0m 32s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
-0 ⚠️	checkstyle	0m 9s	/buildtool-patch-checkstyle-hbase-backup.txt	The patch fails to run checkstyle in hbase-backup
+1 💚	spotbugs	0m 36s		the patch passed
+1 💚	hadoopcheck	12m 11s		Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚	spotless	0m 45s		patch has no errors when running spotless:check.
			_ Other Tests _
+1 💚	asflicense	0m 10s		The patch does not generate ASF License warnings.
		31m 37s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7239/2/artifact/yetus-general-check/output/Dockerfile
GITHUB PR	#7239
Optional Tests	dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname	Linux 952f54ade269 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	HBASE-28957 / `ecf7c1b`
Default Java	Eclipse Adoptium-17.0.11+9
Max. process+thread count	83 (vs. ulimit of 30000)
modules	C: hbase-backup U: hbase-backup
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7239/2/console
versions	git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by	Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2025-08-26T10:20:32Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 28s		Docker mode activated.
-0 ⚠️	yetus	0m 4s		Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
			_ Prechecks _
			_ HBASE-28957 Compile Tests _
+1 💚	mvninstall	3m 8s		HBASE-28957 passed
+1 💚	compile	0m 20s		HBASE-28957 passed
+1 💚	javadoc	0m 15s		HBASE-28957 passed
+1 💚	shadedjars	5m 56s		branch has no errors when building our shaded downstream artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	2m 55s		the patch passed
+1 💚	compile	0m 20s		the patch passed
+1 💚	javac	0m 20s		the patch passed
+1 💚	javadoc	0m 13s		the patch passed
+1 💚	shadedjars	5m 54s		patch has no errors when building our shaded downstream artifacts.
			_ Other Tests _
+1 💚	unit	18m 56s		hbase-backup in the patch passed.
		39m 29s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7239/2/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR	#7239
Optional Tests	javac javadoc unit compile shadedjars
uname	Linux cc33b6e77084 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	HBASE-28957 / `ecf7c1b`
Default Java	Eclipse Adoptium-17.0.11+9
Test Results	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7239/2/testReport/
Max. process+thread count	3775 (vs. ulimit of 30000)
modules	C: hbase-backup U: hbase-backup
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7239/2/console
versions	git=2.34.1 maven=3.9.8
Powered by	Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

taklwu

LGTM

…7239) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com>

HBASE-29524: Handle bulk-loaded HFiles in delete and cleanup process

a2f32fc

This comment has been minimized.

Sign in to view

Kota-SH reviewed Aug 21, 2025

View reviewed changes

taklwu requested a review from Copilot August 25, 2025 23:04

Copilot AI reviewed Aug 25, 2025

View reviewed changes

taklwu requested changes Aug 25, 2025

View reviewed changes

enhance the code according to PR comments

ecf7c1b

taklwu approved these changes Aug 26, 2025

View reviewed changes

taklwu merged commit 0b1ec92 into apache:HBASE-28957 Aug 26, 2025
1 check passed

anmolnar pushed a commit that referenced this pull request Sep 11, 2025

HBASE-29524 Handle bulk-loaded HFiles in delete and cleanup process (#…

d9e5a44

…7239) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com>

anmolnar pushed a commit that referenced this pull request Nov 6, 2025

HBASE-29524 Handle bulk-loaded HFiles in delete and cleanup process (#…

5d815b8

…7239) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Reviewed by: Kota-SH <shanmukhaharipriya@gmail.com>

-            fs.delete(new Path(bulkloadDir, dirName), true);
+            Path bulkloadPath = new Path(bulkloadDir, dirName);
+            System.out.println("Deleting corresponding bulk-load directory: " + bulkloadPath);
+            fs.delete(bulkloadPath, true);

HBASE-29524 Handle bulk-loaded HFiles in delete and cleanup process #7239

HBASE-29524 Handle bulk-loaded HFiles in delete and cleanup process #7239

Uh oh!

Conversation

vinayakphegde commented Aug 20, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

Kota-SH Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

vinayakphegde Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

taklwu Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

vinayakphegde Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Copilot AI Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

taklwu Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

taklwu left a comment

Choose a reason for hiding this comment

Uh oh!

taklwu Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

Apache-HBase commented Aug 26, 2025

Uh oh!

Apache-HBase commented Aug 26, 2025

Uh oh!

taklwu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants