Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-28216 HDFS erasure coding support for table data dirs #5579

Merged
merged 7 commits into from
Dec 19, 2023

Conversation

bbeaudreault
Copy link
Contributor

Since we require hadoop-3 for master and branch-3, I can use the EC APIs directly. If we want to backport to branch-2, we'll need to figure out how to do this with reflection.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. One question: will hbase shell show erasure coding policy of a table too?

@bbeaudreault
Copy link
Contributor Author

Great question! That actually slipped my mind and I will get that working next week

@Apache-HBase

This comment was marked as off-topic.


@BeforeClass
public static void beforeClass() throws Exception {
UTIL.startMiniDFSCluster(6); // 6 necessary for RS-6-3-1024k
Copy link
Contributor

@NihalJain NihalJain Dec 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment not related to test:
Just wondering, what happens if some one mistakenly sets tis policy on a cluster with less than 6 nodes? Can the table/system be recovered/fixed?
Also, do we have any checks to ensure ECP cannot be configured if this condition is not satisfied? Is it worth adding?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question. Typically a hdfs client will fail to write in this case (it would actually be better to have at least 9 nodes as 6 nodes provide no redundancy). In the case of HBase, it looks like compaction will fail, in which case administrator will need to step in and update the EC policy of the table.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To check if the EC policy is compliant with rack/host setup, check out these two jiras:
https://issues.apache.org/jira/browse/HDFS-14061
https://issues.apache.org/jira/browse/HDFS-12946

It's going to be a sizeable change so I'd suggest to leave that out of this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok, I didn't see those. For now I implemented a check in TableDescriptorChecker which sets the requested policy on a temp dir, and then tries to write to the temp dir. As you said, the write will fail. I have a test to validate that as well.

Will look at those 2 jiras for a follow-up improvement.

@NihalJain
Copy link
Contributor

Hi @bbeaudreault thanks for the PR, overall looks good. Also have posted a few questions/reviews. Please let me know of your opinion on same.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@bbeaudreault
Copy link
Contributor Author

Thanks for the review @NihalJain! I believe I covered all of your feedback. I also made sure it works in the shell @jojochuang.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@NihalJain
Copy link
Contributor

Hi @bbeaudreault added a new comment, please have a look. Otherwise changes looks good and all previous feedbacks seemed to be covered. Thanks :)

@NihalJain
Copy link
Contributor

+1 to the change.

Just realized, we may want to add sample command which sets EC in doc section of create.rb and alter.rb. We usually add those for any new feature that we add. Or could be done as part of documentation story as well.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 37s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 prototool 0m 0s prototool was not available.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
_ master Compile Tests _
+0 🆗 mvndep 0m 24s Maven dependency ordering for branch
+1 💚 mvninstall 4m 3s master passed
+1 💚 compile 5m 14s master passed
+1 💚 checkstyle 1m 23s master passed
+1 💚 spotless 0m 56s branch has no errors when running spotless:check.
+1 💚 spotbugs 6m 41s master passed
-0 ⚠️ patch 2m 34s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 11s Maven dependency ordering for patch
+1 💚 mvninstall 4m 16s the patch passed
+1 💚 compile 6m 22s the patch passed
+1 💚 cc 6m 22s the patch passed
+1 💚 javac 6m 22s the patch passed
+1 💚 checkstyle 1m 22s the patch passed
-0 ⚠️ rubocop 0m 11s The patch generated 1 new + 742 unchanged - 0 fixed = 743 total (was 742)
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 hadoopcheck 13m 40s Patch does not cause any errors with Hadoop 3.2.4 3.3.6.
+1 💚 hbaseprotoc 2m 19s the patch passed
+1 💚 spotless 1m 2s patch has no errors when running spotless:check.
+1 💚 spotbugs 7m 45s the patch passed
_ Other Tests _
+1 💚 asflicense 0m 46s The patch does not generate ASF License warnings.
66m 6s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5579/6/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #5579
Optional Tests dupname asflicense javac spotbugs hadoopcheck hbaseanti spotless checkstyle compile cc hbaseprotoc prototool rubocop
uname Linux a3a03ab33af7 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 53e01b2
Default Java Eclipse Adoptium-11.0.17+8
rubocop https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5579/6/artifact/yetus-general-check/output/diff-patch-rubocop.txt
Max. process+thread count 81 (vs. ulimit of 30000)
modules C: hbase-protocol-shaded hbase-client hbase-server hbase-shell U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5579/6/console
versions git=2.34.1 maven=3.8.6 spotbugs=4.7.3 rubocop=1.37.1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@bbeaudreault
Copy link
Contributor Author

+1 to the change.

Just realized, we may want to add sample command which sets EC in doc section of create.rb and alter.rb. We usually add those for any new feature that we add. Or could be done as part of documentation story as well.

If you don't mind, I might file a new jira for this. I took a look, and yes we put a bunch of examples in there but there doesn't seem to be any rhyme or reason. I wonder if we should improve the help text to list the possible options explicitly. We already hardcode the supported options in admin.rb -- update_tdb_from_arg, so it seems like we could refactor that a bit to print them to the help text.

@NihalJain
Copy link
Contributor

If you don't mind, I might file a new jira for this.

Sounds good. 🚀

Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from my side. There are some comments that are worth addressing in follow jiras.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 28s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 39s Maven dependency ordering for branch
+1 💚 mvninstall 3m 3s master passed
+1 💚 compile 1m 47s master passed
+1 💚 shadedjars 5m 14s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 52s master passed
-0 ⚠️ patch 6m 32s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 11s Maven dependency ordering for patch
+1 💚 mvninstall 2m 39s the patch passed
+1 💚 compile 1m 47s the patch passed
+1 💚 javac 1m 47s the patch passed
+1 💚 shadedjars 5m 13s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 53s the patch passed
_ Other Tests _
+1 💚 unit 0m 34s hbase-protocol-shaded in the patch passed.
+1 💚 unit 1m 26s hbase-client in the patch passed.
+1 💚 unit 216m 10s hbase-server in the patch passed.
+1 💚 unit 7m 42s hbase-shell in the patch passed.
253m 0s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5579/6/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR #5579
Optional Tests javac javadoc unit shadedjars compile
uname Linux 8c2b59e08fe1 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 53e01b2
Default Java Eclipse Adoptium-11.0.17+8
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5579/6/testReport/
Max. process+thread count 4294 (vs. ulimit of 30000)
modules C: hbase-protocol-shaded hbase-client hbase-server hbase-shell U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5579/6/console
versions git=2.34.1 maven=3.8.6
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 25s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 13s Maven dependency ordering for branch
+1 💚 mvninstall 2m 32s master passed
+1 💚 compile 1m 30s master passed
+1 💚 shadedjars 5m 14s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 52s master passed
-0 ⚠️ patch 6m 30s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 12s Maven dependency ordering for patch
+1 💚 mvninstall 2m 16s the patch passed
+1 💚 compile 1m 32s the patch passed
+1 💚 javac 1m 32s the patch passed
+1 💚 shadedjars 5m 11s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 49s the patch passed
_ Other Tests _
+1 💚 unit 0m 25s hbase-protocol-shaded in the patch passed.
+1 💚 unit 1m 21s hbase-client in the patch passed.
+1 💚 unit 225m 25s hbase-server in the patch passed.
+1 💚 unit 7m 15s hbase-shell in the patch passed.
259m 34s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5579/6/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR #5579
Optional Tests javac javadoc unit shadedjars compile
uname Linux 97ebebb22853 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 53e01b2
Default Java Temurin-1.8.0_352-b08
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5579/6/testReport/
Max. process+thread count 4683 (vs. ulimit of 30000)
modules C: hbase-protocol-shaded hbase-client hbase-server hbase-shell U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5579/6/console
versions git=2.34.1 maven=3.8.6
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@bbeaudreault bbeaudreault merged commit 9f0625c into apache:master Dec 19, 2023
1 check passed
@bbeaudreault
Copy link
Contributor Author

Thank you both for the reviews!

@bbeaudreault bbeaudreault deleted the HBASE-28216 branch December 19, 2023 19:54
bbeaudreault added a commit that referenced this pull request Dec 19, 2023
Signed-off-by: Nihal Jain <nihaljain@apache.org>
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
bbeaudreault added a commit to HubSpot/hbase that referenced this pull request Dec 19, 2023
)

Signed-off-by: Nihal Jain <nihaljain@apache.org>
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
bbeaudreault added a commit to HubSpot/hbase that referenced this pull request Jan 2, 2024
…ata dirs (apache#5579)

Signed-off-by: Nihal Jain <nihaljain@apache.org>
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
kadirozde pushed a commit to kadirozde/hbase that referenced this pull request Jan 5, 2024
)

Signed-off-by: Nihal Jain <nihaljain@apache.org>
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
bbeaudreault added a commit to HubSpot/hbase that referenced this pull request Jan 14, 2024
)

Signed-off-by: Nihal Jain <nihaljain@apache.org>
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants