Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-23882: Experimenting with low config settings. #1194

Closed
wants to merge 1 commit into from

Conversation

markrmiller
Copy link
Member

Here is some initial experimentation with bringing mini cluster settings down to scale.

To start, I've been shooting for pretty minimal.

I've pulled this out of an experimental branch and so some work may still be in order to find any tests these settings are too low for.

I will continue to update this as I fine tune. I also have some other changes I'd like to dig out around thread pool sizing if I can find them again. I'll likely update again early next week.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 32s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ branch-2 Compile Tests _
+0 🆗 mvndep 0m 14s Maven dependency ordering for branch
+1 💚 mvninstall 5m 42s branch-2 passed
+1 💚 compile 1m 23s branch-2 passed
+1 💚 checkstyle 1m 39s branch-2 passed
+1 💚 shadedjars 4m 37s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 57s branch-2 passed
+0 🆗 spotbugs 3m 43s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 4m 29s branch-2 passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 16s Maven dependency ordering for patch
+1 💚 mvninstall 5m 17s the patch passed
+1 💚 compile 1m 21s the patch passed
+1 💚 javac 1m 21s the patch passed
+1 💚 checkstyle 1m 39s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedjars 4m 37s patch has no errors when building our shaded downstream artifacts.
+1 💚 hadoopcheck 16m 45s Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2.
+1 💚 javadoc 0m 54s the patch passed
+1 💚 findbugs 4m 53s the patch passed
_ Other Tests _
+1 💚 unit 1m 11s hbase-common in the patch passed.
-1 ❌ unit 87m 50s hbase-server in the patch failed.
+1 💚 asflicense 0m 43s The patch does not generate ASF License warnings.
151m 45s
Reason Tests
Failed junit tests hadoop.hbase.client.TestClientOperationInterrupt
hadoop.hbase.security.access.TestNamespaceCommands
hadoop.hbase.client.TestReplicasClient
Subsystem Report/Notes
Docker Client=19.03.4 Server=19.03.4 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1194/1/artifact/out/Dockerfile
GITHUB PR #1194
JIRA Issue HBASE-23882
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile
uname Linux b035aef41854 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 GNU/Linux
Build tool maven
Personality /home/jenkins/jenkins-slave/workspace/Base-PreCommit-GitHub-PR_PR-1194/out/precommit/personality/provided.sh
git revision branch-2 / cb838bc
Default Java 1.8.0_181
unit https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1194/1/artifact/out/patch-unit-hbase-server.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1194/1/testReport/
Max. process+thread count 5668 (vs. ulimit of 10000)
modules C: hbase-common hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1194/1/console
versions git=2.11.0 maven=2018-06-17T18:33:14Z) findbugs=3.1.11
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@saintstack
Copy link
Contributor

Rerunning build to see if the failures repeat.

Copy link
Contributor

@saintstack saintstack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added in old defaults for contrast. A few of these config changes could make a big difference I think. Needs to be done in hbase-*/src/test/resources/hbase-site.xml though. Actually, if you look in the hbase-site.xml you'll see some of these configs already set down.

// conf.setInt("dfs.datanode.max.transfer.threads", 5);
// conf.setInt("dfs.client.file-block-storage-locations.num-threads", 5);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, the way we do this usually for tests is to put these settings into hbase-*/src/test/resources/hbase-site.xml

// can't set this for every test currently
// conf.set("hbase.regionserver.hostname", "127.0.0.1");

conf.setInt("hbase.hfilearchiver.thread.pool.max", 2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default is 8.

// conf.set("hbase.regionserver.hostname", "127.0.0.1");

conf.setInt("hbase.hfilearchiver.thread.pool.max", 2);
conf.setInt("hbase.loadincremental.threads.max", 3);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is sized by processor count. Rather than hard-code it, keep it based on processor count but divide by 2 or 4 w/ minimum 3?

conf.setInt("hbase.loadincremental.threads.max", 3);


conf.setInt("hbase.client.sync.wait.timeout.msec", 30000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default is 10min which is crazy for a test, yes. Test general timeout is 13mins IIRC.



conf.setInt("hbase.client.sync.wait.timeout.msec", 30000);
conf.setInt("zookeeper.recovery.retry", 5);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default is 30. Too much for a test. This looks good.

// conf.setInt(HConstants.ZOOKEEPER_MAX_CLIENT_CNXNS, 5);

conf.setInt("hbase.hconnection.threads.max", 30);
conf.setInt("hbase.hconnection.threads.keepalivetime", 30);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default is 60.


conf.setInt("hbase.hconnection.threads.max", 30);
conf.setInt("hbase.hconnection.threads.keepalivetime", 30);
conf.setInt(HConstants.HBASE_CLIENT_MAX_TOTAL_TASKS, 10);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This a repeat? Above you have it set to 3 if I'm reading this right.




conf.setInt(HConstants.REGION_SERVER_HANDLER_COUNT, 2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default is 30. 2 is probably too few? Maybe not.



conf.setInt(HConstants.REGION_SERVER_HANDLER_COUNT, 2);
conf.setInt("hbase.master.procedure.threads", 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default is 16. Make it based off CPU count? Cutting it down is good idea.

conf.setInt(HConstants.REGION_SERVER_HANDLER_COUNT, 2);
conf.setInt("hbase.master.procedure.threads", 1);

conf.setInt("dfs.namenode.handler.count", 5);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default is 10

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 28s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ branch-2 Compile Tests _
+0 🆗 mvndep 0m 13s Maven dependency ordering for branch
+1 💚 mvninstall 5m 39s branch-2 passed
+1 💚 compile 1m 21s branch-2 passed
+1 💚 checkstyle 1m 41s branch-2 passed
+1 💚 shadedjars 4m 36s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 55s branch-2 passed
+0 🆗 spotbugs 3m 39s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 4m 25s branch-2 passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 13s Maven dependency ordering for patch
+1 💚 mvninstall 5m 21s the patch passed
+1 💚 compile 1m 21s the patch passed
+1 💚 javac 1m 21s the patch passed
+1 💚 checkstyle 1m 39s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedjars 4m 39s patch has no errors when building our shaded downstream artifacts.
+1 💚 hadoopcheck 16m 53s Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2.
+1 💚 javadoc 0m 56s the patch passed
+1 💚 findbugs 4m 20s the patch passed
_ Other Tests _
+1 💚 unit 1m 10s hbase-common in the patch passed.
-1 ❌ unit 96m 34s hbase-server in the patch failed.
+1 💚 asflicense 0m 51s The patch does not generate ASF License warnings.
160m 1s
Reason Tests
Failed junit tests hadoop.hbase.client.TestClientOperationInterrupt
hadoop.hbase.master.TestSplitWALManager
hadoop.hbase.security.access.TestNamespaceCommands
hadoop.hbase.client.TestReplicasClient
Subsystem Report/Notes
Docker Client=19.03.6 Server=19.03.6 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1194/2/artifact/out/Dockerfile
GITHUB PR #1194
JIRA Issue HBASE-23882
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile
uname Linux e4e999ac9c50 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 GNU/Linux
Build tool maven
Personality /home/jenkins/jenkins-slave/workspace/Base-PreCommit-GitHub-PR_PR-1194/out/precommit/personality/provided.sh
git revision branch-2 / 2c36216
Default Java 1.8.0_181
unit https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1194/2/artifact/out/patch-unit-hbase-server.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1194/2/testReport/
Max. process+thread count 6086 (vs. ulimit of 10000)
modules C: hbase-common hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1194/2/console
versions git=2.11.0 maven=2018-06-17T18:33:14Z) findbugs=3.1.11
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@saintstack saintstack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR is stale now @markrmiller I used it for inspiration/stole-from-it doing edits to our test default-site.xml files over in HBASE-23956.

Should we resolve this PR and its JIRA (JIRA keeps getting his w/ build comments when PRs are left open). Thanks.

@saintstack
Copy link
Contributor

Closing out. I hijacked the changes here over in HBASE-23956

@saintstack saintstack closed this May 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants