New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-24271 Set values in conf/hbase-site.xml
that enable running on LocalFileSystem
out of the box
#1597
HBASE-24271 Set values in conf/hbase-site.xml
that enable running on LocalFileSystem
out of the box
#1597
Conversation
I tried to include everyone who brought opinions to the previous PRs, Jiras, and dev thread. |
9797287
to
5f7bd77
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised to not see the switch to change file://
to refer to RawLocalFileSystem
instead of LocalFileSystem
here. I assume that means that the system generally works OK? My memory is that doing WAL replay will cause trouble w/o RawLocalFS.
If you're just laying groundwork, that's also fine.
hbase-common/src/main/java/org/apache/hadoop/hbase/util/CommonFSUtils.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nits. See below. Appreciate the edit on startup section.
hbase-common/src/main/java/org/apache/hadoop/hbase/util/CommonFSUtils.java
Outdated
Show resolved
Hide resolved
final boolean value = false; | ||
LOG.warn("Cannot enforce durability guarantees while running on {}. Setting {}={} for" | ||
+ " this FileSystem.", fs.getUri(), UNSAFE_STREAM_CAPABILITY_ENFORCE, value); | ||
fs.getConf().setBoolean(UNSAFE_STREAM_CAPABILITY_ENFORCE, value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hurray
@@ -426,7 +386,7 @@ You can stop HBase the same way as in the <<quickstart,quickstart>> procedure, u | |||
|
|||
|
|||
[[quickstart_fully_distributed]] | |||
=== Advanced - Fully Distributed | |||
=== Fully Distributed for Production |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good
🎊 +1 overall
This message was automatically generated. |
Nope, not touching any of that. I have no intention of making a |
🎊 +1 overall
This message was automatically generated. |
hbase-common/src/main/java/org/apache/hadoop/hbase/util/CommonFSUtils.java
Outdated
Show resolved
Hide resolved
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will make the above change, and restore all the cleanup you guys liked from the reverted commits.
conf/hbase-site.xml
Outdated
--> | ||
<property> | ||
<name>hbase.tmp.dir</name> | ||
<value>./tmp</value> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hadoop Configuration
supports environment variables. Let me make this even my explicit by changing this value to ${env.HBASE_HOME:-.}/tmp
.
5f7bd77
to
69a9386
Compare
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@joshelser @busbey other thoughts here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're doing good work :)
I think the natural progression is to keep pulling on this thread. With LocalFileSystem, things will assuredly be finicky around stop/start of RegionServers (especially by our tests) with the lack of that hflush. I understand if that's not your goal now, but it sounds like it could be a big win to try to flip some tests over and run with less overhead (from minidfscluster). If not you, someone else we can convince, maybe :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not willing to veto this change. I agree that it is an improvement to ship configuration settings that get this behavior rather than hard code a change when we see LocalFileSystem.
@@ -426,7 +386,7 @@ You can stop HBase the same way as in the <<quickstart,quickstart>> procedure, u | |||
|
|||
|
|||
[[quickstart_fully_distributed]] | |||
=== Advanced - Fully Distributed | |||
=== Fully Distributed for Production | |||
|
|||
In reality, you need a fully-distributed configuration to fully test HBase and to use it in real-world scenarios. | |||
In a distributed configuration, the cluster contains multiple nodes, each of which runs one or more HBase daemon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make clear in this paragraph that this quickstart covers a necessary but not sufficient topic for a system that is "production ready."
Just to be clear that we are only explaining how to get a distributed HBase on top of a distributed filesystem and there exist other topics that need to be considered for a production deployment. Folks should still read through the section "The Important Configurations" and should ensure they have monitoring of metrics and log aggregation, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe drop my addition of "for Production" then? I thought it odd that our default operating configuration was flagged as "advanced". I agree there's more to a production deploy than just these configs. However, I also don't think it's our job to be prescriptive of what a production deployment looks like, re: monitoring, metrics, log aggregation, &c. I think anything more than a paragraph or two of "recommended" or "strongly encouraged" supporting infrastructure is beyond the scope of our document.
Any further input re: what you would like to see for a +1 @busbey ? It sounds to me from your comment that there's a last mile that I'm missing. I'd rather put this to bed than need revisit it. |
69a9386
to
f825b91
Compare
…n `LocalFileSystem` out of the box Simplify the new user experience shipping a configuration that enables a fresh checkout or tarball distribution to run in standalone mode without direct user configuration. This change restores the behavior we had when running on Hadoop 2.8 and earlier. Patch for master includes an update to the book. This change will be omitted when backporting to earlier branches. Signed-off-by: stack <stack@apache.org> Signed-off-by: Josh Elser <elserj@apache.org> Signed-off-by: Duo Zhang <zhangduo@apache.org>
f825b91
to
9e8107b
Compare
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Simplify the new user experience shipping a configuration that enables
a fresh checkout or tarball distribution to run in standalone mode
without direct user configuration. This change restores the behavior
we had when running on Hadoop 2.8 and earlier.
Patch for master includes an update to the book. This change will be
omitted when backporting to earlier branches.