-
Notifications
You must be signed in to change notification settings - Fork 557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow configuring runtime directory #11772
Conversation
This allows users to configure a separate runtime directory. To keep the default behavior, if no runtime directory is configured, it will use the same old location in the data directory.
Test Results 996 files + 1 996 suites +1 1h 50m 1s ⏱️ + 7m 40s Results for commit 575e302. ± Comparison against base commit fce9b13. This pull request removes 432 and adds 804 tests. Note that renamed tests count towards both.
♻️ This comment has been updated with latest results. |
import org.junit.Rule; | ||
import org.junit.Test; | ||
|
||
public class BrokerDifferentRuntimeDirectoryTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ It seems overkill to start the whole broker to test this feature. But I'm not sure how else to test it. Unit testing PartitionFactory is not enough because we don't know if anywhere else runtime location is assumed to be in data directory. If you have other ideas to test please let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💭
Unit testing PartitionFactory is not enough because we don't know if anywhere else runtime location is assumed to be in data directory
We can't predict how configuration is used everywhere, or if anyone has hard-coded paths, that's true. We hard-code that path in our test utilities, for example. At the same time, I don't know if this is what should prevent us from writing a unit test. The PartitionFactory
is the only place where we create the StateController
instance that is used to load the state.
I guess we can't easily enforce that - that only there, we create the StateController
, and the location of the state is only accessed through that. If we could, would you feel more comfortable unit testing the PartitionFactory
?
Couldn't we say the same about the general data directory? I guess you could argue our QA tests ensure that, except since we use default paths, any hard-coded paths would go unnoticed 😅
I think it would be OK to unit test the PartitionFactory
to assert only that the state path is what's expected.
I get what you mean though. We could still have a QA which tests the user feature - that is, using a tmpfs volume for the runtime. That's the main usage goal, right? Although verifying that it's used appropriately is still breaking our abstraction.
The test below is also not testing everything anyway. It only checks the directory is not empty, but doesn't really assert that things are used properly, by the right partitions, etc. We can do a quick sync when you're back to make a final decision 👍
d1b74ce
to
5362f47
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Thanks! Feature seems good, I tested it out and it works as expected. Still have to test it with an actual different volume - did you try a manual benchmark with a separate volume mount?
❌ Could you create a documentation issue around this? Something which covers the expected use case (e.g. with a separate in memory volume), a sample for Kubernetes (should we do general or target Helm chart?), and also covering that any configuration change, when the previous directory was a shared volume (e.g. the data one) needs to be cleaned of the old runtime.
@@ -191,20 +191,28 @@ | |||
# This section allows to configure Zeebe's data storage. Data is stored in | |||
# "partition folders". A partition folder has the following structure: | |||
# | |||
# partition-0 (root partition folder) | |||
# ├── partition.json (metadata about the partition) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💭 Oh wow, this was old 😄
# └── runtime | ||
# └── yy.sst |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# └── runtime | |
# └── yy.sst | |
# └── runtime | |
# └── yy.sst |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess same below. Super minor 🙃
import org.junit.Rule; | ||
import org.junit.Test; | ||
|
||
public class BrokerDifferentRuntimeDirectoryTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💭
Unit testing PartitionFactory is not enough because we don't know if anywhere else runtime location is assumed to be in data directory
We can't predict how configuration is used everywhere, or if anyone has hard-coded paths, that's true. We hard-code that path in our test utilities, for example. At the same time, I don't know if this is what should prevent us from writing a unit test. The PartitionFactory
is the only place where we create the StateController
instance that is used to load the state.
I guess we can't easily enforce that - that only there, we create the StateController
, and the location of the state is only accessed through that. If we could, would you feel more comfortable unit testing the PartitionFactory
?
Couldn't we say the same about the general data directory? I guess you could argue our QA tests ensure that, except since we use default paths, any hard-coded paths would go unnoticed 😅
I think it would be OK to unit test the PartitionFactory
to assert only that the state path is what's expected.
I get what you mean though. We could still have a QA which tests the user feature - that is, using a tmpfs volume for the runtime. That's the main usage goal, right? Although verifying that it's used appropriately is still breaking our abstraction.
The test below is also not testing everything anyway. It only checks the directory is not empty, but doesn't really assert that things are used properly, by the right partitions, etc. We can do a quick sync when you're back to make a final decision 👍
import org.junit.Test; | ||
|
||
public class BrokerDifferentRuntimeDirectoryTest { | ||
private static final String STATE = "state"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Please make this temporary. If I run this locally it will create a state
directory in my working directory, i.e. broker/state
😅 So even if this failed, but I'd run the test before at some point, it will pass. Fine for CI I guess, but it will leave garbage on local machines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙈
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this 98db761, runtime will be determined from the brokerbase. So there is no need to create another temp directory in the test.
I've approved anyway, I don't see a big blocker, but let's sync when you're back before merging. |
5362f47
to
3302e73
Compare
@@ -42,6 +42,9 @@ public final class DataCfg implements ConfigurationEntry { | |||
@Override | |||
public void init(final BrokerCfg globalConfig, final String brokerBase) { | |||
directory = ConfigurationUtil.toAbsolutePath(directory, brokerBase); | |||
if (runtimeDirectory != null) { | |||
runtimeDirectory = ConfigurationUtil.toAbsolutePath(runtimeDirectory, brokerBase); | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@npepinpe This was added after your review. I think it makes sense to add this to be consistent with the data directory configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
Would be interesting to see its impact with normal loads. If it pushes write I/O spikes, then I expect it will impact the general system under load. Nothing too unexpected, but good to confirm. Do we have some idea for a threshold at which the impact becomes noticeable? Though I expect these things will always be relative 🤷 |
Not easy to determine. It hits the throughput drop due to growing rocksdb state. |
Also, in the normal benchmark workload with no growing state, different runtime disk and snapshotting doesn't have any noticeable impact on throughput. The potential impact of configuring a different disk is already documented in the config templates. |
bors merge |
Build succeeded: |
Description
Add new configuration for specifying runtime directory.
The default value for
zeebe.broker.data.runtime
is null. If it is null, runtime is stored in the same location as before in the data directory. If any other value is configured, it will be used in the following directory structure:Related issues
closes #6044