New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add path.shared_data
#12729
Add path.shared_data
#12729
Conversation
.put(InternalSettingsPreparer.IGNORE_SYSTEM_PROPERTIES_SETTING, true) // make sure we get what we set :) | ||
.put(ClusterName.SETTING, InternalTestCluster.clusterName("single-node-cluster", randomLong())) | ||
.put("path.home", createTempDir()) | ||
.put("path.shared_data", createTempDir().getParent()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the .getParent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted it to be a level higher than the temp directory, to capture the directory that other temp dirs are created in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not create one temp dir, and then a subdir off of that? There is nothing that guarantees tempdirs are all created side by side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the directory set for path.shared_data
and the custom directories set during tests when indices are created are uncoupled, if I set it to a specific directory in advance, every test (including where we randomly add a custom data path) would be required to know what the shared data path was already set to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like it better if custom data paths were explicitely created in a specific drectory. However it's an existing issue that your PR does not introduce, so could you just add a comment explaining why you set the custom data path this way and add a TODO to explicitely set custom index paths to be sub directories?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
@jpountz pushed new commits for your comments |
if (settings.get("path.shared_data") != null) { | ||
sharedDataFile = PathUtils.get(cleanPath(settings.get("path.shared_data"))); | ||
} else { | ||
sharedDataFile = homeFile.resolve("data"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it resolve to the same as path.data instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same as path.data
? Or do you mean it should append the cluster name onto it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean if path.data is set eg. in the config/elasticsearch.yml to be another location
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe sharedDataFile should just be null if not set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, part of the reason is that path.data
can be an array, and so I didn't want to just randomly pick one of the paths
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to set it as null because if we remove custom_paths_enabled
we'll now get a NullPointerException if someone tries to use it. I think if we remove custom_paths_enabled
we can set this to a better default if you're concerned about it using the data path. Maybe just ${es.path.home}/data/custom
or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then maybe it should be null when not set, otherwise we request an unnecessary permission to the security manager, while we should be requesting as few permissions as possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can make it null and add better messaging in the validation I think.
@jpountz pushed a change to make |
@@ -300,7 +300,7 @@ public void run() { | |||
@Test | |||
public void testCustomDataPaths() throws Exception { | |||
String[] dataPaths = tmpPaths(); | |||
NodeEnvironment env = newNodeEnvironment(dataPaths, Settings.EMPTY); | |||
NodeEnvironment env = newNodeEnvironment(dataPaths, "/tmp", Settings.EMPTY); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just checking that this test does not create files in this dir, otherwise we should rather use the java.io.tmpdir?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't create any files here, it just checks and resolves paths (unit test only)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This looks good to me, but I'd like that someone else gives it a look too. @rjernst maybe you? |
LGTM too. Do you think #12776 is quick enough to get in today, so we can remove the old setting altogether for 2.x (since users will be forced to add path.shared_data anyways?) |
Yeah, this should be a really quick change. |
0b9ac50
to
6a9c86f
Compare
This allows `path.shared_data` to be added to the security manager while still allowing a custom `data_path` for indices using shadow replicas. For example, configuring `path.shared_data: /tmp/foo`, then created an index with: ``` POST /myindex { "index": { "number_of_shards": 1, "number_of_replicas": 1, "data_path": "/tmp/foo/bar/baz", "shadow_replicas": true } } ``` The index will then reside in `/tmp/foo/bar/baz`. `path.shared_data` defaults to `null` if not specified. Resolves elastic#12714 Relates to elastic#11065
6a9c86f
to
ff5ad39
Compare
This allows
path.shared_data
to be added to the security manager whilestill allowing a custom
data_path
for indices using shadow replicas.For example, configuring
path.shared_data: /tmp/foo
, then created anindex with:
The index will then reside in
/tmp/foo/bar/baz
.path.shared_data
defaults to${path.home}/data
if not specified.Resolves #12714
Relates to #11065