New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add index.data_path setting #8819
Conversation
I've left out documenting this feature, as I suspect the settings/configuration may look different after feedback on this PR, before merging this I will write documentation. |
Why is the templating needed? This seems like something a user should never mess with? For example, restoring snapshotted index would then not work on a cluster that didn't have the same template setup.. |
The templating is another feature because we originally discussed shard-level folder settings, it's not required for anything. I agree it does add complexity and I'd be totally fine with removing it. @clintongormley do you know if there's a reason we need to support custom templating for shard-specific folders, or can we stick with the default template all the time? |
@@ -115,6 +115,10 @@ public ActionRequestValidationException validate() { | |||
if (number_of_replicas != null && number_of_replicas < 0) { | |||
validationException = addValidationError("index must have 0 or more replica shards", validationException); | |||
} | |||
String customTemplate = settings.get("data_template", settings.get(IndexMetaData.SETTING_DATA_PATH_TEMPLATE, null)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can data_template
be a constant?
@dakrone I left a bunch of comments but everything minor. I also wonder if we should remove the templating for now and maybe make it a different PR just to push out the discussion? What do you think? |
@s1monw sounds reasonable to me, I will add a commit to remove it and address the feedback |
da29e79
to
4c715a7
Compare
@s1monw I've removed the templating and added a custom data_path usage 30% of the time in |
left one comment other than that LGTM |
/** | ||
* Tests for custom data path locations and templates | ||
*/ | ||
@ElasticsearchIntegrationTest.ClusterScope(scope = ElasticsearchIntegrationTest.Scope.SUITE, maxNumDataNodes=1, minNumDataNodes=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why a single data node? If you need that, you can use numDataNodes=1
instead
ab221dc
to
517736a
Compare
I've squashed, rebased (since master was heavily changed), and added a node-level setting @s1monw can you take one last look? None of the original logic has been changed. |
@dakrone Why do we need one setting controlling another? They are both in yml, so anyone who can change that at the node level can change both? |
Oh I see...one is at the index level and one at yml...still, is it really necessary? |
As to whether it's necessary, yes, because this is intended for mixed clusters where some indices use the |
517736a
to
0d31a3b
Compare
This allows specifying the path an index will be at. `index.data_path` is specified in the settings when creating an index, and can not be dynamically changed. An example request would look like: POST /myindex { "settings": { "number_of_shards": 2, "data_path": "/tmp/myindex" } } And would put data in /tmp/myindex/0/index/0 and /tmp/myindex/0/index/1 Since this can be used to write data to arbitrary locations on disk, it requires enabling the `node.enable_custom_paths` setting in elasticsearch.yml on all nodes.
0d31a3b
to
b2ec19a
Compare
This is having trouble on Windows, I have reverted it for now while I look into it, I'll open a new PR when fixed. |
This allows specifying the path an index will be at.
index.data_path
is specified in the settings when creating an index,and can not be dynamically changed.
An example request would look like:
And would put data in /tmp/myindex/0/index/0 and /tmp/myindex/0/index/1
Since this can be used to write data to arbitrary locations on disk, it
requires enabling the
node.enable_custom_paths
setting inelasticsearch.yml on all nodes.
I found that the
NodeEnvironment
abstraction works well for index-specificdata paths, and passing the index settings in to the various methods gives
us more flexibility in the future with regard to adding any other environment-
specific settings.