Default max local storage nodes to one #19964

jasontedor · 2016-08-11T20:53:45Z

This commit defaults the max local storage nodes to one. The motivation
for this change is that a default value greather than one is dangerous
as users sometimes end up unknowingly starting a second node and start
thinking that they have encountered data loss.

Closes #19679, supersedes #19748

This commit defaults the max local storage nodes to one. The motivation for this change is that a default value greather than one is dangerous as users sometimes end up unknowingly starting a second node and start thinking that they have encountered data loss.

nik9000 · 2016-08-11T20:55:06Z

buildSrc/src/main/groovy/org/elasticsearch/gradle/test/ClusterFormationTasks.groovy

@@ -261,6 +261,7 @@ class ClusterFormationTasks {
                'node.attr.testattr'                : 'test',
                'repositories.url.allowed_urls': 'http://snapshot.test*'
        ]
+        esConfig['node.max_local_storage_nodes'] = node.config.numNodes


This commit adjusts the node max local storage node settings value for some tests - the provided value for ESIntegTestCase is derived from the value of the annotations - the default value for the InternalTestCluster is more carefully calculated - the value for the tribe unit tests is adjusted to reflect that there are two clusters in play

rmuir · 2016-08-12T10:22:33Z

The motivation
for this change is that a default value greather than one is dangerous
as users sometimes end up unknowingly starting a second node and start
thinking that they have encountered data loss.

Also that when locks are lost, the exception is masked and dropped on the floor and ES keeps on trucking. Oh and did i mention they are filesystem locks?

Seems legitimately unsafe.

This commit simplifies the handling of max local storage nodes in integration tests by just setting the default max local storage nodes to be the maximum possible integer.

nik9000 · 2016-08-12T13:16:12Z

LGTM

mikemccand · 2016-08-12T13:16:32Z

Can we just remove this feature entirely, such that users must always be explicit on starting a node about precisely which path.data that node gets to use, and it must not be in use by any other node?

I think this is too much magic on ES's part, trying to have multiple nodes share path.data entries, and it has hit me (well, my brother, who then asked me WTF was happening) personally when he accidentally started up another node on the same box.

jasontedor · 2016-08-12T13:25:56Z

@mikemccand Doesn't this get us there, now you have to be explicit about wanting multiple nodes that share the same path.data? Removing the feature doesn't buy us much from a code perspective (we still need the locking code anyway), and it will complicate the build.

jasontedor · 2016-08-12T13:26:46Z

Thanks for reviewing @nik9000. I've merged this @mikemccand, but that shouldn't stop discussion on your proposal!

mikemccand · 2016-08-12T13:32:31Z

@mikemccand Doesn't this get us there, now you have to be explicit about wanting multiple nodes that share the same path.data?

This is definitely an improvement (thank you! progress not perfection!), but what I'm saying is I don't think such dangerous magic should even be an option in ES.

rmuir · 2016-08-12T13:39:35Z

Personally I feel the design is broken anyway, as i've said over, and over again. It relies on filesystem locking, which is unreliable by definition.

But worse, its lenient. Startup ES, index some docs, and go nuke that node.lock. You can continue about your business, continue indexing docs, and so on. The only thing you will see is this:

[2016-08-12 09:37:25,300][WARN ][env                      ] [_Oe9-bX] lock assertion failed
java.nio.file.NoSuchFileException: /home/rmuir/workspace/elasticsearch/distribution/zip/build/distributions/elasticsearch-5.0.0-alpha6-SNAPSHOT/data/nodes/0/node.lock

Why is such an important exception dropped on the floor and merely translated into a logger WARN?

This feature is 100% unsafe.

nik9000 · 2016-08-12T13:45:53Z

I'm ok with removing the feature altogether. If we're already breaking backwards compatibility with the setting maybe we can just kill it? @clintongormley, what do you think?

I like that we did this now though because I have a feeling even if we do decide to remove the feature it'll take some time because lots of tests and the gradle build rely on it.

rjernst · 2016-08-12T14:32:03Z

The gradle build does not depend on it. Integ tests have unique installations per node, and even fantasy land tests create a unique temp dir per node for path.home iirc.

Given that the default is now 1, the comment in the config file was outdated. Also considering that the default value is production ready, we shouldn't list it among the values that need attention when going to production. Relates to elastic#19964

Given that the default is now 1, the comment in the config file was outdated. Also considering that the default value is production ready, we shouldn't list it among the values that need attention when going to production. Relates to #19964

jasontedor added >enhancement >breaking review :Core/Infra/Settings Settings infrastructure and APIs v5.0.0-beta1 labels Aug 11, 2016

jasontedor mentioned this pull request Aug 11, 2016

Function as though max_local_storage_nodes defaults to 1 in prod mode #19748

Closed

nik9000 reviewed Aug 11, 2016
View reviewed changes

Simplify tests handling of max local storage nodes

fdee065

This commit simplifies the handling of max local storage nodes in integration tests by just setting the default max local storage nodes to be the maximum possible integer.

jasontedor merged commit 1f0673c into elastic:master Aug 12, 2016

jasontedor deleted the default-max-local-storage-nodes branch August 12, 2016 13:26

jasontedor mentioned this pull request Aug 17, 2016

Update max local storage nodes docs #20029

Merged

dakrone mentioned this pull request Sep 27, 2016

Set default node.max_local_storage_nodes to 1 #8932

Closed

jasontedor mentioned this pull request Nov 3, 2016

Remove node.max_local_storage_nodes from setup doc #21305

Merged

javanna mentioned this pull request Nov 10, 2016

Remove max_local_storage_nodes from elasticsearch.yml #21467

Merged

jasontedor mentioned this pull request Mar 11, 2017

Additional Bootstrap Check for node.max_local_storage_nodes #23541

Closed

ywelsch mentioned this pull request May 23, 2019

Remove node.max_local_storage_nodes #42428

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default max local storage nodes to one #19964

Default max local storage nodes to one #19964

jasontedor commented Aug 11, 2016 •

edited

Loading

nik9000 Aug 11, 2016

rmuir commented Aug 12, 2016

nik9000 commented Aug 12, 2016

mikemccand commented Aug 12, 2016

jasontedor commented Aug 12, 2016

jasontedor commented Aug 12, 2016

mikemccand commented Aug 12, 2016

rmuir commented Aug 12, 2016

nik9000 commented Aug 12, 2016

rjernst commented Aug 12, 2016

Default max local storage nodes to one #19964

Default max local storage nodes to one #19964

Conversation

jasontedor commented Aug 11, 2016 • edited Loading

nik9000 Aug 11, 2016

Choose a reason for hiding this comment

rmuir commented Aug 12, 2016

nik9000 commented Aug 12, 2016

mikemccand commented Aug 12, 2016

jasontedor commented Aug 12, 2016

jasontedor commented Aug 12, 2016

mikemccand commented Aug 12, 2016

rmuir commented Aug 12, 2016

nik9000 commented Aug 12, 2016

rjernst commented Aug 12, 2016

jasontedor commented Aug 11, 2016 •

edited

Loading