Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forgot to install a plugin can cause custom cluster state parts to be removed #21471

Open
martijnvg opened this issue Nov 10, 2016 · 9 comments
Assignees
Labels
>bug :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. stalled Team:Distributed Meta label for distributed team

Comments

@martijnvg
Copy link
Member

If a plugin adds custom cluster state parts and the plugin isn't installed on all nodes then it could cause custom metadata to be removed by a node just by reading it.

Brought up while reviewing #21243.

@martijnvg martijnvg added the >bug label Nov 10, 2016
@martijnvg martijnvg changed the title Forgetting install plugin could cause custom cluster state removal Forgot to install a plugin can cause custom cluster state parts to be removed Nov 10, 2016
@clintongormley
Copy link

On the other hand, if you uninstall a plugin, it's the only way of removing its custom metadata. Perhaps we should just recommend using the required plugins setting?

@bleskes
Copy link
Contributor

bleskes commented Nov 10, 2016

my kneejerk reaction in the review was to refuse starting when you can't read a local cluster state due to the lack of a plugin and make the plugin manager to remove/archive pieces of the cluster state. This is however not trivial at all as the plugin manager need to reach into ES storage which is something we didn't do before (and may require custom knowledge in the future where current plans will make it more complicated). There is also the question if we want the plugin manager to "archive" this like we do with other config.

Perhaps we should just recommend using the required plugins setting?

yeah.. it's fairly hidden now (it took a couple of iteration to find it, knowing it exists.) Maybe we should promote it to a note in the introductions section or the management intro of the plugins.

@martijnvg
Copy link
Member Author

Right, but that doesn't protected against unintended removal of custom metadata. Maybe we should go even further and require that plugins are also specified in the plugin.mandatory setting? (if a plugin isn't in the this list then we abort starting a node)

@clintongormley clintongormley added :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. and removed :Cluster labels Feb 13, 2018
@bleskes
Copy link
Contributor

bleskes commented Mar 20, 2018

We are working on addressing this as part of the node validaing in zen 2. @ywelsch @DaveCTurner do you have this captured in another place so we can close the issue?

@bleskes
Copy link
Contributor

bleskes commented Apr 24, 2018

@DaveCTurner ping re ^^ ?

@DaveCTurner
Copy link
Contributor

The approach we're currently investigating for Zen 2 now looks like it'll be fairly independent of this issue: it won't fix this, but nor will it significantly affect the work required to fix this.

@rjernst rjernst added the Team:Distributed Meta label for distributed team label May 4, 2020
@DaveCTurner DaveCTurner self-assigned this Jul 29, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@DaveCTurner
Copy link
Contributor

DaveCTurner commented Aug 1, 2022

This is still a legitimate issue, we leniently ignore unknown fields when reading the Metadata from XContent:

try {
Custom custom = parser.namedObject(Custom.class, currentFieldName, null);
builder.putCustom(custom.getWriteableName(), custom);
} catch (NamedObjectNotFoundException ex) {
logger.warn("Skipping unknown custom object with type {}", currentFieldName);
parser.skipChildren();
}

Since it is now possible to manually remove named customs from the on-disk metadata (via elasticsearch-node remove-customs), I think we could reasonably remove this lenience and make the node fail to start if it can't read its whole metadata.

NB to avoid having to do manual surgery on the cluster state would require uninstallable plugins to support removing their custom metadata whilst still installed. Today I think there are no known uninstallable plugins, nor is there even a stable API by which plugins can manipulate their parts of the cluster state. I'll raise this point with the plugins folks.

@DaveCTurner DaveCTurner added :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. and removed :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. labels Aug 1, 2022
@DaveCTurner
Copy link
Contributor

I looked into how disruptive it might be to remove this lenience and found at least one blocker, see #88983.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. stalled Team:Distributed Meta label for distributed team
Projects
None yet
Development

No branches or pull requests

6 participants