New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elasticsearch 8.11.1 RPM restarts process during update #102103
Comments
I've also run into this issue with the .deb installer. Normally I would preinstall the upgrade and then restart the nodes in an orderly fashion, however this upgrade caused my entire cluster to restart at the same time on 8.11.1, and required manual intervention. |
Yep, that's my exact workflow as well. Glad I started on a cluster where it didn't really matter. |
Pinging @elastic/es-delivery (Team:Delivery) |
I was unable to replicate this when upgrading 8.10.0 to 8.11.1. The only scenario in which the service should be restarted on upgrade is if See: https://www.elastic.co/guide/en/elasticsearch/reference/current/rpm.html#rpm-configuring |
The ones described above went from 8.11.0 to 8.11.1. I've since upgraded one additional cluster. It came from 8.10.4 (I think? – it was on the last 8.10.x version) and didn't have this issue.
|
I have the same, but in
|
I wasn't able to reproduce this going from 8.11.0 to 8.11.1 either. There's nothing in our packaging scripts that would call |
Here's what I have: https://gist.github.com/ceeeekay/8e407092ef24ac89dec897c1c1748e1a Seems to be looking for |
This sound suspiciously like the server actually errored during the upgrade and systemd attempted to restart it. Looks like we tried to load a class from a jar that no longer exist because it was replaced during the upgrade. This is likely a real problem. @rjernst Thoughts here? This was likely always a problem. We don't really support in-place distribution upgrades and we might even document this somewhere but this is exactly what a Linux package upgrade is doing. I'm wondering if doing a package upgrade on a running node should be supported, or if we should stop, upgrade, then restart to avoid this scenario. Second, I'm wondering if there have been some changes that make this more likely. Are we loading service providers more often? Or not caching results of these when we probably should? I would expect issues with trying to load classes from jars on disk to be rare for nodes that have been running for any length of time. It does indeed look like there were some recent changes in 8.11 in the specific stacktrace linked above. |
FWIW in several years of upgrading ES I have never seen a node stop during the package upgrade, but it happened simultaneously to 12 nodes this time (I lost all my data nodes, one of my masters, and two ingest nodes). Usually, by the time I've finished all my rolling restarts, the last node will have been running fine for an hour after the package upgrade occurred. |
Updating this to mention that the same is also happening for me with the upgrade to 8.11.2-1. Update: It also seems to be stumbling over this one:
|
Looking up component versions through SPI should not change. This commit captures the component versions of the running node once during startup, rather than every time node info is called. closes elastic#102103
Since loading classes could in theory happen at any time, we won't ever be able to stop this error from occurring completely. However, in this specific case, I think it is more likely to happen now because the component versions are loaded by SPI every time the node info api is called. I've opened #103408 to fix that. |
Hi, I got the same problem today upgrading from 8.11.1 to 8.11.3. Didn't have issue on my last upgrade from 8.10.4 to 8.11.1. ERROR from journalctl
Error from elasticsearch server log at that same time:
|
Looking up component versions through SPI should not change. This commit captures the component versions of the running node once during startup, rather than every time node info is called. closes #102103
Looking up component versions through SPI should not change. This commit captures the component versions of the running node once during startup, rather than every time node info is called. closes elastic#102103
The core issue causing the exception should be fixed in 8.12, but this still means that folks upgrading from 8.11.x to 8.12.x will be susceptible to this. Something to keep in mind. |
Elasticsearch Version
8.11.1
Installed Plugins
No response
Java Version
bundled
OS Version
Rocky 9.2 5.14.0-70.26.1.el9_0.x86_64
Problem Description
While updating several nodes from ES 8.11.0 to 8.11.1 I noticed that the nodes tried to restart the elasticsearch process. In previous versions this was not the case and required a manual step. Additionally the restart process failed on all nodes I updated. A manual
systemctl restart elasticsearch
worked however and the nodes are back up.Steps to Reproduce
Install the elasticsearch-8.11.1 RPM from
https://artifacts.elastic.co/packages/8.x/yum/
Logs (if relevant)
The text was updated successfully, but these errors were encountered: