Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't running elasticsearch after update ubuntu and java #28349

Closed
longcnttbkhn opened this issue Jan 24, 2018 · 25 comments
Closed

Can't running elasticsearch after update ubuntu and java #28349

longcnttbkhn opened this issue Jan 24, 2018 · 25 comments

Comments

@longcnttbkhn
Copy link

Describe the feature: I can't running elasticsearch after update ubuntu and java

Elasticsearch version (bin/elasticsearch --version): 2.4.1

Plugins installed: []

JVM version (java -version): OpenJDK 1.8.0_151

OS version (uname -a if on a Unix-like system): Linux 4.13.0-31-generic #34~16.04.1-Ubuntu x86_64 GNU/Linux

Description of the problem including expected versus actual behavior: Normally I can run elasticsearch with the ./elasticsearch command. But this morning, after updating ubuntu and java, I was not able to boot up elasticsearch anymore.

@jpountz
Copy link
Contributor

jpountz commented Jan 24, 2018

Please describe the problem you are having, otherwise it's very hard to help.

@exorbit
Copy link

exorbit commented Jan 24, 2018

I have the same problem since OS update this morning. Befor the update all was fine.

Elasticsearch version (bin/elasticsearch --version): 2.4.6

JVM version (java -version): 1.8.0_131

OS version (uname -a if on a Unix-like system): Linux dellXPS 4.13.0-31-generic #34-Ubuntu 17.10

Description of the problem including expected versus actual behavior:
./elasticsearch did nothing. No message, no log entry.

I noticed that with the update also an update of intels microcode was installed:
intel-microcode 3.20180108.0+really20170707ubuntu17.10.1

@simonwillnauer
Copy link

can you folks give us some logging information otherwise everything is a shot in the dark

@exorbit
Copy link

exorbit commented Jan 24, 2018

There is no logging anymore. I start ./elasticsearch in a terminal and after a second it comes back without any messages. Logfiles are empty also with DEBUG.
I also tried without any java params: /usr/lib/jvm/java-8-oracle/bin/java -Dfile.encoding=UTF-8 -Des.path.home=/home/xxx/elasticsearch-2.4.6 -cp /home/xxx/programs/elasticsearch-2.4.6/lib/elasticsearch-2.4.6.jar:/home/xxx/elasticsearch-2.4.6/lib/* org.elasticsearch.bootstrap.Elasticsearch start without message

@reggieb
Copy link

reggieb commented Jan 24, 2018

Sorry - when I started writing my bug report there was only the first entry and I wasn't sure this was the same bug. But looking at the two more recent entries, I think it is. I do have a related system log error. See #28354

@jasontedor
Copy link
Member

I think you should check your kernel log messages (e.g., dmesg).

@exorbit
Copy link

exorbit commented Jan 24, 2018

Yep, this is the same issue with the same kernel output.
So, meltdown has killed my ES!

@jasontedor
Copy link
Member

Sorry, this is not an Elasticsearch issue then. I suggest rolling back your kernel.

@exorbit
Copy link

exorbit commented Jan 24, 2018

I also suspect the microcode update from intel is involved.

@longcnttbkhn
Copy link
Author

Thank for your support! But I used docker elasticsearch instead, hoping to fix the kenel soon.

@jasontedor
Copy link
Member

The kernel is that of your underlying host, it does not have anything to do with the image.

@dadoonet
Copy link
Member

@exorbit
Copy link

exorbit commented Jan 24, 2018

Running Linux with last kernel 4.13.0-25-generic and ES works fine again!

@centic9
Copy link
Contributor

centic9 commented Jan 24, 2018

Same for us, 4.13.0-25-generic works, upgrading to 4.13.0-31-generic fails.

@chaudum
Copy link
Contributor

chaudum commented Jan 24, 2018

The crash is caused by a non-existant syscall, which has been removed in ES 5.6.4
This is the commit that fixes the bug: b1d5e85

@exorbit
Copy link

exorbit commented Jan 24, 2018

Any chance to have this fix in ES 2?

@s1monw
Copy link
Contributor

s1monw commented Jan 24, 2018

Any chance to have this fix in ES 2?

we won't release another version of 2.x

@jasontedor
Copy link
Member

I am sorry, but that is not the issue. It might be a workaround for the issue (I have not verified), but it is not the bug. The bug is the kernel here. The error message is that the kernel tried to execute a page marked with NX, the kernel should never ever do that. Invoking a non-existant system call should not result in the kernel trying to execute a page marked NX, ever. Instead, invoking a non-existant syscall should return -ENOSYS (see sys_ni_syscall in the kernel execution of syscalls) (except in Docker where the process gets a SIGSYS if a system call filter catches the system call).

@shirgall
Copy link

We're looking into this, the linux-azure kernel also has a similar issue. The Meltdown and Spectre changes may have introduced a regression.

@jasontedor
Copy link
Member

Thanks @shirgall.

@skif48
Copy link

skif48 commented Jan 25, 2018

@chaudum indeed, I gotta say that upgrading to 5.6.4 fixed the problem.

@jasontedor
Copy link
Member

No it didn’t. You have a buggy kernel, how can you trust it to do anything correctly?

@deppi
Copy link

deppi commented Jan 25, 2018

If anyone is working with GCP instances, using Ubuntu 16.04 LTS, I downgraded the kernel from:

uname -a
Linux elasticsearch-1-vm 4.13.0-1007-gcp #10-Ubuntu SMP Fri Jan 12 13:56:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

To:

uname -a
Linux elasticsearch-1-vm 4.13.0-1006-gcp #9-Ubuntu SMP Mon Jan 8 21:13:15 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

To fix the issue with the GCP instances, I ran:

sudo apt remove 4.13.0-1007-gcp
sudo apt install 4.13.0-1006-gcp
exit

Then in google cloud console, restart the instance, then SSH back in then:
sudo service elasticsearch start

@shirgall
Copy link

shirgall commented Jan 27, 2018 via email

openstack-gerrit pushed a commit to openstack/openstack-helm-infra that referenced this issue Jan 29, 2018
There was a change in the upstream reference httpd image for
apache that changed how modules were built for apache.
This change adds the required fix to accomodate the change.
See isssue here docker-library/httpd#87

The Elasticsearch image tag was updated to accomodate the kernel
versions used in the gate as part of the kernel update playbook
See elastic/elasticsearch#28349 (comment)

The openstack-exporter binary was changed to reflect changes made
to the openstack-exporter image

Change-Id: I1deb9e7cde794421dd33fade566c2a9fdb5007e6
@centic9
Copy link
Contributor

centic9 commented Jan 29, 2018

FYI, latest Ubuntu 17.10 kernel seems to have this fixed again via Kernel version 4.13.0-32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests