Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

so-elastic-start times out waiting for elasticsearch #1695

Closed
petiepooo opened this issue Dec 27, 2019 · 7 comments
Closed

so-elastic-start times out waiting for elasticsearch #1695

petiepooo opened this issue Dec 27, 2019 · 7 comments
Assignees
Projects

Comments

@petiepooo
Copy link

@petiepooo petiepooo commented Dec 27, 2019

so-kibana-start has a hardcoded timeout of 240 seconds waiting for elasticsearch to start (not the wait for the .kibana shard addressed in #1655 but the wait prior to that for elasticsearch itself to respond).
On resource-constrained systems, generally during boot, it takes longer than that for elasticsearch to start, as it has to share CPU resources with multiple snort and barnyard2 processes being initialized. In such cases, when elasticsearch has not come online within 240 seconds, the remaining elastic services are not started. Most noticeably, kibana, but also elastalert, logstash, and curator.
Would it be acceptable to modify the so-kibana-start script to try reading the timeout from /etc/nsm/securityonion.conf and, only if not found there, default to 240?

@petiepooo

This comment has been minimized.

Copy link
Author

@petiepooo petiepooo commented Dec 27, 2019

In the interim, I've doubled the time simply by changing sleep 1 to sleep 2 within that loop... it was easy for salt to do an in-place edit since there is only one call to sleep in the script. :)

dougburks added a commit to Security-Onion-Solutions/securityonion-elastic that referenced this issue Dec 27, 2019
@dougburks

This comment has been minimized.

Copy link
Contributor

@dougburks dougburks commented Dec 27, 2019

Hi @petiepooo ,

Sounds like a good idea. I've made the new default 480 seconds and you can now change that default value by setting ELASTICSEARCH_TIMEOUT in /etc/nsm/securityonion.conf. I've implemented this for so-kibana-start and so-elasticsearch-pipelines as well since it has a similar timeout. Please take a look at Security-Onion-Solutions/securityonion-elastic@e9d3421 and let me know what you think.

Thanks!

@dougburks dougburks self-assigned this Dec 27, 2019
@dougburks dougburks added this to To do in 16.04.6.4 via automation Dec 27, 2019
@dougburks dougburks moved this from To do to In progress in 16.04.6.4 Dec 27, 2019
@petiepooo

This comment has been minimized.

Copy link
Author

@petiepooo petiepooo commented Dec 28, 2019

I believe that should work very well. Thank you for your quick response!

@dougburks dougburks moved this from In progress to In Testing in 16.04.6.4 Jan 4, 2020
@petiepooo

This comment has been minimized.

Copy link
Author

@petiepooo petiepooo commented Jan 14, 2020

So.. another reboot, another timeout. Now I see that the so-boot process is being killed by systemd due to the timeout setting in /etc/systemd/system/securityonion.service.
Since that also includes the time needed to launch the squild server and sensor components, perhaps that could be increased from 300 to 600 as well?

@dougburks

This comment has been minimized.

Copy link
Contributor

@dougburks dougburks commented Jan 14, 2020

Created issue 1708 to increase the timeout in /etc/systemd/system/securityonion.service:
#1708

@weslambert

This comment has been minimized.

Copy link
Collaborator

@weslambert weslambert commented Jan 17, 2020

Looks good so far from my testing 👍

@weslambert weslambert moved this from In Testing to Tested in 16.04.6.4 Jan 17, 2020
@dougburks

This comment has been minimized.

@dougburks dougburks closed this Feb 5, 2020
16.04.6.4 automation moved this from Tested to Done Feb 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
16.04.6.4
  
Done
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.