New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow startup without working Elasticsearch cluster #1289

Merged
merged 2 commits into from Jul 7, 2015

Conversation

Projects
None yet
3 participants
@joschi
Contributor

joschi commented Jul 6, 2015

The previous implementation of IndexerSetupService (and related classes) would shutdown Graylog if Elasticsearch wasn't available or the cluster health state was RED on startup. This made sense as long as Graylog didn't have a proper on-disk journal in which messages could be persistently buffered.

The new implementation simply acknowledges the unavailability of Elasticsearch and generates an appropriate system notification (and log entries) but does not shutdown Graylog.

Fixes #1136

joschi added some commits Jul 6, 2015

Allow startup without working Elasticsearch cluster
The previous implementation of IndexerSetupService (and related classes) would should down Graylog if
Elasticsearch wasn't available or the cluster health state was RED on startup. This made sense as long
as Graylog didn't have a proper on-disk journal in which messages could be persistently buffered.

The new implementation simply acknowledges the unavailability of Elasticsearch and generates an
appropriate system notification (and log entries) but does not shutdown Graylog.

Fixes #1136
@@ -114,11 +111,10 @@ protected void startUp() throws Exception {
Tools.silenceUncaughtExceptionsInThisThread();
LOG.debug("Starting indexer");
try {
node.start();
node.start();

This comment has been minimized.

@kroepke

kroepke Jul 7, 2015

Member

are we sure this and the next line never throw exceptions?
if they do, we would miss them and not print any error courtesy of Tools.silenceUncaughtExceptionsInThisThread();

This comment has been minimized.

@joschi

joschi Jul 7, 2015

Contributor

The behavior is the same as before as the outer most catch block just re-threw the exception: https://github.com/Graylog2/graylog2-server/pull/1289/files#diff-72fd860130aebd46f43cb3fa1fb32d18L193

@kroepke

This comment has been minimized.

Member

kroepke commented Jul 7, 2015

other than the comment above this looks really good 👍

@kroepke kroepke self-assigned this Jul 7, 2015

@@ -28,6 +28,8 @@
MULTI_MASTER,
NO_MASTER,
ES_OPEN_FILES,
ES_CLUSTER_RED,
ES_UNAVAILABLE,
NO_INPUT_RUNNING,

This comment has been minimized.

@bernd

bernd Jul 7, 2015

Member

This requires an update to the web interface as well, right?

This comment has been minimized.

@joschi

joschi added a commit to graylog-labs/graylog2-web-interface that referenced this pull request Jul 7, 2015

@bernd

This comment has been minimized.

Member

bernd commented Jul 7, 2015

I tested the branch with Elasticsearch 1.5. Seems to work fine!

I am getting the following message in the logs on a regular basis. Maybe we should have an "es alive" check in the IndexRotationThread as well? It curently just checks if the IndexerSetupService is running.

2015-07-07 11:18:17,557 ERROR: org.graylog2.periodical.IndexRotationThread - Couldn't point deflector to a new index
org.elasticsearch.discovery.MasterNotDiscoveredException: waited for [30s]
    at org.elasticsearch.action.support.master.TransportMasterNodeOperationAction$4.onTimeout(TransportMasterNodeOperationAction.java:164)
    at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:231)
    at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:560)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
@joschi

This comment has been minimized.

Contributor

joschi commented Jul 7, 2015

@bernd Yes, thought about that as well. I would add this in another PR, though.

@bernd

This comment has been minimized.

Member

bernd commented Jul 7, 2015

Yes, thought about that as well. I would add this in another PR, though.

Ack! LGTM 👍

kroepke added a commit that referenced this pull request Jul 7, 2015

Merge pull request #1289 from Graylog2/issue-1136
Allow startup without working Elasticsearch cluster

@kroepke kroepke merged commit ba92b16 into master Jul 7, 2015

2 checks passed

ci Jenkins build graylog2-server-integration-pr 28 has succeeded
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@kroepke kroepke deleted the issue-1136 branch Jul 7, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment