Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinate hang when Elasticsearch restarts #1655

Closed
mdconner opened this issue Aug 21, 2014 · 9 comments
Closed

Infinate hang when Elasticsearch restarts #1655

mdconner opened this issue Aug 21, 2014 · 9 comments
Assignees

Comments

@mdconner
Copy link

Logstash appears to go into infinite hang after events are processed after Elasticsearch restarts.
Tested with Logstash 1.4.1 & Elasticsearch 1.1.1 on RHEL 6.5 by the following steps (tested with File input also):

  1. Start Elasticsearch from command line and wait for "started" message
  2. Start Logstash from command line as agent with following config file
    input { stdin { } }
    output { stdout { }
    elasticsearch { cluster => "elasticsearch" }
    }
  3. Send test events (stdin)
  4. Stop Elasticsearch
  5. Send test events (stdin)
  6. Start Elasticsearch
  7. Send test events (stdin)

Seems to recover successfully when no events sent before Elasticsearch restarts (step 5). Here is the entire console output (Exception occurs after Test 5):

[root@lnx65 bin]# ./logstash agent --verbose -f ../mdc_stdin.conf
Pipeline started {:level=>:info}
log4j, [2014-08-21T20:10:44.283] INFO: org.elasticsearch.node: [logstash-lnx65.localdomain-17807-2010] version[1.1.1], pid[17807], build[f1585f0/2014-04-16T14:27:12Z]
log4j, [2014-08-21T20:10:44.283] INFO: org.elasticsearch.node: [logstash-lnx65.localdomain-17807-2010] initializing ...
log4j, [2014-08-21T20:10:44.316] INFO: org.elasticsearch.plugins: [logstash-lnx65.localdomain-17807-2010] loaded [], sites []
log4j, [2014-08-21T20:10:47.029] INFO: org.elasticsearch.node: [logstash-lnx65.localdomain-17807-2010] initialized
log4j, [2014-08-21T20:10:47.029] INFO: org.elasticsearch.node: [logstash-lnx65.localdomain-17807-2010] starting ...
log4j, [2014-08-21T20:10:47.141] INFO: org.elasticsearch.transport: [logstash-lnx65.localdomain-17807-2010] bound_address {inet[/0:0:0:0:0:0:0:0:9301]}, publish_address {inet[/10.0.2.15:9301]}
log4j, [2014-08-21T20:10:50.358] INFO: org.elasticsearch.cluster.service: [logstash-lnx65.localdomain-17807-2010] detected_master [lnx65][El3W2hxoQvqkvIWo0fT4sg][lnx65.localdomain][inet[/10.0.2.15:9300]], added {[lnx65][El3W2hxoQvqkvIWo0fT4sg][lnx65.localdomain][inet[/10.0.2.15:9300]],}, reason: zen-disco-receive(from master [[lnx65][El3W2hxoQvqkvIWo0fT4sg][lnx65.localdomain][inet[/10.0.2.15:9300]]])
log4j, [2014-08-21T20:10:50.368] INFO: org.elasticsearch.discovery: [logstash-lnx65.localdomain-17807-2010] elasticsearch/SeDbmuOZTlq5DxxCnENasQ
log4j, [2014-08-21T20:10:50.373] INFO: org.elasticsearch.node: [logstash-lnx65.localdomain-17807-2010] started
New Elasticsearch output {:cluster=>"elasticsearch", :host=>nil, :port=>"9300-9305", :embedded=>false, :protocol=>"node", :level=>:info}
Automatic template management enabled {:manage_template=>"true", :level=>:info}
Using mapping template {:template=>"{ "template" : "logstash-_", "settings" : { "index.refresh_interval" : "5s" }, "mappings" : { "default" : { "all" : {"enabled" : true}, "dynamic_templates" : [ { "string_fields" : { "match" : "", "match_mapping_type" : "string", "mapping" : { "type" : "string", "index" : "analyzed", "omit_norms" : true, "fields" : { "raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" : 256} } } } } ], "properties" : { "@Version": { "type": "string", "index": "not_analyzed" }, "geoip" : { "type" : "object", "dynamic": true, "path": "full", "properties" : { "location" : { "type" : "geo_point" } } } } } }}", :level=>:info}

2014-08-21T18:10:53.394+0000 lnx65.localdomain
Test 1
2014-08-21T18:10:58.896+0000 lnx65.localdomain Test 1
Test 2
2014-08-21T18:11:01.376+0000 lnx65.localdomain Test 2
log4j, [2014-08-21T20:11:06.351] INFO: org.elasticsearch.discovery.zen: [logstash-lnx65.localdomain-17807-2010] master_left [[lnx65][El3W2hxoQvqkvIWo0fT4sg][lnx65.localdomain][inet[/10.0.2.15:9300]]], reason [transport disconnected (with verified connect)]
log4j, [2014-08-21T20:11:06.353] WARN: org.elasticsearch.discovery.zen: [logstash-lnx65.localdomain-17807-2010] master_left and no other node elected to become master, current nodes: {[logstash-lnx65.localdomain-17807-2010][SeDbmuOZTlq5DxxCnENasQ][lnx65.localdomain][inet[/10.0.2.15:9301]]{client=true, data=false},}
log4j, [2014-08-21T20:11:06.354] INFO: org.elasticsearch.cluster.service: [logstash-lnx65.localdomain-17807-2010] removed {[lnx65][El3W2hxoQvqkvIWo0fT4sg][lnx65.localdomain][inet[/10.0.2.15:9300]],}, reason: zen-disco-master_failed ([lnx65][El3W2hxoQvqkvIWo0fT4sg][lnx65.localdomain][inet[/10.0.2.15:9300]])

2014-08-21T18:11:14.608+0000 lnx65.localdomain

Test 3
Test 4
Test 5
Exception in thread "elasticsearch[logstash-lnx65.localdomain-17807-2010][generic][T#2]" org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];[SERVICE_UNAVAILABLE/2/no master];
at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:138)
at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedRaiseException(ClusterBlocks.java:128)
at org.elasticsearch.action.bulk.TransportBulkAction.executeBulk(TransportBulkAction.java:197)
at org.elasticsearch.action.bulk.TransportBulkAction.access$000(TransportBulkAction.java:65)
at org.elasticsearch.action.bulk.TransportBulkAction$1.onFailure(TransportBulkAction.java:143)
at org.elasticsearch.action.support.TransportAction$ThreadedActionListener$2.run(TransportAction.java:117)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
log4j, [2014-08-21T20:11:45.493] INFO: org.elasticsearch.cluster.service: [logstash-lnx65.localdomain-17807-2010] detected_master [lnx65][7vpAU211RHemHzNw1u9EyQ][lnx65.localdomain][inet[/10.0.2.15:9300]], added {[lnx65][7vpAU211RHemHzNw1u9EyQ][lnx65.localdomain][inet[/10.0.2.15:9300]],}, reason: zen-disco-receive(from master [[lnx65][7vpAU211RHemHzNw1u9EyQ][lnx65.localdomain][inet[/10.0.2.15:9300]]])

Test 6
Test 7

@suyograo suyograo self-assigned this Aug 29, 2014
@hoelzro
Copy link

hoelzro commented Sep 29, 2014

I've been seeing this issue as well.

@benediktarnold
Copy link

I am seeing this issue with logstash 1.4.2 and es 1.1.1 too.

@tiagobnobrega
Copy link

I am also seeing this issue. Have to restart logstash also to make everything work again. es 1.1.1 logstash 1.4.2

@sbitpdc
Copy link

sbitpdc commented Dec 9, 2014

I've been seeing this issue as well.

@cisco-sro
Copy link

Same here with Logstash 1.4.2 and ES 1.4.2

@wangjingbomail
Copy link

Yes,I have to restart all logstash when I restart elasticsearch, it is boring

@matthewparsons
Copy link

I've got a suspicion the filter queue is blocking and not clearing again. Is there any way to better debug what is going on?

@skikd636
Copy link

skikd636 commented May 1, 2015

I get similar behavior here except it's not when ES restarts it's when I have what I'll call a connection "hiccup" between Logstash and ES. The Logstash server goes into a continuous loop. I get no errors on the ES side of things which I find strange. Here's the error I see in the Logstash log file:

{:timestamp=>"2015-05-01T06:41:17.479000-0400", :message=>"Failed to flush outgoing items", :outgoing_count=>2, :exception=>#<Errno::ECONNREFUSED: Connection refused - Connection
refused>, :backtrace=>["org/jruby/ext/socket/RubySocket.java:201:in `connect_nonblock'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:156:in `conne
ct'", "org/jruby/RubyArray.java:1613:in `each'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/connection.rb:139:in `connect'", "/opt/logstash/vendor/bundle/jruby
/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:406:in `connect'", "org/jruby/RubyProc.java:271:in `call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/pool.rb:48:in `fet
ch'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:403:in `connect'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:319:in `ex
ecute'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/ftw-0.0.39/lib/ftw/agent.rb:217:in `post!'", "/opt/logstash/lib/logstash/outputs/elasticsearch_http.rb:228:in `post'", "/opt/l
ogstash/lib/logstash/outputs/elasticsearch_http.rb:223:in `flush'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:219:in `buffer_flush'", "org/jruby/R
ubyHash.java:1339:in `each'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:216:in `buffer_flush'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0
.0.17/lib/stud/buffer.rb:193:in `buffer_flush'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:112:in `buffer_initialize'", "org/jruby/RubyKernel.java
:1521:in `loop'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.17/lib/stud/buffer.rb:110:in `buffer_initialize'"], :level=>:warn}

@jordansissel
Copy link
Contributor

For Logstash 1.5.0, we've moved all plugins to individual repositories, so I have moved this issue to logstash-plugins/logstash-output-elasticsearch#142. Let's continue the discussion there! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests