Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sensu server stops by itself #1368

Closed
kaushiksriram100 opened this issue Jul 12, 2016 · 6 comments

Comments

Projects
None yet
2 participants
@kaushiksriram100
Copy link

commented Jul 12, 2016

Seeing a wierd issue on only 1 server in a 4 node sensu cluster (0.21 version). The sensu-server simply stops and then starts by itself after some time. I presume it starts because chef-client daemon is running and may be starting it. I see mixed log messages. In one occasion it shows like this:

Any pointers?

{"timestamp":"2016-07-09T05:46:51.006604-0700","level":"fatal","message":"transport connection error","error":"rabbitmq channel closed"}
{"timestamp":"2016-07-09T05:46:51.006776-0700","level":"warn","message":"reconnecting to transport"}
/opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/amqp-1.5.0/lib/amqp/session.rb:738:in send_frame': The connection is closed, you can't use it anymore! (AMQP::ConnectionClosedError) from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/amqp-1.5.0/lib/amqp/channel.rb:1022:inacknowledge'
from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/amqp-1.5.0/lib/amqp/header.rb:35:in ack' from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/sensu-transport-3.3.0/lib/sensu/transport/rabbitmq.rb:74:inacknowledge'
from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/sensu-transport-3.3.0/lib/sensu/transport/base.rb:117:in ack' from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/sensu-0.21.0/lib/sensu/server/process.rb:460:inblock (2 levels) in setup_results'
from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:976:in call' from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:976:inblock in run_deferred_callbacks'
from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:973:in times' from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:973:inrun_deferred_callbacks'
from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:193:in run_machine' from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:193:inrun'
from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/sensu-0.21.0/lib/sensu/server/process.rb:23:in run' from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/sensu-0.21.0/exe/sensu-server:10:in<top (required)>'
from /opt/sensu/bin/sensu-server:23:in load' from /opt/sensu/bin/sensu-server:23:in

'

In another scenario it shows these:
{"timestamp":"2016-07-12T13:53:01.032871-0700","level":"warn","message":"unsubscribing from keepalive and result queues"}
/opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:331:in add_oneshot_timer': ran out of timers; use #set_max_timers to increase limit (RuntimeError) from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:331:inadd_timer'
from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/eventmachine-1.0.8/lib/em/timers.rb:12:in initialize' from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/sensu-0.21.0/lib/sensu/utilities.rb:21:innew'
from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/sensu-0.21.0/lib/sensu/utilities.rb:21:in retry_until_true' from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/sensu-0.21.0/lib/sensu/utilities.rb:23:inblock in retry_until_true'
from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:193:in call' from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:193:inrun_machine'
from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:193:in run' from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/sensu-0.21.0/lib/sensu/server/process.rb:23:inrun'
from /opt/sensu/embedded/lib/ruby/gems/2.2.0/gems/sensu-0.21.0/exe/sensu-server:10:in <top (required)>' from /opt/sensu/bin/sensu-server:23:inload'
from /opt/sensu/bin/sensu-server:23:in `

'

@portertech

This comment has been minimized.

Copy link
Member

commented Jul 13, 2016

@kaushiksriram100 are you using a proxy in front of RabbitMQ?

I have a open pull request for the second scenario, #1370 👍

@kaushiksriram100

This comment has been minimized.

Copy link
Author

commented Jul 18, 2016

@portertech : no. Not using a proxy.

@portertech

This comment has been minimized.

Copy link
Member

commented Jul 20, 2016

@kaushiksriram100 does the RabbitMQ log suggest why the channels are being closed?

@portertech

This comment has been minimized.

Copy link
Member

commented Jul 20, 2016

@kaushiksriram100 in regards to the timer issue, how many check definitions does the install have?

@kaushiksriram100

This comment has been minimized.

Copy link
Author

commented Jul 21, 2016

@portertech This is staging setup and there are 892 checks in this install but not all clients in this staging install are subscribed to run all of those checks. There are lesser clients in staging. Production setup also has the same 892 checks with more clients and more subscriptions.

Issue is currently observed in staging and one of the prod install.

@portertech

This comment has been minimized.

Copy link
Member

commented Mar 20, 2017

Closing this due to inactivity, please feel free to create a new issue if this hasn't been resolved.

@portertech portertech closed this Mar 20, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.