Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor listener startup error handling #1730

Merged
merged 1 commit into from
Oct 10, 2018
Merged

Conversation

michaelklishin
Copy link
Member

Proposed Changes

Functions that start listeners (Ranch supervisors) no longer
throw on errors. They simply return the first error encountered
and let the boot step handle it.

Since there is no way for boot steps to indicate errors, this is the
best we can do in this area without a much deeper refactoring of the boot
sequence.

In addition they also log the error. Note that modern Ranch versions
log more reasonable messages when Ranch supervisors exit due to
a listen/bind socket operation error, e.g. when the address/port pair
is already in use.

Types of Changes

  • Bug fix (non-breaking change which fixes issue Improve error reporting for listeners that fail to start #1711)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation (correction or otherwise)
  • Cosmetics (whitespace, appearance)

Checklist

  • I have read the CONTRIBUTING.md document
  • I have signed the CA (see https://cla.pivotal.io/sign/rabbitmq)
  • All tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in related repositories

Further Comments

Closes #1711 (for now), covers #1729 for the server as a drive-by change.

[#160791138]
[#161136615]

Functions that start listeners (Ranch supervisors) no longer
throw on errors. They simply return the first error encountered
and let the boot step handle it.

Since there is no way for boot steps to indicate errors, this is the
best we can do in this area without a much deeper refactoring of the boot
sequence.

In addition they also log the error. Note that modern Ranch versions
log more reasonable messages when Ranch supervisors exit due to
a listen/bind socket operation error, e.g. when the address/port pair
is already in use.

Closes #1711 (for now), covers #1729 for the server as a drive-by change.

[#160791138]
[#161136615]
@lukebakken
Copy link
Collaborator

Previous error text 🤕

              Starting broker...
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,{{shutdown,{failed_to_start_child,{ranch_listener_sup,{acceptor,{0,0,0,0,0,0,0,0},5672}},{shutdown,{failed_to_start_child,ranch_acceptors_sup,{listen_error,{acceptor,{0,0,0,0,0,0,0,0},5672},eaddrinuse}}}}},{child,undefined,'rabbit_tcp_listener_sup_:::5672',{tcp_listener_sup,start_link,[{0,0,0,0,0,0,0,0},5672,ranch_tcp,[inet6,{backlog,128},{nodelay,true},{linger,{true,0}},{exit_on_close,false}],rabbit_connection_sup,[],{rabbit_networking,tcp_listener_started,[amqp,[{backlog,128},{nodelay,true},{linger,{true,0}},{exit_on_close,false}]]},{rabbit_networking,tcp_listener_stopped,[amqp,[{backlog,128},{nodelay,true},{linger,{true,0}},{exit_on_close,false}]]},10,\"TCP Listener\"]},transient,infinity,supervisor,[tcp_listener_sup]}}}},[{rabbit_networking,start_listener0,5,[{file,\"src/rabbit_networking.erl\"},{line,230}]},{rabbit_networking,'-start_listener/5-lc$^0/1-0-',5,[{file,\"src/rabbit_networking.erl\"},{line,221}]},{rabbit_networking,start_listener,5,[{file,\"src/rabbit_networking.erl\"},{line,222}]},{rabbit_networking,'-boot_tcp/1-lc$^0/1-0-',2,[{file,\"src/rabbit_networking.erl\"},{line,128}]},{rabbit_networking,boot_tcp,1,[{file,\"src/rabbit_networking.erl\"},{line,128}]},{rabbit_networking,boot,0,[{file,\"src/rabbit_networking.erl\"},{line,121}]},{rabbit_boot_steps,'-run_step/2-lc$^1/1-1-',1,[{file,\"src/rabbit_boot_steps.erl\"},{line,49}]},{rabbit_boot_steps,run_step,2,[{file,\"src/rabbit_boot_steps.erl\"},{line,52}]}]}}}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,{{shutdown,{failed_to_start_child,{ranch_listener

Crash dump is being written to: /tmp/rabbitmq-test-instances/rabbit@shostakovich/log/erl_crash.dump...done
make: *** [/home/lbakken/development/rabbitmq/rabbitmq-server/deps/rabbit_common/mk/rabbitmq-run.mk:226: run-broker] Error 1

With this patch:

              Starting broker...
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{{could_not_start_listener,\"::\",5672,eaddrinuse},{rabbit,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,rabbit,{{could_not_start_listener,"::",5672,eaddrinuse},{rabbit,start,[normal,[]]}}})

Crash dump is being written to: /tmp/rabbitmq-test-instances/rabbit@shostakovich/log/erl_crash.dump...done
make: *** [/home/lbakken/development/rabbitmq/rabbitmq-server/deps/rabbit_common/mk/rabbitmq-run.mk:226: run-broker] Error 1

I'd call that a big improvement 👍

@lukebakken lukebakken merged commit 100e7e1 into master Oct 10, 2018
@lukebakken lukebakken deleted the rabbitmq-server-1711 branch October 10, 2018 23:34
lukebakken added a commit that referenced this pull request Oct 10, 2018
Refactor listener startup error handling

(cherry picked from commit 100e7e1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve error reporting for listeners that fail to start
2 participants