Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

registry hangs and is unresponsive: bosh deploy fails with execution expired #1234

Closed
voelzmo opened this issue Apr 21, 2016 · 3 comments
Closed

Comments

@voelzmo
Copy link
Contributor

voelzmo commented Apr 21, 2016

We saw recently a few deployments failing with the error of execution expired during writing the settings for a VM into the registry. The registry stopped responding at all, there was nothing in the logs.

Even when being on the Director itself, calling the registry didn't work. Something like

# curl  http://localhost:25777/instances/bla/settings

should fail with an error saying instance 'bla' not found, but just hangs until it runs into a read timeout.

So we attached a gdb and looked at the backtrace, seeing that excon couldn't open the ssl_socket and blocked forever in the rescue. Note that the documentation of IO.select states this piece of code as an explicit example on how to implement a blocking read

The gdb stacktracke:

(gdb) call (void) rb_backtrace()
    from /var/vcap/packages/registry/bin/bosh-registry:16:in `<main>'
    from /var/vcap/packages/registry/bin/bosh-registry:16:in `load'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/bosh-registry-1.3202.0/bin/bosh-registry:28:in `<top (required)>'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/bosh-registry-1.3202.0/lib/bosh/registry/runner.rb:18:in `run'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/bosh-registry-1.3202.0/lib/bosh/registry/runner.rb:34:in `start_http_server'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/thin-1.5.1/lib/thin/server.rb:159:in `start'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/thin-1.5.1/lib/thin/backends/base.rb:63:in `start'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/eventmachine-1.0.3/lib/eventmachine.rb:187:in `run'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/eventmachine-1.0.3/lib/eventmachine.rb:187:in `run_machine'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/thin-1.5.1/lib/thin/connection.rb:39:in `receive_data'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/thin-1.5.1/lib/thin/connection.rb:54:in `process'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/thin-1.5.1/lib/thin/connection.rb:79:in `pre_process'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/thin-1.5.1/lib/thin/connection.rb:79:in `catch'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/thin-1.5.1/lib/thin/connection.rb:81:in `block in pre_process'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/rack-1.6.4/lib/rack/urlmap.rb:50:in `call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/rack-1.6.4/lib/rack/urlmap.rb:50:in `each'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/rack-1.6.4/lib/rack/urlmap.rb:66:in `block in call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:2021:in `call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:181:in `call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/show_exceptions.rb:21:in `call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/rack-1.6.4/lib/rack/head.rb:13:in `call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/rack-1.6.4/lib/rack/nulllogger.rb:9:in `call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/rack-protection-1.5.3/lib/rack/protection/frame_options.rb:31:in `call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/rack-protection-1.5.3/lib/rack/protection/base.rb:49:in `call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/rack-protection-1.5.3/lib/rack/protection/base.rb:49:in `call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/rack-protection-1.5.3/lib/rack/protection/json_csrf.rb:18:in `call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/rack-protection-1.5.3/lib/rack/protection/path_traversal.rb:16:in `call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/rack-protection-1.5.3/lib/rack/protection/xss_header.rb:18:in `call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:894:in `call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:906:in `call!'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:1066:in `invoke'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:1066:in `catch'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:1066:in `block in invoke'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:906:in `block in call!'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:1081:in `dispatch!'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:1066:in `invoke'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:1066:in `catch'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:1066:in `block in invoke'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:1084:in `block in dispatch!'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:971:in `route!'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:971:in `each'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:972:in `block in route!'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:1012:in `process_route'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:1012:in `catch'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:1014:in `block in process_route'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:974:in `block (2 levels) in route!'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:993:in `route_eval'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:974:in `block (3 levels) in route!'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:974:in `[]'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:1610:in `block in compile!'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/sinatra-1.4.6/lib/sinatra/base.rb:1610:in `call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/bosh-registry-1.3202.0/lib/bosh/registry/api_controller.rb:22:in `block in <class:ApiController>'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/bosh-registry-1.3202.0/lib/bosh/registry/instance_manager.rb:29:in `read_settings'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/bosh-registry-1.3202.0/lib/bosh/registry/instance_manager.rb:54:in `check_instance_ips'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/bosh-registry-1.3202.0/lib/bosh/registry/instance_manager/openstack.rb:79:in `instance_ips'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/fog-1.34.0/lib/fog/openstack/models/compute/server.rb:151:in `private_ip_addresses'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/fog-1.34.0/lib/fog/openstack/models/compute/server.rb:127:in `floating_ip_addresses'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/fog-1.34.0/lib/fog/openstack/models/compute/server.rb:109:in `all_addresses'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/fog-1.34.0/lib/fog/openstack/requests/compute/list_all_addresses.rb:6:in `list_all_addresses'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/fog-1.34.0/lib/fog/openstack/compute.rb:355:in `request'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/fog-core-1.32.1/lib/fog/core/connection.rb:81:in `request'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/excon-0.45.4/lib/excon/connection.rb:233:in `request'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/excon-0.45.4/lib/excon/middlewares/base.rb:15:in `request_call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/excon-0.45.4/lib/excon/middlewares/base.rb:15:in `request_call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/excon-0.45.4/lib/excon/middlewares/base.rb:15:in `request_call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/excon-0.45.4/lib/excon/middlewares/instrumentor.rb:22:in `request_call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/excon-0.45.4/lib/excon/middlewares/mock.rb:47:in `request_call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/excon-0.45.4/lib/excon/connection.rb:106:in `request_call'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/excon-0.45.4/lib/excon/connection.rb:387:in `socket'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/excon-0.45.4/lib/excon/connection.rb:387:in `new'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/excon-0.45.4/lib/excon/ssl_socket.rb:119:in `initialize'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/excon-0.45.4/lib/excon/ssl_socket.rb:122:in `rescue in initialize'
    from /var/vcap/packages/registry/gem_home/ruby/2.1.0/gems/excon-0.45.4/lib/excon/ssl_socket.rb:122:in `select'

This is an acepted bug in excon 0.45.4 and has been fixed by adding a timeout to IO.select in 0.49.

fog-core is already updated to consume excon 0.49. We're now waiting for fog and fog-openstack to be updated to consume fog-core 1.38.0.

While this prevents excon from blocking forever, it might be a good idea to have a detailed look at the registry on what else could be done to prevent one hanging call blocking the entire registry for everyone.

@cppforlife
Copy link
Contributor

Updating fog and excon sounds ok.

@voelzmo
Copy link
Contributor Author

voelzmo commented May 11, 2016

I've created https://www.pivotaltracker.com/story/show/119363951 to bump fog in bosh-registry.

beyhan pushed a commit that referenced this issue Jun 30, 2016
- consume fixes in excon 0.49, especially freezing connections
  in the registry
  - fixes #1234
- get rid of full fog (we only need fog-aws and fog-openstack)
- some other gems were taken in newer version by
  `bundle update bosh_openstack_cpi`

[#119363951](https://www.pivotaltracker.com/story/show/119363951)
Signed-off-by: Beyhan Veli <beyhan.veli@sap.com>
beyhan pushed a commit that referenced this issue Jun 30, 2016
- consume fixes in excon 0.49, especially freezing connections
  in the registry
  - fixes #1234
- get rid of full fog (we only need fog-aws and fog-openstack)
- some other gems were taken in newer version by
  `bundle update bosh_openstack_cpi`

[#119363951](https://www.pivotaltracker.com/story/show/119363951)

Signed-off-by: Beyhan Veli <beyhan.veli@sap.com>
beyhan pushed a commit that referenced this issue Jul 1, 2016
- consume fixes in excon 0.49, especially freezing connections
  in the registry
  - fixes #1234
- get rid of full fog (we only need fog-aws and fog-openstack)
- some other gems were taken in newer version by
  `bundle update bosh_openstack_cpi`

[#119363951](https://www.pivotaltracker.com/story/show/119363951)

Signed-off-by: Beyhan Veli <beyhan.veli@sap.com>
Kiemes pushed a commit that referenced this issue Sep 6, 2016
- consume fixes in excon 0.49, especially freezing connections
  in the registry
  - fixes #1234
- get rid of full fog (we only need fog-aws and fog-openstack)
- some other gems were taken in newer version by
  `bundle update bosh_openstack_cpi`

[#119363951](https://www.pivotaltracker.com/story/show/119363951)

Signed-off-by: Tom Kiemes <tom.kiemes@sap.com>
@voelzmo
Copy link
Contributor Author

voelzmo commented Sep 21, 2016

Fixed with bosh 257.14

@voelzmo voelzmo closed this as completed Sep 21, 2016
jianqiu pushed a commit to zhanggbj/bosh that referenced this issue Feb 25, 2017
- consume fixes in excon 0.49, especially freezing connections
  in the registry
  - fixes cloudfoundry#1234
- get rid of full fog (we only need fog-aws and fog-openstack)
- some other gems were taken in newer version by
  `bundle update bosh_openstack_cpi`

[#119363951](https://www.pivotaltracker.com/story/show/119363951)

Signed-off-by: Tom Kiemes <tom.kiemes@sap.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants