Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

deployer 1.4 / microbosh 0.8.1 - timeout after 19mins of "waiting for the agent" #36

Closed
drnic opened this Issue · 10 comments

5 participants

@drnic
Deploy Micro BOSH
  unpacking stemcell (00:00:13)                                                 
  uploading stemcell (00:11:20)                                                 
  creating VM from ami-1ee17777 (00:00:27)
Waiting for the agent        |ooooo              | 3/11 00:19:07  ETA: 00:24:57/usr/local/lib/ruby/gems/1.9.1/gems/agent_client-0.1.1/lib/agent_client/http_client.rb:46:in `rescue in request': cannot access agent (Connection refused - connect(2) (http://54.235.133.196:6868)) (Bosh::Agent::Error)

In the bosh logs:

I, [2013-02-07T07:00:57.923009 #18367] [0xb19ea0]  INFO -- : discovered bosh ip=54.235.133.196
I, [2013-02-07T07:00:58.004944 #18367] [0xb19ea0]  INFO -- : [AWS EC2 200 0.079246 0 retries] describe_addresses(:filters=>[{:name=>"instance-id",:values=>["i-cd96dbbd"]}])  

I, [2013-02-07T07:00:58.094337 #18367] [0xb19ea0]  INFO -- : [AWS EC2 200 0.08894 0 retries] describe_addresses(:filters=>[{:name=>"instance-id",:values=>["i-cd96dbbd"]}])  

D, [2013-02-07T07:01:19.092772 #18367] [0xb19ea0] DEBUG -- : tcp socket 54.235.133.196:22 SystemCallError: #<Errno::ETIMEDOUT: Connection timed out - connect(2)>
D, [2013-02-07T07:01:41.102731 #18367] [0xb19ea0] DEBUG -- : tcp socket 54.235.133.196:22 SystemCallError: #<Errno::ETIMEDOUT: Connection timed out - connect(2)>
D, [2013-02-07T07:02:03.116499 #18367] [0xb19ea0] DEBUG -- : tcp socket 54.235.133.196:22 SystemCallError: #<Errno::ETIMEDOUT: Connection timed out - connect(2)>
D, [2013-02-07T07:02:04.150902 #18367] [0xb19ea0] DEBUG -- : tcp socket 54.235.133.196:22 is readable
I, [2013-02-07T07:03:04.160184 #18367] [0xb19ea0]  INFO -- : Preparing for ssh tunnel: ssh -R 25888:127.0.0.1:25888 vcap@54.235.133.196
D, [2013-02-07T07:03:04.385738 #18367] [0xb19ea0] DEBUG -- : ssh vcap@54.235.133.196: ESTABLISHED
I, [2013-02-07T07:03:04.385907 #18367] [0xb19ea0]  INFO -- : `ssh -R 25888:127.0.0.1:25888 vcap@54.235.133.196` started: OK

It looks like the "bosh micro deploy" command gave up/timed out just before the tunnel was finally open? (I'm not sure if I'm understanding what's going on here)

@drnic

When I re-run bosh micro deploy ... I get this error:

Stopping agent services      |                   | 0/5 00:00:00  ETA: --:--:--Error: nil value given for persistent disk id
@drnic

Was able to delete; so deploying again.

@drnic

On the next deploy it passed this point.

Why does it happen?

@pmenglund

AWS slowness? I wonder if we can tune the ssh timeout setting?

@drnic

When it worked correctly, the bosh logs still had the SystemCallError: #<Errno::ETIMEDOUT: Connection timed out - connect(2)> so that is unrelated.

@drnic

Happened again today:

Stemcell info
-------------
Name:    micro-bosh-stemcell
Version: 0.8.1


Deploy Micro BOSH
  unpacking stemcell (00:00:12)                                                 
  uploading stemcell (00:10:37)                                                 
  creating VM from ami-a21786cb (00:00:30)                                      
Waiting for the agent        |ooooo              | 3/11 00:18:27  ETA: 00:23:08/usr/local/lib/ruby/gems/1.9.1/gems/agent_client-0.1.1/lib/agent_client/http_client.rb:46:in `rescue in request': cannot access agent (Connection refused - connect(2) (http://54.235.133.196:6868)) (Bosh::Agent::Error)
    from /usr/local/lib/ruby/gems/1.9.1/gems/agent_client-0.1.1/lib/agent_client/http_client.rb:29:in `request'
    from /usr/local/lib/ruby/gems/1.9.1/gems/agent_client-0.1.1/lib/agent_client/http_client.rb:56:in `post_json'
    from /usr/local/lib/ruby/gems/1.9.1/gems/agent_client-0.1.1/lib/agent_client/http_client.rb:23:in `handle_method'
    from /usr/local/lib/ruby/gems/1.9.1/gems/agent_client-0.1.1/lib/agent_client/base.rb:17:in `method_missing'
    from /usr/local/lib/ruby/gems/1.9.1/gems/bosh_deployer-1.4.0/lib/deployer/instance_manager.rb:406:in `block in wait_until_agent_ready'
    from /usr/local/lib/ruby/gems/1.9.1/gems/bosh_deployer-1.4.0/lib/deployer/instance_manager.rb:394:in `wait_until_ready'
    from /usr/local/lib/ruby/gems/1.9.1/gems/bosh_deployer-1.4.0/lib/deployer/instance_manager.rb:406:in `wait_until_agent_ready'
    from /usr/local/lib/ruby/gems/1.9.1/gems/bosh_deployer-1.4.0/lib/deployer/instance_manager/aws.rb:122:in `wait_until_agent_ready'
    from /usr/local/lib/ruby/gems/1.9.1/gems/bosh_deployer-1.4.0/lib/deployer/instance_manager.rb:145:in `block in create'
    from /usr/local/lib/ruby/gems/1.9.1/gems/bosh_deployer-1.4.0/lib/deployer/instance_manager.rb:84:in `step'
    from /usr/local/lib/ruby/gems/1.9.1/gems/bosh_deployer-1.4.0/lib/deployer/instance_manager.rb:144:in `create'
    from /usr/local/lib/ruby/gems/1.9.1/gems/bosh_deployer-1.4.0/lib/deployer/instance_manager.rb:104:in `block in create_deployment'
    from /usr/local/lib/ruby/gems/1.9.1/gems/bosh_deployer-1.4.0/lib/deployer/instance_manager.rb:97:in `with_lifecycle'
    from /usr/local/lib/ruby/gems/1.9.1/gems/bosh_deployer-1.4.0/lib/deployer/instance_manager.rb:103:in `create_deployment'
    from /usr/local/lib/ruby/gems/1.9.1/gems/bosh_deployer-1.4.0/lib/bosh/cli/commands/micro.rb:171:in `perform'
    from /usr/local/lib/ruby/gems/1.9.1/gems/bosh_cli-1.0.3/lib/cli/command_handler.rb:57:in `run'
    from /usr/local/lib/ruby/gems/1.9.1/gems/bosh_cli-1.0.3/lib/cli/runner.rb:61:in `run'
    from /usr/local/lib/ruby/gems/1.9.1/gems/bosh_cli-1.0.3/lib/cli/runner.rb:18:in `run'
    from /usr/local/lib/ruby/gems/1.9.1/gems/bosh_cli-1.0.3/bin/bosh:16:in `<top (required)>'
    from /usr/local/bin/bosh:23:in `load'
    from /usr/local/bin/bosh:23:in `<main>'
       error  deploy micro bosh
@mkocher
Owner

How long is this normally taking for you when it works? I assume 19 minutes is much longer than normal?

@drnic
@nand2

I am having the same problem, a 38mn timeout, and then a 7mn timeout, with the same error and logs. With t1.micro instances. Now trying bigger instances to see if it makes any difference..

@nand2

Yes, made a difference with a m1.medium, 3min waiting for the agent and then it proceed to the next steps.

@Amit-PivotalLabs

Hey @drnic, are you still having this issue?

@drnic

Closing this as those gems/stemcells are very old now.

@drnic drnic closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.