Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apollo-launch is stuck on "wait for weave to listen" the first time #662

Closed
sheerun opened this issue Mar 1, 2016 · 7 comments
Closed

Comments

@sheerun
Copy link
Contributor

sheerun commented Mar 1, 2016

Hey,

I tried to deploy Apollo on digital ocean, but the first time it's stuck on "wait for weave to listen". The second time I run it everything's OK. Any ideas?

@sheerun
Copy link
Contributor Author

sheerun commented Mar 1, 2016

Here are logs from one of masters 46.101.9.14 the other masters were 46.101.9.249,188.166.157.176 and slave is 46.101.9.201

...
Mar 01 11:52:23 apollo-mesos-master-0 weave[1533]: 548656b1b629: Pull complete
Mar 01 11:52:23 apollo-mesos-master-0 weave[1533]: 1275dac6c4ff: Pull complete
Mar 01 11:52:23 apollo-mesos-master-0 weave[1533]: Digest: sha256:58c88a203a38fb0765ca7fe78d088fe88f6e7817be8dc1bb6395742cd94f570e
Mar 01 11:52:23 apollo-mesos-master-0 weave[1533]: Status: Downloaded newer image for weaveworks/weave:1.4.4
Mar 01 11:52:26 apollo-mesos-master-0 weave[1533]: 2e7762298c48f29c3d157a1f31be9b415e3597e3ebb70e118c9f91ea4835756a
Mar 01 11:52:26 apollo-mesos-master-0 docker[1986]: INFO: 2016/03/01 11:52:26.285640 EMSGSIZE on send, expecting PMTU update (IP packet was 60028 bytes, payload was 60020 bytes)
Mar 01 11:52:26 apollo-mesos-master-0 docker[1986]: INFO: 2016/03/01 11:52:26.286017 ->[188.166.157.176:44772|6e:c9:66:e2:75:05(apollo-mesos-master-2)]: connection fully established
Mar 01 11:52:26 apollo-mesos-master-0 docker[1986]: INFO: 2016/03/01 11:52:26.288430 sleeve ->[188.166.157.176:6783|6e:c9:66:e2:75:05(apollo-mesos-master-2)]: Effective MTU verified at 1438
Mar 01 11:52:26 apollo-mesos-master-0 docker[1986]: INFO: 2016/03/01 11:52:26.346471 Discovered remote MAC 1e:b7:14:e1:6f:c9 at 12:3b:09:08:70:ca(apollo-mesos-slave-0)
Mar 01 11:52:26 apollo-mesos-master-0 weave[1989]: 10.45.85.85
Mar 01 11:52:26 apollo-mesos-master-0 systemd[1]: Started Weave Network Router.
Mar 01 11:52:39 apollo-mesos-master-0 docker[1986]: INFO: 2016/03/01 11:52:39.367483 ->[127.0.0.1:47788] connection accepted
Mar 01 11:52:39 apollo-mesos-master-0 docker[1986]: ERRO: 2016/03/01 11:52:39.373020 ->[127.0.0.1:47788] connection shutting down due to error during handshake: failed to receive remote protocol header: EOF
Mar 01 11:52:47 apollo-mesos-master-0 systemd[1]: Stopping Weave Network Router...
Mar 01 11:52:47 apollo-mesos-master-0 docker[1986]: ERRO: 2016/03/01 11:52:47.933033 ->[46.101.9.201:6783|12:3b:09:08:70:ca(apollo-mesos-slave-0)]: connection shutting down due to error: read tcp4 46.101.9.201:6783: connection reset by peer
Mar 01 11:52:47 apollo-mesos-master-0 docker[1986]: INFO: 2016/03/01 11:52:47.933239 ->[46.101.9.201:6783|12:3b:09:08:70:ca(apollo-mesos-slave-0)]: connection deleted
Mar 01 11:52:47 apollo-mesos-master-0 docker[1986]: INFO: 2016/03/01 11:52:47.939850 Removed unreachable peer 12:3b:09:08:70:ca(apollo-mesos-slave-0)
Mar 01 11:52:48 apollo-mesos-master-0 docker[1986]: ERRO: 2016/03/01 11:52:48.052011 ->[46.101.9.249:6783|42:95:9b:0d:9c:58(apollo-mesos-master-1)]: connection shutting down due to error: read tcp4 46.101.9.249:6783: connection reset by peer
Mar 01 11:52:48 apollo-mesos-master-0 docker[1986]: INFO: 2016/03/01 11:52:48.052097 ->[46.101.9.249:6783|42:95:9b:0d:9c:58(apollo-mesos-master-1)]: connection deleted
Mar 01 11:52:48 apollo-mesos-master-0 docker[1986]: INFO: 2016/03/01 11:52:48.052958 Removed unreachable peer 42:95:9b:0d:9c:58(apollo-mesos-master-1)
Mar 01 11:52:48 apollo-mesos-master-0 docker[1986]: INFO: 2016/03/01 11:52:48.079459 === received SIGINT/SIGTERM ===
Mar 01 11:52:48 apollo-mesos-master-0 docker[1986]: *** exiting
Mar 01 11:52:48 apollo-mesos-master-0 systemd[1]: Stopped Weave Network Router.

@sheerun
Copy link
Contributor Author

sheerun commented Mar 1, 2016

The problems seems to occur on flush_handlers task.

@bkarypid
Copy link

bkarypid commented Mar 1, 2016

After getting the newest changes in the devel branch, I am also getting an error: 'ERROR: change handler (systemd reload) is not defined', the first time the task [weave | deploy weave service] runs. That happens both in AWS public and Rackspace. Was working fine before I merged. Running the ansible playbook again goes through ok.

@tayzlor
Copy link
Member

tayzlor commented Mar 1, 2016

having a look - highly likely related to this commit bdf513a

@tayzlor
Copy link
Member

tayzlor commented Mar 1, 2016

#664 addresses the systemd issue - looks like I messed up a conflict resolution in bdf513a

tayzlor added a commit that referenced this issue Mar 1, 2016
@tayzlor
Copy link
Member

tayzlor commented Mar 1, 2016

@sheerun just ran up devel (patched with #664) on digitalocean and I didnt experience any weave issues. Can you confirm you are running the devel branch (as there is actually no "wait for weave to listen" task on that branch)

@sheerun
Copy link
Contributor Author

sheerun commented Mar 1, 2016

I still experience this issue.. It happens only first time I run apollo-launch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants