Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Dashboard "dial tcp i/o timeout" error _possibly_ due to Weave networking #1246
Versions of kops:
How I set my cluster up:
I ran the below command and waited a while for all instances being the ELBs to be
How I set up the dashboard and the version:
Obviously I did the
Dashboard works but I intermittently get the below error:
Ps 2: I intend to use this for stuff that's semi-serious, haven't gotten round to putting other services on there. Yet. I want to solve this first.
Ps 3: I didn't have this issue when using the old networking (
What do other people say about this issue?
Nothing stands out to me except this comment and this one. The guy talks about how "Source/Dest check was enabled on my minions, even though it shouldn't have been" (which we do by the way). Some of the other results in my search don't seem worth looking at (IMHO, of course).
What do I think?
The networking that
To add to the
Looking forward to your opinions. Feel free to close issue if it's not
What do the logs say?:
Every time I hit the error I get this (see below) on one of the master's
I thought these would be helpful to know the state of the cluster
Awesome. The only thing I could get close to
Looks like we are using 1.8.1 ... I can see 1.8.2 has a bunch of bug fixes and minor improvements. It's possible that an upgrade would solve the problem we are having here ... in fact!, in the change log I just spotted this ... "Fixed a bug where Kubernetes master could not contact pods weaveworks/weave#2673, weaveworks/weave#2683"
Yes. I still have this cluster up and can consistently replicate the error by refreshing the dashboard a couple of times ... after a few turns it will eventually take a while to load, then result in the aforementioned error (with an exception in the logs of the api server on one of the masters).
I've tried tearing down the cluster and bringing it back up, in the off-chance that maybe something wasn't configured correctly the first time round ... issue still persisted.
referenced this issue
Dec 23, 2016
Figured it out. Easier way to check all 6
Check 'weave-net' pod 'weave-npc' container logs:
Check 'weave-net' pod 'weave' container logs:
Above is a demonstration of their recommended way of troubleshooting connections but I'm afraid it doesn't tell us much. Upgrading to weave 1.8.2 as the logs suggest seems to be the best option at this point because of the bug fixes and minor improvements in that release.