Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

Peer connection attempts reported as failed #3398

Closed
erik-stephens opened this issue Sep 7, 2018 · 1 comment
Closed

Peer connection attempts reported as failed #3398

erik-stephens opened this issue Sep 7, 2018 · 1 comment

Comments

@erik-stephens
Copy link

What you expected to happen?

Peer connection attempts to self not to be reported as failed.

Background

We got alarmed by these. After some research, the takeway is that it's noise. However, also saw a lot of issues about peers not being able to join shortly after CONN_LIMIT. We don't scale nodes much, so less of a concern for us, but still thought we should pay attention to this metric just in case. Our workaround is simply to treat N failed connections for N nodes in the cluster as normal.

Some Ideas

  1. Adding a new state just for this might be a no-go.
  2. Adding special handling in the metrics reporting to ignore when Info ~ "cannot connect to ourself" might be too brittle.
  3. Patch something in the kubernetes add-on so that a node doesn't try to connect to itself. The hope is that something in kubernetes land can better inform weave which peers to try.

If core devs provide a bit of guidance, we might be able to submit a PR.

Versions:

  • weave 2.4.0
  • kubernetes 1.11

Logs:

kubectl -n kube-system exec weave-net-fqv5t -c weave -- /home/weave/weave --local report | jq -r '.Router.Connections[] | select(.State != "established")'
{
  "Address": "10.210.1.78:6783",
  "Outbound": true,
  "State": "failed",
  "Info": "cannot connect to ourself, retry: never",
  "Attrs": null
}
@murali-reddy
Copy link
Contributor

If core devs provide a bit of guidance, we might be able to submit a PR.

@erik-stephens Apologies for the delay in response. Please give it a try to submit a PR.

Please take a look at the kube-utils program which when called returns the list of the current set of Kubernetes node Ip'ss. That list is passed by launch.sh to launch main weave program by passing passing the llist as argument.

So try to exclude the self node in the returned list by kube-utils.

murali-reddy added a commit that referenced this issue Nov 21, 2018
node IP from the list of the peers passed to weaver

Fixes #3398
murali-reddy added a commit that referenced this issue Nov 21, 2018
…rom the list of the peers passed to weaver

Fixes #3398
murali-reddy added a commit that referenced this issue Nov 21, 2018
…rom the list of the peers passed to weaver

Fixes #3398
@bboreham bboreham added this to the 2.6 milestone Nov 4, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants