team netdevice library
C Shell Python Other
Latest commit 046fb6b Feb 6, 2017 Antti Tiainen committed with libteam: resynchronize ifinfo after lost RTNLGRP_LINK notifications
When there's a large number of interfaces (e.g. vlans), teamd loses
link notifications as it cannot read them as fast as kernel is
broadcasting them. This often prevents teamd starting properly if
started concurrently when other links are being set up. It can also
fail when it's up and running, especially in the cases where the team
device itself has a lot of vlans under it.

This can easily be reproduces by simple example (in SMP system) by
manually adding team device with a bunch of vlans, putting it up,
and starting teamd with --take-over option:

  root@debian:~# ip link add name team0 type team
  root@debian:~# for i in `seq 100 150` ; do
  > ip link add link team0 name team0.$i type vlan id $i ; done
  root@debian:~# ip link set team0 up
  root@debian:~# cat teamd.conf
  {
    "device": "team0",
    "runner": {
      "name": "activebackup"
     },
    "ports": {
      "eth1": {},
      "eth2": {}
    }
  }
  root@debian:~# teamd -o -N -f teamd.conf

At this point, teamd will not give any error messages or other
indication that something is wrong. But state will not look healthy:

  root@debian:~# teamdctl team0 state
  setup:
    runner: activebackup
  ports:
    eth1
      link watches:
        link summary: up
        instance[link_watch_0]:
          name: ethtool
          link: up
          down count: 0
  Failed to parse JSON port dump.
  command call failed (Invalid argument)

If checking state dump, it will show that port eth2 is missing info.
Running strace to teamd will reveal that there's one recvmsgs() that
returned -1 with errno ENOBUFS. What happened in this example was
that when teamd started, all vlans got carrier up, and kernel flooded
notifications faster than teamd could read them. It then lost events
related to port eth2 getting enslaved and up.

The socket that joins RTNLGRP_LINK notifications uses default libnl
32k buffer size. Netlink messages are large (over 1k), and this buffer
gets easily full. Kernel neither knows nor cares were notification
broadcasts delivered. This cannot be fixed by simply increasing the
buffer size, as there's no size that is guaranteed to work in every
use case, and this can require several megabytes of buffer (a way over
normal rmem_max limit) if there are hunderds of vlans.

Only way to recover from this is to refresh all ifinfo list, as it's
invalidated at this point. It cannot easily work around of this by
just refreshing team device and its ports, because library side might
not have ports linked due to events missed, and it doesn't know about
teamd configuration.

Checks now return value of nl_recvmsgs_default() for event socket. In
case of ENOBUFS (which libnl nicely changes to ENOMEM), refreshes
all ifinfo list. get_ifinfo_list() also checks now for removed interfaces
in case of missed dellink event. Currently all TEAM_IFINFO_CHANGE
handlers processed events one by one, so it had to be changed to support
multiple ifinfo changes. For this, ifinfo changed flags are cleared
and removed entries destroyed only after all handlers have been called.

Also, increased nl_cli.sock_event receive buffers to 96k like all other
sockets. Added possibility to change this via environment variable.

Signed-off-by: Antti Tiainen <atiainen@forcepoint.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>

README

# SUBMITTING PATCHES / PULL REQUESTS - README!!! #

All github pull requests will be ignored! Please send the patches to the libteam mailing list, according to "SubmittingPatches" file.

# libteam - Library for controlling team network device #

The purpose of the Team driver is to provide a mechanism to team multiple NICs (ports) into one logical one (teamdev) at L2 layer. The process is called "channel bonding", "Ethernet bonding", "channel teaming", "link aggregation", etc. This is already implemented in the Linux kernel by the bonding driver.

One thing to note is that Team driver project does try to provide the similar functionality as the bonding driver, however architecturally it is quite different from bonding driver. Team driver is modular, userspace driven, very lean and efficient, and it does have some distinct advantages over bonding. The way Team is configured differs dramatically from the way bonding is.

## Install

    $ ./autogen.sh
    $ ./configure
    $ make
    $ sudo make install

## Authors

* Jiri Pirko <jiri@resnulli.us>

## Internet Resources

* Project Home:     http://www.libteam.org/
* Git Source Tree:  https://github.com/jpirko/libteam/
* Wiki:             https://github.com/jpirko/libteam/wiki
* Tutorial:         https://github.com/jpirko/libteam/wiki/Tutorial
* Documentation:    https://github.com/jpirko/libteam/wiki/Infrastructure-Specification

## License

Copyright (C) 2011-2015 Jiri Pirko <jiri@resnulli.us>

libteam is distributed under GNU Lesser General Public License version 2.1.
See the file "COPYING" in the source distribution for information on terms & conditions for accessing and otherwise using libteam.