Skip to content

Commit

Permalink
net: bridge: avoid uselessly making offloaded ports promiscuous
Browse files Browse the repository at this point in the history
The bridge driver's intention by making ports promiscuous is to turn off
their RX filters such that these ports receive packets with any MAC DA.

A quick survey of the kernel drivers that call
switchdev_bridge_port_offload() shows that either these do not implement
ndo_change_rx_flags() at all, or they explicitly ignore changes to
IFF_PROMISC (am65_cpsw_slave_set_promisc, cpsw_set_promiscious,
ocelot_set_rx_mode).

This makes sense, because hardware that is purpose-built to do L2
forwarding generally already knows it should accept any MAC DA on its
ports.

That is not to say that IFF_PROMISC makes no sense for switchdev drivers.
For example, DSA has the concept of multiple address databases (this is
achieved by effectively partitioning the FDB: reserve a database - FID -
for each port operating as standalone, a FID for each VLAN-unaware
bridge, a FID for each bridge VLAN). The address database of a
standalone port is managed through the standard dev->uc and dev->uc
lists and is used to filter towards the hosts the addresses required for
local termination. The bridge-related address databases are managed
using switchdev (SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE).

IFF_PROMISC is intrinsically connected to dev->uc and dev->mc (see the
implementation of __dev_set_rx_mode which puts the interface in
promiscuous mode if the unicast list isn't empty but the device doesn't
support IFF_UNICAST_FLT), and therefore to what DSA implements as the
standalone port address database (there, an entry in dev->uc means
"forward it to CPU", the absence of it means "drop it", and promiscuity
means "put the CPU in the flood mask of packets with unknown MAC DA").

Whereas there is no IFF_PROMISC equivalent to the FDB entries notified
through switchdev (therefore to the bridge-related address databases),
because none is needed.

In this model, the bridge driver, which is only trying to secure its
reception of packets, is in fact overstepping, because it manages
something which is outside of its competence: the host flooding of the
standalone port database, when in fact that database will not be the one
used by packets handled by the bridging service.

In turn, this prevents further optimizations from being applied in
particular to DSA, and in general to any switchdev driver. A desirable
goal is to eliminate host flooding of packets which are known to be
unnecessary and only dropped later in software [1].

In an ideal world with ideal hardware:
(a) flooding would be controlled per FID rather than per port
(b) egress flooding towards a certain port can be controlled
    independently depending on the actual port ingress port, rather than
    globally, regardless of ingress port

When (a) does not hold true, the bridge will force the port to keep host
flooding enabled, even if this is not otherwise needed (there is no
station behind a "foreign interface" that requires software forwarding;
the only packets sent by the accelerator to the CPU are for termination
purposes).

When (b) does not hold true, it means that a 4-port switch where 1 port
is standalone and 3 are bridged (again with no foreign interface) will
have host flooding enabled for all 4 ports (including the standalone
port, because the bridge is keeping host flooding enabled, and all ports
are serviced by the same CPU port).

Since DSA is a framework and not just a driver for a single device,
these nonidealities do hold true, and the bridge unnecessarily setting
IFF_PROMISC on its ports is a real roadblock towards disabling host
flooding in practical scenarios.

The proposed solution is to make the bridge driver stop touching port
promiscuity for offloaded switchdev ports, and let them manage
promiscuity by themselves as they see fit. It can achieve this by
looking at net_bridge_port :: offload_count, which is updated
voluntarily by switchdev drivers using switchdev_bridge_port_offload().

br_manage_promisc() is already called by nbp_update_port_count() on a
port join/leave, and the implicit assumption is that
switchdev_bridge_port_offload() has already been called by that time
(from netdev_master_upper_dev_link).

[1] https://www.youtube.com/watch?v=B1HhxEcU7Jg

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
  • Loading branch information
vladimiroltean authored and intel-lab-lkp committed Apr 8, 2022
1 parent 0c56b50 commit 2b24e24
Showing 1 changed file with 39 additions and 24 deletions.
63 changes: 39 additions & 24 deletions net/bridge/br_if.c
Expand Up @@ -135,34 +135,49 @@ static void br_port_clear_promisc(struct net_bridge_port *p)
void br_manage_promisc(struct net_bridge *br)
{
struct net_bridge_port *p;
bool set_all = false;

/* If vlan filtering is disabled or bridge interface is placed
* into promiscuous mode, place all ports in promiscuous mode.
*/
if ((br->dev->flags & IFF_PROMISC) || !br_vlan_enabled(br->dev))
set_all = true;

list_for_each_entry(p, &br->port_list, list) {
if (set_all) {
/* Offloaded ports have a separate address database for
* forwarding, which is managed through switchdev and not
* through dev_uc_add(), so the promiscuous concept makes no
* sense for them. Avoid updating promiscuity in that case.
*/
if (p->offload_count) {
br_port_clear_promisc(p);
continue;
}

/* If bridge is promiscuous, unconditionally place all ports
* in promiscuous mode too. This allows the bridge device to
* locally receive all unknown traffic.
*/
if (br->dev->flags & IFF_PROMISC) {
br_port_set_promisc(p);
continue;
}

/* If vlan filtering is disabled, place all ports in
* promiscuous mode.
*/
if (!br_vlan_enabled(br->dev)) {
br_port_set_promisc(p);
} else {
/* If the number of auto-ports is <= 1, then all other
* ports will have their output configuration
* statically specified through fdbs. Since ingress
* on the auto-port becomes forwarding/egress to other
* ports and egress configuration is statically known,
* we can say that ingress configuration of the
* auto-port is also statically known.
* This lets us disable promiscuous mode and write
* this config to hw.
*/
if (br->auto_cnt == 0 ||
(br->auto_cnt == 1 && br_auto_port(p)))
br_port_clear_promisc(p);
else
br_port_set_promisc(p);
continue;
}

/* If the number of auto-ports is <= 1, then all other ports
* will have their output configuration statically specified
* through fdbs. Since ingress on the auto-port becomes
* forwarding/egress to other ports and egress configuration is
* statically known, we can say that ingress configuration of
* the auto-port is also statically known.
* This lets us disable promiscuous mode and write this config
* to hw.
*/
if (br->auto_cnt == 0 ||
(br->auto_cnt == 1 && br_auto_port(p)))
br_port_clear_promisc(p);
else
br_port_set_promisc(p);
}
}

Expand Down

0 comments on commit 2b24e24

Please sign in to comment.