Vladimir-Oltea…
Commits on Mar 18, 2021
-
net: bridge: switchdev: let drivers inform which bridge ports are off…
…loaded On reception of an skb, the bridge checks if it was marked as 'already forwarded in hardware' (checks if skb->offload_fwd_mark == 1), and if it is, it puts a mark of its own on that skb, with the switchdev mark of the ingress port. Then during forwarding, it enforces that the egress port must have a different switchdev mark than the ingress one (this is done in nbp_switchdev_allowed_egress). Non-switchdev drivers don't report any physical switch id (neither through devlink nor .ndo_get_port_parent_id), therefore the bridge assigns them a switchdev mark of 0, and packets coming from them will always have skb->offload_fwd_mark = 0. So there aren't any restrictions. Problems appear due to the fact that DSA would like to perform software fallback for bonding and team interfaces that the physical switch cannot offload. +-- br0 -+ / / | \ / / | \ / / | \ / / | \ / / | \ / | | bond0 / | | / \ swp0 swp1 swp2 swp3 swp4 There, it is desirable that the presence of swp3 and swp4 under a non-offloaded LAG does not preclude us from doing hardware bridging beteen swp0, swp1 and swp2. The bandwidth of the CPU is often times high enough that software bridging between {swp0,swp1,swp2} and bond0 is not impractical. But this creates an impossible paradox given the current way in which port switchdev marks are assigned. When the driver receives a packet from swp0 (say, due to flooding), it must set skb->offload_fwd_mark to something. - If we set it to 0, then the bridge will forward it towards swp1, swp2 and bond0. But the switch has already forwarded it towards swp1 and swp2 (not to bond0, remember, that isn't offloaded, so as far as the switch is concerned, ports swp3 and swp4 are not looking up the FDB, and the entire bond0 is a destination that is strictly behind the CPU). But we don't want duplicated traffic towards swp1 and swp2, so it's not ok to set skb->offload_fwd_mark = 0. - If we set it to 1, then the bridge will not forward the skb towards the ports with the same switchdev mark, i.e. not to swp1, swp2 and bond0. Towards swp1 and swp2 that's ok, but towards bond0? It should have forwarded the skb there. So the real issue is that bond0 will be assigned the same switchdev mark as {swp0,swp1,swp2}, because the function that assigns switchdev marks to bridge ports, nbp_switchdev_mark_set, recurses through bond0's lower interfaces until it finds something that implements devlink. A solution is to give the bridge explicit hints as to what switchdev mark it should use for each port. Currently, the bridging offload is very 'silent': a driver registers a netdevice notifier, which is put on the netns's notifier chain, and which sniffs around for NETDEV_CHANGEUPPER events where the upper is a bridge, and the lower is an interface it knows about (one registered by this driver, normally). Then, from within that notifier, it does a bunch of stuff behind the bridge's back, without the bridge necessarily knowing that there's somebody offloading that port. It looks like this: ip link set swp0 master br0 | v bridge calls netdev_master_upper_dev_link | v call_netdevice_notifiers | v dsa_slave_netdevice_event | v oh, hey! it's for me! | v .port_bridge_join What we do to solve the conundrum is to be less silent, and emit a notification back. Something like this: ip link set swp0 master br0 | v bridge calls netdev_master_upper_dev_link | v bridge: Aye! I'll use this call_netdevice_notifiers ^ ppid as the | | switchdev mark for v | this port, and zero dsa_slave_netdevice_event | if I got nothing. | | v | oh, hey! it's for me! | | | v | .port_bridge_join | | | +------------------------+ switchdev_bridge_port_offload(swp0) Then stacked interfaces (like bond0 on top of swp3/swp4) would be treated differently in DSA, depending on whether we can or cannot offload them. The offload case: ip link set bond0 master br0 | v bridge calls netdev_master_upper_dev_link | v bridge: Aye! I'll use this call_netdevice_notifiers ^ ppid as the | | switchdev mark for v | bond0. dsa_slave_netdevice_event | Coincidentally (or not), | | bond0 and swp0, swp1, swp2 v | all have the same switchdev hmm, it's not quite for me, | mark now, since the ASIC but my driver has already | is able to forward towards called .port_lag_join | all these ports in hw. for it, because I have | a port with dp->lag_dev == bond0. | | | v | .port_bridge_join | for swp3 and swp4 | | | +------------------------+ switchdev_bridge_port_offload(bond0) And the non-offload case: ip link set bond0 master br0 | v bridge calls netdev_master_upper_dev_link | v bridge waiting: call_netdevice_notifiers ^ huh, switchdev_bridge_port_offload | | wasn't called, okay, I'll use a v | switchdev mark of zero for this one. dsa_slave_netdevice_event : Then packets received on swp0 will | : not be forwarded towards swp1, but v : they will towards bond0. it's not for me, but bond0 is an upper of swp3 and swp4, but their dp->lag_dev is NULL because they couldn't offload it. Basically we can draw the conclusion that the lowers of a bridge port can come and go, so depending on the configuration of lowers for a bridge port, it can dynamically toggle between offloaded and unoffloaded. Therefore, we need an equivalent switchdev_bridge_port_unoffload too. This patch changes the way any switchdev driver interacts with the bridge. From now on, everybody needs to call switchdev_bridge_port_offload, otherwise the bridge will treat the port as non-offloaded and allow software flooding to other ports from the same ASIC. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> -
net: dsa: return -EOPNOTSUPP when driver does not implement .port_lag…
…_join The DSA core has a layered structure, and even though we end up returning 0 (success) to user space when setting a bonding/team upper that can't be offloaded, some parts of the framework actually need to know that we couldn't offload that. For example, if dsa_switch_lag_join returns 0 as it currently does, dsa_port_lag_join has no way to tell a successful offload from a software fallback, and it will call dsa_port_bridge_join afterwards. Then we'll think we're offloading the bridge master of the LAG, when in fact we're not even offloading the LAG. In turn, this will make us set skb->offload_fwd_mark = true, which is incorrect and the bridge doesn't like it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
-
net: dsa: don't set skb->offload_fwd_mark when not offloading the bridge
DSA has gained the recent ability to deal gracefully with upper interfaces it cannot offload, such as the bridge, bonding or team drivers. When such uppers exist, the ports are still in standalone mode as far as the hardware is concerned. But when we deliver packets to the software bridge in order for that to do the forwarding, there is an unpleasant surprise in that the bridge will refuse to forward them. This is because we unconditionally set skb->offload_fwd_mark = true, meaning that the bridge thinks the frames were already forwarded in hardware by us. Since dp->bridge_dev is populated only when there is hardware offload for it, but not in the software fallback case, let's introduce a new helper that can be called from the tagger data path which sets the skb->offload_fwd_mark accordingly to zero when there is no hardware offload for bridging. This lets the bridge forward packets back to other interfaces of our switch, if needed. Without this change, sending a packet to the CPU for an unoffloaded interface triggers this WARN_ON: void nbp_switchdev_frame_mark(const struct net_bridge_port *p, struct sk_buff *skb) { if (skb->offload_fwd_mark && !WARN_ON_ONCE(!p->offload_fwd_mark)) BR_INPUT_SKB_CB(skb)->offload_fwd_mark = p->offload_fwd_mark; } Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com> -
net: ocelot: replay switchdev events when joining bridge
The premise of this change is that the switchdev port attributes and objects offloaded by ocelot might have been missed when we are joining an already existing bridge port, such as a bonding interface. The patch pulls these switchdev attributes and objects from the bridge, on behalf of the 'bridge port' net device which might be either the ocelot switch interface, or the bonding upper interface. The ocelot_net.c belongs strictly to the switchdev ocelot driver, while ocelot.c is part of a library shared with the DSA felix driver. The ocelot_port_bridge_leave function (part of the common library) used to call ocelot_port_vlan_filtering(false), something which is not necessary for DSA, since the framework deals with that already there. So we move this function to ocelot_switchdev_unsync, which is specific to the switchdev driver. The code movement described above makes ocelot_port_bridge_leave no longer return an error code, so we change its type from int to void. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
-
net: ocelot: call ocelot_netdevice_bridge_join when joining a bridged…
… LAG Similar to the DSA situation, ocelot supports LAG offload but treats this scenario improperly: ip link add br0 type bridge ip link add bond0 type bond ip link set bond0 master br0 ip link set swp0 master bond0 We do the same thing as we do there, which is to simulate a 'bridge join' on 'lag join', if we detect that the bonding upper has a bridge upper. Again, same as DSA, ocelot supports software fallback for LAG, and in that case, we should avoid calling ocelot_netdevice_changeupper. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
-
net: ocelot: support multiple bridges
The ocelot switches are a bit odd in that they do not have an STP state to put the ports into. Instead, the forwarding configuration is delayed from the typical port_bridge_join into stp_state_set, when the port enters the BR_STATE_FORWARDING state. I can only guess that the implementation of this quirk is the reason that led to the simplification of the driver such that only one bridge could be offloaded at a time. We can simplify the data structures somewhat, and introduce a per-port bridge device pointer and STP state, similar to how the LAG offload works now (there we have a per-port bonding device pointer and TX enabled state). This allows offloading multiple bridges with relative ease, while still keeping in place the quirk to delay the programming of the PGIDs. We actually need this change now because we need to remove the bogus restriction from ocelot_bridge_stp_state_set that ocelot->bridge_mask needs to contain BIT(port), otherwise that function is a no-op. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
-
net: dsa: replay VLANs installed on port when joining the bridge
Currently this simple setup: ip link add br0 type bridge vlan_filtering 1 ip link add bond0 type bond ip link set bond0 master br0 ip link set swp0 master bond0 will not work because the bridge has created the PVID in br_add_if -> nbp_vlan_init, and it has notified switchdev of the existence of VLAN 1, but that was too early, since swp0 was not yet a lower of bond0, so it had no reason to act upon that notification. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
-
net: dsa: replay port and local fdb entries when joining the bridge
When a DSA port joins a LAG that already had an FDB entry pointing to it: ip link set bond0 master br0 bridge fdb add dev bond0 00:01:02:03:04:05 master static ip link set swp0 master bond0 the DSA port will have no idea that this FDB entry is there, because it missed the switchdev event emitted at its creation. Ido Schimmel pointed this out during a discussion about challenges with switchdev offloading of stacked interfaces between the physical port and the bridge, and recommended to just catch that condition and deny the CHANGEUPPER event: https://lore.kernel.org/netdev/20210210105949.GB287766@shredder.lan/ But in fact, we might need to deal with the hard thing anyway, which is to replay all FDB addresses relevant to this port, because it isn't just static FDB entries, but also local addresses (ones that are not forwarded but terminated by the bridge). There, we can't just say 'oh yeah, there was an upper already so I'm not joining that'. So, similar to the logic for replaying MDB entries, add a function that must be called by individual switchdev drivers and replays local FDB entries as well as ones pointing towards a bridge port. This time, we use the atomic switchdev notifier block, since that's what FDB entries expect for some reason. Reported-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
-
net: dsa: replay port and host-joined mdb entries when joining the br…
…idge I have udhcpcd in my system and this is configured to bring interfaces up as soon as they are created. I create a bridge as follows: ip link add br0 type bridge As soon as I create the bridge and udhcpcd brings it up, I have some other crap (avahi) that starts sending some random IPv6 packets to advertise some local services, and from there, the br0 bridge joins the following IPv6 groups: 33:33:ff:6d:c1:9c vid 0 33:33:00:00:00:6a vid 0 33:33:00:00:00:fb vid 0 br_dev_xmit -> br_multicast_rcv -> br_ip6_multicast_add_group -> __br_multicast_add_group -> br_multicast_host_join -> br_mdb_notify This is all fine, but inside br_mdb_notify we have br_mdb_switchdev_host hooked up, and switchdev will attempt to offload the host joined groups to an empty list of ports. Of course nobody offloads them. Then when we add a port to br0: ip link set swp0 master br0 the bridge doesn't replay the host-joined MDB entries from br_add_if, and eventually the host joined addresses expire, and a switchdev notification for deleting it is emitted, but surprise, the original addition was already completely missed. The strategy to address this problem is to replay the MDB entries (both the port ones and the host joined ones) when the new port joins the bridge, similar to what vxlan_fdb_replay does (in that case, its FDB can be populated and only then attached to a bridge that you offload). However there are 2 possibilities: the addresses can be 'pushed' by the bridge into the port, or the port can 'pull' them from the bridge. Considering that in the general case, the new port can be really late to the party, and there may have been many other switchdev ports that already received the initial notification, we would like to avoid delivering duplicate events to them, since they might misbehave. And currently, the bridge calls the entire switchdev notifier chain, whereas for replaying it should just call the notifier block of the new guy. But the bridge doesn't know what is the new guy's notifier block, it just knows where the switchdev notifier chain is. So for simplification, we make this a driver-initiated pull for now, and the notifier block is passed as an argument. To emulate the calling context for mdb objects (deferred and put on the blocking notifier chain), we must iterate under RCU protection through the bridge's mdb entries, queue them, and only call them once we're out of the RCU read-side critical section. Suggested-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> -
net: dsa: sync ageing time when joining the bridge
The SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME attribute is only emitted from: sysfs/ioctl/netlink -> br_set_ageing_time -> __set_ageing_time therefore not at bridge port creation time, so: (a) drivers had to hardcode the initial value for the address ageing time, because they didn't get any notification (b) that hardcoded value can be out of sync, if the user changes the ageing time before enslaving the port to the bridge Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> -
net: dsa: sync multicast router state when joining the bridge
Make sure that the multicast router setting of the bridge is picked up correctly by DSA when joining, regardless of whether there are sandwiched interfaces or not. The SWITCHDEV_ATTR_ID_BRIDGE_MROUTER port attribute is only emitted from br_mc_router_state_change. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
-
net: dsa: sync up VLAN filtering state when joining the bridge
This is the same situation as for other switchdev port attributes: if we join an already-created bridge port, such as a bond master interface, then we can miss the initial switchdev notification emitted by the bridge for this port. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
-
net: dsa: sync up with bridge port's STP state when joining
It may happen that we have the following topology: ip link add br0 type bridge stp_state 1 ip link add bond0 type bond ip link set bond0 master br0 ip link set swp0 master bond0 ip link set swp1 master bond0 STP decides that it should put bond0 into the BLOCKING state, and that's that. The ports that are actively listening for the switchdev port attributes emitted for the bond0 bridge port (because they are offloading it) and have the honor of seeing that switchdev port attribute can react to it, so we can program swp0 and swp1 into the BLOCKING state. But if then we do: ip link set swp2 master bond0 then as far as the bridge is concerned, nothing has changed: it still has one bridge port. But this new bridge port will not see any STP state change notification and will remain FORWARDING, which is how the standalone code leaves it in. Add a function to the bridge which retrieves the current STP state, such that drivers can synchronize to it when they may have missed switchdev events. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
-
net: dsa: inherit the actual bridge port flags at join time
DSA currently assumes that the bridge port starts off with this constellation of bridge port flags: - learning on - unicast flooding on - multicast flooding on - broadcast flooding on just by virtue of code copy-pasta from the bridge layer (new_nbp). This was a simple enough strategy thus far, because the 'bridge join' moment always coincided with the 'bridge port creation' moment. But with sandwiched interfaces, such as: br0 | bond0 | swp0 it may happen that the user has had time to change the bridge port flags of bond0 before enslaving swp0 to it. In that case, swp0 will falsely assume that the bridge port flags are those determined by new_nbp, when in fact this can happen: ip link add br0 type bridge ip link add bond0 type bond ip link set bond0 master br0 ip link set bond0 type bridge_slave learning off ip link set swp0 master br0 Now swp0 has learning enabled, bond0 has learning disabled. Not nice. Fix this by "dumpster diving" through the actual bridge port flags with br_port_flag_is_set, at bridge join time. We use this opportunity to split dsa_port_change_brport_flags into two distinct functions called dsa_port_inherit_brport_flags and dsa_port_clear_brport_flags, now that the implementation for the two cases is no longer similar. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
-
net: dsa: pass extack to dsa_port_{bridge,lag}_join
This is a pretty noisy change that was broken out of the larger change for replaying switchdev attributes and objects at bridge join time, which is when these extack objects are actually used. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
-
net: dsa: call dsa_port_bridge_join when joining a LAG that is alread…
…y in a bridge DSA can properly detect and offload this sequence of operations: ip link add br0 type bridge ip link add bond0 type bond ip link set swp0 master bond0 ip link set bond0 master br0 But not this one: ip link add br0 type bridge ip link add bond0 type bond ip link set bond0 master br0 ip link set swp0 master bond0 Actually the second one is more complicated, due to the elapsed time between the enslavement of bond0 and the offloading of it via swp0, a lot of things could have happened to the bond0 bridge port in terms of switchdev objects (host MDBs, VLANs, altered STP state etc). So this is a bit of a can of worms, and making sure that the DSA port's state is in sync with this already existing bridge port is handled in the next patches. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
-
Merge branch 'octeon-tc-offloads'
Naveen Mamindlapalli says: ==================== Add tc hardware offloads This patch series adds support for tc hardware offloads. Patch #1 adds support for offloading flows that matches IP tos and IP protocol which will be used by tc hw offload support. Also added ethtool n-tuple filter to code to offload the flows matching the above fields. Patch #2 adds tc flower hardware offload support on ingress traffic. Patch #3 adds TC flower offload stats. Patch #4 adds tc TC_MATCHALL egress ratelimiting offload. * tc flower hardware offload in PF driver The driver parses the flow match fields and actions received from the tc subsystem and adds/delete MCAM rules for the same. Each flow contains set of match and action fields. If the action or fields are not supported, the rule cannot be offloaded to hardware. The tc uses same set of MCAM rules allocated for ethtool n-tuple filters. So, at a time only one entity can offload the flows to hardware, they're made mutually exclusive in the driver. Following match and actions are supported. Match: Eth dst_mac, EtherType, 802.1Q {vlan_id,vlan_prio}, vlan EtherType, IP proto {tcp,udp,sctp,icmp,icmp6}, IPv4 tos, IPv4{dst_ip,src_ip}, L4 proto {dst_port|src_port number}. Actions: drop, accept, vlan pop, redirect to another port on the device. The Hardware stats are also supported. Currently only packet counter stats are updated. * tc egress rate limiting support Added TC-MATCHALL classifier offload with police action applied for all egress traffic on the specified interface. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> -
octeontx2-pf: TC_MATCHALL egress ratelimiting offload
Add TC_MATCHALL egress ratelimiting offload support with POLICE action for entire traffic going out of the interface. Eg: To ratelimit egress traffic to 100Mbps $ ethtool -K eth0 hw-tc-offload on $ tc qdisc add dev eth0 clsact $ tc filter add dev eth0 egress matchall skip_sw \ action police rate 100Mbit burst 16Kbit HW supports a max burst size of ~128KB. Only one ratelimiting filter can be installed at a time. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> -
octeontx2-pf: add tc flower stats handler for hw offloads
Add support to get the stats for tc flower flows that are offloaded to hardware. To support this feature, added a new AF mbox handler which returns the MCAM entry stats for a flow that has hardware stat counter enabled. Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
octeontx2-pf: Add tc flower hardware offload on ingress traffic
This patch adds support for tc flower hardware offload on ingress traffic. Since the tc-flower filter rules use the same set of MCAM rules as the n-tuple filters, the n-tuple filters and tc flower rules are mutually exclusive. When one of the feature is enabled using ethtool, the other feature is disabled in the driver. By default the driver enables n-tuple filters during initialization. The following flow keys are supported. -> Ethernet: dst_mac -> L2 proto: all protocols -> VLAN (802.1q): vlan_id/vlan_prio -> IPv4: dst_ip/src_ip/ip_proto{tcp|udp|sctp|icmp}/ip_tos -> IPv6: ip_proto{icmpv6} -> L4(tcp/udp/sctp): dst_port/src_port The following flow actions are supported. -> drop -> accept -> redirect -> vlan pop The flow action supports multiple actions when vlan pop is specified as the first action. The redirect action supports redirecting to the PF/VF of same PCI device. Redirecting to other PCI NIX devices is not supported. Example #1: Add a tc filter rule to drop UDP traffic with dest port 80 # ethtool -K eth0 hw-tc-offload on # tc qdisc add dev eth0 ingress # tc filter add dev eth0 protocol ip parent ffff: flower ip_proto \ udp dst_port 80 action drop Example #2: Add a tc filter rule to redirect ingress traffic on eth0 with vlan id 3 to eth6 (ex: eth0 vf0) after stripping the vlan hdr. # ethtool -K eth0 hw-tc-offload on # tc qdisc add dev eth0 ingress # tc filter add dev eth0 parent ffff: protocol 802.1Q flower \ vlan_id 3 vlan_ethtype ipv4 action vlan pop action mirred \ ingress redirect dev eth6 Example #3: List the ingress filter rules # tc -s filter show dev eth4 ingress Example #4: Delete tc flower filter rule with handle 0x1 # tc filter del dev eth0 ingress protocol ip pref 49152 \ handle 1 flower Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> -
octeontx2-pf: Add ip tos and ip proto icmp/icmpv6 flow offload support
Add support for programming the HW MCAM match key with IP tos, IP(v6) proto icmp/icmpv6, allowing flow offload rules to be installed using those fields. The NPC HW extracts layer type, which will be used as a matching criteria for different IP protocols. The ethtool n-tuple filter logic has been updated to parse the IP tos and l4proto for HW offloading. l4proto tcp/udp/sctp/ah/esp/icmp are supported. See example usage below. Ex: Redirect l4proto icmp to vf 0 queue 0 ethtool -U eth0 flow-type ip4 l4proto 1 action vf 0 queue 0 Ex: Redirect flow with ip tos 8 to vf 0 queue 0 ethtool -U eth0 flow-type ip4 tos 8 vf 0 queue 0 Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Commits on Mar 17, 2021
-
net: macb: simplify clk_init with dev_err_probe
On some platforms, e.g., the ZynqMP, devm_clk_get can return -EPROBE_DEFER if the clock controller, which is implemented in firmware, has not been probed yet. As clk_init is only called during probe, use dev_err_probe to simplify the error message and hide it for -EPROBE_DEFER. Signed-off-by: Michael Tretter <m.tretter@pengutronix.de> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Marek Behún says: ==================== Add support for mv88e6393x family of Marvell after 2 months I finally had time to send v17 of Amethyst patches. This series is tested on Marvell CN9130-CRB. Changes since v16: - dropped patches adding 5gbase-r, since they are already merged - rebased onto net-next/master - driver API renamed set_egress_flood() method into 2 methods for ucast/mcast floods, so this is fixed Changes from v15: - put 10000baseKR_Full back into phylink_validate method for Amethyst, it seems I misunderstood the meaning behind things and removed it from v15 - removed erratum 3.7, since the procedure is done anyway in mv88e6390_serdes_pcs_config - renumbered errata 3.6 and 3.8 to 4.6 and 4.8, according to newer version of the errata document - refactored errata code a little and removed duplicate macro definitions (for example MV88E6390_SGMII_CONTROL is already called MV88E6390_SGMII_BMCR) Changes from v14: - added my Signed-off-by tags to Pavana's patches, since I am sending them (as suggested by Andrew) - added documentation to second patch adding 5gbase-r mode (as requested by Russell) - added Reviewed-by tags - applied Vladimir's suggestions: - reduced indentation level in mv88e6xxx_set_egress_port and mv88e6393x_serdes_port_config - removed 10000baseKR_Full from mv88e6393x_phylink_validate - removed PHY_INTERFACE_MODE_10GKR from mv88e6xxx_port_set_cmode Changes from v13: - added patch that wraps .set_egress_port into mv88e6xxx_set_egress_port, so that we do not have to set chip->*gress_dest_port members in every implementation of this method - for the patch that adds Amethyst support: - added more information into commit message - added these methods for mv88e6393x_ops: .port_sync_link .port_setup_message_port .port_max_speed_mode (new implementation needed) .atu_get_hash .atu_set_hash .serdes_pcs_config .serdes_pcs_an_restart .serdes_pcs_link_up - this device can set upstream port per port, so implement .port_set_upstream_port instead of .set_cpu_port - removed USXGMII cmode (not yet supported, working on it) - added debug messages into mv88e6393x_port_set_speed_duplex - added Amethyst errata 4.5 (EEE should be disabled on SERDES ports) - fixed 5gbase-r serdes configuration and interrupt handling - refactored mv88e6393x_serdes_setup_errata - refactored mv88e6393x_port_policy_write - added patch implementing .port_set_policy for Amethyst ==================== Signed-off-by: David S. Miller <davem@davemloft.net> -
net: dsa: mv88e6xxx: implement .port_set_policy for Amethyst
The 16-bit Port Policy CTL register from older chips is on 6393x changed to Port Policy MGMT CTL, which can access more data, but indirectly and via 8-bit registers. The original 16-bit value is divided into first two 8-bit register in the Port Policy MGMT CTL. We can therefore use the previous code to compute the mask and shift, and then - if 0 <= shift < 8, we access register 0 in Port Policy MGMT CTL - if 8 <= shift < 16, we access register 1 in Port Policy MGMT CTL There are in fact other possible policy settings for Amethyst which could be added here, but this can be done in the future. Signed-off-by: Marek Behún <kabel@kernel.org> Reviewed-by: Pavana Sharma <pavana.sharma@digi.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
net: dsa: mv88e6xxx: add support for mv88e6393x family
The Marvell 88E6393X device is a single-chip integration of a 11-port Ethernet switch with eight integrated Gigabit Ethernet (GbE) transceivers and three 10-Gigabit interfaces. This patch adds functionalities specific to mv88e6393x family (88E6393X, 88E6193X and 88E6191X). The main differences between previous devices and this one are: - port 0 can be a SERDES port - all SERDESes are one-lane, eg. no XAUI nor RXAUI - on the other hand the SERDESes can do USXGMII, 10GBASER and 5GBASER (on 6191X only one SERDES is capable of more than 1g; USXGMII is not yet supported with this change) - Port Policy CTL register is changed to Port Policy MGMT CTL register, via which several more registers can be accessed indirectly - egress monitor port is configured differently - ingress monitor/CPU/mirror ports are configured differently and can be configured per port (ie. each port can have different ingress monitor port, for example) - port speed AltBit works differently than previously - PHY registers can be also accessed via MDIO address 0x18 and 0x19 (on previous devices they could be accessed only via Global 2 offsets 0x18 and 0x19, which means two indirections; this feature is not yet leveraged with thiis commit) Co-developed-by: Ashkan Boldaji <ashkan.boldaji@digi.com> Signed-off-by: Ashkan Boldaji <ashkan.boldaji@digi.com> Signed-off-by: Pavana Sharma <pavana.sharma@digi.com> Co-developed-by: Marek Behún <kabel@kernel.org> Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
net: dsa: mv88e6xxx: wrap .set_egress_port method
There are two implementations of the .set_egress_port method, and both of them, if successful, set chip->*gress_dest_port variable. To avoid code repetition, wrap this method into mv88e6xxx_set_egress_port. Signed-off-by: Marek Behún <kabel@kernel.org> Reviewed-by: Pavana Sharma <pavana.sharma@digi.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
net: dsa: mv88e6xxx: change serdes lane parameter type from u8 type t…
…o int Returning 0 is no more an error case with MV88E6393 family which has serdes lane numbers 0, 9 or 10. So with this change .serdes_get_lane will return lane number or -errno (-ENODEV or -EOPNOTSUPP). Signed-off-by: Pavana Sharma <pavana.sharma@digi.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
ethernet/microchip:remove unneeded variable: "ret"
remove unneeded variable: "ret". Signed-off-by: dingsenjie <dingsenjie@yulong.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
ethernet/broadcom:remove unneeded variable: "ret"
remove unneeded variable: "ret". Signed-off-by: dingsenjie <dingsenjie@yulong.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
net: stmmac: add per-queue TX & RX coalesce ethtool support
Extending the driver to support per-queue RX and TX coalesce settings in order to support below commands: To show per-queue coalesce setting:- $ ethtool --per-queue <DEVNAME> queue_mask <MASK> --show-coalesce To set per-queue coalesce setting:- $ ethtool --per-queue <DEVNAME> queue_mask <MASK> --coalesce \ [rx-usecs N] [rx-frames M] [tx-usecs P] [tx-frames Q] Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> -
Vladimir Oltean says: ==================== DSA/switchdev documentation fixups These are some small fixups after the recently merged documentation update. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Documentation: networking: dsa: mention that the master is brought up…
… automatically Since commit 9d5ef19 ("net: dsa: automatically bring up DSA master when opening user port"), DSA manages the administrative status of the host port automatically. Update the configuration steps to reflect this. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Documentation: networking: dsa: demote subsections to simple emphasiz…
…ed words "make htmldocs" complains: configuration.rst:165: WARNING: duplicate label networking/dsa/configuration:single port, other instance in (...) configuration.rst:212: WARNING: duplicate label networking/dsa/configuration:bridge, other instance in (...) configuration.rst:252: WARNING: duplicate label networking/dsa/configuration:gateway, other instance in (...) And for good reason, because the "single port", "bridge" and "gateway" use cases are replicated twice, once for normal taggers and twice for DSA_TAG_PROTO_NONE. So when trying to reference these sections via a hyperlink such as: https://www.kernel.org/doc/html/latest/networking/dsa/configuration.html#single-port it will always reference the first occurrence, and never the second one. This change makes the "single port", "bridge" and "gateway" configuration examples consistent with the formatting used in the "Configuration showcases" subsection. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Documentation: networking: dsa: add missing new line in devlink section
"make htmldocs" produces these warnings: Documentation/networking/dsa/dsa.rst:468: WARNING: Unexpected indentation. Documentation/networking/dsa/dsa.rst:477: WARNING: Block quote ends without a blank line; unexpected unindent. Fixes: 8411abb ("Documentation: networking: dsa: mention integration with devlink") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Documentation: networking: switchdev: add missing "and" word
Even though this is clear from the context, it is nice to actually be grammatically correct. Fixes: 0f22ad4 ("Documentation: networking: switchdev: clarify device driver behavior") Reported-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>