Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KSZ9477 HSR driver for 4.9 Linux #98

Open
lmajewski opened this issue Jul 28, 2023 · 22 comments
Open

KSZ9477 HSR driver for 4.9 Linux #98

lmajewski opened this issue Jul 28, 2023 · 22 comments

Comments

@lmajewski
Copy link

I would like to ask for some clarification regarding HSR support on EVB-KSZ9477 board (Linux v4.9).

  1. In the "KSZ9477 Implementation Algorithm and Programming" at AN3474
    Especially in "TX PACKET DUPLICATION FROM HOST TO SWITCH" paragraph - there is clearly stated that the TX HSR frames duplication can be offloaded to KSZ9477 switch.
    To do that the "HSR Port Map" register must be programmed and also the tag for KSZ9477 must have bits set for both HSR ports.

Please clarify mine questions:

  • From mine observation it looks like the current 4.9 HSR driver is not supporting this offloading feature - there are two frames sent with
    the same sequence number and different KSZ9477 engress port set in tag
  • The documentation is not clarifying how things work with this offloading - to be more specific - HSR frames shall only differ with path (i.e. send port number) and FCS. I would expect, that I will send single frame with sequence number and KSZ9477 tag with assigned two HSR ports for sending the frame. Hence the question - is KSZ9477 internally adjusting the "path" field in HSR tag?

Thanks in advance for help.

@triha2work
Copy link
Collaborator

There is no automatic HSR frame generation. The path field is always A. Hardware duplicate discard only works on sequence id and source MAC address. So if the the source MAC address is different as if sent from 2 independent network devices then the feature will not work. Also using just even number nodes like 2 will not trigger the effect as the two frames likely arrive at the same time.

@lmajewski
Copy link
Author

@triha2work - Thanks for your reply.

Yes - it also looks to me that there is no automatic HSR field generation in the KSZ9477 - this must be done in SoC SW.

My question is - how the frame duplication works? In the 4.9 driver each HSR frame is sent from SoC with tail tag assigned to separate port. In other words - the frame duplication offloading in KSZ9477 is not utilized. Am I correct?

If I would like to use this offloading (i.e. frame duplication) - then I would need to enable ports in "HSR Port Map" register and modify tail tag to has bits set for those HSR aware ports? Is this enough?

I also would like to ask about supported HSR version. From mine observation - the "Supervision Frames" with EtherType=0x88FB (PRP) are send by the 4.9 Linux driver. However, when I dump skb data in Linux 4.9 HSR driver just before sending it by MAC there is no frames with EtherType=0x892F. Why such frames are not observed (the HSR seems to work as I can see ongoing transmission with unplugged one cable)?

To sum up - could you clarify which HSR version (v0 or v1) is supported by the KSZ9477 IC and the Linux 4.9 driver?

Last but not least - for current mainline (v6.5-rc3) I can observe, that when I do use pure SW based HSR (CONFIG_HSR=y) with DSA driver - the feature is not working and I notice a lot of "error" frames with "Link Down/Up" observed all the time.

Do you have any idea why this happens?

Thanks in advance for your help.

@triha2work
Copy link
Collaborator

The frame duplication is just a normal feature in KSZ switches where the host can send a frame to any port. In this case the same frame is sent to ports 1 and 2 as defined as HSR ports. There is no specific switch hardware register to enable this feature except the tail tagging, which is always required to implement the DSA driver.
The HSR driver code used by the non-DSA driver is very old. It came from 3.18. At that time the V1 implementation was not formalized. I am able to use the HSR code in 6.1 to replace the old one, but the performance is worse than before so I am hesitated to do that.
The standard HSR implementation sits on top of network device. A new HSR device is created using 2 network devices like lan1 and lan2. So I do not know why there is issue. I can try that out.
The HSR implementation in the non-DSA switch driver was changed from setting on the very top of network devices to the very bottom of the network device. When a normal frame is sent to the MAC driver it is passed to the switch driver to transform it into a HSR frame. The port destination is always updated to send to both HSR ports. When the HSR frame is received from the MAC it is passed to the switch driver to check for duplicate and remove the HSR tag.
It was implemented that way because of concern of using PTP in HSR environment.

@lmajewski
Copy link
Author

The frame duplication is just a normal feature in KSZ switches where the host can send a frame to any port. In this case the same frame is sent to ports 1 and 2 as defined as HSR ports.

Ok, this is clear - the KSZ9477 is copying 1 to 1 the HSR header when it duplicates it.

There is no specific switch hardware register to enable this feature except the tail tagging, which is always required to implement the DSA driver.

I see - the duplication is enabled when both ports (previously enabled in "HSR PORT MAP" register) are marked in KSZ9477 TAG

The HSR driver code used by the non-DSA driver is very old. It came from 3.18. At that time the V1 implementation was not formalized.

So the 4.9 (non DSA driver) supports only HSR v0 ?

I am able to use the HSR code in 6.1 to replace the old one, but the performance is worse than before so I am hesitated to do that.

Which code is that? Is the 4.9 ported driver? Or is it the only-SW-based HSR implementation already present in the Linux kernel (enabled with CONFIG_HSR)?

The standard HSR implementation sits on top of network device. A new HSR device is created using 2 network devices like lan1 and lan2.

I'm using with v6.5-rc3:
ifconfig lan1 up;ifconfig lan2 up
ip link add name hsr0 type hsr slave1 lan1 slave2 lan2 supervision 45
ethtool -K hsr0 hsr-dup-offload on --> Not yet implemented (work in progress)
ifconfig hsr0 192.168.0.1 up

So I do not know why there is issue. I can try that out.

I would expect to have the in-SW HSR working with KSZ9477 (on EVB-KSZ9477) with only using CONFIG_HSR and CONFIG_DSA

The HSR implementation in the non-DSA switch driver was changed from setting on the very top of network devices to the very bottom of the network device. When a normal frame is sent to the MAC driver it is passed to the switch driver to transform it into a HSR frame.

Yes - I've noticed it. The 4.9 Linux driver is modifying frames to insert by itself the HSR header. Moreover, it is not using the DSA framework.

The port destination is always updated to send to both HSR ports.

Are you sure? Maybe, I've misunderstood the Linux 4.9 driver code, but it looks like the HSR driver is sending two separate frames (with only one egress port set in tag). It looks like this driver is NOT relying on in-KSZ9477 frame duplication feature.

When the HSR frame is received from the MAC it is passed to the switch driver to check for duplicate and remove the HSR tag.

This must be supported anyway, as there is a chance to have two frames with the same sequence number passed to MAC, as in some circumstances the KSZ9477 will not drop the duplicate frame.

It was implemented that way because of concern of using PTP in HSR environment.

The PTP is not utilized in my use case. I do only need HSR support in the newest mainline.

@triha2work
Copy link
Collaborator

The non-DSA switch driver uses its built-in HSR driver, not the one from the kernel. That was updated to use v1. If you want to use the kernel HSR driver then you need to set up the non-DSA driver to create independent network devices, just like the DSA driver.
As explained before the kernel HSR support uses a virtual device on top of 2 network devices. The non-DSA driver HSR support is located at the bottom and uses 1 regular eth0 network device. The HSR support is set up in U-Boot with these parameters: "setenv eth1_ports 3; setenv eth1_proto hsr."

@lmajewski
Copy link
Author

The non-DSA switch driver uses its built-in HSR driver, not the one from the kernel. That was updated to use v1

Ok. That is for Linux 4.9 HSR driver from Microchip?

If you want to use the kernel HSR driver then you need to set up the non-DSA driver to create independent network devices, just like the DSA driver.

Is this sentence regarding 4.9 Microchip driver? Or is it regarding newest Linux 6.5-rc3? I guess that the former? Am I correct?

The non-DSA driver HSR support is located at the bottom and uses 1 regular eth0 network device.

Yes, I've noticed that on 4.9 driver only eth0 is present.

The HSR support is set up in U-Boot with these parameters: "setenv eth1_ports 3; setenv eth1_proto hsr."

Yes, correct. I'm able to setup KSZ9477 EVB board to work in this mode with 4.9 Linux kernel (from Microchip).

@triha2work
Copy link
Collaborator

The drivers in the GitHub repository are non-DSA drivers. There are DSA drivers in the 4.9 kernel but they are for testing only. The official DSA drivers should come from the mainline kernels, like 5.4 and up.
I just tried the kernel HSR driver with non-DSA switch driver in kernel 5.10 and did not see any problem. I can try the DSA driver later, but that should not cause any problem as the DSA operation is much simpler.
The non-DSA switch driver always run HSR in v1, as Wireshark only displays HSR frame in that version correctly, so the switch driver never tried to operate in v0. I noted the HSR driver gets the protocol version from the "ip link" command but the one I used does not support that parameter, so I just forced the version to v1 from default v0 in the HSR driver.
The performance of kernel HSR driver is very bad, especially the newer version like in 5.10. The TCP transmit performance is only about 65 Mbps while the non-DSA HSR driver gets about 145 Mbps.
There are some bugs in 6.1 version, but they were probably fixed in 6.5.

@lmajewski
Copy link
Author

The drivers in the GitHub repository are non-DSA drivers. There are DSA drivers in the 4.9 kernel but they are for testing only. The official DSA drivers should come from the mainline kernels, like 5.4 and up.

The kernel DSA driver (without HSR) for KSZ9477-EVB works as expected without issues on 6.5-rcX.

I just tried the kernel HSR driver with non-DSA switch driver in kernel 5.10 and did not see any problem. I can try the DSA driver later, but that should not cause any problem as the DSA operation is much simpler.

The issue here is the permanent link status change (Link UP/DOWN) in the logs for HSR enabled interfaces. Only a fraction of HSR frames are delivered. It looks like the network subsystem seems to not be working reliable.

The non-DSA switch driver always run HSR in v1, as Wireshark only displays HSR frame in that version correctly, so the switch driver never tried to operate in v0.

Yes. I can confirm that.

I noted the HSR driver gets the protocol version from the "ip link" command but the one I used does not support that parameter,

Yes. The 'ip" command is outdated in KSZ9477-EVB BSP setup (I'm using the newest one built from buildbot).

so I just forced the version to v1 from default v0 in the HSR driver.

I also did the same.

The performance of kernel HSR driver is very bad, especially the newer version like in 5.10.

Could you share more details about your's setup to test the performance of mainline kernel's HSR and DSA drivers (enabled together)? With 6.5-rcX the SW based HSR driver (with DSA) is not usable (as mentioned above).

The TCP transmit performance is only about 65 Mbps while the non-DSA HSR driver gets about 145 Mbps. There are some bugs in 6.1 version,

Could you be more specific here ?

but they were probably fixed in 6.5.

Can you point me to some bug reports? Are those for the non-DSA HSR driver only?

  • Just to explain mine approach:
  1. I expect that the KSZ9477-EVB shall work with in kernel's DSA and HSR driver (even with very poor performance).
  2. The drivers/net/dsa/microchip/ksz* shall have the ksz9477_hsr_join added to it to enable in KSZ9477 IC HW features (like frame duplication and filtering)

@triha2work
Copy link
Collaborator

Are you using the KSZ9477 evaluation board with SAMA5D3? I do not see the link issues you mentioned. And they should not be related to the HSR driver. Are you connecting the switch to some other switch and somehow the PHY settings are not compatible and there are link issues?
The HSR driver bug is still there in linux 6.5.0-rc6. It cannot decode the supervisor frame correctly and so every time a unicast frame is sent an error message is displayed from hsr_addr_subst_dest(). I do not know if the code is supposed to handle only v0 supervision frame. But the original code in 5.4 is correct. Maybe during refactoring later the developer just used the wrong structure. After the fix I was able to have nuttcp transmit throughput of 57 Mbps and receive throughput of 130 Mbps. The transmit throughput can be improved by tweaking the Cadence MAC driver, as it implemented a workaround that drastically reduces the TCP transmit performance.
The KSZ9477 HSR hardware duplicate drop function can be enabled all the time. If no HSR frame is used then that function is not activated. I do not know how the DSA driver handles the frame duplication though, as the tail tag processing does not know anything about the hardware configuration. The new PTP implementation may add a flag to signal HSR use. The hardware duplicate drop function only works for the same source MAC address. If lan1 and lan2 have different MAC address then the hardware cannot do anything.
Another hardware feature that needs to be enabled is self address filtering so that a broadcast frame will not be put in a loop. Before the DSA API has a set_address function so the DSA driver can program that in the switch, but that function was removed. There is a new WoL implementation which uses the device tree to specify a MAC address so that address can be programmed to the switch. That allows enabling the self address filtering function. The original set_address API is better, but the DSA maintainers may not want to add that function back.

@lmajewski
Copy link
Author

lmajewski commented Aug 23, 2023

Are you using the KSZ9477 evaluation board with SAMA5D3? I do not see the link issues you mentioned. And they should not be related to the HSR driver. Are you connecting the switch to some other switch and somehow the PHY settings are not compatible and there are link issues?

I'm using the KSZ9477-EVB board with SAMA5D3 (KSZ9477-EVB)

I connect the cables to Port1 and Port2 between two KSZ9477-EVB boards. The driver with HSR from 4.9 (and proper configuration passed when kernel boots: console=ttyS0,115200 earlyprintk root=/dev/mmcblk0p2 rw rootwait spi-ksz9897.multi_dev=1 spi-ksz9897.eth1_ports=3 spi-ksz9897.eth1_vlan=0x7e spi-ksz9897.eth1_proto=hsr spi-ksz9897.eth2_vlan=0x7f spi-ksz9897.eth2_proto=redbox) works as expected.

I've found that the HSR frames have tag with 0x3 - so yes, KSZ9477 duplicates frames in HW. In struct ksz_hsr_info the info->member is set to 3.

When I use the same HW setup with 6.5-rcX kernels on both KSZ9477-EVB boards the very frequent link up/down are shown - hence I guess that this is issue with the kernel.

On the other hand - when 4.9 (with non-DSA HSR driver) is used on one KSZ9477-EVB board and 6.5-rc7 on the other - there are no issues with HSR.

The HSR driver bug is still there in linux 6.5.0-rc6. It cannot decode the supervisor frame correctly and so every time a unicast frame is sent an error message is displayed from hsr_addr_subst_dest().

When I connect two KSZ9477-EVB with running v6.5-rc7 the link up/down issue is present also with:
ksz-switch spi1.0 lan1: hsr_addr_subst_dest: Unknown node

I do not know if the code is supposed to handle only v0 supervision frame. But the original code in 5.4 is correct.

I will test is 5.4 is working correctly with the KSZ9477-EVB board.

After the fix I was able to have nuttcp transmit throughput of 57 Mbps and receive throughput of 130 Mbps. The transmit throughput can be improved by tweaking the Cadence MAC driver, as it implemented a workaround that drastically reduces the TCP transmit performance.

Do you plan to publish those fixes ? (to kernel mailing list or some web available Microchip repository)

The KSZ9477 HSR hardware duplicate drop function can be enabled all the time. If no HSR frame is used then that function is not activated.

It is set by default in Global HSR AME Control Register 0 (0x0644) - bit 7 (HSR_DUPLICATE_DISCARD_ENB).

The hardware duplicate drop function only works for the same source MAC address. If lan1 and lan2 have different MAC address then the hardware cannot do anything.

Thanks for clarification - then I must set the same MAC address for lan1 and lan2.

Another hardware feature that needs to be enabled is self address filtering so that a broadcast frame will not be put in a loop.

I will look on this more closely - thanks for pointing this out.

@triha2work
Copy link
Collaborator

I finally saw the link problem you mentioned. It is so bad the board is unusable. You should use 6.1 until the problem is fixed. It is caused by EEE and a special DSP PHY register setting. The latest kernel code enables EEE after the main PHY driver thinks EEE support is there. The workaround is to clear the EEE register first before the main PHY driver reads it. An alternative is to the use the non-DSA driver in another board which has EEE disabled so that the link problem does not happen.
For the HSR driver the wrong code works for v0. Maybe that is why the problem was not caught before. The updated code works for both.

@lmajewski
Copy link
Author

lmajewski commented Aug 24, 2023

I finally saw the link problem you mentioned. It is so bad the board is unusable. You should use 6.1 until the problem is fixed.

I would prefer to use 6.5-rc7 and then manually disable EEE via ethtool:
ethtool --show-eee lan1
ethtool --show-eee lan2
ethtool --set-eee lan1 eee off
ethtool --set-eee lan2 eee off

This fixes the problem with Link Stability (without the need to modify driver).

Exactly this issue is mentioned in errata Module 4.

I've prepared proper patches for mainline kernel.

@lmajewski
Copy link
Author

lmajewski commented Aug 28, 2023

Another hardware feature that needs to be enabled is self address filtering so that a broadcast frame will not be put in a loop.

With this one I'm a bit puzzled. The AN3474 note in PREVENTING PACKET LOOP IN THE RING BY SELF-ADDRESS FILTERING paragraph mentions about Port Control 2 register's bit 3.

Unfortunately, in the data sheet I cannot find it. It looks like it is not documented.

In the sources I've found:
./drivers/net/dsa/microchip/ksz9477_reg.h:1233:#define REG_PORT_MAC_CTRL_2 0x0402
but it is not referenced in the 4.9 kernel (and newest main line) at all.
Moreover, in the 'data sheet' only Port MAC Control 1 Register is described in point 5.2.5.2.

Is this the correct register to set bit 3 in it to enable per port self-filtering ?
If yes - could you point me where I can find description of this register?

I'm asking as (according to AN3474) to have this feature enabled, I do need to enable the self MAC filtering globally (this I've found) AND per HSR port (this I'm looking for).

@triha2work
Copy link
Collaborator

Maybe you got tired after reading all the document. The AN3474 note states correctly the names that are in the datasheet. Switch Lookup Engine Control 1 Register is 0x0311. Port Control 2 Register is 0xNB00. The datasheet has links in the descriptions of these registers so you can jump between them.
This feature is generic switch feature and is not a part of HSR design.

@lmajewski
Copy link
Author

Ach..... Now I do see.

The Port Control 2 Register is defined as REG_PORT_LUE_CTRL in the 4.9 and newest main line.

Thanks for pointing this out.

@lmajewski
Copy link
Author

The patches now has been posted to ML (@triha2work you were added to receive them :-) )

The performance measurements:

  • HW - two KSZ9477-EVB boards (port1 and port2 used as HSR) with the same kernel v6.5-rc7
  • SW used:
    nuttcp -S --nofork
    nuttcp -vv -T 60 -r 192.168.0.2
    nuttcp -vv -T 60 -t 192.168.0.2
  • Code: KSZ9477-EVB Linux repository
  • Tested HSR v0 and v1
  • Results:
    With KSZ9477 offloading support added: RX: 100 Mbps TX: 98 Mbps
    With no offloading (last 4 patches removed) RX: 63 Mbps TX: 63 Mbps

@lmajewski
Copy link
Author

@triha2work - I'm wondering why KSZ9477 doesn't support in HW in any way forwarding HSR frames between switch ports? I've read somewhere that this may contribute to some slow down when many HSR aware devices work in a "ring".

Do you have any thoughts about this issue?

@triha2work
Copy link
Collaborator

Are you referring to the DSA driver behavior? The hardware and non-DSA driver do work correctly in forwarding HSR frames. The behavior of DSA driver is at initialization all ports act independently and requires a software bridge device to forward frames. After the ports are joined to the bridge they are linked to forward frames inside the switch. There is an indication to the bridge device to stop forwarding the frames itself.
I think the HSR join may need further implementation to link the ports just like joining a bridge. The HSR driver does have an indication to stop this forwarding itself.
There is a limitation of the hardware duplicate drop feature. It only works for same source MAC address and sequence id. When the number of nodes in the HSR network is even it will fail to activate for the node furthest from the node sending the frames. This happens when there are 2 nodes. For this reason it is better to always add a third switch, which can be a dump switch, to activate this duplicate drop feature when doing HSR demo.

@lmajewski
Copy link
Author

Are you referring to the DSA driver behavior? The hardware and non-DSA driver do work correctly in forwarding HSR frames. The behavior of DSA driver is at initialization all ports act independently and requires a software bridge device to forward frames. After the ports are joined to the bridge they are linked to forward frames inside the switch. There is an indication to the bridge device to stop forwarding the frames itself.

Yes, the frame forwarding can be setup in HW and then when bridging is used the DSA driver would offload it.

I think the HSR join may need further implementation to link the ports just like joining a bridge. The HSR driver does have an indication to stop this forwarding itself.

This is really interesting, as HSR driver has NETIF_F_HW_HSR_FWD flag, which can be enabled in the HSR core. I just need to dig where in 4.9 the implicit "bridging" is enabled for HSR ports.

There is a limitation of the hardware duplicate drop feature. It only works for same source MAC address and sequence id. When the number of nodes in the HSR network is even it will fail to activate for the node furthest from the node sending the frames. This happens when there are 2 nodes.

Why the further node? The App Note says that it shall work for more than 2 nodes. According to the note - the problem is when two packets arrive together at almost the same time.

Or is there any other problem?

For this reason it is better to always add a third switch, which can be a dump switch, to activate this duplicate drop feature when doing HSR demo.

Yes, the HSR works better with more than 2 HSR nodes.

@triha2work
Copy link
Collaborator

The switch forwarding is simply manipulating the port membership bits. The default value is 0x7f for 7-port switch. It is set to 0x21 for port 1; 0x22, port 2; 0x24, port 3; and so on at the beginning. When port 1 and 2 are joined their membership values becomes 0x23. When all 6 ports are joined the value goes back to 0x7f. In HSR the HSR port 1 and 2 have this membership value: 0x23.

@lmajewski
Copy link
Author

The switch forwarding is simply manipulating the port membership bits. The default value is 0x7f for 7-port switch. It is set to 0x21 for port 1; 0x22, port 2; 0x24, port 3; and so on at the beginning. When port 1 and 2 are joined their membership values becomes 0x23. When all 6 ports are joined the value goes back to 0x7f. In HSR the HSR port 1 and 2 have this membership value: 0x23.

When I call ksz9477_hsr_join then the REG_PORT_VLAN_MEMBERSHIP__4 value for both HSR ports is 0x20.
(the REG_SW_QM_CTRL__4 has correct value of 0x82).

I guess that both HSR ports (1,2) shall have the value of REG_PORT_VLAN_MEMBERSHIP__4 equal to 0x23 as you suggested (i.e. frames are passed between CPU port and HSR ports in KSZ9477 internal matrix).

Then the NETIF_F_HW_HSR_FWD net device capability flag could be set as frames from HSR port 1 are passed to CPU port and HSR port 2. In the current v6.5-rc7 Linux this flag disables promiscuity mode for HSR ports (1,2).

I will have more info regarding the performance impact (about frames passing) when I form the HSR mesh (with 3 devices).

@lmajewski
Copy link
Author

@triha2work - If I may ask about the RedBox setup for the HSR driver [*].

Why in the drivers/net/ethernet/micrel/ksz_hsr.c in prep_hsr() function the vlan (in KSZ9477 switch IC) is setup for RedBox access?

Please also correct me, but it looks like in your driver there is some kind of "bridge" HSR driver (L2), which decides (based on configuration) if the HSR frame shall be forwarded to port marked as RedBox one (e.g. sw_ins_hsr() at ksz_sw_9897.c).

Have you tried to upstream the HSR RedBox functionality to Linux mainline?

I've poked around mainline Linux sources and I did not spot any attempt of such work done yet.
The current status is to have hsr0 (for HSR ring) and lan3 (for "normal" ETH). Communication between them can be performed on L3 level without issues (with proper routing).

However, "bridging" those two interfaces on L2 level is not working out of the box.

Note:
[*] - "EVB KSZ9477 Software Setup Guide" (point 7.4)
setenv eth1_vlan 0x7e
setenv eth2_proto redbox
setenv eth2_vlan 0x7f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants