Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support to use KSZ9893R in Iris 2.0 Hardware #3

Open
bvacaliuc opened this issue Dec 5, 2018 · 68 comments
Open

Add Support to use KSZ9893R in Iris 2.0 Hardware #3

bvacaliuc opened this issue Dec 5, 2018 · 68 comments

Comments

@bvacaliuc
Copy link
Contributor

Add SPI mode support in KSZ9893R using as reference the kernel driver available for KSZ9897 in ksz_spi.c

@bvacaliuc
Copy link
Contributor Author

Carlos Aguero wrote:

About this driver there are efforts at GitHub about providing a driver for 4.9 Microchip-Ethernet/EVB-KSZ9477#2 there should be a release by end of November 2018, but seems to be delayed, we have tried to contact and provide support on this task but no response.

If there is no response and we need to do the port to Linux 4.9 we need to start planning this task since it is not trivial

Yes, we are happy for you to develop/collaborate on a driver for the KSZ9893R that can be pushed upstream.

@edisonf
Copy link

edisonf commented Jan 11, 2019

The driver has been added to this branch:

https://github.com/uvdl/yocto-ornl/tree/feature/ksz9893-support

The driver is currently building but it has not been tested.

As the driver is written for an older kernel some features are having conflicts and were disabled:

KSZ_PTP
KSZ_DSA
KSZ_MRP
KSZ_HSR

If any of this is needed, the build errors need to be fixed.

@edisonf
Copy link

edisonf commented Jan 11, 2019

Also, some days ago Microchip started pushing the drivers for kernel 4.9. They can be found here:

https://github.com/Microchip-Ethernet/EVB-KSZ9477/tree/master/KSZ/linux-drivers/ksz9897/linux-4.9

@bvacaliuc
Copy link
Contributor Author

First impression is hopeful:

[    1.294718] ksz9897 spi0.0: chip id 0x00989360
[    1.300178] libphy: Switch MII bus: probed
[    1.340347] spi_imx 2008000.ecspi: probed
[    1.361589] 2188000.ethernet supply phy not found, using dummy regulator
[    1.383527] pps pps0: new PPS source ptp0
[    1.399194] libphy: fec_enet_mii_bus: probed
[    1.402804] fec 2188000.ethernet eth0: registered PHC device 0

but later, this:

[    7.721924] fec 2188000.ethernet eth0: no PHY, assuming direct connection to switch
[    7.728374] libphy: PHY fixed-0:00 not found
[    7.731350] fec 2188000.ethernet eth0: could not attach to PHY
[    8.262363] fec 2188000.ethernet eth0: no PHY, assuming direct connection to switch
[    8.268880] libphy: PHY fixed-0:00 not found
[    8.271886] fec 2188000.ethernet eth0: could not attach to PHY

I can see led D23 lit, so that means +3V3_USB is on and possibly stayed on.

@bvacaliuc
Copy link
Contributor Author

I am encouraged by the information presented in linux-drivers/ksz9897/linux-4.9/doc.

However, in order to make use of these programs, they need to be compiled and added to the build. So we are going to need to add setup_sw.c, regs_bin at the least.

@edisonf
Copy link

edisonf commented Jan 14, 2019

I'll create a yocto build from Rocko which uses kernel 4.9, then we can add these latest drivers and the utilities.

@edisonf
Copy link

edisonf commented Jan 15, 2019

I already added the latest drivers in branch:
https://github.com/uvdl/yocto-ornl/tree/feature/yocto_sumo

@bvacaliuc
Copy link
Contributor Author

Very encouraging:

[    1.759739] ksz9897 spi0.0: chip id 0x00989360
[    1.765041] libphy: Switch MII bus: probed
[    1.808655] spi_imx 2008000.ecspi: probed
[    1.812303] libphy: Fixed MDIO Bus: probed
[    1.833075] 2188000.ethernet supply phy not found, using dummy regulator
[    1.841057] pps pps0: new PPS source ptp0
[    1.855578] libphy: fec_enet_mii_bus: probed
[    1.859193] fec 2188000.ethernet eth0: registered PHC device 0
...
[    7.398921] Generic PHY 2188000.ethernet-1:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=2188000.ethernet-1:00, irq=-1)
[    7.409602] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready

And upon login, got this:

root@var-som-mx6-ornl:~# ethtool eth0
Settings for eth0:
        Supported ports: [ TP MII ]
        Supported link modes:   Not reported
        Supported pause frame use: Symmetric
        Supports auto-negotiation: No
        Advertised link modes:  Not reported
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: No
        Speed: 10Mb/s
        Duplex: Half
        Port: MII
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: g
        Wake-on: d
        Link detected: no
root@var-som-mx6-ornl:~# ethtool -i eth0
driver: fec
version: Revision: 1.0
firmware-version:
expansion-rom-version:
bus-info: eth0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no
root@var-som-mx6-ornl:~# ethtool -d eth0
Offset          Values
------          ------
0x0000:         00 00 00 00 00 00 00 00 00 00 80 00 00 00 00 00
0x0010:         00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00
0x0020:         00 00 00 00 12 01 00 f0 00 00 00 00 00 00 00 00
0x0030:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0040:         00 00 12 60 1a 00 00 00 00 00 00 00 00 00 00 00
0x0050:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0060:         00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 00
0x0070:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0080:         00 00 00 00 46 00 c0 47 00 00 00 00 00 00 00 00
0x0090:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00a0:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00b0:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00c0:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00d0:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00e0:         00 00 00 00 1a 7a dc f8 08 88 b9 2e 00 00 01 00
0x00f0:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0100:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0110:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0120:         00 00 40 00 00 00 80 00 00 00 00 00 00 00 00 00
0x0130:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0140:         00 00 00 00 00 01 00 00 00 00 00 00 00 06 00 00
0x0150:         00 05 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0160:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0170:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0180:         00 80 04 3c 00 c0 04 3c c0 07 00 00 00 00 00 00
0x0190:         00 00 00 00 00 00 00 00 04 00 00 00 04 00 00 00
0x01a0:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x01b0:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x01c0:         00 00 00 00 86 00 00 00 00 00 00 00 00 00 00 00
0x01d0:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x01e0:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x01f0:         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0200:         <all 0 from here to 0x3ff0>

@bvacaliuc
Copy link
Contributor Author

However while the interface can be given an IP, etc. no traffic flows and it does not correctly identify the RGMII to be 100Mb/s (its stuck at 10Mb/s and does not function). I get this:

root@var-som-mx6-ornl:~# ethtool -s eth0 speed 1000
Cannot advertise speed 1000
root@var-som-mx6-ornl:~# ethtool -s eth0 xcvr external
Cannot set new settings: Success
  not setting transceiver
root@var-som-mx6-ornl:~# mii-tool -v eth0
  No MII transceiver present!.
root@var-som-mx6-ornl:~# mii-tool -R eth0
resetting the transceiver...
root@var-som-mx6-ornl:~# mii-tool -v eth0
  No MII transceiver present!.

I do not believe that the RGMII interface is functioning. I was reading about this, and there are some references that may apply:

@bvacaliuc
Copy link
Contributor Author

On today's telcon, Edison mentioned that he thought it might be incorrect to use the fec controller. I went back to the reference DART .dts as well as another board reference, and they both instance an &fec controller.

In both cases, when the boards boot, a line of the form:

Configuring network interfaces... fec 2188000.ethernet eth0: Freescale FEC PHY driver [Micrel KSZ9021 Gigabit PHY] (mii_bus:phy_addr=2188000.ethernet:03, irq=-1)

Is emitted. It is pretty clear that the FEC is the interface on the iMX6 that talks to the RGMII.

@bvacaliuc
Copy link
Contributor Author

bvacaliuc commented Jan 21, 2019

On Friday, January 18, 2019 2:05 PM, Edison Fernandez wrote:

I was taking a look at the connection for the KSZ (both physical and in the device tree) and realized that both iMX6 and KSZ have a MAC. Then, I did a research on how the configuration and connection for MAC-MAC would be for the iMX6 and found this thread:

https://community.nxp.com/thread/316374

They are using a different chip but it looks similar to the KSZ:
https://www.marvell.com/switching/assets/marvell_linkstreet_88E6350r_product_brief.pdf

It looks like for this specific case the RGMII connection should be:

RXD - TXD
RX_CTL - TX_CTL
...

And if I interpreted your schematics correctly, you have RX-RX, TX-TX, ...

We talked during our call about DART using the KSZ9031 and I also found the Sabrelite which uses the KSZ9021 but none of these chips have a MAC so there is no restriction when connecting it to the iMX6.

Could you please verify this? This looks like an issue to me but I think is better if you analyze the situation given that we don't do HW work.

I reference the below items:

  1. KSZ9563R
  2. KSZ9893R
  3. EVB-KSZ9893 User Guide, pg. 23
  4. DM320114 User Guide,

From [3] there is a note:

For the xMII pins, Micrel parts naming convention is referenced to the PHY side regardless of whether the switch is in MAC mode or PHY mode. That is, "TX" named pins are inputs and "RX" named pins are outputs for MAC mode and PHY mode.

Also [2, Sec 4.10, pg. 52] says:

Note that the signals on the KSZ9893R MAC interfaces are named as they would be for a PHY: the TX direction is into the KSZ9893R, while the RX direction is out of the KSZ9893R, as if to a host processor with integrated MAC. Signal connection to such a “MAC” device is TX-to-TX, and RX-to-RX. An external PHY (such as the Microchip KSZ9031RNX) may be connected to either port, but in that case the signal connection will be RX-to-TX, and TX-to-RX.

This rather conflicts with [1, TABLE 4-30, pg. 58][2, TABLE 4-30, pg. 55] describing the pin direction unless one interprets the MAC to be the MAC inside the KSZ9893R, in which case it is consistent with the schematic note and the note on pg. 52. In either case, it seems the choice of RGMII TX-to-TX and RX-to-RX is correct.

However, during my investigation on the RGMII interface, this discussion was observed:

We have connected KSZ8775CLXIC using RGMII interface to IMX.6. It did not work because as it turns out IMX.6 needs reference clock connected to ENET_REF_CLK which we did not connect. This switch does not have 125MHz reference clock output. We can use internal clock from IMX6 but it needs to be routed externally from GPIO_16 to ENET_REF_CLK...

Igor wrote:

if PHY can't provide 125M clock, recommended to connect GPIO_16 to ENET_CLK_REF
please look at example on
https://community.freescale.com/message/425432#425432

What this appears to require is a 125MHz clock to be provided to the ENET_REF_CLK input on J9.52. On the DART-MX6, GPIO_16 is routed to J8.49 which we connected as the TIMEPULSE from the GPS. It seems that what is needed is:

  1. Choose another pin for GPS TIMEPULSE
  2. Connect J8.49 to J9.52 to allow for configuration of 125MHz clock reference input to the i.MX6Q

I am going to try and make a wire on one of our boards from J8.49 to J9.52 (GPS TIMEPULSE is also an output but it is not enabled at power up; I’m hoping it will not cause problems).

@edisonf
Copy link

edisonf commented Jan 21, 2019

I didn't know about the naming convention in Micrel parts. If that's the case the connection should be correct.
I also read about the ENET_REF_CLOCK signal you mentioned. I'll take a look at the device tree and make sure the configuration is right for the KSZ and clock signals.

@bvacaliuc
Copy link
Contributor Author

ok. i did the rework and cut the TIMEPULSE from the GPS. If you can figure out what the .dts should be to send the 125MHz clock out GPIO_16, that would be super!

@edisonf
Copy link

edisonf commented Jan 22, 2019

I just pushed some changes that aim to enable the 125MHz clock through GPIO_16, you can see the commit here:

uvdl/yocto-ornl@6c3a273

I took this thread as a reference.

The only configuration I'm missing is IOMUXC_ENET_REF_CLK_SELECT_INPUT.

I also included in the image the imx-test package which has memtool so you can read/write the iMX6 registers.

To get the current value of IOMUXC_ENET_REF_CLK_SELECT_INPUT, you can run:

/unit_tests/memtool -32 0x20E083C 1
E
Reading 0x1 count starting at address 0x020E083C

0x020E083C:  00000000

We are currently getting a 0x0, but according to the thread, it should be a 0x01. You can change that by running:

/unit_tests/memtool -32 0x20E083C=0x01

Finally, you can also check the CCM_ANALOG_PLL_ENETn registers by running:

/unit_tests/memtool -32 0x20C80E0 4
E
Reading 0x4 count starting at address 0x020C80E0

0x020C80E0:  80002003 80002003 80002003 80002003

You should get 80002003 which is the expected value.

@bvacaliuc
Copy link
Contributor Author

I confirm that I see a 125MHz signal (looks like a sine-wave) on GPIO_16 upon boot with imx6q-iris2.dtb. Here is what I get from the above:

root@var-som-mx6-ornl:~# /unit_tests/memtool -32 0x20E083C 1
E
Reading 0x1 count starting at address 0x020E083C

0x020E083C:  00000001

root@var-som-mx6-ornl:~# /unit_tests/memtool -32 0x20C80E0 4
E
Reading 0x4 count starting at address 0x020C80E0

0x020C80E0:  80002003 80002003 80002003 80002003

Looks good so far. I configured eth0 for static IP, and sent some ICMP traffic. When I use ethtool -S eth0 to get statistics, they are still all 0. So we are still not working. Here is what ip says:

root@var-som-mx6-ornl:~# ip address list dev eth0
3: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
    link/ether f8:dc:7a:22:af:f9 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.10/24 scope global eth0
       valid_lft forever preferred_lft forever

Looks like it is not able to recover the RXCLK.

I do have a support ticket open with Microchip to review our schematic. In the meantime, I think the plan is to configure the KSZ9893 into Remote Loopback [2, Sec. 5.2.1.7, pg. 102] and begin to experiment with RGMII delay values to find the sweet spot (if it can exist...).

@edisonf
Copy link

edisonf commented Jan 23, 2019

I think we need to define a fixed link in fec.
According to fixed-link.txt:

Fixed link Device Tree binding
------------------------------

Some Ethernet MACs have a "fixed link", and are not connected to a
normal MDIO-managed PHY device. For those situations, a Device Tree
binding allows to describe a "fixed link".

@bvacaliuc
Copy link
Contributor Author

So kind of like this? imx6q-iris2-b.dts

@edisonf
Copy link

edisonf commented Jan 23, 2019

Yes, like that.

@bvacaliuc
Copy link
Contributor Author

Progress...

[    1.759580] ksz9897 spi0.0: chip id 0x00989360
[    1.764913] libphy: Switch MII bus: probed
[    1.808495] spi_imx 2008000.ecspi: probed
[    1.812117] libphy: Fixed MDIO Bus: probed
[    1.833253] 2188000.ethernet supply phy not found, using dummy regulator
[    1.855695] libphy: fec_enet_mii_bus: probed
[    1.859298] fec 2188000.ethernet eth0: registered PHC device 0
[    8.194353] Generic PHY fixed-0:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=fixed-0:00, irq=-1)
[    9.204011] fec 2188000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off

Then, I setup static IP and sent some ICMP traffic out. I made the default gateway a machine running tcpdump. I did NOT get any packets (no ARP table). I then composed some raw ethernet frames and set the right SRC/DST addresses. Still no traffic.

root@var-som-mx6-ornl:~# arp -an
? (192.168.1.1) at <incomplete> on eth0
root@var-som-mx6-ornl:~# ethtool eth0
Settings for eth0:
        Supported ports: [ TP MII ]
        Supported link modes:   1000baseT/Full
        Supported pause frame use: Symmetric
        Supports auto-negotiation: Yes
        Advertised link modes:  1000baseT/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: Yes
        Link partner advertised link modes:  1000baseT/Full
        Link partner advertised pause frame use: No
        Link partner advertised auto-negotiation: No
        Speed: 1000Mb/s
        Duplex: Full
        Port: MII
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: g
        Wake-on: d
        Link detected: yes
root@var-som-mx6-ornl:~# ethtool -S eth0
NIC statistics:
     tx_dropped: 0
     tx_packets: 186
     tx_broadcast: 132
     tx_multicast: 51
     tx_crc_errors: 0
     tx_undersize: 0
     tx_oversize: 0
     tx_fragment: 0
     tx_jabber: 0
     tx_collision: 0
     tx_64byte: 30
     tx_65to127byte: 22
     tx_128to255byte: 20
     tx_256to511byte: 9
     tx_512to1023byte: 105
     tx_1024to2047byte: 0
     tx_GTE2048byte: 0
     tx_octets: 72672
     IEEE_tx_drop: 0
     IEEE_tx_frame_ok: 186
     IEEE_tx_1col: 0
     IEEE_tx_mcol: 0
     IEEE_tx_def: 0
     IEEE_tx_lcol: 0
     IEEE_tx_excol: 0
     IEEE_tx_macerr: 0
     IEEE_tx_cserr: 0
     IEEE_tx_sqe: 0
     IEEE_tx_fdxfc: 0
     IEEE_tx_octets_ok: 72672
     rx_packets: 0
     rx_broadcast: 0
     rx_multicast: 0
     rx_crc_errors: 0
     rx_undersize: 0
     rx_oversize: 0
     rx_fragment: 0
     rx_jabber: 0
     rx_64byte: 0
     rx_65to127byte: 0
     rx_128to255byte: 0
     rx_256to511byte: 0
     rx_512to1023byte: 0
     rx_1024to2047byte: 0
     rx_GTE2048byte: 0
     rx_octets: 0
     IEEE_rx_drop: 0
     IEEE_rx_frame_ok: 0
     IEEE_rx_crc: 0
     IEEE_rx_align: 0
     IEEE_rx_macerr: 0
     IEEE_rx_fdxfc: 0
     IEEE_rx_octets_ok: 0

So looks like were not getting anything on the RX side of the switch. I'll keep exploring the RGMII delay modes. Should we be concerned that its finding a "Generic PHY" and not the "KSZ9893"?

@edisonf
Copy link

edisonf commented Jan 25, 2019

Just added phy-handle to FEC node. Please give it a try and lets see how it goes.

@bvacaliuc
Copy link
Contributor Author

So I booted up with the updated image and first thing I observed is that the 125MHz clock went away. It also doesn't configure eth0 anymore. I checked the registers with memtool:

root@var-som-mx6-ornl:~# /unit_tests/memtool -32 0x20E083C 1
E
Reading 0x1 count starting at address 0x020E083C

0x020E083C:  00000001

root@var-som-mx6-ornl:~# /unit_tests/memtool -32 0x20C80E0 4
E
Reading 0x4 count starting at address 0x020C80E0

0x020C80E0:  00001003 00001003 00001003 00001003

Oops. You said this had to be 0x80002003, so I did:

root@var-som-mx6-ornl:~# /unit_tests/memtool -32 0x20C80E0=0x80002003
Writing 32-bit value 0x80002003 to address 0x020C80E0
root@var-som-mx6-ornl:~# /unit_tests/memtool -32 0x20C80E0 4
E
Reading 0x4 count starting at address 0x020C80E0

0x020C80E0:  80002003 80002003 80002003 80002003

And I got the 125MHz clock back on the scope. Hmmm. Why did that go away? I did a diff between the previous and current .dts and it seems that almost every phandle got bumped by one:
2019-01-23_vs_2019-01-26.txt

@bvacaliuc
Copy link
Contributor Author

Here are the boot logs from both:
yocto-boot-2019-01-23.txt
yocto-boot-2019-01-26A.txt

Looks like the latter is missing the phy-reset signal on gpio-25, so probably the phy-handle setting isn't right somehow.

@bvacaliuc
Copy link
Contributor Author

We need to study this system to understand how to setup the .dtb files for using the KSZ9893R chip:
EVB-KSZ9893-Evaluation-Board-User-Guide
SAMA5D3-Ethernet-Development-System-Users-Guide

@bvacaliuc
Copy link
Contributor Author

I got a response from Variscite on our schematic. There is an explanation for why RGMII does not work:

  1. RGMII lines are referenced to +2.5V, but we are providing +3.3V to AVDDH on the KSZ9893.
  2. Confirm need for 125MHz clock on SOM pin J3.52, recommend adding an option for a full oscillator in addtion to looping back GPIO_16
  3. RGMII series termination can be achieved by choosing DSE values on CPU pins in the .dtsi files. In their design, they have a series 33ohm resistor on RGMII2_TCLK (and their reference schematic has a series 33ohm resistor on RGMII2_RCLK near the PHY).
  4. They reference NXP AN5078 for explanations on how to use the pin settings.

@bvacaliuc
Copy link
Contributor Author

I got the shims to translate the voltage on the RGMII signals and to give 125MHz to J3.52 (ENET_REF_CLK). Not working. It is actually worse than on Jan 24 because all stats remain '0'. There is a regression from Jan 25 (the phy-handle addition) seems to cause it not configure an eth0 at all. I had to build uvdl/yocto-ornl@8af38e6 in order to get an interface.

Anyway, I do not think we have the correct .dtsi expression yet. We do not need to loop back GPIO_16 as the shim has a 125MHz oscillator on it. I believe we have all the correct signalling now. I am concerned that we no longer observe TX statistics like we did on #3 (comment) Here are two boot log, one from a build of uvdl/yocto-ornl@develop and one from the build of uvdl/yocto-ornl@8af38e6 .

boot-yocto-sd-2019-03-13-devel-R0-1-shim.log
boot-yocto-sd-2019-03-13-8af38e6-R0-1-shim.log

What do you think is the next step to diagnose this?

@bvacaliuc
Copy link
Contributor Author

Ok. I have integrated Edison's patch to instrument the ksz9893 driver. I had to make a branch of the kernel, because the Yocto build kept rejecting my efforts to integrate the patch or build it locally.

The kernel branch is on commit uvdl/linux-imx@eaadda0. Here is the boot log:

boot-yocto-sd-2019-03-15-debug_ksz9893-R0-1-shim.log

So interpreting the trace, I see the following:

  1. FEC_QUIRK_ERR006687 is added
  2. fep->clk_enet_out = NULL
  3. fep->clk_ref = NULL
  4. fep->clk_2x_txclk = NULL
  5. got 3 irqs, wake_irq is one of them
  6. FEC_QUIRK_SINGLE_MDIO is added
  7. fec_enet_mii_init fails @3938/3993

Ah, bummer, we didn't instrument this function. There are 4 sections. The first exits silently if there is no mii bus, the second exits with a dev_err() and we don't see that, so its not it. The third could exit silently if it could not allocate a bus. The fourth exits silently if it could not register an mdio bus. My money is on the fourth because:

	node = of_get_child_by_name(pdev->dev.of_node, "mdio");
	if (node) {
		err = of_mdiobus_register(fep->mii_bus, node);
		of_node_put(node);
	} else if (fep->phy_node && !fep->fixed_link) {
		err = -EPROBE_DEFER;
	} else {
		err = mdiobus_register(fep->mii_bus);
	}

	if (err)
		goto err_out_free_mdiobus;

What is this hard-coded mdio on line 2159? We are connected on SPI...

@bvacaliuc
Copy link
Contributor Author

These two discussions seem relevant:
http://lists.infradead.org/pipermail/barebox/2014-May/019197.html
https://lore.kernel.org/lkml/971bb29592c6ff4e786a988d850e1956@agner.ch/

Do we need to name some tag in our device tree to have it reference 'mdio'? Or is it something else for SPI?

@bvacaliuc
Copy link
Contributor Author

So upon instrumenting fec_enet_mii_init(), we find that it fails in the above code with an -EPROBE_DEFER. Hmm:
http://lists.infradead.org/pipermail/linux-arm-kernel/2012-August/113446.html

I continue to instrument as the investigation continues.

@bvacaliuc
Copy link
Contributor Author

So with uvdl/linux-imx@418e464, the issue appears to be that the driver is looking for a fixed-link child and its not getting it.

boot-yocto-sd-2019-03-16-debug_ksz9893-R0-1-shim.log
(see line 334-336).

Strangely, though later on in line 349 something does match. So I'm rather confused. Do we need to spinlock?

@bvacaliuc
Copy link
Contributor Author

Based on 'fsl-fec.txt', I made some mods to 'imx6q-iris2-R0.dts' in uvdl/linux-imx@d9ef2d1

I regenerated the kernel using https://github.com/uvdl/yocto-ornl/blob/project/debug_ksz9893/Makefile#L157, but it failed to update the .dtb. I will have to try it manually tomorrow.

@bvacaliuc
Copy link
Contributor Author

So over the weekend, I did two things:

  1. Finished out the branch, https://github.com/uvdl/linux-imx/tree/project/ksz9897-microchip
    1. this is a branch that tries to follow the example from https://github.com/Microchip-Ethernet/EVB-KSZ9477/tree/master/KSZ/kernels/linux-4.9.143
    2. the modifications microchip made to drivers/net/ethernet/cadence/macb is merged into drivers/net/ethernet/freescale/fec (as best I could and as far as I understood them).
    3. I put the added code in two files ksz_fec.h and ksz_fec.c so that it is easier to understand what the needed additions are to fec_main.c.
  2. Merged-up work that Edison was doing in his testing branch into my testing branch, https://github.com/uvdl/linux-imx/tree/project/debug_ksz9893
    1. Edison put his changes in fec_main.c completely

Observations: The behavior remains the same in either branch: FEC comes up, attaches to PHY, appears to communicate with it, but no traffic is accounted for in the driver (and no traffic makes it to the network). Using 2 (commit uvdl/linux-imx@dbc1e6d), I collected the following logs:
boot-yocto-dbc1e6dd56421d702389d46f930c931758f1d82b-noshim-cleanup.log
boot-yocto-dbc1e6dd56421d702389d46f930c931758f1d82b-shim-cleanup.log

(the logs are "cleaned up" because some of the printk's overran program output on the terminal). What we see is that the shim (which translates RGMII voltage reference) had no appreciable effect on the operation.

With the shim installed, I am able to probe the RGMII and SPI signals. I had previously confirmed that SPI transactions are occurring and that data is flowing (though the traces show alot of overshoot on the edges of all signals - probably need to adjust termination settings in the pinmux). In the below, I captured ENET_REF_CLK (125MHz oscillator to GPIO1[23]), RX_CLK, RX_CTL, TX_CLK. (the latter three are in the +3.3V domain on the Iris2 R0 board.)

TEK00000
TEK00001

From the above, what I can surmise is that:

  • REFCLK is at low level - possibly too low if it is expected to come in at +3.3V logic.
  • RXCLK (driven by KSZ9893) seems good.
  • RXCTL (driven by KSZ9893) is mostly low with a 1V ripple
  • TXCLK (driven by IMX6) is not good. Seems to have a 0.5V ripple with a 1V bias.

I think the next step here is to disconnect the 125MHz CLK from the SHIM and do the GPIO_16 route-around trick to have the IMX6 generate the 125MHz CLK at its correct voltage.

@edisonf
Copy link

edisonf commented Apr 1, 2019

That's a problem. I assumed the clock from the SHIM was working. I'll take a look at the kernel and see how is the support to generate the 125MHz clock.
Will get back to you soon.

@edisonf
Copy link

edisonf commented Apr 1, 2019

The kernel and devicetree seem to have the logic for generating the 125MHz clock.
You can see some comments here.

Could you try this support with this commit ?

@edisonf
Copy link

edisonf commented Apr 1, 2019

Another thing I would like to try is using multi_dev=1 see section 8.1.2.
Taking a look a the driver, an entry called multi_dev should be created under /sys/module/MODULE_NAME/parameters/.

@bvacaliuc
Copy link
Contributor Author

On Monday, April 01, 2019 2:53 PM, Edison Fernandez wrote:

Actually, could you please send me the log before you do the refclck work?

Ok, here is the log before the refclk adjustments:
boot-yocto-5cce0f757dccd11453a5af00373a233cbdf2d65a.log

ksz_mii_read()/ksz_mii_write() was spamming the console, so I implemented the RLL reporting method to only show changes observed.

@bvacaliuc
Copy link
Contributor Author

As I was setting up for the refclk test, I noticed that the .dtsi file was still setup for switching the GPIO_16, so I made a branch debug/enet-ref-clk-gpio_16, and reverted the two lines from the commit you referenced. The log/scope observation does not show any difference (i.e. the TXCLK did not suddenly fire up...), but I post it here for completeness:
boot-yocto-ecc8cc36c98e97a29dcbfc5ee550e0c04f14c8ed.log

@edisonf
Copy link

edisonf commented Apr 1, 2019

Could you try using memtool to check the GPIO configuration as you did here?

@bvacaliuc
Copy link
Contributor Author

It was still running from last night, so I did the commands:

root@var-som-mx6-ornl:~# /unit_tests/memtool -32 0x20E083C 1
E
Reading 0x1 count starting at address 0x020E083C

0x020E083C:  00000001

root@var-som-mx6-ornl:~# /unit_tests/memtool -32 0x20C80E0 4
E
Reading 0x4 count starting at address 0x020C80E0

0x020C80E0:  80002003 80002003 80002003 80002003

For completeness, I attach the console log. There are periodic changes to the phy 2 reg 2 that are captured for whatever that is worth:
boot-yocto-ecc8cc36c98e97a29dcbfc5ee550e0c04f14c8ed-memtest.log

Now, this was supposed to be the non-GPIO_16 run, but it behaves the same as the GPIO_16 one. I pulled up the imx6q-irs2-R0.dts that was made from the .dtb and it has this:

					enetgrp {
						fsl,pins = <0x58 0x36c 0x0 0x1 0x0 0x1b0b0 0x5c 0x370 0x0 0x1 0x0 0x1b0b0 0x60 0x374 0x0 0x1 0x0 0x1b0b0 0x64 0x378 0x0 0x1 0x0 0x1b0b0 0x68 0x37c 0x0 0x1 0x0 0x1b0b0 0x74 0x388 0x0 0x1 0x0 0x1b0b0 0x1d4 0x4e8 0x0 0x1 0x0 0xb0f1 0x84 0x398 0x844 0x1 0x0 0x1b0b0 0x70 0x384 0x848 0x1 0x0 0x1b0b0 0x78 0x38c 0x84c 0x1 0x0 0x1b0b0 0x7c 0x390 0x850 0x1 0x0 0x1b0b0 0x80 0x394 0x854 0x1 0x0 0x1b0b0 0x6c 0x380 0x858 0x1 0x0 0x1b0b0 0x1dc 0x4f0 0x0 0x5 0x0 0x831 0x248 0x618 0x83c 0x2 0x1 0x1b0d9>;
						linux,phandle = <0x2e>;
						phandle = <0x2e>;
					};

The problem is that the imx6qdl-iris2-R0.dtsi has this:

                pinctrl_enet: enetgrp {
                        fsl,pins = <
                                MX6QDL_PAD_RGMII_TXC__RGMII_TXC         0x0001b0b0
                                MX6QDL_PAD_RGMII_TD0__RGMII_TD0         0x0001b0b0
                                MX6QDL_PAD_RGMII_TD1__RGMII_TD1         0x0001b0b0
                                MX6QDL_PAD_RGMII_TD2__RGMII_TD2         0x0001b0b0
                                MX6QDL_PAD_RGMII_TD3__RGMII_TD3         0x0001b0b0
                                MX6QDL_PAD_RGMII_TX_CTL__RGMII_TX_CTL   0x0001b0b0
                                MX6QDL_PAD_ENET_REF_CLK__ENET_TX_CLK    0x0001b0b0
                                MX6QDL_PAD_RGMII_RXC__RGMII_RXC         0x0001b0b0
                                MX6QDL_PAD_RGMII_RD0__RGMII_RD0         0x0001b0b0
                                MX6QDL_PAD_RGMII_RD1__RGMII_RD1         0x0001b0b0
                                MX6QDL_PAD_RGMII_RD2__RGMII_RD2         0x0001b0b0
                                MX6QDL_PAD_RGMII_RD3__RGMII_RD3         0x0001b0b0
                                MX6QDL_PAD_RGMII_RX_CTL__RGMII_RX_CTL   0x0001b0b0
                                MX6QDL_PAD_ENET_CRS_DV__GPIO1_IO25      0x00000831 /* PHY Reset */
                        >;
                };

So I am not expecting the 0xb0f1 and the 0x1b0d9 codes. This tells me that it did not regenerate the .dtb file following the change to the .dtsi file. Grrr. It is SO FRUSTRATING that this build system doesn't handle dependencies properly. I rebuild the kernel using make kernel from here:
https://github.com/uvdl/yocto-ornl/blob/project/debug_ksz9893/Makefile#L158

I think I captured the steps you recommended from your Friday, March 15, 2019 12:25 PM email. Could you double check?

@edisonf
Copy link

edisonf commented Apr 2, 2019

The steps you followed to build the kernel are correct. Just keep in mind that the dtb will be under your $(KERNEL_BUILD)/arch/arm/boot/dts directory.

Regarding the refclk I'm confused. was it running on your board?

@bvacaliuc
Copy link
Contributor Author

Ah, ok. So I use the make archive to pack up the files on the AWS instance and transfer them to my local machine. I need another step in make kernel to copy them to the $(YOCTO_DIR)/$(YOCTO_ENV)/tmp/deploy/images/$(MACHINE) directory. I have already started a full rebuild so I need to wait for that to complete...

So refclk as probed is passed in to J3.52 (GPIO1[23]) on the DART-MX6. So yes, it is running, but I do not think it has enough drive for some reason. I'm going to probe an unloaded shim to see what the clock generator chip is doing (its a DSC1001DL5-125.0000).

@edisonf
Copy link

edisonf commented Apr 2, 2019

Based on macb.c I added the sw_init function to the fec driver. You can see the commit here.

Could you please give it a try?

@bvacaliuc
Copy link
Contributor Author

Ok, first of all, let me report on what I have learned so far from the hardware debug. It turns out that my shim had the incorrect part number for the +3.3V regulator to the clock chip. It was only producing +1.5V, so that explains the low output level. For the purpose of the test, I bypassed the regulator and supplied the oscillator from +3.7V thru a 36ohm resistor. Here is the difference:

shim-bypass-regulator-enet-ref-clock
Ok, so now with a correctly functioning REFCLK on a new shim, I performed the boot. First observation is that until the eth0 interface is brought up the situation looks like this:

pre-eth0-up
Upon login, when the eth0 link comes up, the observation is thus:

eth0-up
Then when I do a ping 192.168.1.1, I get this, when triggering on TXCTL:

ping-on-eth0
There is no reply, but I suspect this has more to do with the setup of the KSZ9893 than anything else. I will try your commit next, but I'm happy that we are making progress!

@bvacaliuc
Copy link
Contributor Author

Ok, well with your commits, the kernel is panicking. I made a fix/enable-ksz9893 branch to try and resolve it, but I only got as far as the panic in port_get_link_speed(). Here is the log from uvdl/linux-imx@81c3dee on that branch:
boot-yocto-81c3deefca12b5b536958155886131b0df8fa8c2-panic.log

Basically delayed work is running with incomplete setup. Some clues are found in that the traps I put in to ksz_mii_read() are showing that sw->ops is NULL for phy_id 0 during the call ethtool. Eventually it gets into port_get_link_speed() and an element that I have not been able to trap.

I had to make two commits to get it past bootup: uvdl/linux-imx@03ed24a, uvdl/linux-imx@5a0ec85

@edisonf
Copy link

edisonf commented Apr 3, 2019

Could you please give this commit a try?

It does the phy_connect in the fec_enet_mii_probe function instead of the ksz sw initialization.

@bvacaliuc
Copy link
Contributor Author

Hmm. So I am having a hard time reproducing the state that I had when I posted #3 (comment). This would have been done against the kernel state in uvdl/linux-imx@ecc8cc3.

When I run that same kernel build today, I get the following when I ping:
TEK00013

Granted, I am using a dodgy circuit to power the oscillator, but could that cause TXCLK to STOP? And why is it tracking TXCTL during pings? (I expect TXCTL to trigger during ping as the iMX6 MAC is sending to the KSZ9893, but I expect TXCLK to be a stable 125Mhz synced to ENET_REF_CLK)

Here is the same condition, but probing the REFCLK oscillator supply:
TEK00012

Notice the very strong ripple on the power supply. That cannot be good, but can it explain the behavior of TXCLK?

@bvacaliuc
Copy link
Contributor Author

I posted this invitation on the NXP forum for collaboration. If you are coming here from that, WELCOME!
We have switch hardware, diagnostics and are highly motivated to get this working! :)

@bvacaliuc
Copy link
Contributor Author

Activity during the last couple of days:

On Thursday, April 04, 2019 3:59 PM, Edison Fernandez wrote:

Could you please verify what's in phy_interface of the sw.0:xx buses under /sys/bus/mdio_bus/devices/ ? I just want to double check the switch is configured as we think it is.

/sys/bus/mdio_bus/devices/fixed-0:00:
of_node  phy_has_fixups  phy_id  phy_interface  power  subsystem  uevent

/sys/bus/mdio_bus/devices/sw.0:00:
driver  phy_has_fixups  phy_id  phy_interface  power  subsystem  uevent

/sys/bus/mdio_bus/devices/sw.0:01:
driver  phy_has_fixups  phy_id  phy_interface  power  subsystem  uevent

/sys/bus/mdio_bus/devices/sw.0:02:
driver  phy_has_fixups  phy_id  phy_interface  power  subsystem  uevent

/sys/bus/mdio_bus/devices/sw.0:03:
driver  phy_has_fixups  phy_id  phy_interface  power  subsystem  uevent
cat /sys/bus/mdio_bus/devices/*/phy_interface
/sys/bus/mdio_bus/devices/fixed-0:00/phy_interface: gmii
/sys/bus/mdio_bus/devices/sw.0:00/phy_interface: rgmii-id
/sys/bus/mdio_bus/devices/sw.0:01/phy_interface: gmii
/sys/bus/mdio_bus/devices/sw.0:02/phy_interface: gmii
/sys/bus/mdio_bus/devices/sw.0:03/phy_interface: gmii

This is from uvdl/linux-imx@ecc8cc3 which has become the reference for a lot of testing since it correctly triggers TXCTL during the ping test…

On Fri, Apr 5, 2019 at 11:46 AM, Edison Fernandez wrote:

I have some doubts about the switch configuration.
Is a phy_id directly related with a physical port? If that's the case, shouldn't sw.0:02 interface be RGMII?
Also, if the ports and the phy_ids are related, shouldn't the net device be attached to phy_id 2 (it is currently attached to phy_id 0)?

To answer this I have a couple of tests I would like to run.

  1. Register 0x3301 of the switch has the configuration of the MAC port (see page 120 of the datasheet), we can verify in there the actual configuration of the port.
  2. If phy_id and port are related, we can easily test attaching to phy 2 instead of 0, you can just change this line to be "for (phy_id = 2; (phy_id < PHY_MAX_ADDR); phy_id++) {". That way it should attach to phy_id 2 instead of 0 as it always uses the first registered phy it finds.

Results from test 1:
boot-yocto-ecc8cc36c98e97a29dcbfc5ee550e0c04f14c8ed-2019-04-05_1745.log

It did not appear that the dbg_msg calls got emitted after writing 8 to printk. I remember I had to change a #define in the driver source code to get them (from before, I didn’t do it here).

The register 0x3301 was returning 0xcf. There is a discrepancy between the KSZ9893 datasheet p. 121 table 5.2.3.2 and the KSZ9897 Switch Reference Guide pg. 14-15 regarding register 0xN301. From what I could tell, only ‘b11’ in bits 1-0 (RGMII) produced a working RXCLK (as seen on the scope), so while the sysfs ‘sw/reg’ interface as described in the switch reference guide seems to function, the datasheet remains authoritative. I will confirm this with Microchip.

0xcf leads us to think that its not receiving at 1000 (since the datasheet says that bit 6 is the port speed). Curiously, I have strapped LED2_1 to GND, so that should select ‘0’, yet the register reads back ‘1’. Anyway, it seems like 0x07 is a more reasonable value for this register, so I included a test where this is also done.

@bvacaliuc
Copy link
Contributor Author

I had setup another DART-MX6 on the DT6C board running tcpdump -i eth0 -v to see if any packets made it to that interface. It turns out that it is quite likely that many of the previous tests in which TXCTL was triggering would have translated into IP traffic (ARP and ICMP).

Using test 2 (uvdl/linux-imx@b96df96), I observed a variety of unusual behavior looking at the scope for TXCTL/TXCLK and RXCTL/RXCLK. What I found was that by changing the value of register 0x3301, I could affect the frequency of the RXCLK. For example:

0x3301 value RXCTL active RXCLK
0xcf N 125MHz
0x00 Y 25MHz
0x57 N 125MHz
0x13 Y 25MHz
0x53 Y 125MHz
0x5b Y 125MHz

When I used 0x53, I started getting intermittent responses, when I used 0x5b, the interface stabilized:

R0-1-to-DT6C-ethernet

So I think we are close to zero-ing in on a minimal set of configuration needed to make this work.

@bvacaliuc
Copy link
Contributor Author

bvacaliuc commented Apr 8, 2019

Ok, so as best as I can tell there are two settings that make everything work:

  1. 0x3301=0x5b
  2. 0x3800=0x02

These have to be set when eth0 is down. Here is a clean run from power up that has the trace:

boot-yocto-ecc8cc36c98e97a29dcbfc5ee550e0c04f14c8ed-0x5b-0x02.log

So next steps are to determine what (if any) modifications are actually required to the fec driver or if simply bringing up the switch driver (which enables setting those registers thru sysfs) would be sufficient.

@bvacaliuc
Copy link
Contributor Author

With uvdl/linux-imx@40704e0, I am able to use the Generic PHY (in a suboptimal, no flow control way):

boot-yocto-40704e0706645154c33fe67e18c4161714421664-0x5b-0x02.log

@bvacaliuc
Copy link
Contributor Author

Actually, I no longer think that it is necessary to set the sniffer port (reg 0x3800). Register 0x3301=0x5b remains key.

@bvacaliuc
Copy link
Contributor Author

Tuesday, April 09, 2019 11:47 AM, Edison Fernandez wrote:

I think all we need to do to get it working is:

  1. Attach to the sw.0 bus (so far this requires to modify the fec driver)
  2. Find the way of properly changing registers 0x3301 and 0x3800. In the worst case this will require to modify the ksz driver.

On Tue, Apr 9, 2019 at 11:09 AM Vacaliuc, Bogdan wrote:

Yes. Interesting, attaching the phy is not needed for traffic, but it will allow someone to use phytool to read back the registers. I suspect it could also allow the interface to configure different speeds and use flow control which right now we do not have.

So you think it is not possible for the fec driver to attach to the sw.0 bus without modification?

On Wednesday, April 10, 2019 1:06 PM Edison Fernandez wrote:

Yes, I don't think it is possible. We need to have the fec driver to use the switch bus in order for it to attach to sw.0 and I don't see how we could do that without modifying the fec driver.

I got pretty close with uvdl/linux-imx@87f0822; it may be that we can modify the ksz driver in such a way as to accept the MDIO bus created by the fec. What do you think?

@edisonf
Copy link

edisonf commented Apr 10, 2019

The problem I see with that is that when the fec driver creates its bus it also registers its own read/write functions and we need the bus to use the ones implemented by the switch driver which are that ones that at the end communicate with the switch via SPI in our case.

@bvacaliuc
Copy link
Contributor Author

Ok, well then is there any other driver that does this and how is it implemented? What I'd like to do is find a way to modify the fec driver minimally and using just the DT. The plan is to have something that can be pushed upstream.

@bvacaliuc
Copy link
Contributor Author

Adding the 'go.sh' script that I used above for integration into other distro builds:
go.sh.txt

bvacaliuc pushed a commit to uvdl/linux-imx that referenced this issue May 19, 2019
@bvacaliuc
Copy link
Contributor Author

A new datasheet is available for the KSZ9893, updated 2019-06-11. Table 3-3, pg. 15 updates the strapping resistor settings, which shows that for RGMII operation LED2_0 should be pull-down to disable IBS (Register 0x3301, bit 2).

Now, when the system boots, it reads:

# echo 0x3301 > /sys/bus/spi/devices/spi0.0/sw/reg
4b

(the setting needs to read back 5b)

Well, so it does not appear that any further strapping can help here, as the bit in question above is the RGMII ingress delay for the clock. From pg. 122 of KSZ9893R, this defaults to 0, does not have a strapping setting, and (unfortunately) needs to be 1 in our application.

So it appears that we still cannot achieve a driverless operation of the KSZ9893 as the delay can only be set by programming the 0x3301 register.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants