Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update TI bits in linux-iot2050 #411

Merged
merged 7 commits into from Feb 15, 2023
Merged

Update TI bits in linux-iot2050 #411

merged 7 commits into from Feb 15, 2023

Conversation

jan-kiszka
Copy link
Collaborator

Align the BSP kernel regarding its TI downstream or backport bits with latest ti-linux. Also update the prueth firmware along that. Should resolve #368 and supersedes #374.

This does not including a stable update of the underlying CIP kernel yet as we are waiting for a recent cip-rt release.

@attila-hannibal
Copy link

Dear All!

We tried the Siemens reference image (Chao's branch). The eno1 interface was requested to use DHCP. After several power cycles the eno1 couldn't get IP address. Based on the statistics it seems this the well known issue, that the RX packets got stuck at driver level and couldn't reach the higher levels. Long story short: the DHCP responses (RX packets) are lost in the driver and the Linux cannot configure the IP.

The statistics in the ethtool dump shows non-zero RX counters. However ifconfig / netstat shows zero RX bytes:
ethtool.txt
ifconfig.txt
journalctl-b.log
netstat.txt

Will try Jan's branch variant

@attila-hannibal
Copy link

I tried a clean build from jan/kernel-update branch

The network interfaces don't work at all.
After 30 seconds a kernel dump appears on the debug console:
...
[ 136.998977] ------------[ cut here ]------------
[ 137.003638] NETDEV WATCHDOG: eno2 (icssg-prueth): transmit queue 0 timed out
[ 137.010788] WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:467 dev_watchdog+0x314/0x320
[ 137.019038] Modules linked in: ti_am335x_adc kfifo_buf irq_pruss_intc rfkill icssg_prueth pru_rproc icss_iep ptp cp210x pps_core usbserial ti_k3_r5_remoteproc ti_cal videobuf2_dma_contig ti_am335x_tscadc v4l2_fwnode videobuf2_memops pci_endpoint_test videobuf2_v4l2 videobuf2_common pruss at24 optee_rng rng_core fuse ip_tables x_tables ipv6
[ 137.049140] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.10.145-cip17 #1
[ 137.055739] Hardware name: SIMATIC IOT2050 Advanced PG2 (DT)
[ 137.061388] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[ 137.067382] pc : dev_watchdog+0x314/0x320
...
and then endlessly the following comes:
...
[ 180.006989] icssg-prueth icssg0-eth eno2: xmit timeout
[ 184.870978] icssg-prueth icssg0-eth eno2: xmit timeout
[ 189.990981] icssg-prueth icssg0-eth eno2: xmit timeout
...

This version is even worse, please double check at your side

logs attached:
console.txt

@jan-kiszka
Copy link
Collaborator Author

Thanks for reporting. I haven't seen this on any boards here so far. Same issue when going two commits backward (70ebcfa)?

@jan-kiszka
Copy link
Collaborator Author

@attila-hannibal please also check if a specific element of your network infrastructure contributes to this (eg. as specific switch, compared to cross-links).

@attila-hannibal
Copy link

Hello @jan-kiszka

I tried the mentioned commit 70ebcfa but the behaviour is the same, eno* interfaces don't work, kernel dump appears after 1-2 minutes
I had a guess that maybe the TI pruss firmware is too new, so I made a version rollback "08.06.00.001" -> "08.02.00.002", but did not help

@jan-kiszka
Copy link
Collaborator Author

And if you leave out the firmware update completely? In my tests, our current firmware still worked.

BTW, please also explore my other question if your network infrastructure influences this. I have no luck reproducing it, boxes run for hours with all versions.

@attila-hannibal
Copy link

attila-hannibal commented Jan 20, 2023

By using the binary artefact from: https://github.com/siemens/meta-iot2050/suites/10438058123/artifacts/516591580

The only change besides the root password config at first startup I modified the /etc/network/interfaces file:

# interfaces(5) file used by ifup(8) and ifdown(8)
# Include files from /etc/network/interfaces.d:
source /etc/network/interfaces.d/*

auto eno1
iface eno1 inet dhcp

auto eno2
iface eno2 inet static
address 192.168.214.230
netmask 255.255.255.0

interfaces does not work, kernel dumps after while
note: during the ping test I pulled out then plugged in the UTP cable
console.txt

@jan-kiszka
Copy link
Collaborator Author

Ok, we can rule out build issues on your side.

Some more things to rule out still: please have a look at my other suggestions.

@attila-hannibal
Copy link

Using the Chao's branch as source
The iot2050 eno2 was directly connected to a workstation PC using static IP's

  • iot2050 IP: 192.168.214.230
  • workstation IP: 192.168.214.20
    This time eno1 was not connected.
    test: pinging from iot2050 to workstation
    The iot2050 tries to resolve the MAC for the IP, this is answered from the PC, see the attached capture.
    arp.zip
    Since the ARP responses are lost, the iot2050 continues the resolution.
    The ARP table remains empty:
    root@iot2050-debian:~# arp
    Address HWtype HWaddress Flags Mask Iface
    192.168.214.20 (incomplete) eno2

The ifconfig shows no RX data
eno2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.214.230 netmask 255.255.255.0 broadcast 192.168.214.255
inet6 fe80::8ef3:19ff:fe6c:ee42 prefixlen 64 scopeid 0x20
ether 8c:f3:19:6c:ee:42 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 2872 bytes 809966 (790.9 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

The dmesg has the "icssg-prueth icssg0-eth eno2: timeout waiting for command done" line

So the error is present with simple point-to-point connection

@jan-kiszka
Copy link
Collaborator Author

jan-kiszka commented Jan 20, 2023

Please don't change two variables a the same time (here: sources AND network setup).

Would still like to see

  • a test with the firmware from current master and basic driver update (69fae11)
  • a complete tcpdump taken on the device

In addition, please confirm

  • issue is gone when using current master
  • a second device does expose the same problem

@attila-hannibal
Copy link

After the many clean recompilation I recognized that we may use a different SW than the "Chao's branch", it should be an older variant of your master branch. I guess in the past the image to be written was iot2050-image-example-iot2050-debian-iot2050.wic.img (with .img suffix) and the current build creates an other filename without .img suffix.
When I did the git branching and started the build, the old file remained in the build directory and I burned the old one to the USB stick.
The reason I'm thinking the kernel variant is 5.10.104-cip3 in the logs that was already replaced in October and both of your dev branches use 5.10.145 and the master, too.
Sorry it was my mistake

I have the same Ethernet ports non-working issue with the Chao branch, too.

Let me example our test setups, we have two iot2050 instances

  • the first device is installed in our testlab, that does not use the Chao's source, but and older variant of the master branch compiled on 21st September 2022. This one produces the incoming RX packet loss in the driver level. This issue happens randomly (this is our original problem), but when it occurs, no packet is transferred to higher level, not at all. So the RX side got stalled right after the beginning, not just after X minutes or so. This setup is normally connected to switches but for last time we used a direct connection to the workstation. From this perspective there was only one change. Here we changed only the networking to simplify the environment, no DHCP, no switch, just direct connection to PC using static IP
  • the second device is at my desk where I connect the eno1 interface to corporate network that supports DHCP, while the eno2 is static IP and uses a direct connection to my laptop. When I'm switching the branches I use this device and this setup produces the kernel dump, this setup really uses the latest build compiled from scratch. The non-working Ethernet ports happen by using jan/kernel-update and chao/catch-up-icssg-driver branches. In case I use the actual master branch state (e2f9367) the network interfaces are working in terms of basic RX-TX data, ping works. Not sure about the "original problem" written above, we need to perform many restarts

will do the logs you requested later.

@attila-hannibal
Copy link

we may start an other approach.
When the "stalled Ethernet port" problem happens we have this log in the dmesg: "icssg-prueth icssg0-eth eno2: timeout waiting for command done".
This error is written in the "emac_set_port_state()" function: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/drivers/net/ethernet/ti/icssg_config.c?h=ti-linux-5.10.y#n580
The function sends 4 commands (each command is an uint32_t) to the R30 / R32 register addresses, then polls the registers if the TI firmware on the other end cleared them or not. When any of the 4 registers is not cleared the driver prints this error, this is we get randomly.

So based the symptom it seems the TI firmware running on the PRU got stock somehow and cannot handle the commands.

Do you know if we can have some debug possibility to verify if the firmware running on the PRU is "healthy" or have some issues?

@attila-hannibal
Copy link

we may start an other approach. When the "stalled Ethernet port" problem happens we have this log in the dmesg: "icssg-prueth icssg0-eth eno2: timeout waiting for command done". This error is written in the "emac_set_port_state()" function: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/drivers/net/ethernet/ti/icssg_config.c?h=ti-linux-5.10.y#n580 The function sends 4 commands (each command is an uint32_t) to the R30 / R32 register addresses, then polls the registers if the TI firmware on the other end cleared them or not. When any of the 4 registers is not cleared the driver prints this error, this is we get randomly.

So based the symptom it seems the TI firmware running on the PRU got stock somehow and cannot handle the commands.

Do you know if we can have some debug possibility to verify if the firmware running on the PRU is "healthy" or have some issues?

This a command register dump when the problem occurred: the first value should be 0xffff0000 (EMAC_NONE), too, but it stays 0xffbb0000

root@DM8CF3196AE6B5:~# devmem2 0xb0005ac
/dev/mem opened.
Memory mapped at address 0xffffbebe5000.
Read at address  0x0B0005AC (0xffffbebe55ac): 0xFFBB0000
root@DM8CF3196AE6B5:~# ^C
root@DM8CF3196AE6B5:~# devmem2 0xb0005b0
/dev/mem opened.
Memory mapped at address 0xffffb2770000.
Read at address  0x0B0005B0 (0xffffb27705b0): 0xFFFF0000
root@DM8CF3196AE6B5:~# devmem2 0xb0005b4
/dev/mem opened.
Memory mapped at address 0xffff809b4000.
Read at address  0x0B0005B4 (0xffff809b45b4): 0xFFFF0000
root@DM8CF3196AE6B5:~# devmem2 0xb0005b8
/dev/mem opened.
Memory mapped at address 0xffff90003000.
Read at address  0x0B0005B8 (0xffff900035b8): 0xFFFF0000

@jan-kiszka
Copy link
Collaborator Author

I'm trying my contacts to TI. Maybe we will get some further hints how to analyze this best.

@attila-hannibal
Copy link

attila-hannibal commented Jan 24, 2023

Some further testing result:
SW level: jan/kernel-update, this determines the TI pruss firmware: 08.06.00.001

test scenarios:

  1. eno1: dhcp, eno2: static, UTP cables connected at startup -> none of the interfaces are working
  2. eno1: dhcp, eno2: static, UTP cables not connected startup, after login connect eno1, then eno2 -> eno1 does not work, eno2 works
  3. eno1: dhcp, eno2: static, UTP cables not connected startup, after login connect eno2, then eno1 -> eno2 works, eno1 does not work
  4. eno1: dhcp, eno2: static, only eno2 UTP cables is connected at startup -> eno2 works
  5. eno1: static, eno2: dhcp, UTP cables connected at startup -> both interfaces are working
  6. eno1: unconfigured, eno2: static, UTP cables connected at startup, after login configure eno1 (command: "ifconfig eno1 0.0.0.0 0.0.0.0 && dhclient") -> both interfaces are working
  7. eno1: dhcp, eno2: static, UTP cables connected at startup + using TI pruss firmware: 08.02.00.002 -> none of the interfaces are working
  8. eno1: dhcp, eno2: static, UTP cables connected at startup + using TI pruss firmware: 08.00.00.004 -> none of the interfaces are working

I think we can narrow down the issue: during the boot when the eno1 is connected already and configured as dhcp, there should be some kind of deadlock that causes the issue. It seems only eno1 has such a behaviour (see scenario 5) ), also when the interface configured "late" (see scenario 6) ) it works. That may answers why our Linux image has the non-persistent behaviour, because something depends on the timing and/or boot step sequence.

the 4 command registers for scenario 1) seems all fine

root@iot2050-debian:~# memtool md 0xb0005ac
0b0005ac: ffff0000 ffff0000 ffff0000 ffff0000                ................
0b0005bc: 70020000 00000000 00000000 00000000                ...p............
0b0005cc: 00000000 00000000 00000000 00000000                ................

@applea9
Copy link
Contributor

applea9 commented Jan 28, 2023

@attila-hannibal Do you have tested with scenario that both eno1 and eno2 are static? I met issue with this scenario before.

@BaochengSu BaochengSu mentioned this pull request Jan 28, 2023
@BaochengSu
Copy link
Collaborator

@jan-kiszka I found the commit 4a80b17 really helps a lot from the rebase suffering, however, irrelevant to this issue.

So I've just extracted it to a separate PR #414.

@BaochengSu
Copy link
Collaborator

Some further testing result: SW level: jan/kernel-update, this determines the TI pruss firmware: 08.06.00.001

test scenarios:

  1. eno1: dhcp, eno2: static, UTP cables connected at startup -> none of the interfaces are working
  2. eno1: dhcp, eno2: static, UTP cables not connected startup, after login connect eno1, then eno2 -> eno1 does not work, eno2 works
  3. eno1: dhcp, eno2: static, UTP cables not connected startup, after login connect eno2, then eno1 -> eno2 works, eno1 does not work
  4. eno1: dhcp, eno2: static, only eno2 UTP cables is connected at startup -> eno2 works
  5. eno1: static, eno2: dhcp, UTP cables connected at startup -> both interfaces are working
  6. eno1: unconfigured, eno2: static, UTP cables connected at startup, after login configure eno1 (command: "ifconfig eno1 0.0.0.0 0.0.0.0 && dhclient") -> both interfaces are working
  7. eno1: dhcp, eno2: static, UTP cables connected at startup + using TI pruss firmware: 08.02.00.002 -> none of the interfaces are working
  8. eno1: dhcp, eno2: static, UTP cables connected at startup + using TI pruss firmware: 08.00.00.004 -> none of the interfaces are working

I think we can narrow down the issue: during the boot when the eno1 is connected already and configured as dhcp, there should be some kind of deadlock that causes the issue. It seems only eno1 has such a behaviour (see scenario 5) ), also when the interface configured "late" (see scenario 6) ) it works. That may answers why our Linux image has the non-persistent behaviour, because something depends on the timing and/or boot step sequence.

the 4 command registers for scenario 1) seems all fine

root@iot2050-debian:~# memtool md 0xb0005ac
0b0005ac: ffff0000 ffff0000 ffff0000 ffff0000                ................
0b0005bc: 70020000 00000000 00000000 00000000                ...p............
0b0005cc: 00000000 00000000 00000000 00000000                ................

Hi @attila-hannibal,

We've tried these scenarios with the action build https://github.com/siemens/meta-iot2050/actions/runs/3948486240, however none is reproduced.

So there might be something nuanced different between our setup and yours which lead to the non-producible. So it would be helpful if we have below information (some of them just a double confirmation to make sure we are on the same page):

  • The reproduce rate for these scenarios. From your description I get a feeling that it was very easy to reproduce the issue within the scenarios, i.e. you don't have to perform lots of reboot to trigger the issue.
  • Which hardware version? We were trying on two PG2 advanced iot2050, with 1YA2 and FS: 04,
  • FW version in your setup. We've tried with two different firmware, one is the released v01.03.01 firmware, the other is the firmware built from the above action.
  • Is there any additional peripheral such as PCIE card, DP monitor, etc. in your setup? A picture of your device setup could be more helpful. We only have the usb-sd-card reader and the FTDI uart cable connected in our setup.
  • What is the network segments for both eno1 and eno2? We are using two different segments for static (192.168.200.0/24) and dhcp(192.168.1.0/24).
  • Is there any other images in the eMMC or SD card slot? This is just in case to avoid accidentally booting to another image. You can check the current boot target via fw_printenv | grep boot_targets, or you can check the image build ID from /etc/os-release file to make sure the current booting image is with build id fd691b7.
  • For the static profile that directly connected to the PC, what kind of Ethernet port are you using on the PC side, is it the PC native port or some USB network adapter? We've tried both.
  • How do you define not working in the issue? is it not pinging? or DHCP IP not showing up? or static IP losing? We were using the ping command to determine the working/non-working.
  • What is your DHCP environment? We think it is your company DHCP?
  • What is the network tool you are using? We are using nmtui and we think you are using the /etc/network ?

@attila-hannibal
Copy link

Some further testing result: SW level: jan/kernel-update, this determines the TI pruss firmware: 08.06.00.001
test scenarios:

  1. eno1: dhcp, eno2: static, UTP cables connected at startup -> none of the interfaces are working
  2. eno1: dhcp, eno2: static, UTP cables not connected startup, after login connect eno1, then eno2 -> eno1 does not work, eno2 works
  3. eno1: dhcp, eno2: static, UTP cables not connected startup, after login connect eno2, then eno1 -> eno2 works, eno1 does not work
  4. eno1: dhcp, eno2: static, only eno2 UTP cables is connected at startup -> eno2 works
  5. eno1: static, eno2: dhcp, UTP cables connected at startup -> both interfaces are working
  6. eno1: unconfigured, eno2: static, UTP cables connected at startup, after login configure eno1 (command: "ifconfig eno1 0.0.0.0 0.0.0.0 && dhclient") -> both interfaces are working
  7. eno1: dhcp, eno2: static, UTP cables connected at startup + using TI pruss firmware: 08.02.00.002 -> none of the interfaces are working
  8. eno1: dhcp, eno2: static, UTP cables connected at startup + using TI pruss firmware: 08.00.00.004 -> none of the interfaces are working

I think we can narrow down the issue: during the boot when the eno1 is connected already and configured as dhcp, there should be some kind of deadlock that causes the issue. It seems only eno1 has such a behaviour (see scenario 5) ), also when the interface configured "late" (see scenario 6) ) it works. That may answers why our Linux image has the non-persistent behaviour, because something depends on the timing and/or boot step sequence.
the 4 command registers for scenario 1) seems all fine

root@iot2050-debian:~# memtool md 0xb0005ac
0b0005ac: ffff0000 ffff0000 ffff0000 ffff0000                ................
0b0005bc: 70020000 00000000 00000000 00000000                ...p............
0b0005cc: 00000000 00000000 00000000 00000000                ................

Hi @attila-hannibal,

We've tried these scenarios with the action build https://github.com/siemens/meta-iot2050/actions/runs/3948486240, however none is reproduced.

So there might be something nuanced different between our setup and yours which lead to the non-producible. So it would be helpful if we have below information (some of them just a double confirmation to make sure we are on the same page):

  • The reproduce rate for these scenarios. From your description I get a feeling that it was very easy to reproduce the issue within the scenarios, i.e. you don't have to perform lots of reboot to trigger the issue.
  • Which hardware version? We were trying on two PG2 advanced iot2050, with 1YA2 and FS: 04,
  • FW version in your setup. We've tried with two different firmware, one is the released v01.03.01 firmware, the other is the firmware built from the above action.
  • Is there any additional peripheral such as PCIE card, DP monitor, etc. in your setup? A picture of your device setup could be more helpful. We only have the usb-sd-card reader and the FTDI uart cable connected in our setup.
  • What is the network segments for both eno1 and eno2? We are using two different segments for static (192.168.200.0/24) and dhcp(192.168.1.0/24).
  • Is there any other images in the eMMC or SD card slot? This is just in case to avoid accidentally booting to another image. You can check the current boot target via fw_printenv | grep boot_targets, or you can check the image build ID from /etc/os-release file to make sure the current booting image is with build id fd691b7.
  • For the static profile that directly connected to the PC, what kind of Ethernet port are you using on the PC side, is it the PC native port or some USB network adapter? We've tried both.
  • How do you define not working in the issue? is it not pinging? or DHCP IP not showing up? or static IP losing? We were using the ping command to determine the working/non-working.
  • What is your DHCP environment? We think it is your company DHCP?
  • What is the network tool you are using? We are using nmtui and we think you are using the /etc/network ?

Hello @BaochengSu

I used the iot2050-example-image.zip as wic image from the build you linked in. I copied it to a 64GB large USB3 stick. I also replaced the SPI flash content with the iot2050-pg2-image-boot.bin from the build to have the same FW/SW as you have.
content of the /etc/network/interfaces files is the same as before:
root@iot2050-debian:~# cat /etc/network/interfaces
source /etc/network/interfaces.d/*

auto eno1
iface eno1 inet dhcp

auto eno2
iface eno2 inet static
address 192.168.214.230
netmask 255.255.255.0

(no other change has been made to you reference image)
Answering your question 1-by-1:

  1. the issue with the networking is persistent now. I tried 10 restarts and all the 10 were faulty. The failure is explained in point 8.

  2. I'm attaching a picture from the sticker on the machine. we use the same HW variant
    sticker

  3. I use now the one from the build artifacts

  4. nothing special, I'm attaching a picture. I use the pendrive as boot source / rootfs. The eMMC is erased. I have the 2 UTP cables connected to the networking ports, the debug UART and the 24V power supply
    20230202_142844

  5. the eno1 (DHCP) connection should obtain an IP address from network range: 10.23.0.*** , but it does not work, see 8. the eno2 (static) uses 192.168.214.*, see above the /etc/network/interfaces file

  6. There is no SD card at all, the eMMC is erased, the system boots from USB stick. I'm attaching the os-release file content:
    release.txt the build is correct!

  7. the static connection use an U-Green USB-Ethernet adapter, that has an "AX88179 Gigabit Ethernet" chip on it.

  8. "The problem" - that happens always when I use your latest image(s) - the eno1 DHCP connection does not get any IP address, nothing at all. Using the eno2 port the iot2050 cannot ping my laptop (laptop ip: 192.168.214.1), I executed ping on both machines, tcpdump on my machine shows no ARP response:
    14:43:48.621038 ARP, Request who-has 192.168.214.230 tell 192.168.214.1, length 28
    14:43:49.643279 ARP, Request who-has 192.168.214.230 tell 192.168.214.1, length 28
    14:43:50.668756 ARP, Request who-has 192.168.214.230 tell 192.168.214.1, length 28
    14:43:51.690955 ARP, Request who-has 192.168.214.230 tell 192.168.214.1, length 28
    14:43:52.715215 ARP, Request who-has 192.168.214.230 tell 192.168.214.1, length 28
    tcpdump on the iot2050 is attached in the console log:
    debug_console.txt
    on the iot2050 it seems the ARP responses are sent but is not visible on the other side. After a while kernel dump happens and I get the "icssg-prueth icssg0-eth eno2: xmit timeout" periodically
    NOTE: as written before if the eno1 <-> eno2 are swapped in the /etc/network/interface files (and of course I swap the UTP cables). the problem does not appear. So please reproduce exactly the same network setup as I have.

  9. The DHCP is provided by the company I work for. What information do you need? See point 8.'s note: if I swap the eno1 and eno2 the DHCP works. Also there are about ~40 machines on this network those are working flawlessly, therefore I think nothing wrong with the DHCP environment.

  10. I use the systemd's built-in. See the timeout in the debug logs ( A start job is running for Raise network interfaces...)

Best regards
Attila

@jan-kiszka
Copy link
Collaborator Author

Wait, I missed that so far:

content of the /etc/network/interfaces files is the same as before:
root@iot2050-debian:~# cat /etc/network/interfaces
source /etc/network/interfaces.d/*

auto eno1
iface eno1 inet dhcp

auto eno2
iface eno2 inet static
address 192.168.214.230
netmask 255.255.255.0

We are using Network Manger in the default image. I'm sure if /etc/network/interface.d is properly evaluated at all. And even if: there is also /etc/NetworkManager/system-connections/eno1-default.

@attila-hannibal
Copy link

Hello @jan-kiszka

Ok, I confirm the "persistent network issue" is solved by using the correct network-manager.
I cleaned the /etc/network/interfaces file and configured the /etc/NetworkManager/system-connections/ connection-files via the nmtui tool, the system works as expected.

Now we have to go back to our original problem when the network loss happened randomly, and when this problem happens we have the "icssg-prueth icssg0-eth eno2: timeout waiting for command done" in the dmesg log

@jan-kiszka
Copy link
Collaborator Author

Great to hear. Hope the image can now help validating if that issue is gone as well. Please let us know when there are news or further findings/questions.

@lyxsiemens
Copy link
Contributor

After several power cycles the eno1 couldn't get IP address

@attila-hannibal
Could you tell me how did you do the power cycle? by "reboot" command or reset button or power cut?

@jan-kiszka jan-kiszka force-pushed the jan/kernel-update branch 2 times, most recently from b603787 to dc1cfec Compare February 3, 2023 13:19
These patches have been selected from ti-linux tag 08.02.00.006 which
corresponds to that SDK release. They roughly bring the icssg_prueth
driver on that level but also catches up in other not-yet--upstreamed
areas. This is the full diffstat of the kernel changes:

 arch/arm64/boot/dts/ti/k3-am65-main.dtsi   |   36 +
 drivers/dma/of-dma.c                       |   10 +
 drivers/dma/ti/dma-crossbar.c              |    6 +-
 drivers/dma/ti/k3-udma-glue.c              |  370 ++-
 drivers/dma/ti/k3-udma-private.c           |   39 +
 drivers/dma/ti/k3-udma.c                   | 2538 +++++++++++++++++----
 drivers/dma/ti/k3-udma.h                   |   27 +-
 drivers/firmware/ti_sci.c                  |    1 +
 drivers/firmware/ti_sci.h                  |    2 +
 drivers/irqchip/irq-pruss-intc.c           |   47 +-
 drivers/net/ethernet/ti/Kconfig            |    2 +
 drivers/net/ethernet/ti/Makefile           |    2 +-
 drivers/net/ethernet/ti/icss_iep.c         |    4 +-
 drivers/net/ethernet/ti/icss_iep.h         |    2 +-
 drivers/net/ethernet/ti/icss_mii_rt.h      |   93 +-
 drivers/net/ethernet/ti/icssg_classifier.c |    8 +
 drivers/net/ethernet/ti/icssg_config.c     |  485 +++-
 drivers/net/ethernet/ti/icssg_config.h     |   80 +-
 drivers/net/ethernet/ti/icssg_ethtool.c    |   84 +
 drivers/net/ethernet/ti/icssg_mii_cfg.c    |  105 +
 drivers/net/ethernet/ti/icssg_prueth.c     |  623 ++++-
 drivers/net/ethernet/ti/icssg_prueth.h     |   78 +-
 drivers/net/ethernet/ti/icssg_qos.c        |  476 ++++
 drivers/net/ethernet/ti/icssg_qos.h        |  136 ++
 drivers/net/ethernet/ti/icssg_switch_map.h |   11 +
 drivers/net/ethernet/ti/icssg_switchdev.c  |  494 ++++
 drivers/net/ethernet/ti/icssg_switchdev.h  |   13 +
 drivers/net/phy/dp83867.c                  |   15 +-
 drivers/pci/controller/dwc/pci-keystone.c  |    8 +-
 drivers/soc/ti/k3-ringacc.c                |  325 ++-
 include/linux/dma/k3-event-router.h        |   16 +
 include/linux/dma/k3-psil.h                |   16 +
 include/linux/dma/k3-udma-glue.h           |    8 +
 include/linux/dmaengine.h                  |   16 +
 include/linux/soc/ti/k3-ringacc.h          |   17 +
 35 files changed, 5494 insertions(+), 699 deletions(-)

The SDK came with 2 issues, one missing config SELECT and a regression
of half-duplex support for PG1. Related fixes are at the end of the
series.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
This brings us to the level of SDK 08.02.00.02, aligned with the current
kernel queue. Updating the firmware binaries separately from the kernel
is fine as the old firmware still worked with the newer driver.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Now usable with the latest icssg_prueth driver.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
…0.001

This comes again primarily with fixed for the icssg-prueth and DMA bits
but also has two SDHCI fixes and a feature enhancement on the timer
side. Diffstat:

 drivers/clocksource/timer-ti-dm.c       |   8 ++
 drivers/dma/ti/k3-udma-private.c        |   6 +-
 drivers/dma/ti/k3-udma.c                | 100 +++++++++++++++--
 drivers/mmc/host/sdhci-cqhci.h          |  24 +++++
 drivers/mmc/host/sdhci_am654.c          | 183 ++++++++++++++++++++++++++++----
 drivers/net/ethernet/ti/icss_mii_rt.h   |   1 +
 drivers/net/ethernet/ti/icssg_ethtool.c |  46 +++++++-
 drivers/net/ethernet/ti/icssg_mii_cfg.c |  16 +++
 drivers/net/ethernet/ti/icssg_prueth.c  | 132 +++++++++++++++--------
 drivers/net/ethernet/ti/icssg_prueth.h  |   5 +
 drivers/soc/ti/k3-ringacc.c             |   2 +-
 11 files changed, 438 insertions(+), 85 deletions(-)

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
This aligns the firmware again to the ti-linux kernel.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Allows to drop 3 patches that were merged into stable meanwhile.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
…nux version 08.06.00.004

Brings AF_XDP support for icssg-prueth, thought without zero-copy so
far.

Rename the ti-pruss-firmware recipe along this. Nothing changed there,
just reflect that firmware is still in sync with the kernel driver.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
@jan-kiszka
Copy link
Collaborator Author

@BaochengSu, @AsuraZeng, can we proceed with the MR? It addresses what #374 was fixing, avoids related regressions, and aligns with latest ti-linux (now 08.06.004, as recommended by T). It may just not resolve all issues of the prueth, but that should be shared with ti-linux at this point.

@BaochengSu
Copy link
Collaborator

Given that TI is about to release the 8.6 SDK soon, I am ok to proceed with 8.6 catchup.

@BaochengSu BaochengSu merged commit a3f036f into master Feb 15, 2023
@jan-kiszka jan-kiszka deleted the jan/kernel-update branch February 15, 2023 10:42
@jan-kiszka jan-kiszka mentioned this pull request Feb 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ethernet Port X1P2 RX is 0
6 participants