Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orange Pi 5 Plus | Network interface names can swap on reboot #6592

Closed
dirkhh opened this issue Sep 1, 2023 · 22 comments
Closed

Orange Pi 5 Plus | Network interface names can swap on reboot #6592

dirkhh opened this issue Sep 1, 2023 · 22 comments

Comments

@dirkhh
Copy link
Contributor

dirkhh commented Sep 1, 2023

Creating a bug report/issue

Required Information

  • DietPi version | cat /boot/dietpi/.version
G_DIETPI_VERSION_CORE=8
G_DIETPI_VERSION_SUB=22
G_DIETPI_VERSION_RC=-1
G_GITBRANCH='adsb-feeder'
G_GITOWNER='dirkhh'

this branch is bdde5bd plus one commit from me

  • Distro version | bookworm
  • Kernel version | Linux DietPi 5.10.160-rk35xx #1 SMP Thu Jul 6 13:36:29 UTC 2023 aarch64 GNU/Linux
  • SBC model | Orange Pi 5 Plus (aarch64)
  • Power supply used | YU0505 5A 5V power adapter with barrel->USB-C adapter
  • SD card used | SanDisk Extreme 64G

Additional Information (if applicable)

  • Software title | N/A
  • Can this issue be replicated on a fresh installation of DietPi? | YES

Steps to reproduce

  1. compile from source - one commit past current HEAD of the dev branch; error manifests before my code runs
  2. write to SD card
  3. boot
  4. hit enter, log in and wait for the first run to start

Expected behaviour

  • first run should complete to the point where user input is needed (keyboard layout, etc)

Actual behaviour

  • IP4 connectivity test fails

Extra details

  • simply asking for a subshell and running dhclient eth0 is sufficient to get an IP address
  • exiting the subshell allows the install to complete
@dirkhh
Copy link
Contributor Author

dirkhh commented Sep 1, 2023

... and after this happened three times in a row I just started yet another install and this time it worked... so I guess I should edit the title and description to include a 'sometimes' or something... 🤷🏼‍♂️

@Joulinar
Copy link
Collaborator

Joulinar commented Sep 1, 2023

can you check following as soon as the issue happen

journalctl -u ifup@eth0.service

@dirkhh
Copy link
Contributor Author

dirkhh commented Sep 1, 2023

interestingly the journal for both ifup@eth[01].service is empty.
Looking through the full system journal I notice that dhclient is started, but fails.
And an ip link list eth0 showed that it thought that the port had no carrier.
I checked at the router and curiously, the router claims that this is a 100Mb/s connection, while this is a 2.5Gb/s port.
Next I plugged into eth1 instead and manually ran dhclient eth1. And got an IP address in a few seconds. For this connection, the switch indicated that this was indeed a 1Gb/s connection (I don't have a 2.5G switch).

So this may be a kernel issue where the initialization of the wired ethernet ports isn't reliable? Maybe?
Not sure. Happy to provide any other debugging help you can think of.

@MichaIng
Copy link
Owner

MichaIng commented Sep 7, 2023

Generally verify that there is allow-hotplug eth0 or allow-hotplug eth1 uncommented in /etc/network/interfaces. On first boot, we loop through all (detected) network adapters, enable them one by one and configure the first with a carrier signal detected. If no carrier signal is detected, the first network adapter is tried to be configured, i.e. eth0, but likely fails as long as the signal does not come up.

Interesting that dhclient is started, as this should only happen if any of the two ifup@ services is started as well (or if an auto eth[01] entry is present, which triggers configuration via networking.service instead).

I'll try to replicate with on OPi 5 Plus.

@dirkhh
Copy link
Contributor Author

dirkhh commented Sep 7, 2023

So this happens on about one out of three reboots for me... Otherwise I love my OPi5+

@MichaIng
Copy link
Owner

MichaIng commented Oct 10, 2023

So I guess the network adapters are detected in inconsistent order and hence can have assigned different interface names each time. I'll try to replicate it and test some possible fixes.

That is the downside of non-predictable/classic interface names: They are simple and self-explaining, but named just in the order of detection, instead of by some unique identifier which is assured to remain the same.

Not the same, but similar: #6663

@MichaIng MichaIng added this to the v8.23 milestone Oct 10, 2023
@MichaIng MichaIng modified the milestones: v8.23, v8.24 Oct 21, 2023
@ethanbowering24
Copy link

Experiencing the same issue, documented here.

Inconsistent assignment of eth0 and eth1 to network adapters, observed by comparing MAC addresses with ip addr show.

If anyone knows of a workaround I would love to hear it, even if it includes disabling the second network adapter as I do not need it.

@MichaIng MichaIng modified the milestones: v8.24, v8.25 Nov 19, 2023
@MichaIng MichaIng changed the title no IP address when first install is run Orange Pi 5 Plus | Network interface names can swap on reboot Nov 19, 2023
@MichaIng MichaIng modified the milestones: v8.25, v9.0 Dec 20, 2023
@MichaIng MichaIng modified the milestones: v9.0, v9.1 Jan 20, 2024
@slynetG
Copy link

slynetG commented Feb 3, 2024

Having the same issue. Is there any news ? Thanks 🤗

@Joulinar
Copy link
Collaborator

Joulinar commented Feb 3, 2024

a possible workaround could be #6663 (comment)

@slynetG
Copy link

slynetG commented Feb 3, 2024

Tried the workaround you mentioned @Joulinar, but did not help. I'm out of ideas...
Best option would be to disable one of one the 2 interfaces, but i don't think it's possible on a OPI5+.

@yandritos
Copy link

Hi slynetG, I am suffering exactly the same problem on my orange pi 5 plus (OS installed on the nvme disk). The only workaround I found is to maintain connected both ethernet ports. After that I can reboot in calm the Orange when I am not at home (one of them get´s the fixed IP address), otherwise I run the risk to loose the connection. My experience is that the port is not switching randomly but sequentially, I got a different port enabled after each reboot (before connecting both ports).

@MichaIng MichaIng modified the milestones: v9.1, v9.2 Feb 20, 2024
@slynetG
Copy link

slynetG commented Feb 24, 2024

Thanks Yandritos. In DHCP i managed too using a reservation, but i do need fix IP (and DNS)... And i did not managed. On 2 boots out of 5, i get wrong assignement of interface and therefore no connection :(

@AVasiljev
Copy link

As a walkaround you can install # apt-get install ifenslave

and add Bond interface

root@Dunav:/etc/network/interfaces.d# cat ifcfg-bond0
auto bond0

iface bond0 inet static
    address 192.168.32.3
    netmask 255.255.255.0
    network 192.168.32.0
    gateway 192.168.32.1
    bond-slaves eth0 eth1
    bond-mode active-backup
    bond-miimon 100
    dns-nameservers 192.168.32.1 192.168.32.3 192.168.42.1

allow-bond0 eth0
iface eth0 inet manual

allow-bond0 eth1
iface eth1 inet manual

@MichaIng MichaIng modified the milestones: v9.2, v9.3 Mar 17, 2024
@MichaIng MichaIng modified the milestones: v9.3, v9.4 Apr 16, 2024
@1001Zebras
Copy link

Awwww, you're a lifesaver with bonding the interfaces. Seems to have done the trick beautifully. Thank you! This has been plaguing me for a year now with this little board (which is otherwise amazing!)

It is curious how the Orange Pi 5 Plus treats the two interfaces. Like, they're two separate interfaces, but they kind of act as one at times, even without the bonding. Not sure I quie understand what's going on with the hardware and the driver interface there.

Does this kind of bonding automatically perform link aggregation? Like, if I were to connect both ports two ports on the same switch, would I get double the bandwidth and then a failover if one nic completely loses connection? Or would that be a different sort of config?

@AVasiljev
Copy link

It will do in case the switch supports it.

@izenn
Copy link

izenn commented May 6, 2024

i ran into this and had issues with the bonding solution (it would randomly drop for some odd reason). the solution i ended up with was to create udev rules attached to the mac addresses so that the names would be the same every time.
here is my 99-net-names.rules:

SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="c0:74:2b:xx:yy:zz", ATTR{dev_id}=="0x0", ATTR{type}=="1", NAME="net0"
SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="c0:74:2b:aa:bb:cc", ATTR{dev_id}=="0x0", ATTR{type}=="1", NAME="net1"

this creates a net0 and net1 interface that will be the same every time. you need to avoid ethx and wlanx because a race condition can happen between udev and the kernel
once this was done then i configured my interfaces file like this:

# Location: /etc/network/interfaces
# Please modify network settings via: dietpi-config
# Or create your own drop-ins in: /etc/network/interfaces.d/

# Drop-in configs
source interfaces.d/*

auto net0

iface net0 inet static
    address 192.168.1.x
    netmask 255.255.255.0
    gateway 192.168.1.1
    dns-nameservers 1.1.1.1 1.0.0.1

i've been rock solid ever since

@MichaIng
Copy link
Owner

MichaIng commented May 8, 2024

Jep, such udev rules solution is what we gonna do as well. But we'll use eth0/eth1, to remain compatible with dietpi-config and consistent. And we need to find another attribute identifier than the MAC address, at best, else the rules could not be pre-created on our images.

@MichaIng
Copy link
Owner

MichaIng commented May 12, 2024

I found a solution via udev rules, which does work. It was not so easy, since udev cannot swap device names via NAME. Both names eth0 and eth1 exist already when rules are executed, and attempting to rename eth0 to eth1 then fails, as the latter exist already, vice versa. In the past this did work, as udev automatically used intermediate names in these case, but that functionality has been removed.

The following however works well here:

cat << '_EOF_' > /etc/udev/rules.d/99-dietpi-orangepi5plus.rules
SUBSYSTEM=="net", KERNEL=="to_eth0", RUN:="/bin/true"
SUBSYSTEM=="net", KERNEL=="eth0", KERNELS=="0004:41:00.0", RUN:="/bin/true"
SUBSYSTEM=="net", KERNEL=="eth1", KERNELS=="0003:31:00.0", NAME="to_eth0", RUN="/bin/ip l s dev eth0 name eth1", RUN+="/bin/ip l s dev to_eth0 name eth0", RUN+="/bin/udevadm trigger -c add /sys/class/net/eth0 /sys/class/net/eth1"
_EOF_

What we do is assuring that the Ethernet adapter at PCI bus 3 (the one next to the USB-C power port) is named eth0 and the one at PCI bus 4 is eth1.

  • Starting with the 3rd rule: If the adapter at PCI bus 3 is named eth1 (hence not what we want it to), it is renamed to an intermediate to_eth0.
  • Next, eth0 is renamed to eth1, which is now possible, and finally to_eth0 to eth0.
  • Since assigning a NAME triggers a udev ADD event, the first rule assures that this intermediate interface does not trigger any commands, like ifupdown-hotplug, hence we replace the command query with dummy /bin/true.
  • Similarly, we do not want eth0 at PCI bus 4 to trigger any command, as it will be renamed and may not exist when its commands run.
  • Also eth1 at PCI bus 3 shall run only our commands, hence RUN= at first, and RUN+= to add our own commands the the cleared queue.
  • Instead, after all renaming has finished, we trigger the ADD event for both interfaces manually.

These rules with the KERNELS filter apply only, if the adapter were not detected in the "right" order, i.e. did not get the device/interface names we want them to have. So this rename hassle is done only when needed.

Also it should not break any existing workaround you may have done via /etc/network/interfaces or similarly in userspace. But it might break your udev rules, depending on their alphanumerical order and filters/method. But e.g. @izenn your rules should remain effective, regardless whether they are applied before or after ours. When they are applied before ours, our rules won't apply anymore, due to KERNEL=="eth*" filter, which does not match net* interface names. If they apply afterwards, since they have no interface/device name filter, they will rename in case to_eth0 and eth0, just like they would have renamed eth1 and eth0 otherwise. Since there are no eth0 and eth1 interfaces anymore, when RUN commands are executed (which happens after all rules have been otherwise applied), the ip and udevadm commands from our rules will just fail and do nothing. Also, since NAME="net*" triggers a new udev ADD event, ifupdown-hotplug and possibly other commands will run as intended. Our rule clears/overrides the command queue for the original interface names only.

Would be still great if you could test it. I'll implement them on DietPi update today, but show a dialogue which informs about this, so one can react, when needed.


It is quite a hacky solution, and the fact that udev does not natively support swapping interface names, indicates that there may be related issues when the one interface matches the original kernel-applied device name of another interface. However, I do not see another solution, until we reworked our network setup script to support "predictable interface naming", i.e. these enp3p49s0 like names, respectively end0 with recent kernel versions, if the adapter is declared as an embedded one.

@MichaIng
Copy link
Owner

A little update:

cat << '_EOF_' > /etc/udev/rules.d/99-dietpi-orangepi5plus.rules
SUBSYSTEM=="net", KERNEL=="eth0", KERNELS=="0004:41:00.0", RUN:="/bin/true"
SUBSYSTEM=="net", KERNEL=="eth1", KERNELS=="0003:31:00.0", NAME="to_eth0", RUN:="/bin/true"
SUBSYSTEM=="net", KERNEL=="to_eth0", RUN="/bin/ip l s dev eth0 name eth1", RUN+="/bin/ip l s dev to_eth0 name eth0", RUN+="/bin/udevadm trigger -c add /sys/class/net/eth0 /sys/class/net/eth1"
_EOF_

While it worked well with legacy kernel (Linux 5.10.160), with vendor kernel (Linux 6.1), the PCIe device queue somehow blocked and finally completely prevented the eth1 RUN queue. I am not sure about the logic, but I moved the final commands into the to_eth0 udev rule now, to detach them from the original device, which also looks cleaner. This also assures, that if udev RUN commands are cancelled for whichever reason, any to_eth0 udev trigger will finish the renaming.

The vendor kernel has another issue, causing an infinite udev event loop for an extcon7 device, which is related to one of the HDMI ports:

root@OrangePiZero3:~# find / -name '*extcon7*'
/sys/class/extcon/extcon7
/sys/devices/platform/fdee0000.hdmirx-controller/extcon/extcon7
root@OrangePiZero3:~# journalctl | tail -10
May 12 21:16:55 OrangePi5Plus (udev-worker)[1553]: extcon7: Device processed (SEQNUM=11457, ACTION=change)
May 12 21:16:55 OrangePi5Plus (udev-worker)[1553]: extcon7: sd-device-monitor(worker): Passed 268 byte to netlink monitor.
May 12 21:16:55 OrangePi5Plus systemd-udevd[417]: No events are queued, removing /run/udev/queue.
May 12 21:16:55 OrangePi5Plus systemd-udevd[417]: extcon7: Device is queued (SEQNUM=11458, ACTION=change)
May 12 21:16:55 OrangePi5Plus systemd-udevd[417]: extcon7: Device ready for processing (SEQNUM=11458, ACTION=change)
May 12 21:16:55 OrangePi5Plus systemd-udevd[417]: extcon7: sd-device-monitor(manager): Passed 268 byte to netlink monitor.
May 12 21:16:55 OrangePi5Plus (udev-worker)[1553]: extcon7: Processing device (SEQNUM=11458, ACTION=change)
May 12 21:16:55 OrangePi5Plus (udev-worker)[1553]: extcon7: Device processed (SEQNUM=11458, ACTION=change)
May 12 21:16:55 OrangePi5Plus (udev-worker)[1553]: extcon7: sd-device-monitor(worker): Passed 268 byte to netlink monitor.
May 12 21:16:55 OrangePi5Plus systemd-udevd[417]: No events are queued, removing /run/udev/queue.

This causes udev to reload all rules every ~3 seconds, and a constantly active (or regenerated several times a second) udev worker process is consuming 5% CPU usage.

The linux-image-vendor-rk35xx/linux-dtb-vendor-rk35xx packages are available from our repository, but they are clearly not yet ready to be applied on all RK3588 systems. I'll have a look into things, when I find time, and regularly push updates to our repo, so feel free to test them. But I do not recommend them for a production system yet.

MichaIng added a commit that referenced this issue May 12, 2024
- Orange Pi 5 Plus | Resolved an issue where the Ethernet interface names eth0/eth1 could swap on (re)boot. We add a udev rule which assures they are named persistently based on the PCI bus identifier. In case you created own udev rules to mitigate the issue, please check whether they conflict and decide whether to use ours or your solution. The DietPi update will also inform you about this change with a prompt. Many thanks to @dirkhh and many others for reporting this issue: #6592
@izenn
Copy link

izenn commented May 13, 2024

looks like the new udev rules happened after my net0/1 rules so the interfaces were recreated back to eth0/1. Not the end of the world, just something to note. i'll be testing more tomorrow

@MichaIng
Copy link
Owner

MichaIng commented May 13, 2024

looks like the new udev rules happened after my net0/1 rules so the interfaces were recreated back to eth0/1

If they happen after yours, they would not apply anymore, as eth0 and eth1 then do not exist anymore. Also if they happen before yours, it should not be an issue: Yours override the names, and the scheduled RUN commands of ours just fail, since the interfaces do not exist anymore.

But probably I am missing something. Uncommenting udev_log=info in /etc/udev/udev.conf, then checking back udev logs via journalctl -u systemd-udevd or dmesg (AFAIK) could be done to check what happens. But it is a huge number of logs to go through. Makes sense to filter:

journalctl -u systemd-udevd --grep 'eth|net'

Or just remove our rules. Just keep in mind that dietpi-config and some dietpi-software options do not recognise these interface names correctly.

@MichaIng
Copy link
Owner

Since it as generally solved, I'll close this issue. For particular issues or question regarding the udev rules, feel free to keep posting here or open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants