Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raspberry Pi 4: DHCPCD route socket overflowed #4092

Open
jwillmer opened this issue Jan 23, 2021 · 25 comments
Open

Raspberry Pi 4: DHCPCD route socket overflowed #4092

jwillmer opened this issue Jan 23, 2021 · 25 comments

Comments

@jwillmer
Copy link

Describe the bug
Every now and then my Pi is loosing it's IPv6. I found out that I can fix the issue temporarily via systemctl restart dhcpcd. Today it happened again and I used systemctl status dhcpcd to look at the state. I got the following output:

Warning: The unit file, source configuration file or drop-ins of dhcpcd.service changed on disk. Run 'systemctl daemon-reload' to reload units.
● dhcpcd.service - dhcpcd on all interfaces
   Loaded: loaded (/lib/systemd/system/dhcpcd.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/dhcpcd.service.d
           └─wait.conf
   Active: failed (Result: signal) since Fri 2021-01-22 15:36:35 CET; 1 day 6h ago
  Process: 340 ExecStart=/usr/lib/dhcpcd5/dhcpcd -q -w (code=exited, status=0/SUCCESS)
 Main PID: 484 (code=killed, signal=SEGV)

Jan 22 15:36:35 home-server dhcpcd[484]: veth2c6abe3: waiting for carrier
Jan 22 15:36:35 home-server dhcpcd[484]: vethf713b46: IAID 69:de:ae:f1
Jan 22 15:36:35 home-server dhcpcd[484]: vethf713b46: adding address fe80::e7........7:52f9
Jan 22 15:36:35 home-server dhcpcd[484]: veth88ef6b4: waiting for carrier
Jan 22 15:36:35 home-server dhcpcd[484]: veth514e931: waiting for carrier
Jan 22 15:36:35 home-server dhcpcd[484]: route socket overflowed - learning interface state
Jan 22 15:36:35 home-server dhcpcd[484]: vethf32a0ec: carrier acquired
Jan 22 15:36:35 home-server dhcpcd[484]: vethf32a0ec: IAID bf:44:26:9c
Jan 22 15:36:35 home-server systemd[1]: dhcpcd.service: Main process exited, code=killed, status=11/SEGV
Jan 22 15:36:35 home-server systemd[1]: dhcpcd.service: Failed with result 'signal'.

I don't have enough knowledge about Linux to say that this is the right channel for this issue. Please be kind and redirect me if this issue is completely off topic.

To reproduce
I don't know. I can't find a pattern, it just happens now and then.

System

  • Which model of Raspberry Pi?
    Raspberry Pi 4

  • Which OS and version (cat /etc/rpi-issue)?

Raspberry Pi reference 2020-08-20
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 9a3a10bf1019ebb2d59053564dc6b90068bad27d, stage2
  • Which firmware version (vcgencmd version)?
Jan  7 2021 18:27:29
Copyright (c) 2012 Broadcom
version fb345a0c2d5544957f4ba1a2b9e968970e3312c4 (clean) (release) (start)
  • Which kernel version (uname -a)?
    Linux home-server 5.10.6-v7l+ #1393 SMP Mon Jan 11 15:09:41 GMT 2021 armv7l GNU/Linux

Additional context
I am only running docker containers on the Pi. I boot the OS from an SSD drive but this I did only recently and I had the issue before as well.

@alaub81
Copy link

alaub81 commented Jan 24, 2021

I have a similar problem. It seems to be, that dhcpcd in the raspberry pi OS Version, which is:

dhcpcd 8.1.2
Copyright (c) 2006-2019 Roy Marples
Compiled in features: INET ARP ARPing IPv4LL INET6 DHCPv6 AUTH

Fails if there are to many network interfaces. It Seems like you also have docker running on the pi.

On my side, the dhcpcd and dhcpcd5 daemon will not come up on a fresh reboot. I have about 10 docker containers running in several docker-compose projects. If I shut down the compose projects before rebooting the Pi (its a Pi4) then everything is working fine and the daemon is coming up.

I read in another forum, that this should be fixed in a newer dhcpcd Version and it is a known bug. So dhcpcd fails if there are too many network interfaces.

Perhaps somebody has a workaround or even a fix for that.

@jwillmer
Copy link
Author

@alaub81 do you have a link to that issue so that I can track the progress?

@alaub81
Copy link

alaub81 commented Jan 25, 2021

@jwillmer I just read it here: forums.gentoo.org

@FP2K-Minske
Copy link

Fastest workaround:
sudo nano /etc/dhcpcd.conf
Insert the following line at the end:
denyinterfaces veth*

It exclude the virtual container interfaces from dhcpcd.

@alaub81
Copy link

alaub81 commented Feb 6, 2021

@FP2K-Minske thank you, just tried it right now and it seems to work :-)

@theunreal89
Copy link

I had the same exact issue with Docker and denying veth interfaces solved the issue for me! Much appreciated ;)

@daniel-asilva
Copy link

daniel-asilva commented May 9, 2021

Here to confirm that the FP2K's workaround also worked for me. I was tracking this problem and I can say this solution also works.

In addition, I was observing that this problem occurs every time the DHCP lease duration expires (4h by default). So dhcpd service crashes and the Raspberry Pi became offline but stays powered on. I have the exact same scenario: multiples docker interfaces. Fortunately the two mentioned workarounds above fix this thing.

EDIT: After @cpannwitz's comment below, I have to clarify. I've tested both solutions I mentioned above individually. I didn't applied both simultaneously.

@cpannwitz
Copy link

cpannwitz commented May 17, 2021

In Addition to the fix by @FP2K-Minske and @daniel-asilva (both fixes applied),
I had to restart daemon and dhcpcd:

sudo systemctl daemon-reload
sudo systemctl restart dhcpcd

afterwards, because there were complaints about changed conf files on disk,
which resulted in the same problem, dhcpcd not working after reboot.

EDIT: In my case, applying BOTH fixes (see above) did NOT work. I had to remove the fix posted by @daniel-asilva , and had to move denyinterfaces veth* to the top of the /etc/dhcpcd.conf file.

gradientskier pushed a commit to gradientskier/btcpayserver-docker that referenced this issue Dec 12, 2021
dhcpcd and docker: edit solution taken from raspberrypi/linux#4092
@moracabanas
Copy link

moracabanas commented Mar 19, 2022

This happened to me from the time I set wlan0 static IP from GUI.

I see it happening the same thing from this thread https://raspberrypi.stackexchange.com/questions/58809/rpi-loses-its-wlan0-configuration-when-any-docker-container-is-started/117381#117381

It was solved disabling DHCP for virtual interfaces with the denyinterfaces veth* trick on /etc/dhcpcd.conf. Make sure you add it to the top of the file and reboo. Otherwise it won't work.

But I suspect when you set IP only, it looks for the resto configuration over all networks including veth new ones.

I will try to confirm this issue setting all static config requested on GUI and see what happens.

This is so crazy I was getting TLS and socket resset errors in my stack, and I was thinking for a month it was my stack issue.

@d-rez
Copy link

d-rez commented Mar 26, 2022

and I was thinking for a month it was my stack issue.

Just for a month? :P I've been having network crashes (same symptoms, docker swarm cluster) for well over a year, and going back even to raspberry pi 3 kernels and I could never pinpoint this in any way.
System logs were very ambiguous and since I run them headless I always assumed they crashed until I realised I could still access them over another address/IP protocol (I run IPv6 and 2x attached VLANs on each raspi so technically 4x addresses per each Pi - 2x IPv4, 1x IPv6 public and 1xIPv6 ULA)

If this actually works I'll be ecstatic :D Thanks for posting this workaround folks!

@seamusdemora
Copy link

This happened to me from the time I set wlan0 static IP from GUI.

I see it happening the same thing from this thread https://raspberrypi.stackexchange.com/questions/58809/rpi-loses-its-wlan0-configuration-when-any-docker-container-is-started/117381#117381

It was solved disabling DHCP for virtual interfaces with the denyinterfaces veth* trick on /etc/dhcpcd.conf. Make sure you add it to the top of the file and reboo. Otherwise it won't work.

There was another Q posted to RPi recently that involved strange issues with docker services. I don't use docker services, and would have ignored the question except that the title of the Q implied network issues. I eventually gave a rather elaborate and tutorial answer that was primarily to make this point: Do not use dhcpcd's static ip option.

This shouldn't be controversial (or so I thought) as the author of dhcpcd says in man dhcpcd.conf:

For IPv4, you should use the inform ipaddress option instead of setting a static address.

The OP didn't provide any feedback; I don't know if he resolved his issue or not. But I ran across this thread, and wanted to ask a question, hopefully to get some feedback.

In the first line of this quote, it seems that you are using the static ip option, and so my question is this: Instead of static ip, have you tried either the request or inform options? If so, did that have any effect on the docker issues?

@ykun91
Copy link

ykun91 commented Jul 25, 2022

Encountered same problem and after a week of searching, I finally found the answer here. seems that this issue is still not
get fixed.

I have two raspberry 4 with Raspberry Pi OS(64 bit) installed, have docker running in both, and both lost it ethernet connection after 2~3 days of poweron. and I have to manually reboot it every time to get it recovery...

pi@rasp-2:log $ uname -a
Linux rasp-2 5.15.32-v8+ #1538 SMP PREEMPT Thu Mar 31 19:40:39 BST 2022 aarch64 GNU/Linux
Jul 23 15:16:28 rasp-2 dhcpcd[721]: veth4bd92c9: soliciting an IPv6 router
Jul 23 15:16:28 rasp-2 dhcpcd[721]: veth73f0901: waiting for carrier
Jul 23 15:16:28 rasp-2 dhcpcd[721]: vethb3ac390: soliciting a DHCP lease
Jul 23 15:16:28 rasp-2 dhcpcd[721]: veth91f9038: soliciting a DHCP lease
Jul 23 15:16:28 rasp-2 dhcpcd[721]: veth576f779: waiting for carrier
Jul 23 15:16:28 rasp-2 dhcpcd[721]: route socket overflowed - learning interface state
Jul 23 15:16:28 rasp-2 dhcpcd[721]: veth7777561: carrier acquired

@eric-pierce
Copy link

I'd like to note that this is still an active issue, and the fix mentioned in #4092 does resolve it. I wish it hadn't taken me weeks to find this thread, but very happy I did. I'm curious if this also persists on alternate distros like DietPi

@bfren
Copy link

bfren commented Aug 24, 2022

Same here - thanks @FP2K-Minske - doesn't even feel very hacky - simply telling dhcpcd not to do something that it probably sensibly tries to do by default.

@denwald
Copy link

denwald commented Dec 2, 2022

Been experiencing a similar problem too. Headless Raspberry Pi 4 with 10 docker containers (and corresponding veth* interfaces). Was regularly loosing connectivity on eth0 after a couple of weeks of uptime. Only suspicious thing I could find in the logs is the mentioned "DHCPCD route socket overflowed" message.

I will try the "denyinterfaces" option for dhcpcd 🤞.

@ferrarimarco
Copy link

ferrarimarco commented Feb 16, 2023

Maybe updating dhcpcd to a version >= 9.2.0 could also help. There are a few interesting notes in the changelog of that version that seem related...

The latest dhcpcd version available on Raspberry Pi OS is 8.1.2.

@dulitz
Copy link

dulitz commented May 27, 2023

Just wanted to register another vote to bring dhcpcd up to a version more recent than 2019, as there have been a lot of improvements since then.

@bfren
Copy link

bfren commented May 28, 2023

I ended up using raspi-config to switch to NetworkManager on all my pis instead of dhcpd - not had a problem since.

@doodlebro
Copy link

This bug is especially strange since I don't use IPv6 anywhere, but it seems like 6+ independent containers create the conditions for overflow.

@areksobiczewski
Copy link

The issue is still perisistent. I realized that after enabling IPv6 in my network and going with docker ~10 containers. Initially I thought there's a DHCP issue on my router (too narrow DHCP lease time), but it's like in this thread - at some point dhcpd is giving up not renewing DHCP leases regardless of how the DHCP server is being configured. It took me several hours to debug the matter with dhcpd and find this tread. It's a very confusing kind of error!

Meanwhile, I'm using what @bfren has proposed above - using newtork-manager instead of dhcpd.

Dhcpd should be either updated or network-manager should be the default in the OS. Otherwise an user is going to be faced with strange networking issues that are hard to troubleshoot whenever wanting to do some more serious work with Pi and networking :-(

@seamusdemora
Copy link

@areksobiczewski , et al

Please note that issues with dhcpcd are likely impacted by the fact that the RPi powers-zat-bee decided some time ago to stick with an old, no-longer-maintained-upstream version of dhcpcd. That left all bug-fixes for dhcpcd as the responsibility of someone in the RPi organization - or maybe a volunteer?? At any rate - in my experience, no one seemed to give a rat's-a$$ if it was maintained or not.

That's not meant as criticism, but only as a plain statement of fact.

@bfren
Copy link

bfren commented Mar 9, 2024

Indeed. The decision has been made, for whatever reasons (however good - presumably there are consequences to using later versions of dhcpd?). What I don't understand is why Network Manager isn't simply made the default - are there consequences to using it that I'm not aware of?

If not, given doing that would easily fix the strange and definitely hard to troubleshoot issues caused by having relatively few Docker containers, I don't see why it hasn't been done.

@ferrarimarco
Copy link

It seems that the release notes for the latest Raspberry Pi OS version (based on Bookworm), contain this line:

  * NetworkManager used instead of dhcpcd as networking interface; various changes made to networking plugin to support this

@bfren
Copy link

bfren commented Mar 9, 2024

@ferrarimarco that is curious, when I used the Bookworm image to install a new Pi 5, the default was still dhcpd and I had to change it using raspi-config.

@JS-E
Copy link

JS-E commented Jul 13, 2024

Fastest workaround: sudo nano /etc/dhcpcd.conf Insert the following line at the end: denyinterfaces veth*

It exclude the virtual container interfaces from dhcpcd.

Really appreciate this, i've been tearing my hair out trying to figure out why this wasn't working and this fixed it. Thanks again! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests