Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: runtime error: invalid nil pointer dereference when point-to-point interface has nil dst address #292

Closed
pdcastro opened this issue Apr 8, 2022 · 5 comments

Comments

@pdcastro
Copy link
Contributor

pdcastro commented Apr 8, 2022

$ journalctl -au balena
...
Apr 06 18:29:07 balenad[508945]: panic: runtime error: invalid memory address or nil pointer dereference
Apr 06 18:29:07 balenad[508945]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1272ec4]

As further detailed in the linked support thread.

On investigation, we found that the bug is in the netlink library (3rd party) and already fixed (over there) by the following pull request:

vishvananda/netlink#665
IFA_ADDRESS is to be used as the peer address if it differs from IFA_LOCAL.
Therefore, include the check for "no IFA_ADDRESS" in the difference check.
Example: ppp interfaces can contain IFA_LOCAL and no IFA_ADDRESS attribute

Related reference: https://stackoverflow.com/questions/4678637/what-is-difference-between-ifa-local-and-ifa-address-in-rtnetlink-linux

Known workaround

Not yet confirmed at the time of this writing, but in the linked support thread, I believe that the immediate cause for the engine panic is the following P-t-P:0.0.0.0 value in the ppp0 interface:

$ ifconfig
ppp0      Link encap:Point-to-Point Protocol  
          inet addr:10.164.233.243  P-t-P:0.0.0.0  Mask:255.255.255.255

I suspect that, by the setting a value other than 0.0.0.0, the engine panic would be avoided. In the linked support thread, I understand that interface ppp0 was associated with cellular (gsm) internet connection, for which NetworkManager reported an IPv4 gateway of value 0.0.0.0:

root@d3ba86f:~# nmcli device show
GENERAL.DEVICE:                         ttyS0
GENERAL.TYPE:                           gsm
GENERAL.HWADDR:                         (unknown)
GENERAL.MTU:                            1500
GENERAL.STATE:                          100 (connected)
GENERAL.CONNECTION:                     cellular
GENERAL.CON-PATH:                       /org/freedesktop/NetworkManager/ActiveConnection/1
IP4.ADDRESS[1]:                         10.164.233.243/32
IP4.GATEWAY:                            0.0.0.0
IP4.ROUTE[1]:                           dst = 0.0.0.0/0, nh = 0.0.0.0, mt = 20700
IP4.DNS[1]:                             194.151.228.34
IP4.DNS[2]:                             194.151.228.18

Above, I believe it is unusual for IP4.GATEWAY to have value 0.0.0.0. I suspect that that IP4.GATEWAY value corresponds to P-t-P:0.0.0.0 in the output of ifconfig for the ppp0 interface. I suspect that setting a non-nil value for IP4.GATEWAY would work around the engine panic.

@jellyfish-bot
Copy link

[pdcastro] This issue has attached support thread https://jel.ly.fish/8e060183-5225-4114-9bfb-43469299a6dd

@fossejc
Copy link

fossejc commented Apr 8, 2022

Hi,

I kept doing some test and reverting to BalenaOS version balenaOS 2.85.2+rev3 seems to work.

I have no idea on why or how ...

I hope it can help solve the issue.

Thanks

@pipex
Copy link

pipex commented Apr 8, 2022

You are correct, it seems that the bug was introduced with the update of balena-engine to upstream moby v20.10.12, which was introduced in balenaOS v2.94.0. So the bug would not be present in balenaOS 2.85.2 which has balena-engine v19.03.30

lmbarros added a commit that referenced this issue Apr 11, 2022
This new version of netlink includes a number of bugfixes, including a
fix to #292.

Signed-off-by: Leandro Motta Barros <leandro@balena.io>
Change-type: patch
@jellyfish-bot
Copy link

[majorz] This has attached https://jel.ly.fish/b8cffc99-86da-4961-88af-51ecbb4aa590

lmbarros added a commit that referenced this issue Dec 20, 2022
balenaEngine initialization would fail on devices connected to the
network via PPP (Point-to-Point Protocol) and with a nil destination
address (0.0.0.0):

    panic: runtime error: invalid memory address or nil pointer dereference

This commit updates the netlink dependency to a fork where we
cherry-picked the correction. This correction wasn't available in any
stable release of netlink, so we opted to use this fork instead of
relying on a beta netlink version. We can obsolete our fork once Moby
starts using a netlink version that includes the fix.

Fixes #292

Signed-off-by: Leandro Motta Barros <leandro@balena.io>
Change-type: patch
@alexgg
Copy link
Contributor

alexgg commented Mar 27, 2024

A workaround is to enable balena-host from boot before the network interfaces come up. For that we can modify the balena-host.service unit and add:

[Install]
WantedBy=multi-user.target

And then do:

systemctl enable balena-host
reboot

That should bring the host engine up from boot and allow to update the hostOS to a patched version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants