Description
bjonglez:
I've been trying to debug this regression in 18.06.8: in some circumstances, rpcd fails to start.
This is mostly visible as it breaks LuCI, see e.g. openwrt/luci#3773 or https://forum.openwrt.org/t/luci-error-after-upgrade-to-r10949-or-r10951-etc-config-luci-seems-to-be-corrupt/56880
To reproduce on 18.06.8:
- remove the ''rpcd'' section in ''/etc/config/rpcd''
- reboot (this is important)
- result: ''rpcd'' is not started, and the following log message is printed in ''logread'':
Thu Feb 27 21:26:37 2020 daemon.info procd: Not starting instance rpcd::instance1, command not set
More details:
When this issue happens, it becomes impossible to start ''rpcd'' with ''procd'', even when adding back the ''rpcd'' section:
root@OpenWrt:~# PROCD_DEBUG=1 /etc/init.d/rpcd start
{ "name": "rpcd", "script": "\/etc\/init.d\/rpcd", "instances": { "instance1": { "command": [ "\/sbin\/rpcd" ] } }, "triggers": [ ], "data": { } }
root@OpenWrt:~# ps | grep rpc
1614 root 1200 S grep rpc
root@OpenWrt:# uci add rpcd rpcd
cfg027c4e
root@OpenWrt:# uci set rpcd.@rpcd[-1].timeout=30
root@OpenWrt:# uci commit
root@OpenWrt:# PROCD_DEBUG=1 /etc/init.d/rpcd start
{ "name": "rpcd", "script": "/etc/init.d/rpcd", "instances": { "instance1": { "command": [ "/sbin/rpcd", "-t", "30" ] } }, "triggers": [ ], "data": { } }
root@OpenWrt:~# ps | grep rpc
1636 root 1200 S grep rpc
root@OpenWrt:
# uci set rpcd.@rpcd[-1].socket=/var/run/ubus.sock# uci commit
root@OpenWrt:
root@OpenWrt:# PROCD_DEBUG=1 /etc/init.d/rpcd start# ps | grep rpc
{ "name": "rpcd", "script": "/etc/init.d/rpcd", "instances": { "instance1": { "command": [ "/sbin/rpcd", "-s", "/var/run/ubus.sock", "-t", "30" ] } }, "triggers": [ ], "data": { } }
root@OpenWrt:
1680 root 1200 S grep rpc
However, running ''rpcd'' manually works perfectly well (and fixes LuCI):
root@OpenWrt:~# rpcd
Workaround:
To workaround the issue, it is necessary to:
- add a ''rpcd'' section with either a ''socket'' or ''timeout'' option
- reboot
At this point, ''rpcd'' is started correctly, and everything works fine. It is even possible to delete the ''rpcd'' section and restart ''rpcd'', it will still start correctly.
Finding the root cause:
There are very few commits between 18.06.7 and 18.06.8. None of these commits is touching ''procd'' or ''rpcd''.
However, there has been a libubox fix in 82fbd85. This is currently the prime suspect: I will try to revert this commit, and also try with the further libubox fixes that have not yet been backported.