Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A runtime error occured: /usr/lib/lua/luci/model/network.lua:187: attempt to index a nil value #2027

Closed
fnarfbargle opened this issue Aug 1, 2018 · 12 comments

Comments

@fnarfbargle
Copy link

fnarfbargle commented Aug 1, 2018

For whatever reason I am unable to re-open : #1649
I am again running a recent git head build :
commit 2a598bb
Author: Rafał Miłecki rafal@milecki.pl
Date: Fri Jul 27 21:54:08 2018 +0200

I have tracked this down to luci.ip returning nil for the the ipc.link(ifname) call.

I inserted debugging code in network.lua prior to the failing line (local phy_only = force_phy_only or (ipc.link(ifname).type ~= 1)) :

                    if ipc.link(ifname) == nil then                                                                
                            utl.perror("ipc.link(ifname) is nil " .. ifname)                                                                                                   
                    end                                                                                            

And got : Tue Jul 31 18:59:46 2018 daemon.err uhttpd[2337]: ipc.link(ifname) is nil wlan0

The subsequent call (ipc.link(ifname).type ~= 1) then succeeded.

So for whatever reason on this device, very intermittently (like once a day) that call will fail and return nil, which network.lua then attempts to index with the .type call and it fails.

I attached the requested network interface dump to that report, but as I've been unable to re-open it I'll reference it here : https://github.com/openwrt/luci/files/1897666/network.interface.dump.txt

I'm now not sure where to go next. Anyone have any ideas?

@jow-
Copy link
Contributor

jow- commented Aug 1, 2018

The weird thing is that ipc.link() actually cannot return nil, except when there's an out of memory issue: https://github.com/openwrt/luci/blob/master/libs/luci-lib-ip/src/ip.c#L1498

The only condition under which link() does not return at least an empty table is if (!msg || !cb)

@fnarfbargle
Copy link
Author

Yep, understand that and have already trawled through that code.
I put some basic instrumentation in and logged several thousand calls per day, with at most one failure. The very peculiar thing is it will fail on the first call (always wlan0) and immediately succeed on the second. So call exactly the same thing twice and the first one fails but the second succeeds.

So by putting that debug in place I never see a failure backtrace. I only see the "is nil wlan0" as it will fail on the debug line and the next test passes.

As I said in the previous issue, I'm fairly sure it's not a memory related issue :

root@WRT32x:~# free
total used free shared buffers cached
Mem: 510936 226216 284720 548 4060 12632
-/+ buffers/cache: 209524 301412

This box literally does nothing other than shunt packets between the wireless and wired LAN. It's a very expensive AP.

It has me stuffed, but as long as I have a web tab with the status overview up it happens pretty much every day.

I'll put some instrumentation in ip.so to find out which allocation is failing and compile up a new image to test.

Debug printing from lua is easy enough. Is there a reference for getting a debug print out of the linked library you could point me towards?

@jow-
Copy link
Contributor

jow- commented Aug 2, 2018

It should be enough to use fprintf(stderr, "msg=%p, cb=%p\n", msg, cb); in the library to make the output appear in logread.

I suspect the problem can be "fixed" with a change like this:

diff --git a/libs/luci-lib-ip/src/ip.c b/libs/luci-lib-ip/src/ip.c
index 854a0c09c..7cf3cd444 100644
--- a/libs/luci-lib-ip/src/ip.c
+++ b/libs/luci-lib-ip/src/ip.c
@@ -1517,8 +1517,10 @@ static int link_get(lua_State *L)
        struct nl_cb *cb = nl_cb_alloc(NL_CB_DEFAULT);
        struct ifinfomsg ifm = { .ifi_index = if_nametoindex(dev) };
 
+       lua_newtable(L);
+
        if (!msg || !cb)
-               return 0;
+               return 1;
 
        nlmsg_append(msg, &ifm, sizeof(ifm), 0);
 
@@ -1526,8 +1528,6 @@ static int link_get(lua_State *L)
        nl_cb_set(cb, NL_CB_FINISH, NL_CB_CUSTOM, cb_done, &st);
        nl_cb_err(cb, NL_CB_CUSTOM, cb_error, &st);
 
-       lua_newtable(L);
-
        nl_send_auto_complete(sock, msg);
 
        while (st.pending > 0)

But Ideally I'd like to understand the underlying problem first. Maybe its not the amount of free memory but memory fragmentation, or some bad caching / allocation behaviour in libnl-tiny.

@fnarfbargle
Copy link
Author

fnarfbargle commented Aug 3, 2018

Ok, failure to allocate isn't the problem.

I put the fprintf in and got the lua error 3 times last night, but never once triggered the print in ip.c.
I re-compiled this morning with a hack to force the fprintf to print just to be sure it would actually leave data in the logs and it did.

So the error is not due to a failure to allocate those structures.

@fnarfbargle
Copy link
Author

Ok, the error is failure in nl_connect(). On failure it is returning -6 :

Fri Aug 3 16:16:33 2018 daemon.err uhttpd[2337]: no nl_connect -6
Fri Aug 3 16:16:33 2018 daemon.err uhttpd[2337]: ipc.link(ifname) is nil wlan0

In libnl, -6 is NLE_EXIST "Object exists".
So the socket is created, nl_connect returns "Object exists", this uses _error() to push 3 elements onto the lua stack with the first element being 'nil', which may explain ipc.link(ifname) returning nil. I assume I should be able to pull the error number and description off the Lua stack also. Is this the right way to return an error to lua, as it doesn't seem to be handled?

Plenty of instances splattered across google of similar errors using libnl. Some bletcherous examples of workarounds using loops and waits. I'll keep poking.

@jow-
Copy link
Contributor

jow- commented Aug 3, 2018

Yeah, one common pattern for returning errors in Lua library function is nil, errno, errstr or nil, errstr.
The fault is in the calling Lua code of network.lua. It simply assumed that the call can never fail.

@fnarfbargle
Copy link
Author

Amusingly I updated the WRT1900ACv2 to OpenWrt 18.06.1 r7258-5eb055306f (from the repositories, not compiled locally) a couple of weeks ago and I'm periodically seeing exactly the same error at the same location in the logs.

That router is actually working as a router and is considerably more busy with a completely different configuration, but precisely the same error. Odd that I'm apparently the only person seeing this.

I suppose I'd better take a look at the lua->C call chain and figure out how to gracefully fail that call.

@dannil
Copy link
Contributor

dannil commented Jan 19, 2019

I just encountered this error about 20 minutes ago, on a R7800 running r9019-0e8d5ff (git-19.011.54860-47d9205)

Failed to execute template dispatcher target for entry '/admin/status/overview'.
The called action terminated with an exception:
/usr/lib/lua/luci/template.lua:97: Failed to execute template 'admin_status/index'.
A runtime error occurred: /usr/lib/lua/luci/model/network.lua:203: attempt to index a nil value
stack traceback:
        /usr/lib/lua/luci/model/network.lua:203: in function '_wifi_iwinfo_by_ifname'
 	/usr/lib/lua/luci/model/network.lua:1664: in function '__init__'
 	/usr/lib/lua/luci/util.lua:65: in function 'wifinet'
 	/usr/lib/lua/luci/model/network.lua:1598: in function 'callback'
 	/usr/lib/lua/luci/model/uci.lua:269: in function 'foreach'
 	/usr/lib/lua/luci/model/network.lua:1595: in function 'get_wifinets'
 	/usr/lib/lua/luci/tools/status.lua:147: in function 'wifi_networks'
 	[string "/usr/lib/lua/luci/view/admin_status/index.h..."]:45: in main chunk
 stack traceback:
 	[C]: in function 'error'
 	/usr/lib/lua/luci/template.lua:97: in function </usr/lib/lua/luci/template.lua:85>
 	(tail call): ?
 	/usr/lib/lua/luci/dispatcher.lua:849: in function </usr/lib/lua/luci/dispatcher.lua:848>

Line 203 is the above mentioned local phy_only = force_phy_only or (ipc.link(ifname).type ~= 1)

Memory is also not an issue on my device, taken shortly after discovery.

root@OpenWrt:~# free
              total        used        free      shared  buff/cache   available
Mem:         477372      121768      341596        1416       14008      325228
Swap:             0           0           0

@r3t3ch
Copy link

r3t3ch commented Mar 14, 2019

Just to add another potential data point in the event it has any relation:

Wed Mar 13 10:35:47 2019 daemon.err uhttpd[1418]: luci: accepted login on /admin/network/wireless for root from 192.168.1.12
Wed Mar 13 10:56:57 2019 daemon.err uhttpd[1418]: Failed to execute call dispatcher target for entry '/admin/network/wireless_status/radio0.network1,radio1.network1'.
Wed Mar 13 10:56:57 2019 daemon.err uhttpd[1418]: The called action terminated with an exception:
Wed Mar 13 10:56:57 2019 daemon.err uhttpd[1418]: /usr/lib/lua/luci/model/network.lua:203: attempt to index a nil value
Wed Mar 13 10:56:57 2019 daemon.err uhttpd[1418]: stack traceback:
Wed Mar 13 10:56:57 2019 daemon.err uhttpd[1418]: 	/usr/lib/lua/luci/model/network.lua:203: in function '_wifi_iwinfo_by_ifname'
Wed Mar 13 10:56:57 2019 daemon.err uhttpd[1418]: 	/usr/lib/lua/luci/model/network.lua:1664: in function '__init__'
Wed Mar 13 10:56:57 2019 daemon.err uhttpd[1418]: 	/usr/lib/lua/luci/util.lua:65: in function </usr/lib/lua/luci/util.lua:61>
Wed Mar 13 10:56:57 2019 daemon.err uhttpd[1418]: 	(tail call): ?
Wed Mar 13 10:56:57 2019 daemon.err uhttpd[1418]: 	/usr/lib/lua/luci/tools/status.lua:185: in function 'wifi_network'
Wed Mar 13 10:56:57 2019 daemon.err uhttpd[1418]: 	/usr/lib/lua/luci/controller/admin/network.lua:277: in function </usr/lib/lua/luci/controller/admin/network.lua:270>
Wed Mar 13 10:56:57 2019 daemon.err uhttpd[1418]: 	(tail call): ?
Wed Mar 13 10:57:17 2019 daemon.err uhttpd[1418]: Failed to execute call dispatcher target for entry '/admin/network/wireless_status/radio0.network1,radio1.network1'.
Wed Mar 13 10:57:17 2019 daemon.err uhttpd[1418]: The called action terminated with an exception:
Wed Mar 13 10:57:17 2019 daemon.err uhttpd[1418]: /usr/lib/lua/luci/model/network.lua:203: attempt to index a nil value
Wed Mar 13 10:57:17 2019 daemon.err uhttpd[1418]: stack traceback:
Wed Mar 13 10:57:17 2019 daemon.err uhttpd[1418]: 	/usr/lib/lua/luci/model/network.lua:203: in function '_wifi_iwinfo_by_ifname'
Wed Mar 13 10:57:17 2019 daemon.err uhttpd[1418]: 	/usr/lib/lua/luci/model/network.lua:1664: in function '__init__'
Wed Mar 13 10:57:17 2019 daemon.err uhttpd[1418]: 	/usr/lib/lua/luci/util.lua:65: in function </usr/lib/lua/luci/util.lua:61>
Wed Mar 13 10:57:17 2019 daemon.err uhttpd[1418]: 	(tail call): ?
Wed Mar 13 10:57:17 2019 daemon.err uhttpd[1418]: 	/usr/lib/lua/luci/tools/status.lua:185: in function 'wifi_network'
Wed Mar 13 10:57:17 2019 daemon.err uhttpd[1418]: 	/usr/lib/lua/luci/controller/admin/network.lua:277: in function </usr/lib/lua/luci/controller/admin/network.lua:270>
Wed Mar 13 10:57:17 2019 daemon.err uhttpd[1418]: 	(tail call): ?

LuCI Master (git-19.058.07653) / OpenWrt SNAPSHOT r9456

@fmikker
Copy link

fmikker commented Mar 20, 2019

This issue has been bugging me for a while since my monitoring system alerts me on errors from the router.

Adding the latest message in my system log as well:

Wed Mar 20 20:12:55 2019 daemon.err uhttpd[1617]: The called action terminated with an exception:
Wed Mar 20 20:12:55 2019 daemon.err uhttpd[1617]: /usr/lib/lua/luci/template.lua:97: Failed to execute template 'admin_status/index'.
Wed Mar 20 20:12:55 2019 daemon.err uhttpd[1617]: A runtime error occured: /usr/lib/lua/luci/model/network.lua:203: attempt to index a nil value
Wed Mar 20 20:12:55 2019 daemon.err uhttpd[1617]: stack traceback:
Wed Mar 20 20:12:55 2019 daemon.err uhttpd[1617]: 	/usr/lib/lua/luci/model/network.lua:203: in function '_wifi_iwinfo_by_ifname'
Wed Mar 20 20:12:55 2019 daemon.err uhttpd[1617]: 	/usr/lib/lua/luci/model/network.lua:1657: in function '__init__'
Wed Mar 20 20:12:55 2019 daemon.err uhttpd[1617]: 	/usr/lib/lua/luci/util.lua:65: in function 'wifinet'
Wed Mar 20 20:12:55 2019 daemon.err uhttpd[1617]: 	/usr/lib/lua/luci/model/network.lua:1591: in function 'callback'
Wed Mar 20 20:12:55 2019 daemon.err uhttpd[1617]: 	/usr/lib/lua/luci/model/uci.lua:269: in function 'foreach'
Wed Mar 20 20:12:55 2019 daemon.err uhttpd[1617]: 	/usr/lib/lua/luci/model/network.lua:1588: in function 'get_wifinets'
Wed Mar 20 20:12:55 2019 daemon.err uhttpd[1617]: 	/usr/lib/lua/luci/tools/status.lua:147: in function 'wifi_networks'
Wed Mar 20 20:12:55 2019 daemon.err uhttpd[1617]: 	[string "/usr/lib/lua/luci/view/admin_status/index.h..."]:50: in main chunk
Wed Mar 20 20:12:55 2019 daemon.err uhttpd[1617]: stack traceback:
Wed Mar 20 20:12:55 2019 daemon.err uhttpd[1617]: 	[C]: in function 'error'
Wed Mar 20 20:12:55 2019 daemon.err uhttpd[1617]: 	/usr/lib/lua/luci/template.lua:97: in function </usr/lib/lua/luci/template.lua:85>
Wed Mar 20 20:12:55 2019 daemon.err uhttpd[1617]: 	(tail call): ?
Wed Mar 20 20:12:55 2019 daemon.err uhttpd[1617]: 	/usr/lib/lua/luci/dispatcher.lua:827: in function </usr/lib/lua/luci/dispatcher.lua:826>

Linksys WRT1900AC
OpenWrt 18.06.1 r7258-5eb055306f / LuCI openwrt-18.06 branch (git-19.051.55698-76cf653)

@sgabe
Copy link

sgabe commented May 20, 2019

I just ran into this issue while running OpenWrt SNAPSHOT r9019-0e8d5ff0fc / LuCI Master (git-19.079.18950-11e64f8) on MikroTik RouterBOARD 750Gr3.

Mon May 20 20:21:39 2019 daemon.err uhttpd[2018]: Failed to execute call dispatcher target for entry '/admin/network/iface_status/lan,vpn,wan'.
Mon May 20 20:21:39 2019 daemon.err uhttpd[2018]: The called action terminated with an exception:
Mon May 20 20:21:39 2019 daemon.err uhttpd[2018]: /usr/lib/lua/luci/controller/admin/network.lua:215: attempt to call method 'is_auto' (a nil value)
Mon May 20 20:21:39 2019 daemon.err uhttpd[2018]: stack traceback:
Mon May 20 20:21:39 2019 daemon.err uhttpd[2018]: 	/usr/lib/lua/luci/controller/admin/network.lua:215: in function </usr/lib/lua/luci/controller/admin/network.lua:187>
Mon May 20 20:21:39 2019 daemon.err uhttpd[2018]: 	(tail call): ?

@jow-
Copy link
Contributor

jow- commented Feb 23, 2020

Master / 19.07 do not use this code anymore

@jow- jow- closed this as completed Feb 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants