Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM error on running snabb config get #829

Open
dpino opened this issue May 31, 2017 · 3 comments
Open

OOM error on running snabb config get #829

dpino opened this issue May 31, 2017 · 3 comments
Assignees
Labels

Comments

@dpino
Copy link
Member

dpino commented May 31, 2017

I got an OOM error when running a snabb config get command which output is a really long string. For instance, if I try to query a whole binding table (https://people.igalia.com/dpino/lwaftr/lwaftr-migrated-2.conf)

Term1:

$ sudo ./snabb lwaftr run --reconfigurable --name lwaftr --conf lwaftr-migrated-2.conf \
   --on-a-stick 83:00.0

Term2:

$ sudo ./snabb config get -s ietf-softwire lwaftr \
   /softwire-config/binding/br/br-instances/br-instance[id=1]

Returns "PANIC: unprotected error in call to Lua API (not enough memory)" on term1. Some times it just returns "not enough memory" error.

I think the problem is in encode_yang_string. Debugging the issue it tries to encode a string of length equals to 204231825.

@dpino
Copy link
Member Author

dpino commented May 21, 2018

I triage this issue again now that the latest master is able to work with bigger binding-tables and it's returning an error, although a different one.

Run lwAFTR:

$ sudo ./snabb lwaftr run --name lwaftr --conf lwaftr.conf --on-a-stick 82:00.0

Run get-config:

sudo ./snabb config get lwaftr / 
program/config/common.lua:173: unexpected EOF when reading length

Stack Traceback
===============
(1) Lua function 'handler' at file 'core/main.lua:168' (best guess)
    Local variables:
     reason = string: "program/config/common.lua:173: unexpected EOF when reading length"
     (*temporary) = C function: print
(2) global C function 'error'
(3) Lua upvalue 'read_length' at file 'program/config/common.lua:173'
    Local variables:
     socket = table: 0x4174a4d0  {io:table: 0x4174a310, tx:cdata<struct 1327>: 0x4174b5f0, random_access:false (more...)}
     line = nil
(4) Lua global 'recv_message' at file 'program/config/common.lua:184'
    Local variables:
     socket = table: 0x4174a4d0  {io:table: 0x4174a310, tx:cdata<struct 1327>: 0x4174b5f0, random_access:false (more...)}
     (*temporary) = Lua function 'read' (defined at line 179 of chunk program/config/common.lua)
     (*temporary) = table: 0x4174a4d0  {io:table: 0x4174a310, tx:cdata<struct 1327>: 0x4174b5f0, random_access:false (more...)}
(5) Lua field 'call_leader' at file 'program/config/common.lua:193'
    Local variables:
     instance_id = string: "lwaftr"
     method = string: "get-config"
     args = table: 0x41574c60  {path:/, format:yang, schema:ietf-softwire-br, print_default:false}
     caller = table: 0x41574cc8  {parse_output:function: 0x41749c70, print_input:function: 0x40a20658}
     socket = table: 0x4174a4d0  {io:table: 0x4174a310, tx:cdata<struct 1327>: 0x4174b5f0, random_access:false (more...)}
     msg = string: "get-config {\
  format yang;\
  print-default false;\
  schema ietf-softwire-br;\
}\
"
     parse_reply = Lua function 'parse' (defined at line 44 of chunk lib/yang/rpc.lua)
(6) Lua field 'run' at file 'program/config/get/get.lua:9'
    Local variables:
     args = table: 0x40f61818  {path:/, format:yang, print_default:false, schema_name:ietf-softwire-br (more...)}
     opts = table: 0x40f60f68  {is_config:true, command:get, with_path:true}
(7) Lua field 'run' at file 'program/config/config.lua:23'
    Local variables:
     args = table: 0x40ab4408  {1:lwaftr, 2:/}
     command = string: "get"
     modname = string: "program.config.get.get"
(8) Lua function 'main' at file 'core/main.lua:67' (best guess)
    Local variables:
     program = string: "config"
     args = table: 0x412f8c38  {1:get, 2:lwaftr, 3:/}
(9) global C function 'xpcall'
(10) main chunk of file 'core/main.lua' at line 240
(11)  C function 'require'
(12) global C function 'pcall'
(13) main chunk of file 'core/startup.lua' at line 3
(14) global C function 'require'
(15) main chunk of [string "require "core.startup""] at line 1
    nil

@dpino
Copy link
Member Author

dpino commented Jun 19, 2018

Fixed now in raptorjit-lwaftr branch due to all the refactorings regarding improving string streaming (large binding-table compilation) and switching to raptorjit branch.

@dpino dpino closed this as completed Jun 19, 2018
@dpino
Copy link
Member Author

dpino commented Jul 5, 2018

It seems this issue doesn't get totally fixed just by switching to RaptorJIT. If using a larger binding-table, for instance a 10M binding-table, I get the following error:

$ sudo ./snabb config get lwaftr /
lib/stream/mem.lua:45: bad argument #1 to '2' (size of C type is unknown or too large)

A 10M configuration file can be generated as:

$ sudo time ./snabb lwaftr generate-configuration \
   --output lwaftr2.conf 193.5.1.100 158740 fc00::100 fc00:1:2:3:4:5:0:7e 6

I debugged the error and it seems that as soon as buffer size reaches 2GB, there's this error.

I think the problem is that although RaptorJIT runs on 64-bit, FFI memory is still limited to 2GB. One possible solution could be to make lib/stream/mem.lua backup on malloc which doesn't have the 2GB restriction. The downside is that it would be necessary to manually deallocate this memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant