Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

don't interact with XSB via a pipe #28

Closed
adferguson opened this issue Nov 17, 2013 · 4 comments
Closed

don't interact with XSB via a pipe #28

adferguson opened this issue Nov 17, 2013 · 4 comments

Comments

@adferguson
Copy link
Collaborator

we need to replace the current method of interacting with XSB via a Unix pipe.

under the current regime, FlowLog will hang delightfully without warning due to unknown difficulties with XSB. here is the backtrace of the two stuck processes (they were stuck for several minutes at this point, which caused the switches to give-up on the controller, since it was not following the OpenFlow keep-alive protocol):

$ sudo gdb ./flowlog.native 29998
GNU gdb (GDB) 7.5-ubuntu
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/flowlog/FlowLog/interpreter/flowlog.native...done.
Attaching to program: /home/flowlog/FlowLog/interpreter/flowlog.native, process 29998
Reading symbols from /lib/x86_64-linux-gnu/libpthread.so.0...(no debugging symbols found)...done.
[New LWP 30051]
[New LWP 30050]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Loaded symbols for /lib/x86_64-linux-gnu/libpthread.so.0
Reading symbols from /lib/x86_64-linux-gnu/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/libm.so.6
Reading symbols from /lib/x86_64-linux-gnu/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/libdl.so.2
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
0x00007ff47544bd2d in read () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) bt
#0  0x00007ff47544bd2d in read () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00000000005a1cd1 in caml_do_read ()
#2  0x00000000005a1d15 in caml_refill ()
#3  0x00000000005a2806 in caml_ml_input_char ()
#4  0x000000000048f4ee in camlXsb_Communication__get_line_gingerly_1201 () at Xsb_Communication.ml:77
#5  0x000000000048fc6b in camlXsb_Communication__send_query_1230 () at Xsb_Communication.ml:201
#6  0x000000000048d9e5 in camlPartial_Eval__change_table_how_1537 () at Partial_Eval.ml:573
#7  0x00000000005167e7 in camlExtList__loop_1100 ()
#8  0x0000000000517e67 in camlExtList__map_1096 ()
#9  0x000000000048def6 in camlPartial_Eval__respond_to_notification_1637 () at Partial_Eval.ml:678
#10 0x000000000048bc4d in camlPartial_Eval__fun_2151 () at Partial_Eval.ml:816
#11 0x00000000004b60a2 in camlNetCore_Action__apply_atom_1250 () at lib/NetCore_Action.ml:274
#12 0x00000000004af561 in camlFrenetic_List__fun_1050 () at lib/Frenetic_List.ml:11
#13 0x00000000004c096f in camlNetCore_Semantics__eval_action_1098 () at lib/NetCore_Semantics.ml:25
#14 0x00000000004af561 in camlFrenetic_List__fun_1050 () at lib/Frenetic_List.ml:11
#15 0x00000000004c0a89 in camlNetCore_Semantics__eval_1108 () at lib/NetCore_Semantics.ml:35
#16 0x00000000004c0aa4 in camlNetCore_Semantics__eval_1108 () at lib/NetCore_Semantics.ml:35
#17 0x00000000004c2cb3 in camlNetCore_Controller__fun_2373 () at lib/NetCore_Controller.ml:335
#18 0x000000000051e0ec in camlLwt__catch_1425 () at src/core/lwt.ml:679
#19 0x00000000004c2f0b in camlNetCore_Controller__fun_2413 () at lib/NetCore_Controller.ml:375
#20 0x000000000051ade6 in camlLwt__fun_2125 () at src/core/lwt.ml:646
#21 0x000000000051c927 in camlLwt__run_waiters_rec_1144 () at src/core/lwt.ml:201
#22 0x000000000051c927 in camlLwt__run_waiters_rec_1144 () at src/core/lwt.ml:201
#23 0x000000000051c927 in camlLwt__run_waiters_rec_1144 () at src/core/lwt.ml:201
#24 0x000000000051cbc6 in camlLwt__leave_wakeup_1181 () at src/core/lwt.ml:289
#25 0x000000000051a13d in camlLwt_sequence__loop_1066 () at src/core/lwt_sequence.ml:149
#26 0x000000000051a13d in camlLwt_sequence__loop_1066 () at src/core/lwt_sequence.ml:149
#27 0x0000000000560d01 in camlList__iter_1061 () at list.ml:75
#28 0x00000000004f9c7e in camlLwt_engine__fun_2245 () at src/unix/lwt_engine.ml:342
#29 0x00000000004fb8b8 in camlLwt_main__run_1012 () at src/unix/lwt_main.ml:41
#30 0x0000000000486e32 in camlFlowlog__main_1272 () at flowlog.ml:110
#31 0x00000000004871f8 in camlFlowlog__entry () at flowlog.ml:127
#32 0x00000000004842e9 in caml_program ()
#33 0x00000000005ab956 in caml_start_program ()
#34 0x000000000059a2e5 in caml_main ()
#35 0x000000000059a320 in main ()

$ sudo gdb /home/flowlog/XSB/config/x86_64-unknown-linux-gnu/bin/xsb 30000
GNU gdb (GDB) 7.5-ubuntu
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/flowlog/XSB/config/x86_64-unknown-linux-gnu/bin/xsb...(no debugging symbols found)...done.
Attaching to program: /home/flowlog/XSB/config/x86_64-unknown-linux-gnu/bin/xsb, process 30000
Reading symbols from /lib/x86_64-linux-gnu/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/libm.so.6
Reading symbols from /lib/x86_64-linux-gnu/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/libdl.so.2
Reading symbols from /lib/x86_64-linux-gnu/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Loaded symbols for /lib/x86_64-linux-gnu/libpthread.so.0
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
0x00007fc179285040 in write () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007fc179285040 in write () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fc179217883 in _IO_file_write () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fc17921774a in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007fc179218eb5 in _IO_do_write () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x00007fc1792191ff in _IO_file_overflow () from /lib/x86_64-linux-gnu/libc.so.6
#5  0x00007fc179210e29 in putc () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x000000000041e898 in file_function ()
#7  0x0000000000426254 in builtin_call ()
#8  0x00000000004425ba in emuloop ()
#9  0x00000000004686ac in xsb ()
#10 0x0000000000410f6e in main ()
@tnelson
Copy link
Owner

tnelson commented Nov 18, 2013

Which program was running at the time? Also, do we still have the logfile?
I want to clean up interaction with XSB but it'd help to be able to
understand why this happened.

(Separate observation: logs are likely to get enormous for long runs of the
controller.)

(Snipped quote to avoid issue link.)

@tnelson
Copy link
Owner

tnelson commented Nov 18, 2013

Something similar just happened when stress-testing. The packet was nothing
special:

<<< incoming: arp_packet:
[arp_op:1;arp_sha:2;arp_spa:167772162;arp_tha:0;arp_tpa:167772165;dldst:281474976710655;dlsrc:2;dltyp:2054;locpt:2;locsw:1]

emitting: arp_packet:
[arp_op:2;arp_sha:5;arp_spa:167772165;arp_tha:2;arp_tpa:167772162;dldst:2;dlsrc:5;dltyp:2054;locpt:2;locsw:1]

I was running a 1-switch, 8-host mininet with all hosts arpinging someone
else. 13103 packets were handled properly before this freeze occurred.

TODO: repeat the stress-test with some debugging features re: XSB errors.
It may be that XSB is running out of memory (or something like that).

@adferguson
Copy link
Collaborator Author

awesome! glad you could reproduce this as well.

fortunately or unfortunately, XSB doesn't seem to use a lot of memory from what I can tell on the testbed.

@tnelson
Copy link
Owner

tnelson commented Nov 19, 2013

I believe that this was fixed in commit 9829e67.

We should keep an eye on XSB in the meantime, though.

@tnelson tnelson closed this as completed Dec 12, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants