You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
pua_dialoginfo crashes on dialog_publish.c too quick on line 77 (1.9 branch).
And I think all other branches because code (and line) is the same.
#0 0x00007f9b2463f122 in memcpy () from /lib64/libc.so.6
No symbol table info available. #1 0x00007f9b19f089c9 in memcpy (state=0x7f9b19f0d7d3 "terminated",
entity=0x7fffa95f1900, peer=0x7fffa95f1690, callid=0x7f9b11d58b88,
initiator=1, localtag=0x0, remotetag=0x0) at /usr/include/bits/string3.h:52
No locals. #2 build_dialoginfo (state=0x7f9b19f0d7d3 "terminated",
entity=0x7fffa95f1900, peer=0x7fffa95f1690, callid=0x7f9b11d58b88,
initiator=1, localtag=0x0, remotetag=0x0) at dialog_publish.c:77
doc = 0x0
root_node = 0x0
dialog_node = 0x0
state_node = 0x0
remote_node = 0x0
local_node = 0x0
tag_node = 0x0
id_node = 0x0
body = 0x0
buf = "ʿp$\260\375\377\377\377\377\377\377\377\377\377\377\003\000\000\000\000\000\000\000\350\320%\037\233\177\000\000`\372\236\033\233\177\000\000`\372\236\033\233\177\000\000`\372\236\033\233\177\000\000\230\037_\251\377\177\000\000\234\037_\251\377\177\000\000\340\060A\000\000\000\000\000\200\065\224$\000\000\000\000\300\321%\037\233\177\000\000\230\322%\037\233\177\000\000\000\000\00---Type <return> to continue, or q <return> to quit---
0\000\000\000\000\000`\372\236\033\233\177\000\000\215\064A\000\000\000\000\000\000\000\000\000\260\375\377\377\000\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000\200N\224$\233\177\000\000\000\000\000\000\000\000\000\000\003", '\000' <repeats 15 times>, "\004\000\000\000\000\000\000\000\240\311D\002", '\000' <repeats 12 times>, "P\023_\251\377\177\000\000\247\320g$\233\177\000\000\002\000\000\000\000\000\000\000n^I\000\000\000\000\000\000\000\000\000\377\177\000\000$\000\000\000\000\000\000\000\300\227B\037\233\177\000\000\003", '\000' <repeats 11 times>... FUNCTION = «build_dialoginfo"
or
#0 0x00007f41cec45122 in memcpy () from /lib64/libc.so.6
No symbol table info available. #1 0x00007f41c450e9c9 in memcpy (state=0x7f41c45137de "early",
entity=0x7fffa2243690, peer=0x7fffa2243420, callid=0x7f41bbf576f8,
initiator=1, localtag=0x0, remotetag=0x0) at /usr/include/bits/string3.h:52
No locals. #2 build_dialoginfo (state=0x7f41c45137de "early", entity=0x7fffa2243690,
Additional information. To reproduce problem is enough to enable pua and pua_dialoginfo modules and call dialoginfo_set("A") or dialoginfo_set("B") to all of our calls. We can reproduce it only on live traffic. At the random moment about one time in 1-3 hours struct dlg_cell loose its pointer to callid. In another words, callid.s == NULL. Any other data is there. Rarely is the same problem with name.s value in struct dlg_val. It's all in random places of code !!! I inserted checkpoints in any other functions to print callid and callid disappears in random moment in same dialogs. Usually, it disappears just right after creation dialog when processing initial INVITE or first/second reply after this INVITE - 200, 486, 180, 100 !!! Just one moment we have callid and then we have no callid there, but current process functions does not touch callid! It rewrites by another process. My idea was also to check timer routines in pua module. Cleanup function was not called, but dbupdate function was ended usually 1 second ago before callid is broken. Then we have crashes in dialog or pua_dialoginfo modules. If we disable pua & pua_dialoginfo modules then opensips is stable. No loosing vars. Version is 1.10 latest git (a36e379, Jan 15). I cannot find who rewrites pointers in dlg_cell and dlg_val structures.
@nikbyte , do you still have the corefile (available for inspection) ? or can you reproduce it and get a new core ? I would need some more info from the corefile, just to validate a theory of mine on how this crash happens.
This bug was intensively troubleshooted with @nikbyte , but there was not final conclusion on it. The "callid" field is overwritten during an overflow in the "dlg_cell" structure, but we did not manage to find the actual source (the code responsible for the the underflow).
According to @nikbyte tests, the problem seems to be fixed in 1.11 code. Also following his upgrade (from 1.10 to 1.11) we cannot troubleshoot this crash anymore. Even more , 1.10 gets to the end of his lifetime, so we decide not to follow this bug anymore and close the ticket.
pua_dialoginfo crashes on dialog_publish.c too quick on line 77 (1.9 branch).
And I think all other branches because code (and line) is the same.
#0 0x00007f9b2463f122 in memcpy () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007f9b19f089c9 in memcpy (state=0x7f9b19f0d7d3 "terminated",
No locals.
#2 build_dialoginfo (state=0x7f9b19f0d7d3 "terminated",
0\000\000\000\000\000`\372\236\033\233\177\000\000\215\064A\000\000\000\000\000\000\000\000\000\260\375\377\377\000\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000\200N\224$\233\177\000\000\000\000\000\000\000\000\000\000\003", '\000' <repeats 15 times>, "\004\000\000\000\000\000\000\000\240\311D\002", '\000' <repeats 12 times>, "P\023_\251\377\177\000\000\247\320g$\233\177\000\000\002\000\000\000\000\000\000\000n^I\000\000\000\000\000\000\000\000\000\377\177\000\000$\000\000\000\000\000\000\000\300\227B\037\233\177\000\000\003", '\000' <repeats 11 times>...
FUNCTION = «build_dialoginfo"
or
#0 0x00007f41cec45122 in memcpy () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007f41c450e9c9 in memcpy (state=0x7f41c45137de "early",
No locals.
#2 build_dialoginfo (state=0x7f41c45137de "early", entity=0x7fffa2243690,
\000\000\000\004\000\000\000\000\000\000\000\200\256\364\316A\177", '\000' <repeats 34 times>"\240, Yt\002", '\000' <repeats 12 times>"\340, \060$\242\377\177\000\000\247\060\310\316A\177\000\000\002\000\000\000\000\000\000\000n^I\000\000\000\000\000a.137810$\000\000\000\000\000\000\000\230n\210\311A\177\000\000\003", '\000' <repeats 11 times>, "\004\000\000\000\000\000\000\000A\177\000\000\350\061$\242\377\177\000\000\000\061$\242\377\177\000\000\000\000\000\000A"...
FUNCTION = «build_dialoginfo"
because
(gdb) p entity->uri
$5 = {s = 0x0, len = 36}
or
(gdb) p entity->uri
$1 = {s = 0x0, len = 39}
The text was updated successfully, but these errors were encountered: