New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mtr 0.85 crash when using tcp mode #28
Comments
Does it also crash for you when you specify a sensible value for the interval? As written you're trying to send 100 packets per second to about 10 hosts, or about 1000 connection requests per second. I can imagine "unexpected" things to happen. My mtr also bombed out. After about ten seconds. But you're running this "as root" to work around the "lower limit of 1.0 seconds per round". As the root user you can break your system by issueing the wrong commands or giving the wrong arguments. On your computer things go wrong (a segfault) after about a second. On my computer it exits with "Socket: succes!" after 5 seconds or so. So the limit depends on the computer somehow. I could program in a new limit, that today works on your computer and on mine. But after a few years that limit will be too high as computers and networks have gotten faster. So I don't like such an arbitrary limit. So... You are stress-testing your system and network by attempting to start 500 to 1500 connections per second. And lo-and-behold, something goes wrong. I suspect the problem might be caused by some fundamental IPV4 limit: Maybe you're running out of socket numbers or something like that. If it breaks, you get to keep both pieces. If you can reproduce this with normal parameters, feel free to reopen this issue. It is a "won't fix" for me. |
Hi!I have the same issue with mtr 0.85 installed through MacPorts: "bind(): Undefined error: 0". ... stat64("/usr/lib/libxar.1.dylib\0", 0x7FFF5xx31368, 0x7FFF5xx32200) = 0 0 �[?1049h�[1;24r�(B�[m�[4l�[?7h�[H�[2J�[1;29H�(B�[0;1m�[1K My traceroute [v0.85] �(B�[mpc.lan (0.0.0.0)�[2;56HSat Apr 19 17:02:25 2014 Keys: �(B�[0;1mH�(B�[melp �(B�[0;1mD�(B�[misplay mode �(B�[0;1mR�(B�[mestart statistics �(B�[0;1mO�(B�[mrder of fields �(B�[0;1mq�(B�[muit�[4;37H�(B�[0;1m Packets�[15X�[4;62HPings Host�[5;37H Loss% Snt Last Avg Best Wrst StDev�[H �(B�[m�[24;1H�[?1049l �[?1l�>�[?1049h�[1;24r�(B�[m�[4l�[?7h�[H�[2J�[1;29H�(B�[0;1m�[1K My traceroute [v0.85] �(B�[mpc.lan (0.0.0.0)�[2;56HSat Apr 19 17:02:25 2014 Keys: �(B�[0;1mH�(B�[melp �(B�[0;1mD�(B�[misplay mode �(B�[0;1mR�(B�[mestart statistics �(B�[0;1mO�(B�[mrder of fields �(B�[0;1mq�(B�[muit�[4;37H�(B�[0;1m Packets�[15X�[4;62HPings Host�[5;37H Loss% Snt Last Avg Best Wrst StDev�[H �(B�[mbind(): Inappropriate ioctl for device ... bind(0xD, 0x7FFxxF2xx610, 0x80) = -1 Err#22 ... ioctl(0x2, 0x8048xx15, 0x7FE8B84xx1E0) = -1 Err#25 ioctl(0x2, 0x8048xx15, 0x7FFF5xx32500) = -1 Err#25 ... Any thought about "[mbind(): Inappropriate ioctl for device"? Thanks |
If you specify "-r" option the "truss" output will not be interspersed with escape sequences to handle the full-screen-output. Or otherwise send the truss output somewhere else. On Linux we have "strace" which does the same as truss. It has an -o option to save it to a file. |
It looks like "-r" is not available: $ dtruss -r /usr/bin/dtruss: illegal option -- r USAGE: dtruss [-acdefholLs] [-t syscall] { -p PID | -n name | command } -p PID # examine this PID -n name # examine this process name -t syscall # examine this syscall only -a # print all details -c # print syscall counts -d # print relative times (us) -e # print elapsed times (us) -f # follow children -l # force printing pid/lwpid -o # print on cpu times -s # print stack backtraces -L # don't print pid/lwpid -b bufsize # dynamic variable buf size eg, dtruss df -h # run and examine "df -h" dtruss -p 1871 # examine PID 1871 dtruss -n tar # examine all processes called "tar" dtruss -f test.sh # run test.sh and follow children Anyway, I have played with Xcode and it looks like something is going wrong around net.c line 340 (http://tinyurl.com/l8b5akn): if (bind(s, (struct sockaddr *) &local, sizeof (local))) { //here we get the error 22 - ?EINVAL? display_clear(); perror("bind()"); exit(EXIT_FAILURE); } By modifying this code to: if (bind(s, (struct sockaddr *) &local, sizeof (struct sockaddr))) { display_clear(); perror("bind()"); exit(EXIT_FAILURE); } Bind() seems happy, but I don't have any hop printed in the output, just: matrix:mtr dpm$ sudo ./mtr --tcp --report --port 80 8.8.8.8 Start: Fri Apr 25 12:38:23 2014 HOST: pc.lan Loss% Snt Last Avg Best Wrst StDev |
Looks like the same issue: traviscross#28 https://bugs.launchpad.net/mtr/+bug/1273486 https://bugs.launchpad.net/mtr/+bug/1327036 % uname -srm; mtr -r --tcp localhost FreeBSD 10.0-RELEASE-p4 amd64 Start: Fri Jun 6 22:20:00 2014 bind(): Invalid argument % % uname -srm; mtr -r --tcp localhost NetBSD 6.1.3_PATCH i386 Start: Fri Jun 6 22:23:13 2014 bind(): Invalid argument % % uname -a; % mtr -r --tcp localhost SunOS opensolaris 5.11 oi_151a9 i86pc i386 i86pc Start: Wed Jun 11 19:49:28 2014 bind(): Invalid argument % OpenSolaris/OI: compile fails with FIONBIO Looks like the same issue: traviscross#27 traviscross#35 https://bugs.launchpad.net/mtr/+bug/1273486 % make make all-recursive Making all in img depbase=`echo net.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\ gcc -DHAVE_CONFIG_H -I. -g -O2 -Wall -MT net.o -MD -MP -MF $depbase.Tpo -c -o net.o net.c &&\ mv -f $depbase.Tpo $depbase.Po net.c: In function `net_send_tcp': net.c:360: error: `FIONBIO' undeclared (first use in this function) net.c:360: error: (Each undeclared identifier is reported only once net.c:360: error: for each function it appears in.) *** Error code 1 fix: #define BSD_COMP
Looks like the same issue at the bottom of this report: traviscross#28 % uname -a; ./mtr -rT localhost; echo $? Start: Thu Jun 12 17:09:11 2014 SunOS opensolaris 5.11 oi_151a9 i86pc i386 i86pc 141 % uname -srm; ./mtr -4rT localhost; echo $? NetBSD 6.1.3_PATCH i386 Start: Thu Jun 12 16:14:47 2014 141 % gdb ./mtr (gdb) set args -4rT localhost (gdb) r Start: Thu Jun 12 16:15:36 2014 Program received signal SIGPIPE, Broken pipe. [Switching to LWP 1] 0xbb767387 in write () from /usr/lib/libc.so.12 (gdb) bt #0 0xbb767387 in write () from /usr/lib/libc.so.12 traviscross#1 0xbb470263 in write () from /usr/lib/libpthread.so.1 traviscross#2 0x0804fea6 in net_process_fds (writefd=0xbfbfeaac) at net.c:1542 traviscross#3 0x080544a7 in select_loop () at select.c:264 traviscross#4 0x0804d5ee in main (argc=3, argv=0xbfbfec58) at mtr.c:719 (gdb) list net.c:1541,1543 1541 if (fd > 0 && FD_ISSET(fd, writefd)) { 1542 r = write(fd, "G", 1); 1543 /* if write was successful, or connection refused we have (gdb) fix: If the socket is connected, getpeername() will return 0. Else getpeername() will return ENOTCONN, and read(,,1) will produce the right errno. This is a combination of suggestions from Douglas C. Schmidt and Ken Keys.
You might try out this patch from https://bugs.launchpad.net/mtr/+bug/1273486/+attachment/4135564/+files/len-sizeof-patch.diff . |
though this issue is closed, its still happening, so in case it is useful I've created a gist of the dtruss output from running the following in a separate terminal sudo dtruss -flesal -n mtr during the crash from running: |
Running into this with
|
I have this problem FreeBSD 11.2, TCP mode with no other flags specified mtr --versionmtr UNKNOWN O_o, it is 0.92 tho. |
You can help me reproduce it by restating the command I need to try to get what you see. Maybe a cut-and-paste of both the command and the crashing output. |
Steps to reproduce:
It crashed in about a second (SIGABRT).
Backtrace:
OS is Arch Linux x86_64, with glibc 2.17
The text was updated successfully, but these errors were encountered: