Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZeroTier one may be leaking sockets #597

Closed
alexforencich opened this issue Sep 30, 2017 · 8 comments
Closed

ZeroTier one may be leaking sockets #597

alexforencich opened this issue Sep 30, 2017 · 8 comments
Labels
Type: Bug Bug to be resolved

Comments

@alexforencich
Copy link
Contributor

I recently installed ZeroTier One on a VPS that's running under OpenVZ. Shortly after, the server started ending up in an unusable state - fastcgi and sshd complaining about not being able to open sockets:

2017/09/30 17:43:09 [alert] 21870#0: *94759 socket() failed (12: Cannot allocate memory) while connecting to upstream, client: xxx.xxx.xxx.xxx, server: xxxxxx.com, request: "GET /xxx/xxx/xxx HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "xxxxxx.com"

and

sshd[459]: error: reexec socketpair: Cannot allocate memory

I ended up leaving a mosh session open with a root shell running (I tried without a root shell running, but PAM was failing and preventing console logins, sudo, and su) and I finally traced the issue back to a limit issue. Apparently the instance has a limit 'numothersock' of 500, with all 500 used. Upon killing zerotier-one, the number dropped to 70. I presume that zerotier-one doesn't need 430 sockets open at once with a network with less than 10 nodes, so my conclusion is that zerotier-one may be leaking sockets.

@adamierymenko
Copy link
Contributor

It really should not use that many sockets. What does lsof show for the ZeroTier process?

@alexforencich
Copy link
Contributor Author

Let me check. Reinstalling it now, may have to wait a day or two to see what happens.

@alexforencich
Copy link
Contributor Author

alexforencich commented Oct 2, 2017

After running for a few hours, lsof seems to show a bunch of these stacking up:

zerotier- 16951 root   19u  sock       0,6      0t0 729706119 can't identify protocol 
zerotier- 16951 root   20u  sock       0,6      0t0 729766216 can't identify protocol 
zerotier- 16951 root   21u  sock       0,6      0t0 729826446 can't identify protocol 

Full output (from the VPS):

COMMAND     PID USER   FD   TYPE    DEVICE SIZE/OFF      NODE NAME
zerotier- 16951 root  cwd    DIR      0,34     4096  39845891 /
zerotier- 16951 root  rtd    DIR      0,34     4096  39845891 /
zerotier- 16951 root  txt    REG      0,34  2067072  40509500 /usr/sbin/zerotier-one
zerotier- 16951 root  mem    REG      8,17           40509500 /usr/sbin/zerotier-one (path dev=0,34)
zerotier- 16951 root    0u   CHR       1,3      0t0  39849683 /dev/null
zerotier- 16951 root    1u   CHR     136,3      0t0         6 /dev/pts/3
zerotier- 16951 root    2u   CHR     136,3      0t0         6 /dev/pts/3
zerotier- 16951 root    3r  FIFO       0,8      0t0 729567782 pipe
zerotier- 16951 root    4w  FIFO       0,8      0t0 729567782 pipe
zerotier- 16951 root    5r   CHR       1,9      0t0  39849701 /dev/urandom
zerotier- 16951 root    6u  IPv4 729567790      0t0       TCP localhost.localdomain:9993 (LISTEN)
zerotier- 16951 root    7u  IPv6 729567791      0t0       TCP localhost:9993 (LISTEN)
zerotier- 16951 root    8u   CHR    10,200      0t0  39845990 /dev/net/tun
zerotier- 16951 root    9r  FIFO       0,8      0t0 729567804 pipe
zerotier- 16951 root   10u  sock       0,6      0t0 729567799 can't identify protocol
zerotier- 16951 root   11u  sock       0,6      0t0 729641257 can't identify protocol
zerotier- 16951 root   12w  FIFO       0,8      0t0 729567804 pipe
zerotier- 16951 root   13u  IPv4 729567821      0t0       UDP xxx.xxx:9993 
zerotier- 16951 root   14u  IPv6 729567822      0t0       UDP [xxx:xxx::xxx]:9993 
zerotier- 16951 root   15u  IPv4 729567827      0t0       UDP xxx.xxx:64937 
zerotier- 16951 root   16u  IPv6 729567828      0t0       UDP [xxx:xxx::xxx]:64937 
zerotier- 16951 root   17u  IPv4 729567833      0t0       UDP xxx.xxx:64938 
zerotier- 16951 root   18u  IPv6 729567834      0t0       UDP [xxx:xxx::xxx]:64938 
zerotier- 16951 root   19u  sock       0,6      0t0 729706119 can't identify protocol
zerotier- 16951 root   20u  sock       0,6      0t0 729766216 can't identify protocol
zerotier- 16951 root   21u  sock       0,6      0t0 729826446 can't identify protocol
zerotier- 16951 root   22u  sock       0,6      0t0 729882892 can't identify protocol
zerotier- 16951 root   23u  sock       0,6      0t0 729937746 can't identify protocol
zerotier- 16951 root   24u  sock       0,6      0t0 729987816 can't identify protocol
zerotier- 16951 root   25u  sock       0,6      0t0 730038757 can't identify protocol
zerotier- 16951 root   26u  sock       0,6      0t0 730084218 can't identify protocol
zerotier- 16951 root   27u  sock       0,6      0t0 730130972 can't identify protocol
zerotier- 16951 root   28u  sock       0,6      0t0 730177242 can't identify protocol
zerotier- 16951 root   29u  sock       0,6      0t0 730221712 can't identify protocol
zerotier- 16951 root   30u  sock       0,6      0t0 730264273 can't identify protocol
zerotier- 16951 root   31u  sock       0,6      0t0 730310323 can't identify protocol
zerotier- 16951 root   32u  sock       0,6      0t0 730354919 can't identify protocol
zerotier- 16951 root   33u  sock       0,6      0t0 730402132 can't identify protocol
zerotier- 16951 root   34u  sock       0,6      0t0 730455191 can't identify protocol
zerotier- 16951 root   35u  sock       0,6      0t0 730510973 can't identify protocol
zerotier- 16951 root   36u  sock       0,6      0t0 730561408 can't identify protocol
zerotier- 16951 root   37u  sock       0,6      0t0 730608384 can't identify protocol
zerotier- 16951 root   38u  sock       0,6      0t0 730660413 can't identify protocol
zerotier- 16951 root   39u  sock       0,6      0t0 730708924 can't identify protocol
zerotier- 16951 root   40u  sock       0,6      0t0 730760319 can't identify protocol
zerotier- 16951 root   41u  sock       0,6      0t0 730809615 can't identify protocol
zerotier- 16951 root   42u  sock       0,6      0t0 730858650 can't identify protocol

From a different machine (laptop):

COMMAND    PID USER   FD   TYPE             DEVICE SIZE/OFF     NODE NAME
zerotier- 1557 root  cwd    DIR                8,9     4096        2 /
zerotier- 1557 root  rtd    DIR                8,9     4096        2 /
zerotier- 1557 root  txt    REG                8,9  1378616 12327158 /var/lib/zerotier-one/zerotier-one
zerotier- 1557 root  mem    REG                8,9    46912 18483157 /usr/lib/libnss_files-2.26.so
zerotier- 1557 root  mem    REG                8,9  1358168 18483149 /usr/lib/libm-2.26.so
zerotier- 1557 root  mem    REG                8,9  2065840 18483275 /usr/lib/libc-2.26.so
zerotier- 1557 root  mem    REG                8,9   145336 18483298 /usr/lib/libpthread-2.26.so
zerotier- 1557 root  mem    REG                8,9   751064 18498592 /usr/lib/libgcc_s.so.1
zerotier- 1557 root  mem    REG                8,9 11563944 18487705 /usr/lib/libstdc++.so.6.0.24
zerotier- 1557 root  mem    REG                8,9    10352 18584477 /usr/lib/libnatpmp.so.1
zerotier- 1557 root  mem    REG                8,9   176880 18483276 /usr/lib/ld-2.26.so
zerotier- 1557 root    0r   CHR                1,3      0t0     1028 /dev/null
zerotier- 1557 root    1u  unix 0xffff8a891c3ee400      0t0    15850 type=STREAM
zerotier- 1557 root    2u  unix 0xffff8a891c3ee400      0t0    15850 type=STREAM
zerotier- 1557 root    3r  FIFO               0,11      0t0    18666 pipe
zerotier- 1557 root    4w  FIFO               0,11      0t0    18666 pipe
zerotier- 1557 root    5r   CHR                1,9      0t0     1033 /dev/urandom
zerotier- 1557 root    6u  IPv4              20769      0t0      TCP localhost.localdomain:palace-2 (LISTEN)
zerotier- 1557 root    7u  IPv6              20770      0t0      TCP localhost.localdomain:palace-2 (LISTEN)
zerotier- 1557 root    8u   CHR             10,200    0t121    10519 /dev/net/tun
zerotier- 1557 root    9u  sock                0,9      0t0    15927 protocol: UDP
zerotier- 1557 root   10r  FIFO               0,11      0t0    18343 pipe
zerotier- 1557 root   11w  FIFO               0,11      0t0    18343 pipe
zerotier- 1557 root   12u  IPv4           33387322      0t0      UDP xxx.xxx:palace-2 
zerotier- 1557 root   13u  IPv4           33387329      0t0      UDP xxx.xxx:35007 
zerotier- 1557 root   14u  IPv4           33387336      0t0      UDP xxx.xxx:35008 
zerotier- 1557 root   15u  IPv6           33387323      0t0      UDP xxx:palace-2 
zerotier- 1557 root   16u  IPv6           33387330      0t0      UDP xxx:35007 
zerotier- 1557 root   17u  IPv6           33387337      0t0      UDP xxx:35008 
zerotier- 1557 root   18u  sock                0,9      0t0  3058172 protocol: UDP
zerotier- 1557 root   19u  sock                0,9      0t0  3069642 protocol: UDP
zerotier- 1557 root   20u  sock                0,9      0t0  3757137 protocol: UDP
zerotier- 1557 root   21u  sock                0,9      0t0  4107904 protocol: UDP
zerotier- 1557 root   22u  sock                0,9      0t0  4140195 protocol: UDP
zerotier- 1557 root   23u  sock                0,9      0t0  4604928 protocol: UDP
zerotier- 1557 root   24u  sock                0,9      0t0  4623243 protocol: UDP
zerotier- 1557 root   25u  sock                0,9      0t0  7683969 protocol: UDP
zerotier- 1557 root   26u  sock                0,9      0t0  7700466 protocol: UDP
zerotier- 1557 root   27u  sock                0,9      0t0  7715608 protocol: UDP
zerotier- 1557 root   28u  sock                0,9      0t0  7735932 protocol: UDP
zerotier- 1557 root   29u  sock                0,9      0t0  7752105 protocol: UDP
zerotier- 1557 root   30u  sock                0,9      0t0 12945438 protocol: UDP
zerotier- 1557 root   31u  sock                0,9      0t0 12954456 protocol: UDP
zerotier- 1557 root   32u  sock                0,9      0t0 12967728 protocol: UDP
zerotier- 1557 root   33u  sock                0,9      0t0 13137308 protocol: UDP
zerotier- 1557 root   34u  sock                0,9      0t0 14653402 protocol: UDP
zerotier- 1557 root   35u  sock                0,9      0t0 14672228 protocol: UDP
zerotier- 1557 root   36u  sock                0,9      0t0 14687470 protocol: UDP
zerotier- 1557 root   37u  sock                0,9      0t0 14695253 protocol: UDP
zerotier- 1557 root   38u  sock                0,9      0t0 17921674 protocol: UDP
zerotier- 1557 root   39u  sock                0,9      0t0 17937631 protocol: UDP
zerotier- 1557 root   40u  sock                0,9      0t0 17948570 protocol: UDP
zerotier- 1557 root   41u  sock                0,9      0t0 17964625 protocol: UDP
zerotier- 1557 root   42u  sock                0,9      0t0 17980503 protocol: UDP
zerotier- 1557 root   43u  sock                0,9      0t0 17995282 protocol: UDP
zerotier- 1557 root   44u  sock                0,9      0t0 18009361 protocol: UDP
zerotier- 1557 root   45u  sock                0,9      0t0 18025549 protocol: UDP
zerotier- 1557 root   46u  sock                0,9      0t0 18036484 protocol: UDP
zerotier- 1557 root   47u  sock                0,9      0t0 18052737 protocol: UDP
zerotier- 1557 root   48u  sock                0,9      0t0 18066530 protocol: UDP
zerotier- 1557 root   49u  sock                0,9      0t0 18076259 protocol: UDP
zerotier- 1557 root   50u  sock                0,9      0t0 18264974 protocol: UDP

And from a 3rd machine (home server), which does not appear to be leaking sockets:

COMMAND   PID USER   FD   TYPE             DEVICE SIZE/OFF      NODE NAME
zerotier- 635 root  cwd    DIR                8,3     4096         2 /
zerotier- 635 root  rtd    DIR                8,3     4096         2 /
zerotier- 635 root  txt    REG                8,3  1169384  18614456 /var/lib/zerotier-one/zerotier-one
zerotier- 635 root  mem    REG                8,3    46912  10225642 /usr/lib/libnss_files-2.26.so
zerotier- 635 root  mem    REG                8,3  1358168  10225636 /usr/lib/libm-2.26.so
zerotier- 635 root  mem    REG                8,3  2065840  10225769 /usr/lib/libc-2.26.so
zerotier- 635 root  mem    REG                8,3   145336  10225794 /usr/lib/libpthread-2.26.so
zerotier- 635 root  mem    REG                8,3   751064  10242191 /usr/lib/libgcc_s.so.1
zerotier- 635 root  mem    REG                8,3 11563944  10230246 /usr/lib/libstdc++.so.6.0.24
zerotier- 635 root  mem    REG                8,3   176880  10225770 /usr/lib/ld-2.26.so
zerotier- 635 root    0r   CHR                1,3      0t0      1028 /dev/null
zerotier- 635 root    1u  unix 0xffff8804b6370c00      0t0 159690787 type=STREAM
zerotier- 635 root    2u  unix 0xffff8804b6370c00      0t0 159690787 type=STREAM
zerotier- 635 root    3r  FIFO               0,11      0t0 159690790 pipe
zerotier- 635 root    4w  FIFO               0,11      0t0 159690790 pipe
zerotier- 635 root    5r   CHR                1,9      0t0      1033 /dev/urandom
zerotier- 635 root    6u  IPv4          159690793      0t0       TCP localhost.localdomain:palace-2 (LISTEN)
zerotier- 635 root    7u  IPv6          159690794      0t0       TCP localhost.localdomain:palace-2 (LISTEN)
zerotier- 635 root    8u  IPv4          195619760      0t0       UDP xxx.xxx:palace-2 
zerotier- 635 root    9u  IPv4          195619765      0t0       UDP xxx.xxx:58345 
zerotier- 635 root   10u  IPv4          195619770      0t0       UDP xxx.xxx:58346 
zerotier- 635 root   11u  IPv6          196006957      0t0       UDP xxx.xxx:palace-2 
zerotier- 635 root   12u  IPv6          196006963      0t0       UDP xxx.xxx:58345 
zerotier- 635 root   13u  IPv6          196006969      0t0       UDP xxx.xxx:58346 
zerotier- 635 root   18u   CHR             10,200    0t170     12657 /dev/net/tun
zerotier- 635 root   19r  FIFO               0,11      0t0 159690905 pipe
zerotier- 635 root   20w  FIFO               0,11      0t0 159690905 pipe

Everything is running 1.2.4, as reported by my.zerotier.com.

@janjaapbos
Copy link
Contributor

It would be good to see in what state the sockets are. Perhaps they are not cleaned up by the OS.

E.g. netstat -aA inet

@alexforencich
Copy link
Contributor Author

Today it's up to file descriptor 337. Still hasn't hit the limit yet, but it's getting there.

The 'leaked' sockets do not appear in netstat -aA inet (there are far less than 300 entries in that list, anyway)

@adamierymenko
Copy link
Contributor

It looks as if it's building up UDP sockets, but the "can't identify protocol" descriptions make me wonder. In any case I think we'll take a look at the possibility of UDP sockets not being closed.

@adamierymenko adamierymenko added the Type: Bug Bug to be resolved label Oct 16, 2017
joseph-henry added a commit that referenced this issue Nov 27, 2017
@joseph-henry
Copy link
Contributor

I suspect it might have to do with our usage of libnatpmp. I've committed a patch to make sure the socket that libnatpmp opens upon initialization is closed in the event of failure.

Try pulling latest dev and giving that a shot.

@adamierymenko
Copy link
Contributor

Closing since I think this is fixed in dev.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug Bug to be resolved
Projects
None yet
Development

No branches or pull requests

4 participants