Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

socket.io slow/disconnected 2k connections #2664

Closed
ARogovsky opened this issue Sep 1, 2016 · 16 comments
Closed

socket.io slow/disconnected 2k connections #2664

ARogovsky opened this issue Sep 1, 2016 · 16 comments

Comments

@ARogovsky
Copy link

ARogovsky commented Sep 1, 2016

Hi.
I have simple web chat based on socket.io. When chat get 2k connections - socket.io slow/disconnected:

root@sr:~# netstat -plant | grep -c :8888
2091
root@sr:~#  date; wget -O /dev/null http://localhost:8888/socket.io/socket.io.js; date
Thu Sep  1 10:37:11 EDT 2016
--2016-09-01 10:37:11--  http://localhost:8888/socket.io/socket.io.js
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:8888... failed: Connection timed out.
Retrying.

--2016-09-01 10:38:15--  (try: 2)  http://localhost:8888/socket.io/socket.io.js
Connecting to localhost (localhost)|127.0.0.1|:8888... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/javascript]
Saving to: `/dev/null'

    [   <=>                                                                                                                                        ] 184,245      124K/s   in 1.5s    

2016-09-01 10:39:03 (124 KB/s) - `/dev/null' saved [184245]

Thu Sep  1 10:39:03 EDT 2016


Any advices?

@MaffooBristol
Copy link

What's the RAM usage like? You're wgeting the socket.io.js file, but that's just provided by the web server, for example Express. So I'd imagine the issue isn't to do with socket.io, but is more to do with the way your server is set up and/or the resources it has available.

Assuming you get the same issue with wget -O /dev/null http://localhost:8888?

@ARogovsky
Copy link
Author

Server resources:

root@sr:~# free -g
             total       used       free     shared    buffers     cached
Mem:           126        101         24          0          0         67
-/+ buffers/cache:         33         92
Swap:           19          0         19

Application run with this options:
--max_old_space_size=81920 --optimize_for_size --max_executable_size=40960 --stack_size=40960
Yes, connections for 8888 is dropped and outbound connections for newrelic server was dropped also:

New Relic for Node.js halted startup due to an error:
Error: connect ECONNRESET 50.31.164.148:443
    at Object.exports._errnoException (util.js:870:11)
    at exports._exceptionWithHostPort (util.js:893:20)
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1063:14)

Also I have nginx on port 80, it works always fine. So, this is not related server hardware/lack resources.

@MaffooBristol
Copy link

Hm, yes. But then I was saying that the ability for node to serve the socket.io.js file without timing out is nothing to do with socket.io itself. If you have actual transport issues through the sockets when a large number of clients are connected, then that may be attributable to this project. I just feel that serving that file is so basic that it wouldn't be socket.io's fault if it started timing out. Unless of course it was somehow blocking the thread. Could also be an issue with your code, are you sure you haven't got anything working synchronously or doing any heavy lifting that could be a bottleneck?

@ARogovsky
Copy link
Author

I'm not a developer, I just sysadmin. I thinking this is some like memore leaking in node, becouse connections problem cause on all modules - socket.io, newrelic, etc.
This problem persent when application have 2k connections.

@MaffooBristol
Copy link

Is the output of your free -g from when it was laying dormant with low connections or during a 2k connection spike? Node will expand its RAM usage quite a fair amount as more clients connect, unlike some other services where they allocate memory. The new relic issue sounds like something where it's run out of memory too, especially with an ECONNRESET error. How is the node app running, through Forever/PM2 or the like? Has it restarted itself when the connection number got high?

@ARogovsky
Copy link
Author

Yes, output when 2k connections. Server have 68G RAM free.
Sockets error is not depends of memory, couse no OOM registred.
There is otput from top:
top - 23:08:45 up 500 days, 11:11, 1 user, load average: 0.44, 0.47, 0.48
Tasks: 280 total, 1 running, 279 sleeping, 0 stopped, 0 zombie
%Cpu(s): 2.4 us, 0.2 sy, 0.0 ni, 97.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem: 129170 total, 113969 used, 15201 free, 584 buffers
MiB Swap: 19777 total, 0 used, 19777 free, 69747 cached

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND                                                                                                                  
23023 mysql     20   0 25.1g  20g  15m S     4 16.6   2983:04 mysqld                                                                                                                   
 3917 root      20   0 11.2g  10g  10m S     7  8.0 314:26.19 node                                                                                                                     
17717 root      20   0 1236m 189m  16m S     0  0.1 174:52.52 core          

As you can see - leaking is present, node take 10G RAM and 153 connections:

# netstat -plant | grep -c :8888
153

Also this cause on any version of node - from 4.2 to 7

I dont want restart node when ram is high, couse it is not solution. I need resolve problem or change techniqe.

@darrachequesne
Copy link
Member

Hi! Did you try to set a higher ulimit (maximum number of open file descriptors)? ulimit -n 100000

@ARogovsky
Copy link
Author

Yes. This is not help.

@carpii
Copy link

carpii commented Sep 5, 2016

@ARogovsky Is there anything in your system/dmesg log?

My guess is your limit is maxing out at 2048, which is most likely some sort of system limit
Open file descriptors would have been my guess too, but you say its not.

What about iptables conntrack? I've seen similar issues where heavy traffic from localhost.PHP -> localhost.Memcache was being dropped due to the conntrack kernel/iptables module running out of 'bucket' space.
You can reconfigure this, or add some local->local iptables rules and specify 'NOTRACK' so these connections don't consume conntrack resource

@ARogovsky
Copy link
Author

What is "2048 system limit"? I dont understand it. All possible limits is increased.
I dont use iptables, so conntrack module is not load.

@carpii
Copy link

carpii commented Sep 5, 2016

What is "2048 system limit"?

I don't know, I was just saying it seems likely to be some sort of system limit if its failing around 2000 every single time.

The point @MaffooBristol was making, is that when you do a wget with http://localhost:8888/socket.io/socket.io.js it's not testing socket.io or your websocket server. Its simply requesting a javascript file via HTTP (from nginx, or node/express etc).

What is in your system logs and nginx logs when your wget fails?

@ARogovsky
Copy link
Author

No errors in system/nginx/other logs. Problem only related nodejs/socket.io.

@carpii
Copy link

carpii commented Sep 5, 2016

When connections start timing out, does it ever recover if you leave it long enough? Or it just stays that way indefinitely until node is restarted?

@ARogovsky
Copy link
Author

Sometime it recovery, sometime need restart node.

@carpii
Copy link

carpii commented Sep 5, 2016

I think you should just pass it back to your developers, and ask them to add some more logging to troubleshoot it

For all we know, your node server could be getting an unhandled exception, causing it to crash and then be restarted by pm2 or something.
I don't think you've provided anything which demonstrates it is specifically a socket.io problem yet.

@darrachequesne
Copy link
Member

Closed due to inactivity, please reopen if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants