Skip to content
This repository has been archived by the owner on Nov 26, 2020. It is now read-only.

dstat showing incorrect tcp active sockt count? #92

Closed
siddharth178 opened this issue Jun 25, 2015 · 9 comments
Closed

dstat showing incorrect tcp active sockt count? #92

siddharth178 opened this issue Jun 25, 2015 · 9 comments
Assignees
Labels
Milestone

Comments

@siddharth178
Copy link

Hi,

I am trying to use dstat to measure network numbers (bandwidth usage as well as sockets) while I run some performance tests using wrk. But I feel something is not right ..
Here is the output of dstat

-net/total- ----tcp-sockets---- ------sockets------
 recv  send|lis act syn tim clo|tot tcp udp raw frg
   0     0 | 12   3   0   0   0|195  10   2   0   0
  64B  220B| 12   3   0   0   0|195  10   2   0   0
  64B  204B| 12   3   0   0   0|195  10   2   0   0
  64B  188B| 12   3   0   0   0|195  10   2   0   0
 637B  188B| 12   3   0   0   0|195  10   2   0   0
  64B  188B| 12   3   0   0   0|195  10   2   0   0
  64B  188B| 12   3   0   0   0|195  10   2   0   0
  64B  188B| 12   3   0   0   0|195  10   2   0   0
  64B  188B| 12   3   0   0   0|195  10   2   0   0
  64B  188B| 12   3   0   0   0|195  10   2   0   0
2915k 5462k| 12 468   0   0   0|660  10   2   0   0
4959k 9690k| 12 612   0   0   0|804  10   2   0   0
4829k 9571k| 12 612   0   0   0|804  10   2   0   0
4843k 9455k| 12 804   0   0   0|996  10   2   0   0
4787k 9487k| 12 804   0   0   0|996  10   2   0   0
4803k 9515k| 12 804   0   0   0|996  10   2   0   0
4806k 9523k| 12 804   0   0   0|996  10   2   0   0
4801k 9398k| 12   1   0   0   0|  1  10   2   0   0
4722k 9356k| 12   1   0   0   0|  1  10   2   0   0
4718k 9348k| 12   1   0   0   0|  1  10   2   0   0
4734k 9380k| 12   1   0   0   0|  1  10   2   0   0
4771k 9454k| 12   1   0   0   0|  1  10   2   0   0
4731k 9373k| 12   1   0   0   0|  1  10   2   0   0
4730k 9373k| 12   1   0   0   0|  1  10   2   0   0
4731k 9373k| 12   1   0   0   0|  1  10   2   0   0
4735k 9383k| 12   1   0   0   0|  1  10   2   0   0
4783k 9477k| 12   1   0   0   0|  1  10   2   0   0
4705k 9322k| 12   1   0   0   0|  1  10   2   0   0
4713k 9340k| 12   1   0   0   0|  1  10   2   0   0
4730k 9373k| 12   1   0   0   0|  1  10   2   0   0
4751k 9411k| 12   1   0   0   0|  1  10   2   0   0
4739k 9392k| 12   1   0   0   0|  1  10   2   0   0
4726k 9363k| 12   1   0   0   0|  1  10   2   0   0
4731k 9375k| 12   1   0   0   0|  1  10   2   0   0
4725k 9363k| 12   1   0   0   0|  1  10   2   0   0
4786k 9482k| 12   1   0   0   0|  1  10   2   0   0
4732k 9377k| 12   1   0   0   0|  1  10   2   0   0
4732k 9375k| 12   1   0   0   0|  1  10   2   0   0
4722k 9358k| 12   1   0   0   0|  1  10   2   0   0
4725k 9363k| 12   1   0   0   0|  1  10   2   0   0
2652k 5238k| 12   3   0   0   0|195  10   2   0   0
  64B  204B| 12   3   0   0   0|195  10   2   0   0

See how 'act' connections grow to 804 and immediately reduce to 1 even if my wrk bench mark is running for next 20seconds with 1000 connections to this server. Wrk doesn't show any errors. So the code under test is working fine.

Whats happening here?

Btw, I am running dstat on Ubuntu box with this -
Dstat 0.7.2
Linux jupitor 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Thanks,
Siddharth

@siddharth178
Copy link
Author

Btw, noticed one thing. Proc file /proc/net/tcp used by dstat doesn't have this information. That means there is no issue in dstat code. But the real issue is with data available in /proc/net/tcp.

I checked if other tools have this issue -
Tested same scenario with 'ss' and it shows correct number of active/connected sockets.

Whats missing here? Is /proc/net/tcp right choice to look for this data?

@kkalinin
Copy link

kkalinin commented Aug 12, 2016

I've noticed the same problem. And changing tab width for dstat_socket class helps. It's value hardcoded as 3 and it shows only first 3 digits of actual number. So everything is correct while you have <= 999 sockets in any state, but if you have 1234 socket in some state, dstat will show only 1.

@scottchiefbaker
Copy link
Collaborator

Is the solution just to display those numbers in decimal format? 1.2k connections?

@kkalinin
Copy link

I thought that the best solution would be to add one more app launch argument, something like --raw.
So, by default, the view would be human-readable. In case with tcp sockets - 1.2k, 1.2m, but in raw mode I would like to see the actual numbers, no matter of their length. But, for sure, some metrics like sent/recv bytes etc. should remain human-readable in any mode.
Still, as a quick solution, your idea is good!

Thank you!

@dagwieers
Copy link
Collaborator

@kkalinin You get the raw numbers in CSV output. We cannot print raw numbers to the screen since it messes up the whole formatting to the point it is rendered useless. (That's the problem with e.g. vmstat)

Currently one can already add --float to get floating point numbers instead of integers (when units are used). But we have to consider column width.

@dagwieers
Copy link
Collaborator

I see what you mean now. Dstat did not show any unit numbers, and only reserved 3 columns in total for most connection stats. Should be fixed now !

@dagwieers
Copy link
Collaborator

BTW Option --raw already exists, but doesn't do what you'd be expecting :-)

@superheizai
Copy link

Want to know how did you resolve this problem?
I suffer from the same problem. I want to measure system performance with one command,dstat, but it gave me a totally different number from ss.

@dagwieers
Copy link
Collaborator

@superheizai It is fixed in the master branch. The fix is included in commit 4c47a34

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants