Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The file descriptor is unnormal #4375

Closed
yoster0520 opened this issue Nov 28, 2017 · 14 comments
Closed

The file descriptor is unnormal #4375

yoster0520 opened this issue Nov 28, 2017 · 14 comments
Assignees

Comments

@yoster0520
Copy link

yoster0520 commented Nov 28, 2017

I have a production environment that had running three months. Now I found there was more than 4000 file describes on the server node. In my test environment that had running two weeks, there was more than 800 file describes on the server node.
I use lsof to check system's file describes. The results as follows:

runsv   692 root    2w  FIFO                0,8      0t0     13903 pipe
runsv   692 root    3r  FIFO                0,8      0t0     15444 pipe
runsv   692 root    4w  FIFO                0,8      0t0     15444 pipe
runsv   692 root    5r  FIFO                0,8      0t0     15445 pipe
runsv   692 root    6w  FIFO                0,8      0t0     15445 pipe

Most of the results are pipe. Why?

Production Environment

  • Graylog Version: 2.2.0
  • Elasticsearch Version: 2.4.0
  • MongoDB Version:...
  • Operating System: centos 6.5
  • Browser version: ...

Test Environment

  • Graylog Version: 2.4.0
  • Elasticsearch Version: 5.5.0
  • MongoDB Version:...
  • Operating System: centos 6.5
  • Browser version: ...
@joschi
Copy link
Contributor

joschi commented Nov 28, 2017

@yoster0520 Which process (PID 692) created these pipes?

@yoster0520
Copy link
Author

@joschi

[root@elapt02 ~]# ps -ef | grep graylog-server
root       692   671  0 10月09 ?      00:00:00 runsv graylog-server
root      3964  2631  0 17:07 pts/1    00:00:00 grep --color=auto graylog-server

@joschi
Copy link
Contributor

joschi commented Nov 28, 2017

@yoster0520 What's the complete output of the following command?

sudo lsof +E -V

@yoster0520
Copy link
Author

is +E correct?

lsof: illegal option character: E
lsof 4.87
 latest revision: ftp://lsof.itap.purdue.edu/pub/tools/unix/lsof/
 latest FAQ: ftp://lsof.itap.purdue.edu/pub/tools/unix/lsof/FAQ
 latest man page: ftp://lsof.itap.purdue.edu/pub/tools/unix/lsof/lsof_man
 usage: [-?abhKlnNoOPRtUvVX] [+|-c c] [+|-d s] [+D D] [+|-f[gG]] [+|-e s]
 [-F [f]] [-g [s]] [-i [i]] [+|-L [l]] [+m [m]] [+|-M] [-o [o]] [-p s]
[+|-r [t]] [-s [p:s]] [-S [t]] [-T [t]] [-u s] [+|-w] [-x [fl]] [--] [names]
Use the ``-h'' option to get more help information.

@joschi
Copy link
Contributor

joschi commented Nov 28, 2017

@yoster0520 Maybe a too old version of lsof or compiled with the wrong flags?

From the manpage of lsof (http://manpages.ubuntu.com/manpages/xenial/en/man8/lsof.8.html):

+E specifies that Linux pipe and Linux UNIX socket files should be displayed with endpoint information and the files of the endpoints should also be displayed. Note: UNIX socket file endpoint information is available only when the compile flags line of -v output contains HASUXSOCKEPT.

@yoster0520
Copy link
Author

I see the lsof's website, I have the lasted version of lsof. My system is centos 7. Do you have other ways to see the output? I think I can't use the parameter.

@yoster0520
Copy link
Author

yoster0520 commented Nov 28, 2017

@joschi
These?

lr-x------ 1 root root 64 11月 28 16:28 72 -> pipe:[25990879]
l-wx------ 1 root root 64 11月 28 16:28 73 -> pipe:[25990879]
lrwx------ 1 root root 64 11月 28 16:28 74 -> anon_inode:[eventpoll]
lr-x------ 1 root root 64 11月 28 16:28 75 -> pipe:[25990880]
l-wx------ 1 root root 64 11月 28 16:28 76 -> pipe:[25990880]
lrwx------ 1 root root 64 11月 28 16:28 77 -> anon_inode:[eventpoll]
lr-x------ 1 root root 64 11月 28 16:28 78 -> pipe:[25990881]
l-wx------ 1 root root 64 11月 28 16:28 79 -> pipe:[25990881]
lr-x------ 1 root root 64 11月 28 16:28 8 -> /opt/elap/plugin/graylog-plugin-snmp-0.3.1-SNAPSHOT.jar
lrwx------ 1 root root 64 11月 28 16:28 80 -> anon_inode:[eventpoll]
lr-x------ 1 root root 64 11月 28 16:28 81 -> pipe:[25990882]
l-wx------ 1 root root 64 11月 28 16:28 82 -> pipe:[25990882]
lrwx------ 1 root root 64 11月 28 16:28 83 -> anon_inode:[eventpoll]
lr-x------ 1 root root 64 11月 28 16:28 84 -> pipe:[25990883]
l-wx------ 1 root root 64 11月 28 16:28 85 -> pipe:[25990883]
lrwx------ 1 root root 64 11月 28 16:28 86 -> anon_inode:[eventpoll]
lr-x------ 1 root root 64 11月 28 16:28 87 -> pipe:[25993289]
l-wx------ 1 root root 64 11月 28 16:28 88 -> pipe:[25993289]

Then I use lsof -n -P | grep 25990883, the outputs as follows:

java       3079  3364    root   85w     FIFO                0,8       0t0   25990883 pipe
java       3079  3365    root   84r     FIFO                0,8       0t0   25990883 pipe
java       3079  3365    root   85w     FIFO                0,8       0t0   25990883 pipe
java       3079  3366    root   84r     FIFO                0,8       0t0   25990883 pipe
java       3079  3366    root   85w     FIFO                0,8       0t0   25990883 pipe
java       3079  3367    root   84r     FIFO                0,8       0t0   25990883 pipe
java       3079  3367    root   85w     FIFO                0,8       0t0   25990883 pipe
java       3079  3368    root   84r     FIFO                0,8       0t0   25990883 pipe
java       3079  3368    root   85w     FIFO                0,8       0t0   25990883 pipe
java       3079  3369    root   84r     FIFO                0,8       0t0   25990883 pipe
java       3079  3369    root   85w     FIFO                0,8       0t0   25990883 pipe
java       3079  3370    root   84r     FIFO                0,8       0t0   25990883 pipe
java       3079  3370    root   85w     FIFO                0,8       0t0   25990883 pipe
java       3079  3371    root   84r     FIFO                0,8       0t0   25990883 pipe
java       3079  3371    root   85w     FIFO                0,8       0t0   25990883 pipe
java       3079  3372    root   84r     FIFO                0,8       0t0   25990883 pipe
java       3079  3372    root   85w     FIFO                0,8       0t0   25990883 pipe
java       3079  3373    root   84r     FIFO                0,8       0t0   25990883 pipe

3079 infomations.

root      3079   692 65 11月27 ?      16:05:31 /opt/graylog/embedded/jre/bin/java -Xms1g -Xmx1500m -XX:NewRatio=1 -server -XX:+ResizeTLAB -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSClassUnloadingEnabled -XX:+UseParNewGC -XX:-OmitStackTraceInFastThrow -jar -Dlog4j.configurationFile=file:///opt/graylog/conf/log4j2.xml -Dfile.encoding=utf-8 -Djava.library.path=/opt/graylog/server/lib/sigar/ -Dgraylog2.installation_source=unknown /opt/graylog/server/graylog.jar server -f /opt/graylog/conf/graylog.conf

@joschi
Copy link
Contributor

joschi commented Nov 28, 2017

@yoster0520 I think you're mixing up things.

In your first post, you wrote that you're using CentOS 6.5. Now you say you're using CentOS 7, but the output you've posted shows that you're using the Graylog omnibus package (e. g. in the OVA or AMI) which only runs on Ubuntu Linux 14.04…

@gruselglatz
Copy link

gruselglatz commented Nov 28, 2017

Production Setup here: uptime 5 Days ->

lsof -n -P | grep graylog | wc -l == 3385072

ls /proc/17936/fd/ | wc -l == 2687 open files (17936 == graylog-server process)

if the output is needed i can send it via PM to @joschi

i have many of these:

java      17936           graylog  txt       REG              253,0      7304  101249582 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.144-0.b01.el7_4.x86_64/jre/bin/java (deleted)
java      17936           graylog  DEL       REG              253,0               278551 /var/lib/graylog-server/journal/messagejournal-0/00000000060383303881.index.deleted
java      17936           graylog  DEL       REG              253,0               278544 /var/lib/graylog-server/journal/messagejournal-0/00000000060383172115.index.deleted
java      17936           graylog  DEL       REG              253,0               278549 /var/lib/graylog-server/journal/messagejournal-0/00000000060383033477.index.deleted
java      17936           graylog  mem       REG              253,0    164184     783834 /var/lib/graylog-server/journal/messagejournal-0/00000000060433169807.index
java      17936           graylog  DEL       REG              253,0               783832 /var/lib/graylog-server/journal/messagejournal-0/00000000060432980611.index.deleted
java      17936           graylog  mem       REG              253,0   1048576     783836 /var/lib/graylog-server/journal/messagejournal-0/00000000060433369336.index
java      17936           graylog  DEL       REG              253,0               409350 /var/lib/graylog-server/journal/messagejournal-0/00000000060432810715.index.deleted
java      17936           graylog  DEL       REG              253,0               778089 /var/lib/graylog-server/journal/messagejournal-0/00000000060432660350.index.deleted
java      17936           graylog  DEL       REG              253,0               121514 /var/lib/graylog-server/journal/messagejournal-0/00000000060432502785.index.deleted
java      17936           graylog  DEL       REG              253,0               779447 /var/lib/graylog-server/journal/messagejournal-0/00000000060432362377.index.deleted

@joschi
Copy link
Contributor

joschi commented Nov 28, 2017

@gruselglatz Please create a new issue for that and attach the full output as a text file (it doesn't include sensitive information, so this should be safe).

You're referring to regular (deleted) files, while @yoster0520 referred to pipes.

@gruselglatz
Copy link

@joschi I also have dozends of pipes... 3.382.385 in particular

@joschi
Copy link
Contributor

joschi commented Nov 29, 2017

@gruselglatz Then please post the full output of the commands I've previously mentioned in this issue.

#4375 (comment)

@gruselglatz
Copy link

@joschi on CentOS 7 there is only lsof 4.87 so no +E

To fulfill your requirements pls give me a equivalent command you would need.

@no-response no-response bot closed this as completed Jan 8, 2018
@Graylog2 Graylog2 deleted a comment from no-response bot Jan 8, 2018
@dennisoelkers dennisoelkers reopened this Jan 8, 2018
@no-response no-response bot closed this as completed Jan 8, 2018
@Graylog2 Graylog2 deleted a comment from no-response bot Jan 8, 2018
@joschi joschi reopened this Jan 8, 2018
@no-response no-response bot closed this as completed Jan 17, 2018
@no-response
Copy link

no-response bot commented Jan 17, 2018

This issue has been automatically closed because there has been no response to our request for more information from the original author. With only the information that is currently in the issue, we don't have enough information to take action. Please reach out if you have or find the answers we need so that we can investigate further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants