Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

句柄打满监控 #24

Closed
xiezhenouc opened this issue Dec 9, 2020 · 5 comments
Closed

句柄打满监控 #24

xiezhenouc opened this issue Dec 9, 2020 · 5 comments

Comments

@xiezhenouc
Copy link
Contributor

最近遇到流量比较大的模块,可能会出现too many files句柄打满错误。其中一个原因是物理机句柄限制。另外一个,由于是偶现,抓不到现场,现在还不知道原因。。所以这个工具是否可以帮助我们抓到这个现场,定位到具体的问题。

@cch123
Copy link
Collaborator

cch123 commented Dec 11, 2020

简单的思路似乎是可以先采集 fd 数:
https://stackoverflow.com/questions/21752067/counting-open-files-per-process

然后当 fd 数异常飚升时,用 lsof -p pid 把信息采集下来

不过感觉这个采集 fd 数成本好像有点高啊。。

等我把采集函数给拆掉应该可以试试

@xiezhenouc
Copy link
Contributor Author

"不过感觉这个采集 fd 数成本好像有点高啊。。"

对,我们线上有高并发模块,物理机部署,采集 fd 数 挺慢的。。

@taoyuanyuan
Copy link
Contributor

@xiezhenouc 看你描述是大流量,如果是连接数的socket句柄导致的,通过goroutine的dump是可以抓到现场的。

@xiezhenouc
Copy link
Contributor Author

嗯,感谢大佬~

@xiezhenouc
Copy link
Contributor Author

@xiezhenouc 看你描述是大流量,如果是连接数的socket句柄导致的,通过goroutine的dump是可以抓到现场的。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants