Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added adapter to monitor server disk, network and process statistics #31

Merged
merged 14 commits into from
Feb 12, 2019

Conversation

timcnicholls
Copy link
Collaborator

This is an updated PR for the original #26 made by @ajgdls from the dls-controls fork.

Adds MacOS compatibility and multi-process reporting.

@coveralls
Copy link

coveralls commented Jan 28, 2019

Coverage Status

Coverage decreased (-0.2%) to 99.53% when pulling c452957 on cpu-stats into 13dbed7 on master.

@ajgdls
Copy link
Contributor

ajgdls commented Feb 6, 2019

`[D 190206 15:25:48 system_status:70] {u'status': {'process': {'stFrameProcessor1.sh': {29514: {'memory_shared': None, 'cpu_percent': 0.0, 'memory_vms': 1300082688, 'cpu_affinity': [0, 1, 2, 3, 4, 5, 6, 7], 'memory_rss': 76406784, 'memory_percent': 0.46089974528704436}}, 'stFrameReceiver1.sh': {16899: {'memory_shared': None, 'cpu_percent': 0.0, 'memory_vms': 512155648, 'cpu_affinity': [0, 1, 2, 3, 4, 5, 6, 7], 'memory_rss': 56610816, 'memory_percent': 0.34148683283007614}}}, 'disk': {'_home_gnx91527': {'total': 3958241886208, 'percent': 74.6, 'free': 1004021710848, 'used': 2954220175360}}, 'network': {'enp0s25': {'packets_sent': 239923067, 'packets_recv': 410145521, 'bytes_recv': 394827267307, 'dropin': 0, 'errin': 0, 'dropout': 0, 'bytes_sent': 64546349961, 'errout': 0}}}}

[D 190206 15:25:48 server:87] 200 GET /api/0.1/stats/status/ (172.23.253.54) 2.17ms

[E 190206 16:22:35 system_status:228] process no longer exists (pid=12585)

Traceback (most recent call last):

  File "/home/gnx91527/work/statistics/odin-control/prefix/lib/python2.7/site-packages/odin_control-0.3.1_28.ge5b172d-py2.7.egg/odin/adapters/system_status.py", line 225, in update_loop
    self.monitor()

  File "/home/gnx91527/work/statistics/odin-control/prefix/lib/python2.7/site-packages/odin_control-0.3.1_28.ge5b172d-py2.7.egg/odin/adapters/system_status.py", line 257, in monitor
    self.monitor_processes()

  File "/home/gnx91527/work/statistics/odin-control/prefix/lib/python2.7/site-packages/odin_control-0.3.1_28.ge5b172d-py2.7.egg/odin/adapters/system_status.py", line 296, in monitor_processes
    self._processes[process_name] = self.find_processes(process_name)

  File "/home/gnx91527/work/statistics/odin-control/prefix/lib/python2.7/site-packages/odin_control-0.3.1_28.ge5b172d-py2.7.egg/odin/adapters/system_status.py", line 341, in find_processes
    parents = self.find_processes_by_name(process_name)

  File "/home/gnx91527/work/statistics/odin-control/prefix/lib/python2.7/site-packages/odin_control-0.3.1_28.ge5b172d-py2.7.egg/odin/adapters/system_status.py", line 364, in find_processes_by_name
    if name in proc.name():

  File "/dls_sw/prod/tools/RHEL6-x86_64/psutil/2-1-3/build_20141201-130925_uxj42447_tools_psutil_2-1-3/psutil/__init__.py", line 490, in name
    name = self._proc.name()

  File "/dls_sw/prod/tools/RHEL6-x86_64/psutil/2-1-3/build_20141201-130925_uxj42447_tools_psutil_2-1-3/psutil/_pslinux.py", line 707, in wrapper

    raise NoSuchProcess(self.pid, self._name)

NoSuchProcess: process no longer exists (pid=12585)

[D 190206 16:35:42 system_status:70] {u'status': {'process': {'stFrameProcessor1.sh': {29514: {'memory_shared': None, 'cpu_percent': 0.0, 'memory_vms': 1300082688, 'cpu_affinity': [0, 1, 2, 3, 4, 5, 6, 7], 'memory_rss': 76406784, 'memory_percent': 0.46089974528704436}}, 'stFrameReceiver1.sh': {16899: {'memory_shared': None, 'cpu_percent': 0.0, 'memory_vms': 512155648, 'cpu_affinity': [0, 1, 2, 3, 4, 5, 6, 7], 'memory_rss': 56610816, 'memory_percent': 0.34148683283007614}}}, 'disk': {'_home_gnx91527': {'total': 3958241886208, 'percent': 74.5, 'free': 1008443719680, 'used': 2949798166528}}, 'network': {'enp0s25': {'packets_sent': 240041886, 'packets_recv': 410490131, 'bytes_recv': 395073455709, 'dropin': 0, 'errin': 0, 'dropout': 0, 'bytes_sent': 64565639557, 'errout': 0}}}}

[D 190206 16:35:42 server:87] 200 GET /api/0.1/stats/status/ (172.23.253.54) 2.10ms
`

@ajgdls
Copy link
Contributor

ajgdls commented Feb 6, 2019

This error was raised after leaving the system running for about an hour. The server continued to function correctly and I was able to query the information correctly so I'm not quite sure exactly which process had gone away but I thought I should raise it.

@timcnicholls
Copy link
Collaborator Author

Thanks @ajgdls. That's annoying, it's supposed to handle processes going away cleanly 😬 It looks like the process_iter() call returns a process iterator but then there's a race condition if a process in that list dies while the name is matched. I'll have to wrap the for proc in ... loop in try/catch block. Will look into it.

@timcnicholls
Copy link
Collaborator Author

Hi @ajgdls I managed to spot the same error when testing the proxy adapter with @ANeaves this afternoon. While I can't explicitly provoke it, as it relies on a process terminating while the psutil.process_iter() generator loop is being iterated, I have put in a trap for psutil.NoSuchProcess in find_processes_by_name() and run it for some time. Would be good if you can run this version and check for stability, thanks.

@ajgdls
Copy link
Contributor

ajgdls commented Feb 7, 2019

@timcnicholls sure I'll set them off tomorrow and leave them running for several hours.

@ajgdls
Copy link
Contributor

ajgdls commented Feb 11, 2019

@timcnicholls I have seen no repeat of the issue.

@timcnicholls timcnicholls merged commit dca3719 into master Feb 12, 2019
@ghost ghost removed the in progress label Feb 12, 2019
@timcnicholls timcnicholls deleted the cpu-stats branch February 12, 2019 08:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants