-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Diamond Ceph Stats not received in calamari #384
Comments
Note I've build the latest git stable deb packages via vagrant, still with the same issue
Also on the client I've matched the salt versions which is recommended
|
Doing a server diamond restart show the below: root@ceph1:~# tail -f /var/log/diamond/diamond.log
[2016-01-24 04:19:21,039] [MainThread] pysnmp.entity.rfc3413.oneliner.cmdgen failed to load
[2016-01-24 04:19:21,043] [MainThread] pysnmp.entity.rfc3413.oneliner.cmdgen failed to load
[2016-01-24 04:19:21,044] [MainThread] pysnmp.entity.rfc3413.oneliner.cmdgen failed to load
[2016-01-24 04:19:21,046] [MainThread] pysnmp.entity.rfc3413.oneliner.cmdgen failed to load
[2016-01-24 04:19:21,056] [MainThread] pysnmp.entity.rfc3413.oneliner.cmdgen failed to load
[2016-01-24 04:19:21,074] [MainThread] pysnmp.entity.rfc3413.oneliner.cmdgen failed to load
[2016-01-24 04:19:22,252] [Thread-1] Traceback (most recent call last):
File "/usr/lib/pymodules/python2.7/diamond/collector.py", line 412, in _run
self.collect()
File "/usr/share/diamond/collectors/ceph/ceph.py", line 464, in collect
self._collect_service_stats(path)
File "/usr/share/diamond/collectors/ceph/ceph.py", line 450, in _collect_service_stats
self._publish_stats(counter_prefix, stats, schema, GlobalName)
File "/usr/share/diamond/collectors/ceph/ceph.py", line 305, in _publish_stats
assert path[-1] == 'type'
AssertionError
^C
root@ceph1:~# md5sum /usr/lib/pymodules/python2.7/diamond/collector.py
08bb05a483fa3d1d64c0ebf690259a05 /usr/lib/pymodules/python2.7/diamond/collector.py
root@ceph1:~# md5sum /usr/share/diamond/collectors/ceph/ceph.py
aeb3915f8ac7fdea61495805d2c99f33 /usr/share/diamond/collectors/ceph/ceph.py
root@ceph1:~# |
Looking at the calamari.log I can see it's looking for missing graphite metric data root@calamari:/var/log/calamari# tail -f calamari.log
2016-01-23 22:44:54,040 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.df.total_space
2016-01-23 22:44:54,041 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.df.total_avail
2016-01-23 22:44:58,560 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.pool.0.num_objects
2016-01-23 22:44:58,561 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.pool.0.num_bytes
2016-01-23 22:44:58,835 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.df.total_used_bytes
2016-01-23 22:44:58,835 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.df.total_used
2016-01-23 22:44:58,836 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.df.total_space
2016-01-23 22:44:58,836 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.df.total_avail
2016-01-23 22:44:58,893 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.pool.0.num_objects
2016-01-23 22:44:58,894 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.pool.0.num_bytes
2016-01-23 22:45:14,440 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.df.total_used_bytes
2016-01-23 22:45:14,441 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.df.total_used
2016-01-23 22:45:14,442 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.df.total_space
2016-01-23 22:45:14,442 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.df.total_avail
2016-01-23 22:45:18,373 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.pool.0.num_objects
2016-01-23 22:45:18,377 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.pool.0.num_bytes
2016-01-23 22:45:18,878 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.pool.0.num_objects
2016-01-23 22:45:18,879 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.pool.0.num_bytes
2016-01-23 22:45:19,269 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.df.total_used_bytes
2016-01-23 22:45:19,270 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.df.total_used
2016-01-23 22:45:19,275 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.df.total_space
2016-01-23 22:45:19,276 - metric_access - django.request No graphite data for ceph.cluster.85895b09-7e2d-4290-b053-e7a71f8b5e08.df.total_avail
^C
root@calamari:/var/log/calamari# |
I can see the ok files are thereroot@ceph1:/var/run/ceph# ls -la
total 0
drwxrwx--- 2 ceph ceph 80 Feb 1 10:51 .
drwxr-xr-x 18 root root 640 Feb 1 10:52 ..
srwxr-xr-x 1 ceph ceph 0 Feb 1 10:51 ceph-mon.ceph1.asok
srwxr-xr-x 1 root root 0 Jan 27 15:08 ceph-osd.0.asok
root@ceph1:/var/run/ceph#
root@ceph1:/var/run/ceph#
root@ceph1:/var/run/ceph# Running diamond in debug show the below[2016-02-01 10:55:23,774] [Thread-1] Collecting data from: NetworkCollector
[2016-02-01 10:56:23,484] [Thread-1] Collecting data from: CPUCollector
[2016-02-01 10:56:23,487] [Thread-6] Collecting data from: MemoryCollector
[2016-02-01 10:56:23,489] [Thread-7] Collecting data from: SockstatCollector
[2016-02-01 10:56:23,768] [Thread-1] Collecting data from: CephCollector
[2016-02-01 10:56:23,768] [Thread-1] gathering service stats for /var/run/ceph/ceph-mon.ceph1.asok
[2016-02-01 10:56:24,094] [Thread-1] Traceback (most recent call last):
File "/usr/lib/pymodules/python2.7/diamond/collector.py", line 412, in _run
self.collect()
File "/usr/share/diamond/collectors/ceph/ceph.py", line 464, in collect
self._collect_service_stats(path)
File "/usr/share/diamond/collectors/ceph/ceph.py", line 450, in _collect_service_stats
self._publish_stats(counter_prefix, stats, schema, GlobalName)
File "/usr/share/diamond/collectors/ceph/ceph.py", line 305, in _publish_stats
assert path[-1] == 'type'
AssertionError
[2016-02-01 10:56:24,096] [Thread-8] Collecting data from: LoadAverageCollector
[2016-02-01 10:56:24,098] [Thread-1] Collecting data from: VMStatCollector
[2016-02-01 10:56:24,099] [Thread-1] Collecting data from: DiskUsageCollector
[2016-02-01 10:56:24,104] [Thread-9] Collecting data from: DiskSpaceCollector Check the md5 on the file returns the below:root@ceph1:/var/run/ceph# md5sum /usr/share/diamond/collectors/ceph/ceph.py
aeb3915f8ac7fdea61495805d2c99f33 /usr/share/diamond/collectors/ceph/ceph.py
root@ceph1:/var/run/ceph# I've found that replacing the ceph.py file with the below stops the diamond errorDiamond version 3.4.67 https://raw.githubusercontent.com/BrightcoveOS/Diamond/master/src/collectors/ceph/ceph.py
root@ceph1:/usr/share/diamond/collectors/ceph# md5sum ceph.py
13ac74ce0df39a5def879cb5fc530015 ceph.py
[2016-02-01 11:14:33,116] [Thread-42] Collecting data from: MemoryCollector
[2016-02-01 11:14:33,117] [Thread-1] Collecting data from: CPUCollector
[2016-02-01 11:14:33,123] [Thread-43] Collecting data from: SockstatCollector
[2016-02-01 11:14:35,453] [Thread-1] Collecting data from: CephCollector
[2016-02-01 11:14:35,454] [Thread-1] checking /var/run/ceph/ceph-mon.ceph1.asok
[2016-02-01 11:14:35,552] [Thread-1] checking /var/run/ceph/ceph-osd.0.asok
[2016-02-01 11:14:35,685] [Thread-44] Collecting data from: LoadAverageCollector
[2016-02-01 11:14:35,686] [Thread-1] Collecting data from: VMStatCollector
[2016-02-01 11:14:35,687] [Thread-1] Collecting data from: DiskUsageCollector
[2016-02-01 11:14:35,692] [Thread-45] Collecting data from: DiskSpaceCollector But after all that it's still not working |
Ok Thanks to the below reply on the mailing list
I've downgraded to hammer, now everything is working I've build the latest calamari server, diamond and new calamari clients (now called romana) Feel free to use them on your trusty deployments http://bladeservers.net.au/calamari-server_1.3.1.1-105-g79c8df2-1trusty_amd64.deb |
@drolfe |
I've been waiting to get this upstream: python-diamond/Diamond#321 |
Everything is working except the ceph and pool graph stats in the calamari gui, the host stats are working fine
Let me know what more I should be checking
Regards, Daniel
The text was updated successfully, but these errors were encountered: