Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mgr/telegraf: Use Python generator and catch OSError #22418

Merged
merged 2 commits into from Jun 27, 2018

Conversation

wido
Copy link
Member

@wido wido commented Jun 5, 2018

I saw two problems with the Telegraf module for the Mgr.

The OSError not being catched was the main issue as this will be thrown after module startup if the Telegraf socket does not exist (yet).

Performance wise I've also seen that using generators improves the performance as we can streamline how data is send to Telegraf. Instead of collecting everything in lists first we now use a pipeline where we fetch from the Manager and then directly send it to Telegraf.

@wido wido added the mgr label Jun 5, 2018
@wido wido requested review from tchaikov and jcsp June 5, 2018 11:41
@joscollin
Copy link
Member

retest this please

1 similar comment
@wido
Copy link
Member Author

wido commented Jun 6, 2018

retest this please

@wido
Copy link
Member Author

wido commented Jun 6, 2018

Ping @hydro-b :)

Copy link
Contributor

@tchaikov tchaikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could put something like

try:
  next(self.gather_measurements())
except StopIteration:
  raise RuntimeError('No measurements found')

at https://github.com/ceph/ceph/pull/22418/files#diff-2d45266c801d406eac40c4bbce5afe5eR287

otherwise we have http://pulpito.ceph.com/kchai-2018-06-10_02:09:19-rados-wip-kefu-testing-2018-06-09-2340-distro-basic-smithi/2648089/

@wido
Copy link
Member Author

wido commented Jun 11, 2018

Done!

@hydro-b
Copy link
Contributor

hydro-b commented Jun 11, 2018 via email

@hydro-b
Copy link
Contributor

hydro-b commented Jun 12, 2018

It still works @wido @tchaikov

measurements = self.gather_measurements()
if len(measurements) == 0:
raise RuntimeError('No measurements found')
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry: IndentationError: expected an indented block (module.py, line 286)

if len(measurements) == 0:
raise RuntimeError('No measurements found')
try:
next(self.gather_measurements())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling next() here will only result in evaluating the first element (I think?) -- maybe this should be list() instead to ensure hitting all the entries in the iterators is working

Instead of collection all data in a list prior to sending use
generators inside Python to send out data as soon as the Mgr
has supplied us with the data.

On larger systems with thousands of OSDs this can reduce the
amount of time it takes to collect and send the data.

Telegraf also likes this more as we send less data in one go.

Signed-off-by: Wido den Hollander <wido@42on.com>
A OSError will be thrown (and not catched) if the module starts for
the first time and the UNIX socket of the Telegraf Agent does not
exist (yet).

This will render the module useless because it can not accept commands
for configuration at that point.

If we catch an exception, simply log it but do not raise it.

Signed-off-by: Wido den Hollander <wido@42on.com>
@wido
Copy link
Member Author

wido commented Jun 15, 2018

@jcsp I'm casting it to a list now and then check if it's empty. This way all iterators are called to ensure we try to gather all data.

@tchaikov
Copy link
Contributor

@tchaikov tchaikov merged commit cc758a7 into ceph:master Jun 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants