Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ncd status logging #13

Merged
merged 7 commits into from Sep 4, 2014
Merged

Improve ncd status logging #13

merged 7 commits into from Sep 4, 2014

Conversation

jouvin
Copy link
Contributor

@jouvin jouvin commented Sep 3, 2014

Fixes #12.

In addition, restore the original formatting of ncd status line in /var/log/ncm-cdispd to avoid breaking existing parser for this file (I know that line is parsed to raise attention when a component failed).

Still work in progress, would like to add the list of failed components as described in #12.

@jouvin jouvin added this to the 14.8 milestone Sep 3, 2014
(modification may break parsing by scripts)
- ncd output streamed if >= verbose level
- Redundant debug level 3 ncd output suppressed
@@ -563,7 +563,7 @@ sub launch_ncd {

my $result = 0; # Assume success

my @cmd = ( '/usr/sbin/ncm-ncd', '--configure' );
my $p = CAF::Process->new(['/usr/sbin/ncm-ncd', '--configure'] , log => $this_app );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mayeb make /usr/sbin/ncm-ncd a conatsnat or readonly and use it here and below?

@jouvin
Copy link
Contributor Author

jouvin commented Sep 3, 2014

As for me, the implementation is complete and now ready for merging... after review!

return;
}

my $comp_state_dir = $this_app->option('state');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so i wasn't aware of the state dir, but i just looked at the default on one of our nodes

[root@node2160 run]# ls -lrt /var/run/quattor-components/
total 4
-rw-r--r-- 1 root root 0 Jul 18 20:31 download
-rw-r--r-- 1 root root 1 Aug 27 10:57 spma
[root@node2160 run]# grep '' /var/run/quattor-components/
[root@node2160 run]# grep '' /var/run/quattor-components/*
/var/run/quattor-components/spma:

so this contains 2 files, one empty one and one with just a newline (not really useful); but they have very different timestamps. we should make sure that this dir is cleaned up etc etc; the documentations suggests it's only cleaned up when components become inactive, so we'd need some support of "recent" failures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be we should fix the documentation but for me (checked on my test system), there is a file left by ncm-ncd only for failed components (components with errors, not warning) and the file is removed after a succesful execution of the component.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that the file is created when ncm-ncd determined that the component should be run and is empty until it actually ran and encounters some error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so what i have is some leftovers i guess. well, if it causes issues, i'll open an issue with ncm-ncd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only issue you'll have with leftover if that they will appear as failed components even if they are not... but only in the case they are other failed components (ncm-ncd returns a non 0 exit status). When ncm-ncd succeeds, there is no attempt to look at the state directory.
If all your components ran successfully, for the time being I'd suggest cleaning up this directory manually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm that ncm-ncd is removing state files only for components it ran successfully. If there is a file created in the state directory or a state file left from a component no longer part of the configuration, the file will never be removed by ncm-ncd. It has to be cleaned up manually.

@stdweird
Copy link
Member

stdweird commented Sep 3, 2014

test this please

@stdweird
Copy link
Member

stdweird commented Sep 3, 2014

ok to test

@stdweird
Copy link
Member

stdweird commented Sep 3, 2014

test this please

@hpcugentbot
Copy link

Merged build triggered.

@hpcugentbot
Copy link

Merged build started.

@hpcugentbot
Copy link

Merged build finished.

@stdweird
Copy link
Member

stdweird commented Sep 3, 2014

retest this please

@hpcugentbot
Copy link

Merged build triggered.

@hpcugentbot
Copy link

Merged build started.

@hpcugentbot
Copy link

Merged build finished.

@jouvin
Copy link
Contributor Author

jouvin commented Sep 4, 2014

Last commit address @stdweird's remark about possible empty configuration module state files. If it happens, it ensures the message logged states it explicitely.

jrha added a commit that referenced this pull request Sep 4, 2014
Improve ncd status logging
@jrha jrha merged commit 9e79c0c into quattor:master Sep 4, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

14.8.0-rc3: new cdisdp logging doesn't allow easy identification of failed components
4 participants