Low: Bump some log messages to notice #205

inouekazu · 2013-01-10T06:00:45Z

This request is a proposal patch.

Since the loglevel of some helpful log messages is info or debug, these are not outputted via syslog.
Therefore, bump up loglevel of these helpful log messeges to notice.

beekhof · 2013-01-10T09:39:05Z

I find it interesting that after being crucified for a decade for logging too much, the moment I turn things down people want it all back on again.

The current policy is that messages like those you want turned up are not appropriate for syslog (which is why they are at 'info' or lower). People that want this level of detail are encouraged to specify a logfile in either corosync or /etc/sysconfig/pacemaker

I will take another look, but most of these changes will probably get NACK'd.
Especially in the PE which intentionally only logs the changes that will result from the current transition.

inouekazu · 2013-01-21T09:11:25Z

I expound on necessity anew about each message.

In pacemaker-1.0, I'm monitoring the message of the following events during production run.
So I need to monitor these events (messages) in pacemaker-1.1 similarly. However, because a large quantity of messages are logged when PCMK_debugfile is used, I don't want to use it.

operation start of a resource
The "operation completion" message of a resource is notice. This is subject of surveillance.

crmd[10158]:   notice: process_lrm_event: LRM operation prmDummy_start_0 (call=8, rc=0, cib-update=55, confirmed=true) ok
crmd[10158]:   notice: process_lrm_event: LRM operation prmDummy_start_0 (call=19, rc=1, cib-update=61, confirmed=true) unknown error

However, there is no notice message of an operation start of a resource.
I need the notice message of an "operation start" including a resource name and operation, so I want to change this message into notice.

crmd[10158]:    debug: do_lrm_rsc_op: Performing key=7:1:0:c7dfa3fe-0746-461d-b1b0-f9b6f4cf1d56 op=prmDummy_start_0

inouekazu@a727333#L1L1665

change of node status
Since I need the notice message which shows the status of a node, I want to change these messages into notice.

"start" of service stop

crmd[11520]:     info: handle_shutdown_request: Creating shutdown request for dev1 (state=S_IDLE)

inouekazu@a727333#L2L884

node status was determined

pengine[11519]:     info: determine_online_status: Node dev1 is shutting down
pengine[11519]:     info: determine_online_status: Node dev2 is online

inouekazu@a727333#L7L912

node became offline

crmd[11520]:     info: peer_update_callback: Client dev1/peer now has status [offline] (DC=true)

inouekazu@a727333#L0L137

node failed, ...

crmd[10895]:     info: peer_update_callback: dev1 is now lost (was member)

inouekazu@a727333#L0L124

process creation of pacemakerd
I think that it's necessary to monitor it since process creation of pacemakerd is an event which consumes computer resource.

child process was created

pacemakerd[11223]:     info: start_child: Forked child 11225 for process cib
pacemakerd[11223]:     info: start_child: Forked child 11226 for process stonith-ng

inouekazu@a727333#L8L266

child process was terminated

pacemakerd[11223]:     info: pcmk_child_exit: Child process crmd exited (pid=11230, rc=0)
pacemakerd[11223]:     info: pcmk_child_exit: Child process pengine exited (pid=11229, rc=0)

inouekazu@a727333#L8L172

election of DC
The contents of the message differ by DC node and other nodes. So I need these messages, in order to recognize DC.
```
crmd[11520]:    debug: update_dc: Unset DC. Was dev2
crmd[11520]:     info: update_dc: Set DC to dev2 (3.0.7)
```
inouekazu@a727333#L5L1050

update of an attribute (CIB)
I need to receive the notice message including the update of an attribute (node name, attribute name and value), on DC node. Because it may become the trigger of resource movement depending on location constraints.

crmd[12131]:     info: abort_transition_graph: te_update_diff:176 - Triggered transition abort (complete=0, tag=nvpair, id=status-2506467520-default_ping_set, name=default_ping_set, value=100, magic=NA, cib=0.5.26) : Transient attribute: update
crmd[12131]:     info: abort_transition_graph: te_update_diff:176 - Triggered transition abort (complete=0, tag=nvpair, id=status-2472913088-default_ping_set, name=default_ping_set, value=100, magic=NA, cib=0.5.28) : Transient attribute: update

inouekazu@a727333#L4L329

final status of resource
Messages other than "LogActions: Leave" including Start, Move, etc. are notice. (I'm monitoring the status of a resource from these messages.)
Since I need to monitor the final status of a resource, please change "LogActions: Leave" into notice similarly.

pengine[12130]:   notice: LogActions: Move    prmDummy     (Started dev1 -> dev2)
pengine[12130]:   notice: LogActions: Stop    prmPing:0    (dev1)
pengine[12130]:     info: LogActions: Leave   prmPing:1    (Started dev2)
pengine[12130]:     info: LogActions: Leave   prmDummy     (Started dev2)
pengine[12130]:     info: LogActions: Leave   prmPing:0    (Started dev2)
pengine[12130]:     info: LogActions: Leave   prmPing:1    (Stopped)

inouekazu@a727333#L9L1813
inouekazu@a727333#L6L18

operation of all resources
I want to monitor the operation of all resources of the cluster, so I'm monitoring the following messages on DC node. Please change this message into notice.

crmd[14356]:     info: te_rsc_command: Initiating action 2: stop prmDummy_stop_0 on dev1
crmd[14356]:     info: te_rsc_command: Initiating action 9: start prmDummy_start_0 on dev2 (local)
crmd[14356]:     info: te_rsc_command: Initiating action 10: monitor prmDummy_monitor_10000 on dev2 (local)

inouekazu@a727333#L3L395

beekhof · 2013-02-12T02:41:48Z

1 The logs you are looking for exist, but are from the lrmd and not appropriate for syslog.

why, how long, which arguments - all the kinds of information you need to diagnose problems afterwards belong application logfiles (eg. /var/log/cluster.log) not syslog.

2.1 pacemakerd logs are a better source for this
2.2 pengine tries to log changes only. as something that runs often it would be too verbose to log this every time.
crmd logs would be a better source.
2.3 and 2.4
logs from crm_update_peer_state and crm_update_peer_proc are better suited for this

3 There is a saying in english "No news, is good news".
If something bad/unexpected happened, we'll log it, for example:

        notice: pcmk_child_exit:    Child process lrmd terminated with signal 9 (pid=23119, core=0)
        error: pcmk_child_exit:     Child process crmd exited (pid=23122, rc=2)

but the normal starting/stopping of processes isn't something worthy of showing in syslog.

6 See 2.2 (apart from the bit about crmd logs)

I'll follow up on the others shortly

beekhof · 2013-02-12T03:24:55Z

7 Ok. We can make that change.

beekhof · 2013-02-12T03:25:42Z

4 is very much an internal detail. You can still look for it, just enable logfiles and monitor that instead

beekhof · 2013-02-12T03:28:57Z

5 is again a debugging item. if you want to monitor updates to the cib, use the API - thats what its for.
Parsing logfiles is very fragile and likely to break any time we change the format/contents/function name.

And again, if you really have to, you can still look for it, just enable logfiles and monitor that instead.

inouekazu · 2013-03-01T12:06:04Z

Thank you for a reply and advice. I will watch application logfile.
And so I have a request newly.

I need the INFO message of "operation start".
The "operation completion" message of a resource is output with a priority of NOTICE, but INFO (or higher) message of the "operation start" corresponding to it is not output.
I want to know what is performed from now on. Please merge a change of either.

inouekazu@ea2591c
inouekazu@79761c9
I want the INFO message of "reset DC."
The "set DC" message is output with a priority of INFO, but INFO (or higher) message of the "unset DC" corresponding to it is not output.
The contents of the message differ by DC node and other nodes. So I want this INFO message.

inouekazu@9c8cd5c

beekhof · 2013-03-04T03:29:45Z

Applied:

Kazunori INOUE (3 days ago) 166c670: Low: Bump the log of an operation start of a resource to info (HEAD, master)
Kazunori INOUE (3 days ago) dc5f117: Low: Bump the log of unset DC to info

inouekazu · 2013-03-04T09:12:03Z

I confirmed it. Thank you

Low: Bump some log messages to notice

a727333

beekhof closed this Jan 10, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Low: Bump some log messages to notice #205

Low: Bump some log messages to notice #205

inouekazu commented Jan 10, 2013

beekhof commented Jan 10, 2013

inouekazu commented Jan 21, 2013

beekhof commented Feb 12, 2013

beekhof commented Feb 12, 2013

beekhof commented Feb 12, 2013

beekhof commented Feb 12, 2013

inouekazu commented Mar 1, 2013

beekhof commented Mar 4, 2013

inouekazu commented Mar 4, 2013

Low: Bump some log messages to notice #205

Low: Bump some log messages to notice #205

Conversation

inouekazu commented Jan 10, 2013

beekhof commented Jan 10, 2013

inouekazu commented Jan 21, 2013

beekhof commented Feb 12, 2013

beekhof commented Feb 12, 2013

beekhof commented Feb 12, 2013

beekhof commented Feb 12, 2013

inouekazu commented Mar 1, 2013

beekhof commented Mar 4, 2013

inouekazu commented Mar 4, 2013