Skip to content

Commit

Permalink
Logging-based alarms, part 1 (logback API)
Browse files Browse the repository at this point in the history
This is part 1 (out of 4) of the third version of the proposed alarms component.  

In the previous version, a Java hierarchy was defined for Alarm types.  This required actual code changes every time a new alarm is defined for use, in particular:

    - an implementation of the new alarm type
    - use of this type at the call sites generating the alarm (plus the requirement that the logging call explicitly add the Marker as the first parameter).

After reconsideration, I deemed this too heavy-weight and invasive.  This patch modifies the API/logging core of the alarms component so it is much more flexible and does not necessarily demand changes to the code (if the necessary logging statements concerning the alarm are already present); moreover, there is no difference between adding a logging statement and an alarm statement; any logging statement could be redefined as an alarm after the fact.

This is accomplished as follows:

1.  An AlarmDefinition class allows for a JSON description of the alarm to be provided as part of a logging filter.  Refer to the attached image for a capture of the Javadoc for this class.

2.  The AlarmDefinitionFilter is provided for use with the AlarmDefinitionAppender.  This filter intercepts logging events and attempts to match them on the basis of available AlarmDefinitions; when a definition match is found, the event is accepted; otherwise it is denied.  The match relies on an implicit function (logger,level,regex,thread)->definition; hence a given alarm type can be generated by more than one logger; a logger in turn can send multiple type of alarms if these are mapped to different logging levels (e.g., fatal, error, warn), thread names and/or regex matches on the message string.

3.  Before accepting the event, the filter calls a method on the matching definition type which embeds the type name and a JSON string representing the alarm in the original event's MDC map, for immediate downstream use in the same thread by the appender.

4. The accepted logging event then arrives at the AlarmDefinitionAppender, which converts the original event into an Alarm event by accessing the two embedded properties and turning them into the event Marker and message, respectively.  The new event is then passed to this appender's children appenders, such as a SocketAppender which can send the event off to a remote Logback server.

The logback.xml  included in skel/etc has been modified so that an alarm appender is added to the root logger.  The alarm appender in turn has a child Socket appender set to send events on port 60001 to localhost (this should be modified, depending on the location of the remote server).  The second patch in this series includes the shell command script for launching a remote socket server to accept these events.

The only actual alarm currently defined is for checksum errors generated by the logger in org.dcache.pool.classic.ChecksumScanner.  The user/admin, however, may define additional alarms simply by including other <alarmType> elements in the <filter> element (fuller info available in the AlarmDefinition Javadoc):

  <!-- processes events from all loggers into alarms on the basis of the
       alarmType definitions provided; for further information,
       see the javadoc for org.dcache.alarms.logback.AlarmDefinition and
       the dCache Book -->
  <appender name="alarms" class="org.dcache.alarms.logback.AlarmDefinitionAppender">
      <!-- this filter determines which events are to be interpreted as alarms;
          the appender converts these into alarm events and passes them
          to its embedded child appender(s)
      -->
      <filter class="org.dcache.alarms.logback.AlarmDefinitionFilter">
            <alarmType>
                logger:org.dcache.pool.classic.ChecksumScanner,
                regex:"Checksum mismatch",
                type:CHECKSUM,
                level:ERROR,
                severity:MODERATE,
                include-in-key:message.type.host.service.domain
            </alarmType>
      </filter>
      <appender-ref ref="remote"/>
  </appender>

This patch supersedes http://rb.dcache.org/r/4662; some of the code is still the same, but the structure has been largely simplified.  There follow three more patches which provide the storage (DAO), front-end (webadmin) and unit test parts of the full implementation.

Target: trunk
Patch: http://rb.dcache.org/r/4885
Acked-by: Dmitry
Require-notes: yes
Require-book: yes
  • Loading branch information
alrossi committed Oct 4, 2012
1 parent c232c08 commit 1ab7e19
Show file tree
Hide file tree
Showing 10 changed files with 1,081 additions and 16 deletions.
5 changes: 5 additions & 0 deletions modules/dcache/pom.xml
Expand Up @@ -221,6 +221,11 @@
<artifactId>aspectjweaver</artifactId>
</dependency>

<dependency>
<groupId>org.json</groupId>
<artifactId>json</artifactId>
</dependency>

<dependency>
<groupId>net.sf.smc</groupId>
<artifactId>statemap</artifactId>
Expand Down
107 changes: 107 additions & 0 deletions modules/dcache/src/main/java/org/dcache/alarms/IAlarms.java
@@ -0,0 +1,107 @@
/*
COPYRIGHT STATUS:
Dec 1st 2001, Fermi National Accelerator Laboratory (FNAL) documents and
software are sponsored by the U.S. Department of Energy under Contract No.
DE-AC02-76CH03000. Therefore, the U.S. Government retains a world-wide
non-exclusive, royalty-free license to publish or reproduce these documents
and software for U.S. Government purposes. All documents and software
available from this server are protected under the U.S. and Foreign
Copyright Laws, and FNAL reserves all rights.
Distribution of the software available from this server is free of
charge subject to the user following the terms of the Fermitools
Software Legal Information.
Redistribution and/or modification of the software shall be accompanied
by the Fermitools Software Legal Information (including the copyright
notice).
The user is asked to feed back problems, benefits, and/or suggestions
about the software to the Fermilab Software Providers.
Neither the name of Fermilab, the URA, nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
DISCLAIMER OF LIABILITY (BSD):
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL FERMILAB,
OR THE URA, OR THE U.S. DEPARTMENT of ENERGY, OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT
OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Liabilities of the Government:
This software is provided by URA, independent from its Prime Contract
with the U.S. Department of Energy. URA is acting independently from
the Government and in its own private capacity and is not acting on
behalf of the U.S. Government, nor as its contractor nor its agent.
Correspondingly, it is understood and agreed that the U.S. Government
has no connection to this software and in no manner whatsoever shall
be liable for nor assume any responsibility or obligation for any claim,
cost, or damages arising out of or resulting from the use of the software
available from this server.
Export Control:
All documents and software available from this server are subject to U.S.
export control laws. Anyone downloading information from this server is
obligated to secure any necessary Government licenses before exporting
documents or software obtained from this server.
*/
package org.dcache.alarms;

/**
* Convenience interface for properties in common between the wire object and
* the storage object for Alarm processing.
*
* @author arossi
*/
public interface IAlarms {
/*
* Shared alarm property/field names
*/
final String KEY_TAG = "key";
final String TIMESTAMP_TAG = "timestamp";
final String TYPE_TAG = "type";
final String SEVERITY_TAG = "severity";
final String HOST_TAG = "host";
final String DOMAIN_TAG = "domain";
final String SERVICE_TAG = "service";
final String MESSAGE_TAG = "message";

/*
* The base marker; all specific alarm types carry an additional
* embedded Marker.
*/
final String ALARM_MARKER = "ALARM";

/*
* Placeholder for host name which cannot be resolved.
*/
final String UNKNOWN_HOST = "<unknown host>";

/*
* Placeholder for host name which cannot be resolved.
*/
final String UNKNOWN_SERVICE = "<unknown service>";

/*
* Placeholder for host name which cannot be resolved.
*/
final String UNKNOWN_DOMAIN = "<unknown domain>";

/*
* These are defined elsewhere for use in the MDC.
*/
final String CELL = "cells.cell";
final String DOMAIN = "cells.domain";
}
96 changes: 96 additions & 0 deletions modules/dcache/src/main/java/org/dcache/alarms/Severity.java
@@ -0,0 +1,96 @@
/*
COPYRIGHT STATUS:
Dec 1st 2001, Fermi National Accelerator Laboratory (FNAL) documents and
software are sponsored by the U.S. Department of Energy under Contract No.
DE-AC02-76CH03000. Therefore, the U.S. Government retains a world-wide
non-exclusive, royalty-free license to publish or reproduce these documents
and software for U.S. Government purposes. All documents and software
available from this server are protected under the U.S. and Foreign
Copyright Laws, and FNAL reserves all rights.
Distribution of the software available from this server is free of
charge subject to the user following the terms of the Fermitools
Software Legal Information.
Redistribution and/or modification of the software shall be accompanied
by the Fermitools Software Legal Information (including the copyright
notice).
The user is asked to feed back problems, benefits, and/or suggestions
about the software to the Fermilab Software Providers.
Neither the name of Fermilab, the URA, nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
DISCLAIMER OF LIABILITY (BSD):
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL FERMILAB,
OR THE URA, OR THE U.S. DEPARTMENT of ENERGY, OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT
OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Liabilities of the Government:
This software is provided by URA, independent from its Prime Contract
with the U.S. Department of Energy. URA is acting independently from
the Government and in its own private capacity and is not acting on
behalf of the U.S. Government, nor as its contractor nor its agent.
Correspondingly, it is understood and agreed that the U.S. Government
has no connection to this software and in no manner whatsoever shall
be liable for nor assume any responsibility or obligation for any claim,
cost, or damages arising out of or resulting from the use of the software
available from this server.
Export Control:
All documents and software available from this server are subject to U.S.
export control laws. Anyone downloading information from this server is
obligated to secure any necessary Government licenses before exporting
documents or software obtained from this server.
*/
package org.dcache.alarms;

import java.util.List;

import com.google.common.collect.ImmutableList;

/**
* For marking alarm level.
*
* @author arossi
*/
public enum Severity {
INDETERMINATE, LOW, MODERATE, HIGH, CRITICAL;

private static final List<String> labels = ImmutableList.of(
INDETERMINATE.toString(), LOW.toString(),
MODERATE.toString(), HIGH.toString(), CRITICAL.toString());

public static List<String> asList() {
return labels;
}

/*
* It is cleaner to persist the enum as a number; this allows restoration of
* the enum from the underlying store.
*/
public static Severity fromOrdinal(Integer severity) {
if (severity != null) {
for (Severity value : Severity.values()) {
if (severity == value.ordinal()) {
return value;
}
}
}
return INDETERMINATE;
}
}

0 comments on commit 1ab7e19

Please sign in to comment.