Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
alarms: decouple remote log processing from local logging context
Testing and debugging has revealed several issues arising when the alarms
server is sharing a domain with other components.  The most crucial is
that when the alarms cell runs, the LogEntryAppender is added to
the root logger tree.  This means that local events are then appended
synchronously by the calling thread, differently from the case where the event
is sent to the service via a socket.  Since the work done on this thread
includes potential regex matching (of an indefinite number of patterns)
and storing through the DataNucleus layer to the database (which also involves
a select for duplicate events), this can have considerable impact on
the performance of the domain.

The most immediate solution might be to thread the handling of events in
the appender.  But an even better solution is not to detach the remote
appender and substitute it with the LogEntryAppender for this domain, but
instead to uncouple altogether the handling of incoming events sent
on the socket from the logging context (required in order to use the
logback SimpleSocketServer).

This decoupling is achieved by modifying the alarms server to do the work
of the SocketServer in logback, but without context-related configuration.
A special SocketNode is used which passes the incoming event directly to
the LogEventAppender, now renamed LogEventHandler, without sending it
through the local logging mechanism.  Concomitantly, the handler is not
attached to any logger and hence does not process local events.

Several other modifications have been made in support of more efficient
handling of remote logging events.

For a detailed list of changes as well as tests done, see
the review board patch: https://rb.dcache.org/r/7590.

Target: trunk
Require-book: no
Require-notes: yes
Acked-by: Gerd

RELEASE NOTES:  Threading issues preventing the alarm service from
running in the same domain as other services (2.11) have been corrected
so that it is now safe to do so.  Two new properties have also been
added:

alarms.limits.workers=1
--  Number of worker threads for processing incoming logging events.

(one-of?true|false)alarms.db.alarms-only=false
--  Only store alarms to the persistent store.
  • Loading branch information
alrossi committed Feb 6, 2015
1 parent 631876d commit f18c832
Show file tree
Hide file tree
Showing 18 changed files with 619 additions and 223 deletions.
2 changes: 1 addition & 1 deletion modules/common/src/main/java/org/dcache/alarms/Alarm.java
Expand Up @@ -66,7 +66,7 @@ LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
*/
public interface Alarm {
/**
* Required additional MDC properties when sending alarms.
* MDC properties added during processing of the event by the server.
*/
String HOST_TAG = "host";
String DOMAIN_TAG = "domain";
Expand Down
Expand Up @@ -59,6 +59,8 @@ LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
*/
package org.dcache.alarms.admin;

import org.slf4j.MDC;

import com.google.common.base.Strings;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
Expand All @@ -70,6 +72,7 @@ LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
import java.util.Properties;
import java.util.concurrent.Callable;

import dmg.cells.nucleus.CDC;
import dmg.cells.nucleus.CellCommandListener;
import dmg.util.command.Argument;
import dmg.util.command.Command;
Expand Down Expand Up @@ -508,11 +511,6 @@ class SendCommand implements Callable<String> {
+ " that, the alarm will be marked 'GENERIC'.")
String type;

@Option(name = "h",
usage = "Optional name of host of origin of the alarm "
+ "(defaults to local host name).")
String host;

@Option(name = "d",
usage = "Optional name of domain of origin of the alarm "
+ "(defaults to '<na>').")
Expand All @@ -535,17 +533,15 @@ public String call() throws Exception {
arglist.add("-" + AlarmArguments.TYPE_OPT + "=\"" + type + "\"");
}

if (Strings.emptyToNull(host) != null) {
arglist.add("-" + AlarmArguments.SRC_HOST_OPT + "=\"" + host + "\"");
}

if (Strings.emptyToNull(domain) != null) {
arglist.add("-" + AlarmArguments.SRC_DOMAIN_OPT + "=\"" + domain + "\"");
if (Strings.emptyToNull(domain) == null) {
domain = MDC.get(CDC.MDC_DOMAIN);
}
arglist.add("-" + AlarmArguments.SRC_DOMAIN_OPT + "=\"" + domain + "\"");

if (Strings.emptyToNull(service) != null) {
arglist.add("-" + AlarmArguments.SRC_SERVICE_OPT + "=\"" + service + "\"");
if (Strings.emptyToNull(service) == null) {
service = MDC.get(CDC.MDC_CELL);
}
arglist.add("-" + AlarmArguments.SRC_SERVICE_OPT + "=\"" + service + "\"");

arglist.add("-" + AlarmArguments.DST_HOST_OPT + "=\"" + serverHost + "\"");
arglist.add("-" + AlarmArguments.DST_PORT_OPT + "=\"" + serverPort + "\"");
Expand Down
Expand Up @@ -221,10 +221,6 @@ public void shutdown() {
if (cleanerThread != null ) {
cleanerThread.interrupt();
}

if (pmf != null) {
pmf.close();
}
}

private boolean isRunning() {
Expand Down
@@ -0,0 +1,91 @@
/*
COPYRIGHT STATUS:
Dec 1st 2001, Fermi National Accelerator Laboratory (FNAL) documents and
software are sponsored by the U.S. Department of Energy under Contract No.
DE-AC02-76CH03000. Therefore, the U.S. Government retains a world-wide
non-exclusive, royalty-free license to publish or reproduce these documents
and software for U.S. Government purposes. All documents and software
available from this server are protected under the U.S. and Foreign
Copyright Laws, and FNAL reserves all rights.
Distribution of the software available from this server is free of
charge subject to the user following the terms of the Fermitools
Software Legal Information.
Redistribution and/or modification of the software shall be accompanied
by the Fermitools Software Legal Information (including the copyright
notice).
The user is asked to feed back problems, benefits, and/or suggestions
about the software to the Fermilab Software Providers.
Neither the name of Fermilab, the URA, nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
DISCLAIMER OF LIABILITY (BSD):
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL FERMILAB,
OR THE URA, OR THE U.S. DEPARTMENT of ENERGY, OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT
OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Liabilities of the Government:
This software is provided by URA, independent from its Prime Contract
with the U.S. Department of Energy. URA is acting independently from
the Government and in its own private capacity and is not acting on
behalf of the U.S. Government, nor as its contractor nor its agent.
Correspondingly, it is understood and agreed that the U.S. Government
has no connection to this software and in no manner whatsoever shall
be liable for nor assume any responsibility or obligation for any claim,
cost, or damages arising out of or resulting from the use of the software
available from this server.
Export Control:
All documents and software available from this server are subject to U.S.
export control laws. Anyone downloading information from this server is
obligated to secure any necessary Government licenses before exporting
documents or software obtained from this server.
*/
package org.dcache.alarms.logback;

import ch.qos.logback.classic.spi.ILoggingEvent;
import ch.qos.logback.core.filter.Filter;
import ch.qos.logback.core.spi.FilterReply;

/**
* While logback provides an MDC Turbofilter, we prefer to hardcode this type so
* as not to expose the special property key (ALARMS_INTERNAL) in the
* logback.xml; otherwise, it would be susceptible to modification, accidental
* or otherwise, which could potentially cause the alarm service logging to go
* into an infinite loop.
*
* @author arossi
*/
public final class AlarmsInternalFilter extends Filter<ILoggingEvent> {
public static final String ALARMS_INTERNAL = "alarms-internal";

@Override
public FilterReply decide(ILoggingEvent event) {
if (event.getMDCPropertyMap().get(ALARMS_INTERNAL) != null) {
return FilterReply.DENY;
}
return FilterReply.NEUTRAL;
}
}

0 comments on commit f18c832

Please sign in to comment.