Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
doc
 
 
 
 
 
 
src
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

fty-outage

Agent fty-outage produces pure alerts on _ALERTS_SYS when no data are coming from the device.

How to build

To build fty-outage project run:

./autogen.sh
./configure
make
make check # to run self-test

How to run

To run fty-outage project:

  • from within the source tree, run:
./src/fty-outage

For the other options available, refer to the manual page of fty-outage

  • from an installed base, using systemd, run:
systemctl start fty-outage

Configuration file

Configuration file - fty-outage.cfg - is currently ignored.

Agent reads environment variable BIOS_LOG_LEVEL which controls verbosity level.

State file for fty-outage is stored in /var/lib/fty/fty-outage.zpl.

Architecture

Overview

fty-outage is composed of 1 actor and 2 timers.

  • fty-outage-server: main actor

First timer is implemented via checking zclock and saves the state of the agent each SAVE_INTERVAL_MS milliseconds (default value 45 minutes).

Second timer is implemented via zpoller timeout and publishes outage alerts for dead devices every TIMEOUT_MS milliseconds (default value 30 seconds) unless such an alert is already active.

Protocols

Published metrics

Agent doesn't publish any metrics.

Published alerts

Agent publishes alerts on _ALERTS_SYS stream.

Mailbox requests

It is possible to request the agent fty-outage for:

  • putting devices into or returning devices from maintenance mode: this is used to temporarily ignore outages on assets that are known to not be currently serving data (for example, due to a FW upgrade).

Putting devices into or returning devices from maintenance mode

The USER peer sends the following messages using MAILBOX SEND to FTY-OUTAGE-AGENT ("fty-outage") peer:

  • REQUEST/'correlation_ID'/MAINTENANCE_MODE//asset1/.../assetN/expiration_ttl - switch 'asset1' to 'assetN' into maintenance

where

  • '/' indicates a multipart string message
  • 'correlation_ID' is a zuuid identifier provided by the caller
  • MUST be 'enable' or 'disable'
  • 'asset1', ..., 'assetN' MUST be the device(s) asset name
  • 'expiration_ttl' (optional) is an amount of seconds after which the asset(s) will be automatically returned from maintenance mode. If 'expiration_ttl' is not provided, the default value ('maintenance_expiration') will be used from agent configuration file
  • subject of the message is discarded

The FTY-OUTAGE-AGENT peer MUST respond with one of the messages back to USER peer using MAILBOX SEND.

  • REPLY/correlation_ID/OK
  • REPLY/correlation_ID/ERROR/reason

where

  • '/' indicates a multipart frame message
  • 'correlation ID' is a zuuid identifier provided by the caller
  • 'reason' is string detailing reason for error. Possible values are:
    • Invalid command,
    • Invalid message type,
    • Command failed,
    • Missing maintenance mode,
    • Unsupported maintenance mode.

Stream subscriptions

Agent is subscribed to streams METRICS, METRICS_UNAVAILABLE, METRICS_SENSOR and ASSETS.

If it gets METRICS_UNAVAILABLE message, it resolves all the stored alerts for specified device.

If it gets METRICS or METRICS_SENSOR message from a device, it resolves all the stored alerts for specified device and marks the device as active.

If it gets ASSETS message, it updates the asset cache. If the message is for operation DELETE or RETIRE, it resolves all the alerts for specified device.

You can’t perform that action at this time.