Skip to content

Commit

Permalink
Expose publish metrics with more attrbutes
Browse files Browse the repository at this point in the history
Adding another metrics ``napalm_logs_device_published_messages_attrs``
besides the existing ``napalm_logs_device_published_messages`` to
publish metrics with more attributes. As this may potentially generate
a large amount of metrics I opted not to replace the existing one, and
rather have it separately, and with the possibility to disable it in
case it causes troubles.
  • Loading branch information
mirceaulinic committed Apr 22, 2020
1 parent 6a52625 commit 004af71
Show file tree
Hide file tree
Showing 5 changed files with 61 additions and 19 deletions.
39 changes: 24 additions & 15 deletions docs/metrics/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,11 @@ Listener Process(es)
--------------------

napalm_logs_listener_logs_ingested
Count of ingested log messages. Labels are used to seperate metrics for each Listener process.
Count of ingested log messages. Labels are used to separate metrics for each Listener process.

napalm_logs_listener_messages_published
Count of published messages. These are messages published to the message queue for processing by the Server Process.
Labels are used to seperate metrics for each Listener process.
Labels are used to separate metrics for each Listener process.

Server Process
--------------
Expand All @@ -42,53 +42,62 @@ napalm_logs_server_messages_received
Count of messages received from Listener processes.

napalm_logs_server_messages_with_identified_os
Count of messages with positive OS identification. Labels are used to seperate metrics for each Device OS.
Count of messages with positive OS identification. Labels are used to separate metrics for each Device OS.

napalm_logs_server_messages_without_identified_os
Count of messages which fail OS identification.

napalm_logs_server_messages_failed_device_queuing
Count of messages per device OS that fail to be queued to a proper Device process. Note these are messages that
pass OS identification and we know how to route them but fail to be queued. Labels are used to seperate metrics
pass OS identification and we know how to route them but fail to be queued. Labels are used to separate metrics
for each Device OS.

napalm_logs_server_messages_device_queued
Count of messages sucessfully queued to Device processes. Labels are used to seperate metrics for each Device OS process.
Count of messages successfully queued to Device processes. Labels are used to separate metrics for each Device OS process.

napalm_logs_server_messages_unknown_queued
Count of messages which fail OS indentification and thus we don't know how to route them, but the user has instructed
Count of messages which fail OS identification and thus we don't know how to route them, but the user has instructed
the system to queue them "as-is."

Device Process(es)
------------------

napalm_logs_device_messages_received
Count of messages received from the Server process. Labels are used to seperate metrics for each Device OS process.
Count of messages received from the Server process. Labels are used to separate metrics for each Device OS process.

napalm_logs_device_raw_published_messages
Count of raw type published messages. In this case, the message did not match a configured message type but the
user has instructed the system to publish the message in a raw format. Labels are used to seperate metrics for
user has instructed the system to publish the message in a raw format. Labels are used to separate metrics for
each Device OS process.

napalm_logs_device_published_messages
Count of published messages. These are messages which are sucessfully converted to an OpenConfig format. Labels
are used to seperate metrics for each Device OS process.
Count of published messages. These are messages which are successfully converted to an OpenConfig format. Labels
are used to separate metrics for each Device OS process.

napalm_logs_device_oc_object_failed
napalm_logs_device_oc_object_failed
Counter of failed OpenConfig object generations. These are messages for which the system attempts to map to a
known OpenConfig object model but fails. Labels are used to seperate metrics for each Device OS process.
known OpenConfig object model but fails. Labels are used to separate metrics for each Device OS process.

napalm_logs_device_published_messages_attrs
Count of published messages. This metrics supersedes
``napalm_logs_device_published_messages`` as it provides a more granular
selection, using two additional labels (besides ``device_os``): ``error`` and
``host`` for the *napalm-logs* error / message type and the host,
respectively. As this metric has a potential to generate a large amount of
metrics, you are able to disable it by configuring
``metrics_include_attrbutes: false`` in the napalm-logs configuration file.

Publisher Process(es)
---------------------

napalm_logs_publisher_received_messages
Count of messages received by the Publisher from Device Process(es). Labels are used to seperate metrics for
Count of messages received by the Publisher from Device Process(es). Labels are used to separate metrics for
each Publisher process.

napalm_logs_publisher_whitelist_blacklist_check_fail
Count of messages which fail the whitelist/blacklist check. Labels are used to seperate metrics for each
Count of messages which fail the whitelist/blacklist check. Labels are used to separate metrics for each
Publisher process.

napalm_logs_publisher_messages_published
Count of published messages. These are messages which are published for clients to receive (i.e. output of the
system). Labels are used to seperate metrics for each Publisher process.
system). Labels are used to separate metrics for each Publisher process.
17 changes: 17 additions & 0 deletions docs/options/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,23 @@ Configuration file example:
metrics_dir: /tmp/a_new_dir_for_metrics
.. _configuration-options-metrics-attrs:

``metrics_include_attributes``
------------------------------

.. versionadded:: 0.10.0

Disable detailed metrics with attributes per published device OS, hostname, and
napalm-logs error type. Default: ``True`` (the metrics will include detailed
attributes).

Configuration file example:

.. code-block:: yaml
metrics_include_attributes: false
.. _configuration-options-certificate:

``certificate``
Expand Down
5 changes: 3 additions & 2 deletions napalm_logs/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,8 @@ def __init__(self,
hwm=None,
device_worker_processes=1,
serializer='msgpack',
buffer=None):
buffer=None,
opts=None):
'''
Init the napalm-logs engine.
Expand Down Expand Up @@ -107,7 +108,7 @@ def __init__(self,
self.hwm = hwm
self._buffer_cfg = buffer
self._buffer = None
self.opts = {}
self.opts = opts if opts else {}
# Setup the environment
self._setup_log()
self._build_config()
Expand Down
13 changes: 12 additions & 1 deletion napalm_logs/device.py
Original file line number Diff line number Diff line change
Expand Up @@ -243,7 +243,12 @@ def start(self):
"Counter of failed OpenConfig object generations",
['device_os']
)

if self.opts.get('metrics_include_attributes', True):
napalm_logs_device_published_messages_attrs = Counter(
'napalm_logs_device_published_messages_attrs',
"Counter of published messages, with more granular selection",
['device_os', 'host', 'error']
)
self._setup_ipc()
# Start suicide polling thread
thread = threading.Thread(target=self._suicide_when_without_parent, args=(os.getppid(),))
Expand Down Expand Up @@ -329,6 +334,12 @@ def start(self):
self.pub.send(umsgpack.packb(to_publish))
# self._publish(to_publish)
napalm_logs_device_published_messages.labels(device_os=self._name).inc()
if self.opts.get('metrics_include_attributes', True):
napalm_logs_device_published_messages_attrs.labels(
device_os=self._name,
error=to_publish['error'],
host=to_publish['host']
).inc()

def stop(self):
'''
Expand Down
6 changes: 5 additions & 1 deletion napalm_logs/scripts/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -372,8 +372,12 @@ def parse(self, log, screen_handler):
file_cfg.get('device_worker_processes') or 1,
'serializer': self.options.serializer or file_cfg.get('serializer') or
defaults.SERIALIZER,
'buffer': buffer_cfg
'buffer': buffer_cfg,
'opts': {}
}
for opt, val in file_cfg.items():
if opt not in cfg:
cfg['opts'][opt] = val
return cfg


Expand Down

0 comments on commit 004af71

Please sign in to comment.