22 changes: 20 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,27 @@

All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](http://semver.org).

## [v1.2.0](https://github.com/puppetlabs/puppet_operational_dashboards/tree/temp-tag) (2022-06-10)
## [v1.3.0](https://github.com/puppetlabs/puppet_operational_dashboards/tree/temp-tag) (2022-09-19)

[Full Changelog](https://github.com/puppetlabs/puppet_operational_dashboards/compare/v1.1.0...temp-tag)
[Full Changelog](https://github.com/puppetlabs/puppet_operational_dashboards/compare/v1.2.0...temp-tag)

### Added

- \(SUP-3646\) Grafana Bump for security vulnerability [\#79](https://github.com/puppetlabs/puppet_operational_dashboards/pull/79) ([MartyEwings](https://github.com/MartyEwings))
- SUP-3276 Add system metrics from archives [\#71](https://github.com/puppetlabs/puppet_operational_dashboards/pull/71) ([m0dular](https://github.com/m0dular))
- \(SUP-3431\) Add index and toast stats to postgres [\#70](https://github.com/puppetlabs/puppet_operational_dashboards/pull/70) ([m0dular](https://github.com/m0dular))
- \(SUP-3220\) Rewrite Puppet server script [\#68](https://github.com/puppetlabs/puppet_operational_dashboards/pull/68) ([m0dular](https://github.com/m0dular))

### Fixed

- README.md: Cleanup trailing whitespace / Fix typo [\#73](https://github.com/puppetlabs/puppet_operational_dashboards/pull/73) ([bastelfreak](https://github.com/bastelfreak))
- \(SUP-3396\) Remove ha\_last-sync-succeeded mbeans [\#72](https://github.com/puppetlabs/puppet_operational_dashboards/pull/72) ([m0dular](https://github.com/m0dular))
- \(SUP-3388\) Change error handling in PDB script [\#69](https://github.com/puppetlabs/puppet_operational_dashboards/pull/69) ([m0dular](https://github.com/m0dular))
- \(SUP-3403\) Fix labels in compile/borrow panel [\#67](https://github.com/puppetlabs/puppet_operational_dashboards/pull/67) ([m0dular](https://github.com/m0dular))

## [v1.2.0](https://github.com/puppetlabs/puppet_operational_dashboards/tree/v1.2.0) (2022-06-10)

[Full Changelog](https://github.com/puppetlabs/puppet_operational_dashboards/compare/v1.1.0...v1.2.0)

### Added

Expand Down
53 changes: 46 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,20 @@
- [Importing archive metrics](#importing-archive-metrics)
- [Default Dashboards Available](#default-dashboards-available)
- [Puppetserver Performance](#puppetserver-performance)
- [Puppetserver Workload](#puppetserver-workload)
- [File Sync Metrics](#file-sync-metrics)
- [PuppetDB Performance](#puppetdb-performance)
- [PuppetDB Workload](#puppetdb-workload)
- [Postgres Metrics](#postgres-metrics)
- [Limitations](#limitations)
- [Troubleshooting](#troubleshooting)
- [Postgres Performance](#postgres-performance)
- [Limitations](#limitations)
- [Ubuntu Telegraf Package](#ubuntu-telegraf-package)
- [Upgrading from puppet_metrics_dashboard](#upgrading-from-puppet_metrics_dashboard)
- [Applying classes on PE 2021.5 and 2021.6](#applying-classes-on-pe-20215-and-20216)
- [Installing on openSUSE 15](#installing-on-opensuse-15)
- [Troubleshooting](#troubleshooting)
- [Grafana datasource and time interval](#grafana-datasource-and-time-interval)
- [Telegraf errors](#telegraf-errors)
- [Supporting Content](#supporting-content)
- [Articles](#articles)
- [Videos](#videos)
## Description

This module is a replacement for the [puppet_metrics_dashboard module](https://forge.puppet.com/modules/puppetlabs/puppet_metrics_dashboard). It is used to configure Telegraf, InfluxDB, and Grafana to collect, store, and display metrics collected from Puppet services. By default, those components are installed on a separate Dashboard node by applying the base class of this module to that node. That class will automatically query PuppetDB for Puppet Infrastructure nodes (Primary server, Compilers, PuppetDB hosts, PostgreSQL hosts) or you can specify them via associated class parameters. It is not recommended to apply the base class of this module to one of your Puppet Infrastructure nodes.
Expand Down Expand Up @@ -124,15 +131,15 @@ This dashboard is to inspect Puppet server performance and troubleshoot the `pe-
- Average free JRubies
- Average requested JRubies
- Average JRuby borrow time
- Average JRuby wait time
- Average JRuby wait time
- Heap Memory and Uptime
This panel displays the following JVM metrics:
- Heap Committed
- Heap Used
- Uptime
- Average Requested JRubies
- Average Borrow/Compile Time
- Avergae Free JRubies
- Average Free JRubies
- Average Wait Time
- HTTP Client Metrics
This panel displays the various network related metrics performed by Puppet server. Examples include:
Expand Down Expand Up @@ -292,3 +299,35 @@ telegraf --test --debug --config /etc/telegraf/telegraf.conf --config /etc/teleg
```

will only collect Puppet server metrics.


---

# Supporting Content

### Articles

The [Support Knowledge base](https://support.puppet.com/hc/en-us) is a searchable repository for technical information and how-to guides for all Puppet products.

This Module has the following specific Article(s) available:

1. [Manage the installation and configuration of metrics dashboards using the puppetlabs-puppet_operational_dashboards module for Puppet Enterprise ](https://support.puppet.com/hc/en-us/articles/6374662483735)
2. [Monitor the performance of your PuppetDB](https://support.puppet.com/hc/en-us/articles/5918309176727)
3. [High swap usage on your primary server or replica in Puppet Enterprise](https://support.puppet.com/hc/en-us/articles/8118659796759)

### Videos

The [Support Video Playlist](https://youtube.com/playlist?list=PLV86BgbREluWKzzvVulR74HZzMl6SCh3S) is a resource of content generated by the support team

This Module has the following specific video content available:


1. [Puppet Metrics Overview ](https://youtu.be/LiCDoOUS4hg)
2. [Collecting and Displaying Puppet Metrics](https://youtu.be/13sBMQGDqsA)
3. [Interpreting Puppet Metrics](https://youtu.be/09iDO3DlKMQ)


---



2 changes: 1 addition & 1 deletion REFERENCE.md
Original file line number Diff line number Diff line change
Expand Up @@ -273,7 +273,7 @@ Data type: `String`

Version of the Grafana package to install. Defaults to '8.2.2'

Default value: `'8.2.2'`
Default value: `'8.2.7'`

##### <a name="grafana_datasource"></a>`grafana_datasource`

Expand Down
42 changes: 40 additions & 2 deletions examples/telegraf.conf.d/postgres.conf
Original file line number Diff line number Diff line change
Expand Up @@ -60,22 +60,60 @@ def apply(metric):
for k,v in subdict[db]['database_stats'].items():
if v == None:
continue

field = 'total' if k == 'size_bytes' else k
m.fields[field] = v
m.fields[k] = v
metrics.append(m)

if 'table_stats' in subdict[db].keys():
for table in subdict[db]['table_stats'].keys():
m = Metric("postgresql")
m.tags['db'] = db
m.tags['table'] = table
m.tags['table_name'] = table
m.tags['server'] = server
m.time = date
for k,v in subdict[db]['table_stats'][table].items():
if v == None:
continue
m.fields[k] = v

field = 'table' if k == 'size_bytes' else k
m.fields[field] = v
metrics.append(m)

if 'index_stats' in subdict[db].keys():
for table in subdict[db]['index_stats'].keys():
table_name = table.split('.')[-1]

m = Metric("postgresql")
m.tags['db'] = db
m.tags['table_name'] = table_name
m.tags['server'] = server
m.time = date
for k,v in subdict[db]['index_stats'][table].items():
if v == None:
continue

field = 'index' if k == 'size_bytes' else k
m.fields[field] = v
metrics.append(m)

if 'toast_stats' in subdict[db].keys():
for table in subdict[db]['toast_stats'].keys():
table_name = table.split('.')[-1]

m = Metric("postgresql")
m.tags['db'] = db
m.tags['table_name'] = table.split('.')[-1]
m.tags['server'] = server
m.time = date
for k,v in subdict[db]['toast_stats'][table].items():
if v == None:
continue

field = 'toast' if k == 'size_bytes' else k
m.fields[field] = v
metrics.append(m)
return metrics
'''
[processors.starlark.tagpass]
Expand Down
70 changes: 51 additions & 19 deletions examples/telegraf.conf.d/puppetdb.conf
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,26 @@ def apply(metric):
date = time.parse_time(d['timestamp'], location="UTC").unix_nano
metrics = []

skip_keys = ['error', 'error_count', 'api-query-start', 'api-query-duration', 'puppetdb-status', 'status-service', 'jetty-queuedthreadpool', 'ha_last-sync-succeeded']
# The format of these mbeans may be different on older versions
mq_keys = ['mq_replace_facts', 'mq_store_report', 'mq_replace_catalog', 'mq_deactivate_node', 'mq_replace_catalog_inputs']

skip_keys = ['error', 'error_count', 'api-query-start', 'api-query-duration', 'puppetdb-status', 'status-service', 'jetty-queuedthreadpool', 'ha_last-sync-succeeded']
skip_fields = ['RateUnit', 'DurationUnit', 'state']

if 'error' in d['servers'][server]['puppetdb'].keys() and len(d['servers'][server]['puppetdb']['error']) > 0:
# Skip this file if the 'puppetdb-status' entry is not present
if 'puppetdb-status' not in d['servers'][server]['puppetdb'].keys():
return

# The format we expect for the mbean entries is a dict under the 'puppetdb' dict of the form:
# {
# "mbean-name": {
# "key_1": "value_1",
# "key_2": "value_2"
# }
# }

# If we iterate two levels deep and find another dict, we skip it

for k,v in d['servers'][server]['puppetdb'].items():
if k not in skip_keys:
if v == None:
Expand All @@ -33,19 +47,39 @@ def apply(metric):
m.tags['url'] = server

for i,j in v.items():
if j == None or i in skip_fields:
if j == None or i in skip_fields or type(j) == 'dict':
continue
m.tags['mbean'] = k.replace('ha_', '').replace('storage_', '').replace('_', '.')
m.fields[i] = float(j) if i in ['Min', 'Max'] else j

metrics.append(m)

queue_depth = d['servers'][server]['puppetdb']['puppetdb-status']['status']['queue_depth']
m = Metric("puppetdb")
m.time = date
m.tags['url'] = server
m.fields['queue_depth'] = queue_depth
metrics.append(m)
# These mbeans may be different in different versions, so a special case for them
for mq_key in mq_keys:
if mq_key in d['servers'][server]['puppetdb'].keys() and d['servers'][server]['puppetdb'][mq_key] != None:
for k,v in d['servers'][server]['puppetdb'][mq_key].items():
if k not in skip_keys:
if v == None:
continue
m = Metric("puppetdb")
m.time = date
m.tags['url'] = server

for i,j in v.items():
if j == None or i in skip_fields:
continue
m.tags['mbean'] = k.replace('ha_', '').replace('storage_', '').replace('_', '.')
m.fields[i] = float(j) if i in ['Min', 'Max'] else j

metrics.append(m)

if 'queue_depth' in d['servers'][server]['puppetdb']['puppetdb-status']['status']:
queue_depth = d['servers'][server]['puppetdb']['puppetdb-status']['status']['queue_depth']
m = Metric("puppetdb")
m.time = date
m.tags['url'] = server
m.fields['queue_depth'] = queue_depth
metrics.append(m)

if 'jetty-queuedthreadpool' in d['servers'][server]['puppetdb'].keys():
for k,v in d['servers'][server]['puppetdb']['jetty-queuedthreadpool'].items():
Expand All @@ -64,25 +98,23 @@ def apply(metric):

metrics.append(m)

if 'jvm-metrics' in d['servers'][server]['puppetdb']['status-service']['status']['experimental']:
subdict = d['servers'][server]['puppetdb']['status-service']['status']['experimental']['jvm-metrics']

metric = Metric("puppetdb")
metric.time = date
metric.tags['url'] = server

subdict = d['servers'][server]['puppetdb']['status-service']['status']['experimental']['jvm-metrics']

metric = Metric("puppetdb")
metric.time = date
metric.tags['url'] = server

recurse_dict(subdict, None, metric)
metrics.append(metric)
return metrics
recurse_dict(subdict, None, metric)
metrics.append(metric)
return metrics

def recurse_dict(dict, tags, metric):
for k,v in dict.items():
if type(v) == 'dict':
recurse_dict(v, k if tags == None else tags + "_{0}".format(k), metric)
else:
field = tags + "_" + k if tags else k
#tag = tags if tags else 'base'
metric.fields[field.replace(' ', '_')] = v

'''
Expand Down
Loading