Showing with 60 additions and 9 deletions.
  1. +12 −0 CHANGELOG.md
  2. +6 −3 README.md
  3. +4 −3 data/static.yaml
  4. +8 −0 lib/facter/agent_status_check.rb
  5. +28 −1 lib/facter/pe_status_check.rb
  6. +1 −1 metadata.json
  7. +1 −1 spec/acceptance/pe_status_check_spec.rb
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,18 @@

All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](http://semver.org).

## [v2.3.0](https://github.com/puppetlabs/puppetlabs-pe_status_check/tree/v2.3.0) (2022-07-27)

[Full Changelog](https://github.com/puppetlabs/puppetlabs-pe_status_check/compare/v2.2.0...v2.3.0)

### Added

- \(SUP-3362\) add CRL expiration check [\#149](https://github.com/puppetlabs/puppetlabs-pe_status_check/pull/149) ([MartyEwings](https://github.com/MartyEwings))

### Fixed

- \(SUP-3491\) Cast free\_jrubies value to float [\#151](https://github.com/puppetlabs/puppetlabs-pe_status_check/pull/151) ([m0dular](https://github.com/m0dular))

## [v2.2.0](https://github.com/puppetlabs/puppetlabs-pe_status_check/tree/v2.2.0) (2022-07-15)

[Full Changelog](https://github.com/puppetlabs/puppetlabs-pe_status_check/compare/v2.1.1...v2.2.0)
Expand Down
9 changes: 6 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,7 @@ This fact is used to determine which individual status checks should be run on e
| postgres | The node has just the database role |
| legacy_compiler | The node has the master role but not the puppetdb role |
| legacy_primary | The node is a certificate authority but not a postgres host |
| unknown | Then node type could not be determined |
| unknown | The node type could not be determined |

A failure to determine node type will result in a safe subset of checks being run that will work on all infrastructure node types.

Expand Down Expand Up @@ -245,8 +245,10 @@ Refer below for next steps when any indicator reports a `false`.
| S0018 | Determines if there are any `OutOfMemory` errors in the `orchestrator` log. | [Increase the Java heap size for that service.](https://support.puppet.com/hc/en-us/articles/360015511413) | Open a Support ticket referencing S0018 and provide [puppet metrics](https://support.puppet.com/hc/en-us/articles/231751308), `/var/log/puppetlabs/orchestration-services/orchestration-services.log`, and output of `puppet infra tune`.|
|S0019|Determines if there are sufficent jRubies available to serve agents.| Insufficent jRuby availability results in queued puppet agents and overall poor system performance. There can be many causes: [Insufficent server tuning for load](https://support.puppet.com/hc/en-us/articles/360013148854), [a thundering herd](https://support.puppet.com/hc/en-us/articles/215729277), and [insufficient system resources for scale.](https://puppet.com/docs/pe/latest/hardware_requirements.html#hardware_requirements) | If self-sevice fails to resolve the issue, open a ticket referencing S0019 and provide a description of actions so far and the output of the [support script.](https://puppet.com/docs/pe/latest/getting_support_for_pe.html#pe_support_script)
|S0021|Determines if free memory is less than 10%.| Ensure your system hardware availablity matches the [recommended configuration](https://puppet.com/docs/pe/latest/hardware_requirements.html#hardware_requirements), note this assumes no third-party software using significant resources, adapt requirements accordingly for third-party requirements. | If you have issues with memory utilization in Puppet Enterprise that is not expected, open a Support ticket, referencing S0021 and provide the output of the [support script](https://puppet.com/docs/pe/latest/getting_support_for_pe.html#pe_support_script)
| S0022 | Determines if there is a valid Puppet Enterprise license in place at `/etc/puppetlabs/license.key` on the primary server which is not expiring in the next 90 days. | [Get help with Puppet Enterprise license issues](https://support.puppet.com/hc/en-us/articles/360017933313) | Open a Support ticket referencing S0022 and provide the output of the following commands `ls -la /etc/puppetlabs/license.key` and `cat /etc/puppetlabs/license.key`. |
| S0024 | Determines if there are files in the puppetdb discard directory newer than 1 week old | Recent files indicate an issue that causes PuppetDB to reject incoming data. Invesitgate Puppetdb logs at the time the data was rejected to find a cause, | Open a Support ticket referencing S0024 and provide a copy of the PuppetDB log for the time in question, along with a sample of the most recent file in the following directory `/opt/puppetlabs/server/data/puppetdb/stockpile/discard/`
| S0022 | Determines if there is a valid Puppet Enterprise license in place at `/etc/puppetlabs/license.key` on the primary server which is not expiring in the next 90 days. | [Get help with Puppet Enterprise license issues](https://support.puppet.com/hc/en-us/articles/360017933313)| Open a Support ticket referencing S0022 and provide the output of the following commands `ls -la /etc/puppetlabs/license.key` and `cat /etc/puppetlabs/license.key`. |
| S0023 | Determines if certificate authority CRL expires in the next 90 days. | The solution is to reissue a new CRL from the Puppet CA, note this will also remove any revoked certificates. To do this follow the instructions in [this module](https://forge.puppet.com/modules/m0dular/crl_truncate) | Open a Support ticket referencing S0023 and provide [support script](https://puppet.com/docs/pe/latest/getting_support_for_pe.html#pe_support_script) output from the primary server, and errors or output collected from the resolution steps |
| S0024 | Determines if there are files in the puppetdb discard directory newer than 1 week old | Recent files indicate an issue that causes PuppetDB to reject incoming data. Invesitgate Puppetdb logs at the time the data was rejected to find a cause, | Open a Support ticket referencing S0024 and provide a copy of the PuppetDB log for the time in question, along with a sample of the most recent file in the following directory `/opt/puppetlabs/server/data/puppetdb/stockpile/discard/`|
| S0025 | Determines if the host copy of the CRL expires in the next 90 days. | If the Output of S0023 on the primary server is also false use the resolution steps in S0023. If S0023 on the Primary is True, follow [this article](https://support.puppet.com/hc/en-us/articles/7631166251415) | Open a Support ticket referencing S0025 and provide any errors you recieved in following the resolution steps | |
| S0029 | Determines if number of current connections to Postgresql DB is approaching 90% of the `max_connections` defined. | To increase the maximum number of connections in postgres, adjust `puppet_enterprise::profile::database::max_connections`. Consider to increase `shared_buffers` if that is the case as each connection consumes RAM. | Open a Support ticket referencing S0029 and provide the current and future value for `puppet_enterprise::profile::database::max_connections` and we will assist.
| S0030 | Determines when infrastructure components have the setting `use_cached_catalog` set to true. | Don't configure use_cached_catalog on PE infrastructure nodes. It prevents the management of key infrastructure settings. Disable this setting on all infrastructure components. [See our documentation for more information](https://puppet.com/docs/puppet/latest/configuration.html#use-cached-catalog) | If you encounter errors after disabling use_cached_catalog, open a Support ticket referencing S0030 and provide the errors.
| S0031 | Determines if old PE agent packages exist on the primary server. | [Remove the old PE agent packages.](https://support.puppet.com/hc/en-us/articles/4405333422103) |
Expand All @@ -271,6 +273,7 @@ Refer below for next steps when any indicator reports a `false`.
| AS001 | Determines if the agent host certificate is expiring in the next 90 days. | Puppet Enterprise has a plan built into extend agent certificates. Use a puppet query to find expiring host certificates and pass the node ID to this plan: `puppet plan run enterprise_tasks::agent_cert_regen agent=$(puppet query 'inventory[certname] { facts.agent_status_check.AS001 = false }' \| jq -r '.[].certname' \| paste -sd, -) master=$(puppet config print certname)` | If the plan fails to run, open a support ticket referencing AS001 and provide the error message received when running the plan. |
| AS002 | Determines if the pxp-agent has an established connection to a pxp broker | Ensure the pxp-agent service is running, if running check `/var/log/puppetlabs/pxp-agent/pxp-agent.log` (on *nix) or `C:/ProgramData/PuppetLabs/pxp-agent/var/log/pxp-agent.log` (on Windows) — Contains the for connection issues, first ensuring the agent is connecting to the proper endpoint, for example, a compiler and not the primary. This fact can also be used as a target filter for running tasks, ensuring time is not wasted sending instructions to agents not connected to a broker| If unable to make a connection to a broker, raise a ticket with the support team quoting AS002 and attaching the file `/var/log/puppetlabs/pxp-agent/pxp-agent.log` (on *nix) or `C:/ProgramData/PuppetLabs/pxp-agent/var/log/pxp-agent.log` (on Windows) along with the conclusions of your investigation so far |
| AS003 | Determines the certname configuration parameter is incorrectly set outside of the [main] section of the puppet.conf file. | The Puppet documentation states clearly certname should always be placed solely in the [main] section to prevent unforseen issues with the operation of the puppet agent https://puppet.com/docs/puppet/7/configuration.html#certname | If unable to determine why the indicator is being raised. Open a ticket with the support team quoting AS003 and attaching the file `puppet.conf` along with the conclusions of your investigation so far . |
| AS004 | Determines if the host copy of the CRL expires in the next 90 days. | If the Output of S0023 on the primary server is also false use the resolution steps in S0023. If S0023 on the Primary is True, follow [this article](https://support.puppet.com/hc/en-us/articles/7631166251415) | Open a Support ticket referencing AS004 and provide any errors you recieved in following the resolution steps |
## How to report an issue or contribute to the module
Expand Down
7 changes: 4 additions & 3 deletions data/static.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@ pe_status_check::S0019: "S0019 Determines if there are sufficent jrubies availab
pe_status_check::S0020: "S0020 Determines"
pe_status_check::S0021: "S0021 Determines if there are sufficent jrubies available to serve agents"
pe_status_check::S0022: "S0022 Determines if there is a valid Puppet Enterprise license in place at /etc/puppetlabs/license.key on your primary which is not going to expire in the next 90 days"
pe_status_check::S0023: "S0023 Determines"
pe_status_check::S0023: "S0023 Determines if the CA CRL expires within 90 days"
pe_status_check::S0024: "S0024 Determines if there are files in the puppetdb discard directory newer than 1 week old"
pe_status_check::S0025: "S0025 Determines "
pe_status_check::S0025: "S0025 Determines if the host copy of the CRL expires within 90 days"
pe_status_check::S0026: "S0026 Determines"
pe_status_check::S0027: "S0027 Determines"
pe_status_check::S0028: "S0028 Determines"
Expand All @@ -44,4 +44,5 @@ pe_status_check::S0041: "S0041 Determines if the pxp broker has an established
pe_status_check::S0042: "S0042 Determines if the pxp-agent has an established connection to a pxp broker"
pe_status_check::AS001: "AS001 Determines if the agent host certificate is expiring within 90 days"
pe_status_check::AS002: "AS002 Determines if the pxp-agent has an established connection to a pxp broker"
pe_status_check::AS003: "AS003 Determines the certname configuration parameter is incorrectly set outside of the [main] section of the puppet.conf file"
pe_status_check::AS003: "AS003 Determines the certname configuration parameter is incorrectly set outside of the [main] section of the puppet.conf file"
pe_status_check::AS004: "AS004 Determines if the host copy of the CRL expires within 90 days"
8 changes: 8 additions & 0 deletions lib/facter/agent_status_check.rb
Original file line number Diff line number Diff line change
Expand Up @@ -39,4 +39,12 @@
#
{ AS003: !Puppet.settings.set_in_section?(:certname, :agent) && !Puppet.settings.set_in_section?(:certname, :server) && !Puppet.settings.set_in_section?(:certname, :user) }
end
chunk(:AS004) do
# Is the host copy of the crl expiring in the next 90 days
hostcrl = Puppet.settings[:hostcrl]
next unless File.exist?(hostcrl)

x509_cert = OpenSSL::X509::CRL.new(File.read(hostcrl))
{ AS004: (x509_cert.next_update - Time.now) > 7_776_000 }
end
end
29 changes: 28 additions & 1 deletion lib/facter/pe_status_check.rb
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,15 @@
response = PEStatusCheck.http_get('/status/v1/services/pe-jruby-metrics?level=debug', 8140)
if response
free_jrubies = response.dig('status', 'experimental', 'metrics', 'average-free-jrubies')
{ S0019: free_jrubies >= 0.9 }
{
S0019: if free_jrubies.nil?
false
elsif free_jrubies.is_a?(String)
false
else
free_jrubies.to_f >= 0.9
end
}
else
{ S0019: false }
end
Expand Down Expand Up @@ -249,6 +257,16 @@
{ S0022: validity }
end

chunk(:S0023) do
# Is the CA_CRL expiring in the next 90 days
next unless ['primary', 'legacy_primary'].include?(Facter.value('pe_status_check_role'))
cacrl = Puppet.settings[:cacrl]
next unless File.exist?(cacrl)

x509_cert = OpenSSL::X509::CRL.new(File.read(cacrl))
{ S0023: (x509_cert.next_update - Time.now) > 7_776_000 }
end

chunk(:S0024) do
next unless ['primary', 'legacy_primary', 'replica', 'pe_compiler'].include?(Facter.value('pe_status_check_role'))

Expand All @@ -265,6 +283,15 @@
end
end

chunk(:S0025) do
# Is the host copy of the crl expiring in the next 90 days
hostcrl = Puppet.settings[:hostcrl]
next unless File.exist?(hostcrl)

x509_cert = OpenSSL::X509::CRL.new(File.read(hostcrl))
{ S0025: (x509_cert.next_update - Time.now) > 7_776_000 }
end

chunk(:S0029) do
next unless ['primary', 'replica', 'postgres'].include?(Facter.value('pe_status_check_role'))
# check if concurrnet connections to Postgres approaching 90% defined
Expand Down
2 changes: 1 addition & 1 deletion metadata.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "puppetlabs-pe_status_check",
"version": "2.2.0",
"version": "2.3.0",
"author": "Marty Ewings",
"summary": "A Puppet Enterprise Module to Promote Preventative Maintenance and Self Service",
"license": "Apache-2.0",
Expand Down
2 changes: 1 addition & 1 deletion spec/acceptance/pe_status_check_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
# Test Confirms all facts are false which is another indicator the class is performing correctly
describe 'check no pe_status_check fact is false' do
it 'if idempotent all facts should be true' do
expect(host_inventory['facter']['pe_status_check'].size).to eq(33)
expect(host_inventory['facter']['pe_status_check'].size).to eq(35)
expect(host_inventory['facter']['pe_status_check'].filter { |_k, v| !v }).to be_empty
end
end
Expand Down