Showing with 166 additions and 21 deletions.
  1. +17 −0 CHANGELOG.md
  2. +7 −2 README.md
  3. +4 −3 data/common.yaml
  4. +15 −0 lib/facter/agent_status_check.rb
  5. +74 −11 lib/facter/pe_status_check.rb
  6. +29 −2 lib/shared/pe_status_check.rb
  7. +1 −1 metadata.json
  8. +19 −2 spec/acceptance/pe_status_check_spec.rb
17 changes: 17 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,23 @@

All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](http://semver.org).

## [v1.3.0](https://github.com/puppetlabs/puppetlabs-pe_status_check/tree/v1.3.0) (2022-04-07)

[Full Changelog](https://github.com/puppetlabs/puppetlabs-pe_status_check/compare/v1.2.0...v1.3.0)

### Added

- \(SUP-3150\) Broker TCP Checks for infra components [\#109](https://github.com/puppetlabs/puppetlabs-pe_status_check/pull/109) ([MartyEwings](https://github.com/MartyEwings))
- \(SUP-3121\) Agent connection to pxp broker [\#106](https://github.com/puppetlabs/puppetlabs-pe_status_check/pull/106) ([MartyEwings](https://github.com/MartyEwings))
- \(SUP-2917\) Add indicator S0038 to check number of environments that are present in $codedir/environments [\#105](https://github.com/puppetlabs/puppetlabs-pe_status_check/pull/105) ([taikaa](https://github.com/taikaa))
- \(SUP-2908\) check current connections to Postgres less than 90% defined maximum [\#104](https://github.com/puppetlabs/puppetlabs-pe_status_check/pull/104) ([sandrajiang](https://github.com/sandrajiang))

### Fixed

- \(SUP-3180\) Rescue a loaderror when checking filesystem [\#111](https://github.com/puppetlabs/puppetlabs-pe_status_check/pull/111) ([jarretlavallee](https://github.com/jarretlavallee))
- \(SUP-3101\) Add exception handling and Facter warnings for license\_type and end date that do not exist or are invalid. Fact no longer resolves to true as a catchall. Change license\_type and end\_date variable assignments to first item of array rather than converting entire array to a string. Update spec test. [\#103](https://github.com/puppetlabs/puppetlabs-pe_status_check/pull/103) ([taikaa](https://github.com/taikaa))
- \(SUP-3122\) Fix PSQL node detection in 2021.5 [\#98](https://github.com/puppetlabs/puppetlabs-pe_status_check/pull/98) ([MartyEwings](https://github.com/MartyEwings))

## [v1.2.0](https://github.com/puppetlabs/puppetlabs-pe_status_check/tree/v1.2.0) (2022-03-23)

[Full Changelog](https://github.com/puppetlabs/puppetlabs-pe_status_check/compare/v1.1.0...v1.2.0)
Expand Down
9 changes: 7 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,20 +90,25 @@ Refer to this section for next steps when any indicator reports a `false`.
|S0021|Determines if free memory is less than 10%.| Ensure your system hardware availablity matches the [recommended configuration](https://puppet.com/docs/pe/latest/hardware_requirements.html#hardware_requirements), note this assumes no third-party software using significant resources, adapt requirements accordingly for third-party requirements. | If you have issues with memory utilization in Puppet Enterprise that is not expected, open a Support ticket, referencing S0021 and provide the output of the [support script](https://puppet.com/docs/pe/latest/getting_support_for_pe.html#pe_support_script)
| S0022 | Determines if there is a valid Puppet Enterprise license in place at `/etc/puppetlabs/license.key` on the primary server which is not expiring in the next 90 days. | [Get help with Puppet Enterprise license issues](https://support.puppet.com/hc/en-us/articles/360017933313) | Open a Support ticket referencing S0022 and provide the output of the following commands `ls -la /etc/puppetlabs/license.key` and `cat /etc/puppetlabs/license.key`. |
| S0024 | Determines if there are files in the puppetdb discard directory newer than 1 week old | Recent files indicate an issue that causes PuppetDB to reject incoming data. Invesitgate Puppetdb logs at the time the data was rejected to find a cause, | Open a Support ticket referencing S0024 and provide a copy of the PuppetDB log for the time in question, along with a sample of the most recent file in the following directory `/opt/puppetlabs/server/data/puppetdb/stockpile/discard/`
| S0030 | Determines when infrastructure components have the setting `use_cached_catalog` set to true. | Don't configure use_cached_catalog on PE infrastructure nodes. It prevents the management of key infrastructure settings. Disable this setting on all infrastructure components. [See our documentation for more information](https://puppet.com/docs/puppet/latest/configuration.html#use-cached-catalog). | If you encounter errors after disabling use_cached_catalog, open a Support ticket referencing S0030 and provide the errors.
| S0029 | Determines if number of current connections to Postgresql DB is approaching 90% of the `max_connections` defined. | To increase the maximum number of connections in postgres, adjust `puppet_enterprise::profile::database::max_connections`. Consider to increase `shared_buffers` if that is the case as each connection consumes RAM. | Open a Support ticket referencing S0029 and provide the current and future value for `puppet_enterprise::profile::database::max_connections` and we will assist.
| S0030 | Determines when infrastructure components have the setting `use_cached_catalog` set to true. | Don't configure use_cached_catalog on PE infrastructure nodes. It prevents the management of key infrastructure settings. Disable this setting on all infrastructure components. [See our documentation for more information](https://puppet.com/docs/puppet/latest/configuration.html#use-cached-catalog) | If you encounter errors after disabling use_cached_catalog, open a Support ticket referencing S0030 and provide the errors.
| S0031 | Determines if old PE agent packages exist on the primary server. | [Remove the old PE agent packages.](https://support.puppet.com/hc/en-us/articles/4405333422103) |
| S0033 | Determines if Hiera 5 is in use. | Upgrading to Hiera 5 [offers some major advantages](https://puppet.com/docs/puppet/latest/hiera_migrate) | If you're having issues upgrading to Hiera 5 or if your global Hiera configuration file was erroneously modified, open a Support ticket referencing S0033. Provide your global Hiera configuration file `puppet config print hiera_config`; the default location is `/etc/puppetlabs/puppet/hiera.yaml`.
| S0034 | Determines if your PE deployment has not been upgraded in the last year. | [Upgrade your PE instance.](https://puppet.com/docs/pe/latest/upgrading_pe.html) | If you need help upgrading PE, open a ticket and provide your current version and the version you would like to upgrade to (this could be the LTS or STS version of PE). |
| S0036 | Determines if `max-queued-requests` is set above 150. | [The maximum value for `jruby_puppet_max_queued_requests` is 150](https://support.puppet.com/hc/en-us/articles/115003769433) | If you are unable to change the value of `jruby_puppet_max_queued_requests` or encounter an error when changing it, open a Support ticket referencing S0036 and provide any errors output when attempting to change the setting.
| S0038 | Determines whether the number of environments within `$codedir/environments` is less than 100 | Having a large number of code environments can negatively affect Puppet Server performance. [See the Configuring Puppet Server documentation for more information.](https://puppet.com/docs/pe/latest/config_puppetserver.html#configuring_and_tuning_puppet_server) | Open a Support ticket referencing S0038 and provide the [support script](https://puppet.com/docs/pe/latest/getting_support_for_pe.html#pe_support_script) output from the primary server.
| S0039 | Determines if Puppets Server has reached its `queue-limit-hit-rate`,and is sending messages to agents. | [Check the max-queued-requests article for more information.](https://support.puppet.com/hc/en-us/articles/115003769433) | Open a Support ticket referencing S0039 and provide the [support script](https://puppet.com/docs/pe/latest/getting_support_for_pe.html#pe_support_script) output from the primary server.
| S0040 | Determines if PE is collecting system metrics. | If system metrics are not collected by default, the sysstat package is not installed on the impacted PE infrastructure component. Install the package and set the parameter `puppet_enterprise::enable_system_metrics_collection` to true. [See the documentation.](https://puppet.com/docs/pe/latest/getting_support_for_pe.html#puppet_metrics_collector) | After system metrics are configured, you do not see any files in `/var/log/sa` or if the `/var/log/sa` directory does not exist, open a Support ticket. |
| S0041 | Determines if the pxp broker on a compiler has an established connection to another pxp broker | To resolve a connection issue from a compiler to a pcp broker examine the following log `/var/log/puppetlabs/puppetserver/pcp-broker.log` for an explanation, Compilers should be attempting to make a connection to port 8143 on the primary server, ssl can not be terminated on a network appliance and must passthrough directly to the primary server. Ensure the connnection attempt is not to another compiler in the pool | If unable to make a connection to a broker, raise a ticket with the support team quoting S0041 and attaching the file `/var/log/puppetlabs/puppetserver/pcp-broker.log` along with the conclusions of your investigation so far |
| S0042 |Determines if the pxp-agent has an established connection to a pxp broker | Ensure the pxp-agent service is running. Check S0002 can make that determination. if running check `/var/log/puppetlabs/pxp-agent/pxp-agent.log` for connection issues, first ensuring the agent is connecting to the proper endpoint, for example, a compiler and not the primary. This fact can also be used as a target filter for running tasks, ensuring time is not wasted sending instructions to agents not connected to a broker | If unable to make a connection to a broker, raise a ticket with the support team quoting S0042 and attaching the file `/var/log/puppetlabs/pxp-agent/pxp-agent.log` along with the conclusions of your investigation so far |


## Fact: agent_status_check

| Indicator ID | Description | Self-service steps | What to include in a Support ticket |
|--------------|------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| AS001 | Determines if the agent host certificate is expiring in the next 90 days. | Puppet Enterprise has a plan built into extend agent certificates.|Uses a puppet query to find expiring host certificates and pass the node ID to this plan: `puppet plan run enterprise_tasks::agent_cert_regen agent=$(puppet query 'inventory[certname] { facts.agent_status_check.AS001 = false }' \| jq -r '.[].certname' \| paste -sd, -) master=$(puppet config print certname)` | If the plan fails to run, open a support ticket referencing AS001 and provide the error message recieved when running the plan.
| AS001 | Determines if the agent host certificate is expiring in the next 90 days. | Puppet Enterprise has a plan built into extend agent certificates. Use a puppet query to find expiring host certificates and pass the node ID to this plan: `puppet plan run enterprise_tasks::agent_cert_regen agent=$(puppet query 'inventory[certname] { facts.agent_status_check.AS001 = false }' \| jq -r '.[].certname' \| paste -sd, -) master=$(puppet config print certname)` | If the plan fails to run, open a support ticket referencing AS001 and provide the error message recieved when running the plan. |
| AS002 | Determines if the pxp-agent has an established connection to a pxp broker | Ensure the pxp-agent service is running, if running check `/var/log/puppetlabs/pxp-agent/pxp-agent.log` for connection issues, first ensuring the agent is connecting to the proper endpoint, for example, a compiler and not the primary. This fact can also be used as a target filter for running tasks, ensuring time is not wasted sending instructions to agents not connected to a broker| If unable to make a connection to a broker, raise a ticket with the support team quoting AS002 and attaching the file `/var/log/puppetlabs/pxp-agent/pxp-agent.log` along with the conclusions of your investigation so far |

## How to report an issue or contribute to the module

Expand Down
7 changes: 4 additions & 3 deletions data/common.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ pe_status_check::S0025: ""
pe_status_check::S0026: ""
pe_status_check::S0027: ""
pe_status_check::S0028: ""
pe_status_check::S0029: ""
pe_status_check::S0029: "Determines if number of current connections to Postgresql DB is approaching 90% of the max_connections defined."
pe_status_check::S0030: "Determines when infrastructure components that run with the setting use_cached_catalog are set to true"
pe_status_check::S0031: "Determines if old PE agent packages still exist on the Primary server"
pe_status_check::S0032: ""
Expand All @@ -40,6 +40,7 @@ pe_status_check::S0037: ""
pe_status_check::S0038: ""
pe_status_check::S0039: "Determines if Puppetserver has a non zero queue-limit-hit-rate"
pe_status_check::S0040: "Determines if the deployment is collecting system metrics"
pe_status_check::S0041: ""
pe_status_check::S0042: ""
pe_status_check::S0041: "Determines if the pxp broker has an established connection to another pxp broker"
pe_status_check::S0042: "Determines if the pxp-agent has an established connection to a pxp broker"
pe_status_check::AS001: "Determines if the agent host certificate is expiring within 90 days"
pe_status_check::AS002: "Determines if the pxp-agent has an established connection to a pxp broker"
15 changes: 15 additions & 0 deletions lib/facter/agent_status_check.rb
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,19 @@

{ AS001: result > 7_776_000 }
end
chunk(:AS002) do
# Has the PXP agent establish a connection with a remote Broker
#
next unless Facter.value(:os)['family'] == 'windows' || Facter.value(:os)['family'] == 'Debian' || Facter.value(:os)['family'] == 'RedHat'
result = if Facter.value(:os)['family'] == 'windows'
Facter::Core::Execution.execute('netstat -n | findstr /c:"8142" | findstr /c:"TCP" | findstr /c:"ESTABLISHED"')
else
Facter::Core::Execution.execute('ss -tunp | grep ESTAB | grep 8142 | grep pxp-agent')
end
{ AS002: !result.empty? }
rescue Facter::Core::Execution::ExecutionFailure => e
Facter.warn('agent_status_check.A0002 failed to get socket status')
Facter.debug(e)
{ AS002: false }
end
end
Loading