From e7cb0249a8d641753322735f92ada901c0c44964 Mon Sep 17 00:00:00 2001 From: Adrian Parreiras Horta Date: Tue, 7 Nov 2023 15:21:44 -0800 Subject: [PATCH 1/2] (SUP-4625) Add check for excessive JRubies --- README.md | 7 ++++--- lib/facter/pe_status_check.rb | 23 +++++++++++++++++++++++ spec/acceptance/pe_status_check_spec.rb | 24 ++++++++++++++++++++++++ 3 files changed, 51 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 8aabab6..9472eca 100644 --- a/README.md +++ b/README.md @@ -279,9 +279,10 @@ Refer below for next steps when any indicator reports a `false`. | S0039 | Determines if Puppets Server has reached its `queue-limit-hit-rate`,and is sending messages to agents. | [Check the max-queued-requests article for more information.](https://support.puppet.com/hc/en-us/articles/115003769433) | If the article is unable to solve your issue, open a Support ticket referencing S0039, indicating the investigation so far, and any issues you encountered, then provide the [support script](https://puppet.com/docs/pe/latest/getting_support_for_pe.html#pe_support_script) output from the primary server. | S0040 | Determines if PE is collecting system metrics. | If system metrics are not collected by default, the sysstat package is not installed on the impacted PE infrastructure component. Install the package and set the parameter `puppet_enterprise::enable_system_metrics_collection` to true. [See the documentation.](https://puppet.com/docs/pe/latest/getting_support_for_pe.html#puppet_metrics_collector) | After system metrics are configured, you do not see any files in `/var/log/sa` or if the `/var/log/sa` directory does not exist, open a Support ticket. | | S0041 | Determines if the pxp broker on a compiler has an established connection to another pxp broker | To resolve a connection issue from a compiler to a pcp broker examine the following log `/var/log/puppetlabs/puppetserver/pcp-broker.log` for an explanation, Compilers should be attempting to make a connection to port 8143 on the primary server, ssl can not be terminated on a network appliance and must passthrough directly to the primary server. Ensure the connnection attempt is not to another compiler in the pool | If unable to make a connection to a broker, raise a ticket with the support team quoting S0041 and attaching the file `/var/log/puppetlabs/puppetserver/pcp-broker.log` along with the conclusions of your investigation so far | -| S0042 |Determines if the pxp-agent has an established connection to a pxp broker | Ensure the pxp-agent service is running. Check S0002 can make that determination. if running check `/var/log/puppetlabs/pxp-agent/pxp-agent.log` (on *nix) or `C:/ProgramData/PuppetLabs/pxp-agent/var/log/pxp-agent.log` (on Windows), for connection issues, first ensuring the agent is connecting to the proper endpoint, for example, a compiler and not the primary. This fact can also be used as a target filter for running tasks, ensuring time is not wasted sending instructions to agents not connected to a broker | If unable to make a connection to a broker, raise a ticket with the support team quoting S0042 and attaching the file `/var/log/puppetlabs/pxp-agent/pxp-agent.log` (on *nix) or `C:/ProgramData/PuppetLabs/pxp-agent/var/log/pxp-agent.log` (on Windows), along with the conclusions of your investigation so far | -| S0043 |Determines if there are nodes with Puppet agent versions ahead of the primary server | Agent nodes should not be running Puppet agent versions ahead of infrastructure nodes. Instead consider upgrading PE so that PE package management contains the desired Puppet agent version. See the [upgrading PE](https://puppet.com/docs/pe/latest/upgrading_pe.html) and [upgrading agents](https://puppet.com/docs/latest/upgrading_agents.html) documentation for more information. | If you are unable to determine why the indicator is evaluating to `false` or have questions about Puppet agent versions, open a support ticket and reference S0043. | -| S0044 |Determines if Puppet Servers are using the the PE classifier for the node data plugin (node terminus) | Due to performance optimizations, it is recommended to use the PE classifier plugin instead of external node classifier (ENC) scripts or applications. See the [node_terminus configuration setting documentation](https://www.puppet.com/docs/puppet/7/configuration.html#node-terminus) for more information. | If you have additional questions about the node_terminus configuration setting, open a support ticket and reference S0044. | +| S0042 | Determines if the pxp-agent has an established connection to a pxp broker | Ensure the pxp-agent service is running. Check S0002 can make that determination. if running check `/var/log/puppetlabs/pxp-agent/pxp-agent.log` (on *nix) or `C:/ProgramData/PuppetLabs/pxp-agent/var/log/pxp-agent.log` (on Windows), for connection issues, first ensuring the agent is connecting to the proper endpoint, for example, a compiler and not the primary. This fact can also be used as a target filter for running tasks, ensuring time is not wasted sending instructions to agents not connected to a broker | If unable to make a connection to a broker, raise a ticket with the support team quoting S0042 and attaching the file `/var/log/puppetlabs/pxp-agent/pxp-agent.log` (on *nix) or `C:/ProgramData/PuppetLabs/pxp-agent/var/log/pxp-agent.log` (on Windows), along with the conclusions of your investigation so far | +| S0043 | Determines if there are nodes with Puppet agent versions ahead of the primary server | Agent nodes should not be running Puppet agent versions ahead of infrastructure nodes. Instead consider upgrading PE so that PE package management contains the desired Puppet agent version. See the [upgrading PE](https://puppet.com/docs/pe/latest/upgrading_pe.html) and [upgrading agents](https://puppet.com/docs/latest/upgrading_agents.html) documentation for more information. | If you are unable to determine why the indicator is evaluating to `false` or have questions about Puppet agent versions, open a support ticket and reference S0043. | +| S0044 | Determines if Puppet Servers are using the the PE classifier for the node data plugin (node terminus) | Due to performance optimizations, it is recommended to use the PE classifier plugin instead of external node classifier (ENC) scripts or applications. See the [node_terminus configuration setting documentation](https://www.puppet.com/docs/puppet/7/configuration.html#node-terminus) for more information. | If you have additional questions about the node_terminus configuration setting, open a support ticket and reference S0044. | +| S0045 | Determines if Puppet Servers are configured with an excessive number of JRubies. | Because each JRuby instance consumes additional memory, having too many can reduce the amount of heap space available to Puppet server and cause excessive garbage collections. While it is possible to increase the heap along with the number of JRubies, we have observered diminishing returns with more than 12 JRubies and therefore recommend an upper limit of 12. We also recommend allocating between 1 - 2gb of heap memory for each JRuby. | If you would like to measure the effects of changing JRubies and heap settings, use the [Puppet Operational Dashboards module](https://forge.puppet.com/modules/puppetlabs/puppet_operational_dashboards/readme) to configure a metrics stack and Grafana dashboards for viewing the metrics. If you still have performance issues or further questions, open a support ticket and reference S0045. | ### Fact: agent_status_check diff --git a/lib/facter/pe_status_check.rb b/lib/facter/pe_status_check.rb index 25359d9..c789833 100644 --- a/lib/facter/pe_status_check.rb +++ b/lib/facter/pe_status_check.rb @@ -563,4 +563,27 @@ { S0044: false } end end + + chunk(:S0045) do + next unless ['primary', 'legacy_primary', 'replica', 'pe_compiler', 'legacy_compiler'].include?(Facter.value('pe_status_check_role')) + begin + response = PEStatusCheck.http_get('/status/v1/services/jruby-metrics?level=debug', 8140) + + if response + num_jrubies = response.dig('status', 'experimental', 'metrics', 'num-jrubies') + + unless num_jrubies.nil? + { S0045: false } + end + + { S0045: num_jrubies <= 12 } + else + { S0045: false } + end + rescue StandardError => e + Facter.warn("Error in fact 'pe_status_check.S0045': #{e.message}") + Facter.debug(e.backtrace) + { S0045: false } + end + end end diff --git a/spec/acceptance/pe_status_check_spec.rb b/spec/acceptance/pe_status_check_spec.rb index ea4dca4..a87a389 100644 --- a/spec/acceptance/pe_status_check_spec.rb +++ b/spec/acceptance/pe_status_check_spec.rb @@ -360,6 +360,30 @@ class {'pe_status_check': expect(result.stdout).to match(%r{false}) run_shell('puppet config set --section master node_terminus classifier') end + it 'if S0045 conditions for false are met' do + manifest = <<-PUPPETCODE + pe_hocon_setting { 'jruby-puppet.max-active-instances': + ensure => present, + path => '/etc/puppetlabs/puppetserver/conf.d/pe-puppet-server.conf', + setting => 'jruby-puppet.max-active-instances', + value => 13, + } + PUPPETCODE + + apply_manifest(manifest) + result = run_shell('facter -p pe_status_check.S0045') + expect(result.stdout).to match(%r{false}) + + manifest = <<-PUPPETCODE + pe_hocon_setting { 'jruby-puppet.max-active-instances': + ensure => present, + path => '/etc/puppetlabs/puppetserver/conf.d/pe-puppet-server.conf', + setting => 'jruby-puppet.max-active-instances', + value => 1, + } + PUPPETCODE + apply_manifest(manifest) + end end end end From 0105c7e34463e774f3fcbea6b35fc5313e0b26e4 Mon Sep 17 00:00:00 2001 From: Adrian Parreiras Horta Date: Wed, 8 Nov 2023 09:53:01 -0800 Subject: [PATCH 2/2] Fix acceptance tests --- spec/acceptance/pe_status_check_spec.rb | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/spec/acceptance/pe_status_check_spec.rb b/spec/acceptance/pe_status_check_spec.rb index a87a389..49bce05 100644 --- a/spec/acceptance/pe_status_check_spec.rb +++ b/spec/acceptance/pe_status_check_spec.rb @@ -18,7 +18,7 @@ class {'pe_status_check': # Test Confirms all facts are false which is another indicator the class is performing correctly describe 'check no pe_status_check fact is false' do it 'if idempotent all facts should be true' do - expect(host_inventory['facter']['pe_status_check'].size).to eq(40) + expect(host_inventory['facter']['pe_status_check'].size).to eq(41) expect(host_inventory['facter']['pe_status_check'].filter { |_k, v| !v }).to be_empty end end @@ -371,6 +371,7 @@ class {'pe_status_check': PUPPETCODE apply_manifest(manifest) + run_shell('systemctl restart pe-puppetserver') result = run_shell('facter -p pe_status_check.S0045') expect(result.stdout).to match(%r{false}) @@ -383,6 +384,7 @@ class {'pe_status_check': } PUPPETCODE apply_manifest(manifest) + run_shell('systemctl restart pe-puppetserver') end end end