Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sensu::client::config keepalives 'change' every run #336

Closed
poolski opened this issue Mar 25, 2015 · 58 comments
Closed

sensu::client::config keepalives 'change' every run #336

poolski opened this issue Mar 25, 2015 · 58 comments

Comments

@poolski
Copy link
Contributor

poolski commented Mar 25, 2015

Every single time I run puppet, the module seems to change my thresholds for keepalive, flapping detection etc as follows. It's not a big deal but it does add time to puppet runs and it's a bit misleading, given no change takes place

Notice: /Stage[main]/Sensu::Client::Config/Sensu_client_config[hostname.example.com]/custom: custom changed 'handlers => ["default"]' to 'handlers => ["default"], keepalive => {"high_flap_threshold"=>20, "low_flap_threshold"=>5, "refresh"=>14400, "thresholds"=>{"critical"=>120, "warning"=>90}}'
Notice: /Stage[main]/Sensu::Client::Config/Sensu_client_config[hostname.example.com]/keepalive: keepalive changed 'high_flap_threshold => 20, low_flap_threshold => 5, refresh => 14400, thresholds => {"critical"=>120, "warning"=>90}' to ''
@jlambert121
Copy link
Contributor

What version of the module are you using?

@poolski
Copy link
Contributor Author

poolski commented Mar 25, 2015

1.5.0

@poolski
Copy link
Contributor Author

poolski commented Apr 1, 2015

Any joy with this?

Also, when's the next release of the module out? I'm looking forward to being able to incorporate the fixes from #298 in my environments.

@superseb
Copy link
Contributor

@poolski Can you supply the manifest that is causing these messages? I'll try to reproduce.

@superseb
Copy link
Contributor

@poolski And please check if the fix in #313 works for you.

@poolski
Copy link
Contributor Author

poolski commented Apr 14, 2015

I will, @superseb

@poolski
Copy link
Contributor Author

poolski commented Apr 14, 2015

With the addition of the port check in #343 my puppet runs now also changes the port every time it runs

Notice: /Stage[main]/Sensu::Client::Config/Sensu_client_config[myhost.mynetwork.net]/port: port changed '' to '3030'
Notice: /Stage[main]/Sensu::Client::Config/Sensu_client_config[myhost.mynetwork.net]/custom: custom changed 'handlers => ["default"]' to 'handlers => ["default"], keepalive => {"high_flap_threshold"=>20, "low_flap_threshold"=>5, "refresh"=>14400, "thresholds"=>{"critical"=>120, "warning"=>90}}'
Notice: /Stage[main]/Sensu::Client::Config/Sensu_client_config[myhost.mynetwork.net]/keepalive: keepalive changed 'high_flap_threshold => 20, low_flap_threshold => 5, refresh => 14400, thresholds => {"critical"=>120, "warning"=>90}' to ''

I'm running v1.5.5 of the module to test if it fixes anything.

@superseb
Copy link
Contributor

@poolski That's quite weird, you are using v1.5.5 at the moment? Did you restart the puppetmaster? (puppetmaster/pupperserver/pe-httpd)

@poolski
Copy link
Contributor Author

poolski commented Apr 14, 2015

@superseb yeah I did. It was throwing all sorts of crazy invalid setting errors till I tried that.
We run a puppetmaster with Apache/Passenger.

Also, yes, I'm using v1.5.5.

@superseb
Copy link
Contributor

Ok, what Puppet version? Are you using multiple environments?

@superseb
Copy link
Contributor

Oh @poolski , please post your manifest aswell so I can try to reproduce

@poolski
Copy link
Contributor Author

poolski commented Apr 14, 2015

We are running

  • Puppet version 3.7.5

And yes, we have multiple envs, managed with r10k. I'm testing it on our "develop" env before rolling it out to the wider world.

@zakuni
Copy link

zakuni commented Apr 15, 2015

Hi,
It happens to me also.
keepalive, port, and also redis_reconnect_on_error changes from 'true' to 'false' every time puppet runs.

I'm using sensu-puppet v1.5.5 now, but it've been occured before this version. (sorry I can't tell exactly from which version)

Here is part of my manifest

  class { '::sensu':
    version => '0.17.1-1',
    rabbitmq_ssl_cert_chain => '/etc/sensu/ssl/cert.pem',
    rabbitmq_ssl_private_key => '/etc/sensu/ssl/key.pem',
    rabbitmq_host => $server_host,
    rabbitmq_port => 5671,
    rabbitmq_password => $rabbitmq_pass,
    rabbitmq_vhost => '/sensu',
    purge_config => false,
    server    => $server,
    api       => $server,
    subscriptions => $subscriptions,
    client_custom => {
      'keepalive' => {
        'handlers' => ['default', 'slack']
      }
    },
    use_embedded_ruby => $embedded_ruby,
    sensu_plugin_version => latest,
  }

I've specified 'client_port' and 'redis_reconnect_on_error' explicitly, but it didn't fix.

@superseb
Copy link
Contributor

Okay. So:

  class { '::sensu':
    version => '0.17.1-1',
    rabbitmq_ssl_cert_chain => '/etc/sensu/ssl/cert.pem',
    rabbitmq_ssl_private_key => '/etc/sensu/ssl/key.pem',
    rabbitmq_host => $server_host,
    rabbitmq_port => 5671,
    rabbitmq_password => $rabbitmq_pass,
    rabbitmq_vhost => '/sensu',
    purge_config => false,
    server    => $server,
    api       => $server,
    subscriptions => $subscriptions,
    client_keepalive => {
        'handlers' => ['default', 'slack']
      }
    },
    use_embedded_ruby => $embedded_ruby,
    sensu_plugin_version => latest,
  }

Let me know if this solves it for you.

@zakuni
Copy link

zakuni commented Apr 16, 2015

@superseb
Thank you, it solved for me!

@superseb
Copy link
Contributor

Okay cool, @poolski if you could post your manifest I can take a look.

@poolski
Copy link
Contributor Author

poolski commented Apr 16, 2015

@superseb, alright, here we go. It's a bit broken up because most of the stuff lives in Hiera and is loaded automatically by class-based params. My r10k Sensu block is applied to all hosts (it's in a 'base' profile) and looks like this:

  # Sensu config
  $subs = hiera_array('sensu::subscriptions')
  $plugs = hiera_array('sensu::plugins')
  $checks = hiera_hash('sensu::checks')
  if $checks { create_resources(sensu::check, $checks) }
  class {'::sensu':
    subscriptions => $subs,
    plugins       => $plugs,
  }

The corresponding data is as follows. Checks have been removed because there are lots of them and they aren't complaining.

{
  "sensu::client": true,
  "sensu::client_custom": {
    "handlers": [
      "default"
    ],
    "keepalive": {
      "high_flap_threshold": "20",
      "low_flap_threshold": "5",
      "refresh": 14400,
      "thresholds": {
        "critical": 120,
        "warning": 90
      }
    }
  },
  "sensu::plugins": [
    "puppet:///modules/sensu_site/plugins/system/check-apt.sh",
    "puppet:///modules/sensu_site/plugins/system/check-cpu.rb",
    "puppet:///modules/sensu_site/plugins/system/check-disk.rb",
    "puppet:///modules/sensu_site/plugins/system/check-load.rb",
    "puppet:///modules/sensu_site/plugins/system/cpu-metrics.rb",
    "puppet:///modules/sensu_site/plugins/processes/check-procs.rb",
    "puppet:///modules/sensu_site/plugins/system/check-ram.rb",
    "puppet:///modules/sensu_site/plugins/sendmail/sendmail-mqueue.rb",
    "puppet:///modules/sensu_site/plugins/enabler/check-pidfile.sh",
    "puppet:///modules/sensu_site/plugins/network/check-ports.sh"
  ],
  "sensu::purge_config": true,
  "sensu::rabbitmq_host": "10.10.10.10",
  "sensu::rabbitmq_password": "passwordz",
  "sensu::rabbitmq_port": 5671,
  "sensu::rabbitmq_ssl": true,
  "sensu::rabbitmq_ssl_cert_chain": "puppet:///modules/sensu_site/ssl/cert.pem",
  "sensu::rabbitmq_ssl_private_key": "puppet:///modules/sensu_site/ssl/key.pem",
  "sensu::rabbitmq_vhost": "/sensu",
  "sensu::rabbitmq_reconnect_on_error": true,
  "sensu::redis_reconnect_on_error": true,
  "sensu::sensu_plugin_version": "present",
  "sensu::subscriptions": [
    "common"
  ],
  "sensu::use_embedded_ruby": true,
}

@poolski
Copy link
Contributor Author

poolski commented Apr 16, 2015

Additionally, having installed master over the v1.5.5 release, I've now got a consistent

Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Invalid parameter reconnect_on_error on Sensu_rabbitmq_config[myserver.mynetwork.net] at /etc/puppet/environments/develop/modules/sensu/manifests/rabbitmq/config.pp:120 on node myserver.mynetwork.net
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run

@superseb
Copy link
Contributor

I'll take a look today, did you restart the master after getting master?

@poolski
Copy link
Contributor Author

poolski commented Apr 16, 2015

I did indeed. I killed Apache and waited for the passenger processes to stop before restarting, too.

@superseb
Copy link
Contributor

In a clean Vagrant setup with open source Puppet 3.7.5-1 master+agent and the following declaration, it doesn't reset the properties in every run. Could you verify @poolski ? The issue you are experiencing seems like the old bug with custom providers and environments, but should be fixed in 3.7.5 (same version as I tested it on)

  class { '::sensu':
    client => true,
    client_custom => {
      'handlers' => [
        'default'
       ],
    },
    client_keepalive => {
      "high_flap_threshold" => "20",
      "low_flap_threshold" => "5",
      "refresh" => 14400,
      "thresholds" => {
        "critical" => 120,
        "warning" => 90
      }
    },
    purge_config => true,
    rabbitmq_host => '10.10.10.10',
    rabbitmq_password => 'password',
    rabbitmq_port => 5671,
    rabbitmq_ssl => true,
    rabbitmq_vhost => '/sensu',
    rabbitmq_reconnect_on_error => true,
    redis_reconnect_on_error => true,
    subscriptions => [ 'common' ],
    use_embedded_ruby => true,
  }

@superseb
Copy link
Contributor

@poolski Any update on this?

@jsfrerot
Copy link

Hi,
I applied this patch #345 on v1.5.5 and I still get the Sensu_client_config changing every puppet run.

1st run:
Notice: /Stage[main]/Sensu::Client::Config/Sensu_client_config[my-server]/port: port changed '' to '3030'

2nd run:
Notice: /Stage[main]/Sensu::Client::Config/Sensu_client_config[my-server]/custom: custom changed 'check_load => critical1.66,1.25,1.00warning1.25,1.00,0.80, port => 3030' to 'check_load => critical1.66,1.25,1.00warning1.25,1.00,0.80'

The manifest:

$load_warn=inline_template("<%= sprintf('%.2f',(@processorcount.to_i*1.25)) %>,<%= sprintf('%.2f',(@processorcount.to_i*1)) %>,<%= sprintf('%.2f',(@processorcount.to_i*0.8)) %>")
$load_crit=inline_template("<%= sprintf('%.2f',($processorcount.to_i*1.66)) %>,<%= sprintf('%.2f',($processorcount.to_i*1.25)) %>,<%= sprintf('%.2f',($processorcount.to_i*1)) %>")
class { 'sensu':
    rabbitmq_ssl_private_key => "puppet:///data/sensu/certs/client_key.pem",
    rabbitmq_ssl_cert_chain => "puppet:///data/sensu/certs/client_cert.pem",
    rabbitmq_password => 'xxx',
    rabbitmq_host => 'my-sensu-server',
    rabbitmq_port => 5671,
    rabbitmq_vhost => "/sensu",
    plugins => [
        'puppet:///data/sensu/plugins/system/check-ntp.rb',
        'puppet:///data/sensu/plugins/system/check-disk.rb',
        'puppet:///data/sensu/plugins/system/check-load.rb',
    ],  
    use_embedded_ruby => true,
    sensu_plugin_provider => sensu_gem,
    sensu_plugin_version => 'present',
    install_repo => false,
    client_custom => {
        check_load => {
            warning => $load_warn,
            critical => $load_crit,
        },  
    },  
    subscriptions => ['base']
}   
if $is_virtual == "false" {
    sensu::subscription { 'physical': }
}   

@jsfrerot
Copy link

jsfrerot commented Jun 1, 2015

@superseb, since you were the main contact for this issue, would it be possible to look at my last comment and maybe point me where my problem could be?

Thanks.

@superseb
Copy link
Contributor

superseb commented Jun 2, 2015

@jsfrerot Did you only apply #345? Because I think you need to apply #343. Let me know if this solves it.

@jsfrerot
Copy link

jsfrerot commented Jun 8, 2015

@superseb I just applied #343 and it works as expected. Thank you.

@poolski
Copy link
Contributor Author

poolski commented Jun 10, 2015

@superseb I'm now running v1.5.5 which as I understand it has both #343 and #345 rolled in?

Still experiencing the same issue and something new:
Error 400 on SERVER: Invalid parameter reconnect_on_error on Sensu_rabbitmq_config

Restarting the puppetmaster fixes it but for only one run after which it reverts back to erroring.

@superseb
Copy link
Contributor

@poolski No, v1.5.5 doesn't contain those fixes. Let me ping @jamtur01 or @jlambert121 to release a new version.

@poolski
Copy link
Contributor Author

poolski commented Jun 10, 2015

Oh, durp!

Thanks @superseb.

@poolski
Copy link
Contributor Author

poolski commented Jun 11, 2015

@superseb - so another fun fact. I'm testing out using master to see if the bleeding edge code fixes any of my issues. It doesn't seem to - in fact, it introduces another one!

Notice: /Stage[main]/Sensu::Repo::Apt/Apt::Source[sensu]/Apt::Key[Add key: 8911D8FF37778F24B4E726A218609E3D7580C77F from Apt::Source sensu]/Exec[d36677d9164a673e5a4b8cdd005afa63c1c67926]/returns: executed successfully
Notice: /Stage[main]/Sensu::Client::Config/Sensu_client_config[server.network.net]/custom: custom changed 'handlers => ["default"]' to 'handlers => ["default"], keepalive => {"high_flap_threshold"=>20, "low_flap_threshold"=>5, "refresh"=>14400, "thresholds"=>{"critical"=>120, "warning"=>90}}'
Info: Class[Sensu::Client::Config]: Scheduling refresh of Service[sensu-client]

Now it "adds" the repo key every time Puppet runs. Also, even though the port changes have been fixed, it's still resetting keepalives and flap thresholds, even though nothing's changed

@superseb
Copy link
Contributor

@poolski Did you split your data to client_custom and client_keepalive? I'll see if I can reproduce your apt::key.

@poolski
Copy link
Contributor Author

poolski commented Jun 22, 2015

@superseb - here's my data:

  "sensu::client_custom": {
    "handlers": [
      "default"
    ],
    "keepalive": {
      "high_flap_threshold": "20",
      "low_flap_threshold": "5",
      "refresh": 14400,
      "thresholds": {
        "critical": 120,
        "warning": 90
      }
    }
  }

@superseb
Copy link
Contributor

@poolski Please see example like I posted before, and let me know if this helps.

  class { '::sensu':
    client => true,
    client_custom => {
      'handlers' => [
        'default'
       ],
    },
    client_keepalive => {
      "high_flap_threshold" => "20",
      "low_flap_threshold" => "5",
      "refresh" => 14400,
      "thresholds" => {
        "critical" => 120,
        "warning" => 90
      }
    },
    purge_config => true,
    rabbitmq_host => '10.10.10.10',
    rabbitmq_password => 'password',
    rabbitmq_port => 5671,
    rabbitmq_ssl => true,
    rabbitmq_vhost => '/sensu',
    rabbitmq_reconnect_on_error => true,
    redis_reconnect_on_error => true,
    subscriptions => [ 'common' ],
    use_embedded_ruby => true,
  }

@poolski
Copy link
Contributor Author

poolski commented Jun 22, 2015

That's just what I was about to do @superseb :D

@superseb
Copy link
Contributor

@poolski 👍 😄

@poolski
Copy link
Contributor Author

poolski commented Jun 22, 2015

@superseb when I first started using the module, I don't recall there being a dedicated keepalive field - it had to be shoehorned in with client_custom.

@superseb
Copy link
Contributor

@poolski Correct, was added in 30ddb26 (v1.3.0)

@poolski
Copy link
Contributor Author

poolski commented Jun 22, 2015

Okay, progress - now all I have is it trying to change the port on me...
Notice: /Stage[main]/Sensu::Client::Config/Sensu_client_config[host.network.net]/port: port changed '' to '3030'
or
Notice: /Stage[main]/Sensu::Client::Config/Sensu_client_config[host.network.net]/custom: custom changed 'handlers => ["default"], port => 3030' to 'handlers => ["default"]'

The above happen with no discernible pattern.

@superseb
Copy link
Contributor

Awesome. This was fixed in #342, what codebase are you running on now?

@poolski
Copy link
Contributor Author

poolski commented Jun 22, 2015

v1.5.5

@superseb
Copy link
Contributor

Yeah, the fix was merged after the v1.5.5 tag.

@poolski
Copy link
Contributor Author

poolski commented Jun 22, 2015

Alright, I'll try master and see what happens

@aamerik
Copy link

aamerik commented Jun 23, 2015

Can someone suggest to me how to fix "Invalid parameter reconnect_on_error on Sensu_rabbitmq_config"?

using master, client ver 3.7.5, puppetserver 1.0.2, r10k environments. restarted both puppetservers - which fixes the issue for the first run, then back to invalid parameter errors.
tx

@aamerik
Copy link

aamerik commented Jun 23, 2015

Looks like the invalid parameter issue is related to r10k environments. If you want the change to stick you need to update this module in all environments simultaneously and restart the puppetserver.

@poolski
Copy link
Contributor Author

poolski commented Jun 30, 2015

@superseb any idea when the next version will be released?

@superseb
Copy link
Contributor

@poolski I can't release a new version, let's ping @jamtur01 @jlambert121 again. Does it work as expected now?

@jlambert121
Copy link
Contributor

This is one issue that I think needs to get closed out before a release. Is this still an issue with the latest master?

@jcustenborder
Copy link

I believe master as of a week or so ago fixed the issue for me. It'd be
great if someone double checked.

On Tue, Jun 30, 2015, 9:31 AM Justin Lambert notifications@github.com
wrote:

This is one issue that I think needs to get closed out before a release.
Is this still an issue with the latest master?


Reply to this email directly or view it on GitHub
#336 (comment).

@deepakhj
Copy link

deepakhj commented Jul 1, 2015

I had the same issue with the port changing on every run. Switched from 1.5.5 to master and now it's fixed.

@jlambert121
Copy link
Contributor

I'm going to close this, let us know if this is still an issue.

@poolski
Copy link
Contributor Author

poolski commented Jul 15, 2015

@jlambert121 I'm getting the following error if I switch to using master:

Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Invalid parameter source on Sensu_check[os-disks] at /etc/puppet/environments/staging/modules/sensu/manifests/check.pp:148 on node mynode.mynetwork.net

What broke between 1.5.5 and master? :(

@superseb
Copy link
Contributor

Restarted?

#377 seems to contain a fix for source param issues.

@poolski
Copy link
Contributor Author

poolski commented Jul 15, 2015

Yeah I restarted All The Things.

According to #377, it's been merged into master and fixes that error, but something's still not right.

@poolski
Copy link
Contributor Author

poolski commented Jul 15, 2015

Ok, so, stopping the puppetserver actually seemed to fix it.

Issuing an /etc/init.d/puppetserver restart doesn't seem to flush whatever it's caching as effectively.

Might be worth noting in docs somewhere?

@devshorts
Copy link

Just chiming in here, any ideas when this will get rolled into an official release?

@jlambert121
Copy link
Contributor

@devshorts I keep hoping to get a 2.0 release done real soon. Enterprise support looks like it's pretty much done.

@cintiadr
Copy link

cintiadr commented Sep 7, 2015

It took me a while to understand that this issue was closed but not yet released.

I will be using commit cf40de6, as I won't be able to get apt 2.0 any time soon :/, caused by #411

Looking at the commits, I believe that's the first breaking change after 1.5.5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants