New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jenkins service restart on each puppet run under Redhat 7 #807

Closed
westbywest opened this Issue Sep 12, 2017 · 7 comments

Comments

Projects
None yet
3 participants
@westbywest

westbywest commented Sep 12, 2017

With commit 5ab2c8a of this module, I get this on each puppet agent run, no matter if manual or invoked by the daemon:

puppet-agent[15850]: (/Service[jenkins]/ensure) ensure changed 'running' to 'stopped'
puppet-agent[15850]: (/Stage[main]/Jenkins/Jenkins::Systemd[jenkins]/Transition[stop jenkins service]/enable) transition state {"ensure"=>"stopped"} applied to Service[jenkins]
puppet-agent[15850]: (/Stage[main]/Jenkins::Service/Service[jenkins]/ensure) ensure changed 'stopped' to 'running'
puppet-agent[15850]: (/Stage[main]/Jenkins::Service/Service[jenkins]) Unscheduling refresh on Service[jenkins]

The restarts don't occur using the v1.7.0 version of the module published to forge.

My machine:

  • CentOS 7.3.1611
  • OpenJDK 1.8.0_141
  • Puppet agent 4.10.7

Module invocation:

  class { 'jenkins': }

Hiera:

jenkins::lts: false
jenkins::plugin_hash:
  ace-editor: {}
  ant: {}
  ... and dozens more plugins

@nick-george

This comment has been minimized.

Show comment
Hide comment
@nick-george

nick-george Oct 29, 2017

I am running on the master branch and the same commit as @westbywest. I'm also running RHEL7 (OL7), with Jenkins version as below

Name        : jenkins
Arch        : noarch
Version     : 2.73.2
Release     : 1.1
Size        : 70 M
Repo        : installed
From repo   : jenkins

I get the following each time I run Jenkins.

Notice: /Stage[main]/Profiles::Jenkins/Jenkins::User[jenkins]/Jenkins::Cli::Exec[create-jenkins-user-jenkins]/Exec[create-jenkins-user-jenkins]/returns: executed successfully
Info: /Stage[main]/Profiles::Jenkins/Jenkins::User[jenkins]/Jenkins::Cli::Exec[create-jenkins-user-jenkins]/Exec[create-jenkins-user-jenkins]: Scheduling refresh of Class[Jenkins::Cli::Reload]
Info: Class[Jenkins::Cli::Reload]: Scheduling refresh of Exec[reload-jenkins]
Notice: /Stage[main]/Profiles::Jenkins/Jenkins::Job[r10k-branch]/Jenkins::Job::Present[r10k-branch]/Exec[jenkins update-job r10k-branch]/returns: executed successfully
Info: /Stage[main]/Profiles::Jenkins/Jenkins::Job[r10k-branch]/Jenkins::Job::Present[r10k-branch]/Exec[jenkins update-job r10k-branch]: Scheduling refresh of Exec[reload-jenkins]
Notice: /Stage[main]/Profiles::Jenkins/Jenkins::Job[sync-elastic-repos]/Jenkins::Job::Present[sync-elastic-repos]/Exec[jenkins update-job sync-elastic-repos]/returns: executed successfully
Info: /Stage[main]/Profiles::Jenkins/Jenkins::Job[sync-elastic-repos]/Jenkins::Job::Present[sync-elastic-repos]/Exec[jenkins update-job sync-elastic-repos]: Scheduling refresh of Exec[reload-jenkins]
Notice: /Stage[main]/Jenkins::Cli::Reload/Exec[reload-jenkins]: Triggered 'refresh' from 3 events

I can see that the jenkins::user define is not idempotent, so that might explain that bit. I've also done a diff of my two job files, and got the following, which might explain why the jobs are continually re-installed. When you install a Jenkins job via the API, does it munge the XML?

 diff jobs/sync-repos/config.xml /tmp/sync-repos-config.xml
1c1,2
< <?xml version="1.0" encoding="UTF-8"?><project>
---
> <?xml version="1.0" encoding="UTF-8"?>
> <project>

nick-george commented Oct 29, 2017

I am running on the master branch and the same commit as @westbywest. I'm also running RHEL7 (OL7), with Jenkins version as below

Name        : jenkins
Arch        : noarch
Version     : 2.73.2
Release     : 1.1
Size        : 70 M
Repo        : installed
From repo   : jenkins

I get the following each time I run Jenkins.

Notice: /Stage[main]/Profiles::Jenkins/Jenkins::User[jenkins]/Jenkins::Cli::Exec[create-jenkins-user-jenkins]/Exec[create-jenkins-user-jenkins]/returns: executed successfully
Info: /Stage[main]/Profiles::Jenkins/Jenkins::User[jenkins]/Jenkins::Cli::Exec[create-jenkins-user-jenkins]/Exec[create-jenkins-user-jenkins]: Scheduling refresh of Class[Jenkins::Cli::Reload]
Info: Class[Jenkins::Cli::Reload]: Scheduling refresh of Exec[reload-jenkins]
Notice: /Stage[main]/Profiles::Jenkins/Jenkins::Job[r10k-branch]/Jenkins::Job::Present[r10k-branch]/Exec[jenkins update-job r10k-branch]/returns: executed successfully
Info: /Stage[main]/Profiles::Jenkins/Jenkins::Job[r10k-branch]/Jenkins::Job::Present[r10k-branch]/Exec[jenkins update-job r10k-branch]: Scheduling refresh of Exec[reload-jenkins]
Notice: /Stage[main]/Profiles::Jenkins/Jenkins::Job[sync-elastic-repos]/Jenkins::Job::Present[sync-elastic-repos]/Exec[jenkins update-job sync-elastic-repos]/returns: executed successfully
Info: /Stage[main]/Profiles::Jenkins/Jenkins::Job[sync-elastic-repos]/Jenkins::Job::Present[sync-elastic-repos]/Exec[jenkins update-job sync-elastic-repos]: Scheduling refresh of Exec[reload-jenkins]
Notice: /Stage[main]/Jenkins::Cli::Reload/Exec[reload-jenkins]: Triggered 'refresh' from 3 events

I can see that the jenkins::user define is not idempotent, so that might explain that bit. I've also done a diff of my two job files, and got the following, which might explain why the jobs are continually re-installed. When you install a Jenkins job via the API, does it munge the XML?

 diff jobs/sync-repos/config.xml /tmp/sync-repos-config.xml
1c1,2
< <?xml version="1.0" encoding="UTF-8"?><project>
---
> <?xml version="1.0" encoding="UTF-8"?>
> <project>
@jhoblitt

This comment has been minimized.

Show comment
Hide comment
@jhoblitt

jhoblitt Nov 14, 2017

Member

The solution is to migrate to the native types. jenkins::job and jenkins::user are not idempotent.

Member

jhoblitt commented Nov 14, 2017

The solution is to migrate to the native types. jenkins::job and jenkins::user are not idempotent.

@nick-george

This comment has been minimized.

Show comment
Hide comment
@nick-george

nick-george Nov 26, 2017

Thanks Joshua,

Using the native types has fixed the jobs being re-installed on each run. However even using the native types, I have the same issue as OP on each run (a restart of Jenkins).

Cheers,
Nick

Notice: /Service[jenkins]/ensure: ensure changed 'running' to 'stopped'
Notice: /Stage[main]/Jenkins/Jenkins::Systemd[jenkins]/Transition[stop jenkins service]/enable: transition state {"ensure"=>"stopped"} applied to Service[jenkins]
Notice: /Stage[main]/Jenkins::Service/Service[jenkins]/ensure: ensure changed 'stopped' to 'running'
Info: /Stage[main]/Jenkins::Service/Service[jenkins]: Unscheduling refresh on Service[jenkins]
Notice: /Stage[main]/Profiles::Jenkins/Jenkins_security_realm[hudson.security.HudsonPrivateSecurityRealm]/arguments: arguments changed ['false', 'false', 'undef'] to 'false false '

nick-george commented Nov 26, 2017

Thanks Joshua,

Using the native types has fixed the jobs being re-installed on each run. However even using the native types, I have the same issue as OP on each run (a restart of Jenkins).

Cheers,
Nick

Notice: /Service[jenkins]/ensure: ensure changed 'running' to 'stopped'
Notice: /Stage[main]/Jenkins/Jenkins::Systemd[jenkins]/Transition[stop jenkins service]/enable: transition state {"ensure"=>"stopped"} applied to Service[jenkins]
Notice: /Stage[main]/Jenkins::Service/Service[jenkins]/ensure: ensure changed 'stopped' to 'running'
Info: /Stage[main]/Jenkins::Service/Service[jenkins]: Unscheduling refresh on Service[jenkins]
Notice: /Stage[main]/Profiles::Jenkins/Jenkins_security_realm[hudson.security.HudsonPrivateSecurityRealm]/arguments: arguments changed ['false', 'false', 'undef'] to 'false false '
@nick-george

This comment has been minimized.

Show comment
Hide comment
@nick-george

nick-george Nov 29, 2017

After looking at the code, I believe it's this section in systemd.pp that forces Jenkins to stop on every run. In particular the ensure => stopped line. I think @jhoblitt should be able to give us an indication why this needs to fire on every run.

transition { "stop ${service} service":
    resource   => Service[$service],
    attributes => {
      # lint:ignore:ensure_first_param
      ensure => stopped,
      # lint:endignore
    },
    prior_to   => [
      File[$sysv_init],
    ],
  }

Having this happen on every puppet run is causing me major problems.

Jenkins doesn't start fast enough on my system, so I get the following error:

Notice: /Stage[main]/Jenkins/Jenkins::Systemd[jenkins]/Transition[stop jenkins service]/enable: transition state {"ensure"=>"stopped"} applied to Service[jenkins]
Notice: /Stage[main]/Jenkins::Service/Service[jenkins]/ensure: ensure changed 'stopped' to 'running'
Info: /Stage[main]/Jenkins::Service/Service[jenkins]: Unscheduling refresh on Service[jenkins]
Error: Failed to apply catalog: Execution of '/bin/cat /usr/lib/jenkins/puppet_helper.groovy | /bin/java -jar /usr/lib/jenkins/jenkins-cli.jar -s http://localhost:8080 -i /var/lib/jenkins/.ssh/id_rsa -ssh -user jenkins groovy = user_info_all' returned 255: Nov 29, 2017 9:21:49 PM hudson.cli.SSHCLI sshConnection
WARNING: No header 'X-SSH-Endpoint' returned by Jenkins

Furthermore, it appears that any time there is a Jenkins CLI error, it causes the entire catalogue run to abort. I'm going to post this as a separate bug.

Cheers,
Nick

nick-george commented Nov 29, 2017

After looking at the code, I believe it's this section in systemd.pp that forces Jenkins to stop on every run. In particular the ensure => stopped line. I think @jhoblitt should be able to give us an indication why this needs to fire on every run.

transition { "stop ${service} service":
    resource   => Service[$service],
    attributes => {
      # lint:ignore:ensure_first_param
      ensure => stopped,
      # lint:endignore
    },
    prior_to   => [
      File[$sysv_init],
    ],
  }

Having this happen on every puppet run is causing me major problems.

Jenkins doesn't start fast enough on my system, so I get the following error:

Notice: /Stage[main]/Jenkins/Jenkins::Systemd[jenkins]/Transition[stop jenkins service]/enable: transition state {"ensure"=>"stopped"} applied to Service[jenkins]
Notice: /Stage[main]/Jenkins::Service/Service[jenkins]/ensure: ensure changed 'stopped' to 'running'
Info: /Stage[main]/Jenkins::Service/Service[jenkins]: Unscheduling refresh on Service[jenkins]
Error: Failed to apply catalog: Execution of '/bin/cat /usr/lib/jenkins/puppet_helper.groovy | /bin/java -jar /usr/lib/jenkins/jenkins-cli.jar -s http://localhost:8080 -i /var/lib/jenkins/.ssh/id_rsa -ssh -user jenkins groovy = user_info_all' returned 255: Nov 29, 2017 9:21:49 PM hudson.cli.SSHCLI sshConnection
WARNING: No header 'X-SSH-Endpoint' returned by Jenkins

Furthermore, it appears that any time there is a Jenkins CLI error, it causes the entire catalogue run to abort. I'm going to post this as a separate bug.

Cheers,
Nick

@jhoblitt

This comment has been minimized.

Show comment
Hide comment
@jhoblitt

jhoblitt Nov 29, 2017

Member

@nick-george Could you provide a minimal manifest to reproduce this? I don't see this in my own production env and it isn't occurring under the centos-7 acceptance tests.

Member

jhoblitt commented Nov 29, 2017

@nick-george Could you provide a minimal manifest to reproduce this? I don't see this in my own production env and it isn't occurring under the centos-7 acceptance tests.

@nick-george

This comment has been minimized.

Show comment
Hide comment
@nick-george

nick-george Dec 1, 2017

I think I've figured it out. it's the resource defaults for file (in my case). I can reproduce it with:

node default{
  File{
      ensure  => 'present',
      owner   => "jenkins",
      group   => "jenkins",
      mode    => '644',
  }
  class {'::jenkins':
    cli_ssh_keyfile    => "/var/lib/jenkins/.ssh/id_rsa",
    install_java       => false,
    configure_firewall => true,
    manage_user        => false,
    manage_group       => false,
    cli                => true,
    cli_username       => 'jenkins',
    cli_remoting_free  => true, #Force it to use the new CLI
    port               => 8080,  #Used by the Jenkins CLI only. Needs to be an int.
    config_hash        => {'JENKINS_PORT' => { 'value' => "8080"},
                           'HTTP_PORT' => { 'value' => "8080" }}, #This is required for the jenkins CLI which installs users etc.
  }
}

If I don't have the resource default for file, the problem doesn't occur.

Cheers,
Nick

nick-george commented Dec 1, 2017

I think I've figured it out. it's the resource defaults for file (in my case). I can reproduce it with:

node default{
  File{
      ensure  => 'present',
      owner   => "jenkins",
      group   => "jenkins",
      mode    => '644',
  }
  class {'::jenkins':
    cli_ssh_keyfile    => "/var/lib/jenkins/.ssh/id_rsa",
    install_java       => false,
    configure_firewall => true,
    manage_user        => false,
    manage_group       => false,
    cli                => true,
    cli_username       => 'jenkins',
    cli_remoting_free  => true, #Force it to use the new CLI
    port               => 8080,  #Used by the Jenkins CLI only. Needs to be an int.
    config_hash        => {'JENKINS_PORT' => { 'value' => "8080"},
                           'HTTP_PORT' => { 'value' => "8080" }}, #This is required for the jenkins CLI which installs users etc.
  }
}

If I don't have the resource default for file, the problem doesn't occur.

Cheers,
Nick

@jhoblitt

This comment has been minimized.

Show comment
Hide comment
@jhoblitt

jhoblitt Feb 8, 2018

Member

I'm going to close this our since there's not much we can do about resource defaults, which are pretty much a bag of razor blades.

Member

jhoblitt commented Feb 8, 2018

I'm going to close this our since there's not much we can do about resource defaults, which are pretty much a bag of razor blades.

@jhoblitt jhoblitt closed this Feb 8, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment