Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CasC is unable to reapply GlobalJobDslSecurityConfiguration after Jenkins restart #253

Closed
hawky-4s- opened this issue Jun 2, 2018 · 20 comments

Comments

@hawky-4s-
Copy link
Contributor

hawky-4s- commented Jun 2, 2018

I firstly observed that behavior when killing our Jenkins pod in K8s and it didn't get back up.
CasC 0.7-alpha was unable to reapply following part of the configuration. It also happened with earlier versions.

security:
  GlobalJobDslSecurityConfiguration:
    useScriptSecurity: false

The exception thrown is:

Caused by: org.jenkinsci.plugins.casc.ConfiguratorException: Invalid configuration elements for type class jenkins.model.GlobalConfigurationCategory$Security : GlobalJobDslSecurityConfiguration

I traced it back to following line not resolving to true after restart for the GlobalJobDslSecurityConfiguration: https://github.com/jenkinsci/configuration-as-code-plugin/blob/master/src/main/java/org/jenkinsci/plugins/casc/GlobalConfigurationCategoryConfigurator.java#L61

The culprit is

descriptor.getCategory() == category

resolving to false after restart.

See PR #255 for failing testcase.

@ndeloof
Copy link
Contributor

ndeloof commented Aug 10, 2018

I hardly understand why we get descriptor.getCategory() == category after restart.
sounds like some weird classloader side effect.

@ndeloof
Copy link
Contributor

ndeloof commented Aug 10, 2018

GlobalConfigurationCategory.all() = {ExtensionList@9902} size = 5
0 = {GlobalCredentialsConfiguration$Category@9905}
1 = {AwsGlobalConfigurationCategory@9906}
2 = {GlobalConfigurationCategory$Security@9813}
3 = {GlobalConfigurationCategory$Unclassified@9907}
4 = {ToolConfigurationCategory@9908}

but JobDsl descriptor's category reference another instance {GlobalConfigurationCategory$Security@9523} despite it is set by GlobalConfigurationCategory.get(GlobalConfigurationCategory.Security)

@ndeloof
Copy link
Contributor

ndeloof commented Aug 10, 2018

a possible fix is to compare class name, but I'd like to understand why we get two instances of GlobalConfigurationCategory.Security within the same JVM, as it sounds this could break many other use-cases.

ndeloof added a commit that referenced this issue Aug 10, 2018
(contributed by @hawky-4s-)

Signed-off-by: Nicolas De Loof <nicolas.deloof@gmail.com>
@ndeloof ndeloof closed this as completed Aug 31, 2018
@calebdelnay
Copy link

calebdelnay commented Sep 14, 2018

I think this issue is still present somehow. I am experimenting with version 1.0 of the plugin. Jenkins starts fine the first time, but restarting Jenkins by going to /restart leads to failures to start with the same error as OP. See below. My Jenkins version is v2.138.1 and I'm using the jenkins/jenkins:lts image running inside GKE. Jenkins is deployed using the Jenkins Helm chart.

SEVERE: Failed ConfigurationAsCode.init                                                                                                                         [72/1850]
java.lang.Error: java.lang.reflect.InvocationTargetException
        at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:110)
        at hudson.init.TaskMethodFinder$TaskImpl.run(TaskMethodFinder.java:175)
        at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:296)
        at jenkins.model.Jenkins$5.runTask(Jenkins.java:1066)
        at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:214)
        at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:117)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:104)
        ... 8 more
Caused by: io.jenkins.plugins.casc.ConfiguratorException: security: error configuring 'security' with class io.jenkins.plugins.casc.impl.configurators.GlobalConfigurationCategoryConfigurator configurator
        at io.jenkins.plugins.casc.ConfigurationAsCode.invokeWith(ConfigurationAsCode.java:612)
        at io.jenkins.plugins.casc.ConfigurationAsCode.checkWith(ConfigurationAsCode.java:641)
        at io.jenkins.plugins.casc.ConfigurationAsCode.configureWith(ConfigurationAsCode.java:628)
        at io.jenkins.plugins.casc.ConfigurationAsCode.configureWith(ConfigurationAsCode.java:540)
        at io.jenkins.plugins.casc.ConfigurationAsCode.configure(ConfigurationAsCode.java:270)
        at io.jenkins.plugins.casc.ConfigurationAsCode.init(ConfigurationAsCode.java:262)
        ... 13 more
Caused by: io.jenkins.plugins.casc.ConfiguratorException: Invalid configuration elements for type class jenkins.model.GlobalConfigurationCategory$Security : globaljobdslsecurityconfiguration
        at io.jenkins.plugins.casc.BaseConfigurator.handleUnknown(BaseConfigurator.java:355)
        at io.jenkins.plugins.casc.BaseConfigurator.configure(BaseConfigurator.java:345)
        at io.jenkins.plugins.casc.BaseConfigurator.check(BaseConfigurator.java:265)
        at io.jenkins.plugins.casc.ConfigurationAsCode.lambda$checkWith$8(ConfigurationAsCode.java:641)
        at io.jenkins.plugins.casc.ConfigurationAsCode.invokeWith(ConfigurationAsCode.java:606)
        ... 18 more

Here is my configuration:

jenkins:
  systemMessage: |
    Jenkins configured automatically by Jenkins Configuration as Code Plugin
  numExecutors: 0
  scmCheckoutRetryCount: 2
security:
  globaljobdslsecurityconfiguration:
    useScriptSecurity: true
  remotingCLI:
    enabled: false

Let me know if there is anything else I can provide.

@ndeloof
Copy link
Contributor

ndeloof commented Sep 14, 2018

Can you please try using uppercase GlobalJobDslSecurityConfiguration ?

@ndeloof ndeloof reopened this Sep 14, 2018
@calebdelnay
Copy link

Looks to be the same result when using CamelCase GlobalJobDslSecurityConfiguration:

SEVERE: Failed ConfigurationAsCode.init
java.lang.Error: java.lang.reflect.InvocationTargetException
        at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:110)
        at hudson.init.TaskMethodFinder$TaskImpl.run(TaskMethodFinder.java:175)
        at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:296)
        at jenkins.model.Jenkins$5.runTask(Jenkins.java:1066)
        at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:214)
        at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:117)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:104)
        ... 8 more
Caused by: io.jenkins.plugins.casc.ConfiguratorException: security: error configuring 'security' with class io.jenkins.plugins.casc.impl.configurators.GlobalConfigurationCategoryConfigurator configurator
        at io.jenkins.plugins.casc.ConfigurationAsCode.invokeWith(ConfigurationAsCode.java:612)
        at io.jenkins.plugins.casc.ConfigurationAsCode.checkWith(ConfigurationAsCode.java:641)
        at io.jenkins.plugins.casc.ConfigurationAsCode.configureWith(ConfigurationAsCode.java:628)
        at io.jenkins.plugins.casc.ConfigurationAsCode.configureWith(ConfigurationAsCode.java:540)
        at io.jenkins.plugins.casc.ConfigurationAsCode.configure(ConfigurationAsCode.java:270)
        at io.jenkins.plugins.casc.ConfigurationAsCode.init(ConfigurationAsCode.java:262)
        ... 13 more
Caused by: io.jenkins.plugins.casc.ConfiguratorException: Invalid configuration elements for type class jenkins.model.GlobalConfigurationCategory$Security : GlobalJobDslSecurityConfiguration
        at io.jenkins.plugins.casc.BaseConfigurator.handleUnknown(BaseConfigurator.java:355)
        at io.jenkins.plugins.casc.BaseConfigurator.configure(BaseConfigurator.java:345)
        at io.jenkins.plugins.casc.BaseConfigurator.check(BaseConfigurator.java:265)
        at io.jenkins.plugins.casc.ConfigurationAsCode.lambda$checkWith$8(ConfigurationAsCode.java:641)
        at io.jenkins.plugins.casc.ConfigurationAsCode.invokeWith(ConfigurationAsCode.java:606)
        ... 18 more

@calebdelnay
Copy link

Circled back around to this. At first I thought the issue only occurred when doing a Jenkins restart via /restart and that maybe the restart doesn't fully release process memory which leads to the multiple instances of GlobalConfigurationCategory.Security within the JVM. However, I completely deleted our Jenkins pod and am observing the same issue in a new pod that is using the same data volume. That makes me believe the source of the bug is some combination of state and/or code that reads the state which is persisted as part of Jenkins configuration. Very peculiar! Still digging to see if I can learn anything else.

@noony
Copy link

noony commented Sep 27, 2018

I got this issue too, for me a temporary solution was to delete the javaposse.jobdsl.plugin.GlobalJobDslSecurityConfiguration.xml file in the jenkins_home. But this is not systainable jenkins will be broken at the next restart.

@ndeloof
Copy link
Contributor

ndeloof commented Sep 27, 2018

some groovy weirdness seems to take place here, which doesn't make any sense to me.

@jhoblitt
Copy link
Member

jhoblitt commented Sep 27, 2018

The example/test from integrations/src/test/resources/io/jenkins/plugins/casc/JobDslGlobalSecurityConfigurationTest.yml seems to be mysteriously working for me all of a sudden even though the core + plugin versions have not changed -- only the the CASC config. However, with warnings enabled, I'm now seeing this message on the /manage page:

Configuration as Code obsolete file format:

/etc/jenkins/casc/01_config.yaml (line 330): Invalid configuration elements for type class jenkins.model.GlobalConfigurationCategory$Security : GlobalJobDslSecurityConfiguration

The GlobalJobDslSecurityConfiguration key also seems to be absent from a config export.

Update:

  • core 2.141
  • casc 1.0
  • casc-support 1.0

@ndeloof
Copy link
Contributor

ndeloof commented Sep 28, 2018

got it:

serialized xml for GlobalJobDslSecurityConfiguration includes an unexpected line:

<category class="jenkins.model.GlobalConfigurationCategory$Security"/>

This is caused by groovy style property used to override getCategory

final GlobalConfigurationCategory category = GlobalConfigurationCategory.get(GlobalConfigurationCategory.Security)

doing so, XStream do consider a filed to be serialized, not just a method override.
As a result, on load, XStream do create a fresh new jenkins.model.GlobalConfigurationCategory$Security instance, which is != from the singleton it is supposed to be.

@ndeloof
Copy link
Contributor

ndeloof commented Sep 28, 2018

@ndeloof
Copy link
Contributor

ndeloof commented Sep 28, 2018

@calebdelnay
Copy link

I tested a build of job-dsl-plugin with jenkinsci/job-dsl-plugin#1143 applied and that change fixed this issue. I am able to restart Jenkins repeatedly without any problems.

@noony
Copy link

noony commented Nov 6, 2018

Hi, jenkins configuration as code is not usable until they merged this pull request, because any restart need a manual action to recover the service. Can you help to land the fix in a job-dsl-plugin release @ndeloof please ?

@ndeloof
Copy link
Contributor

ndeloof commented Nov 6, 2018

Nothing I can do for job-dsl maintainer to accept this PR. Better report success using it jenkinsci/job-dsl-plugin#1143 so they get more confident and eventually merge

@yogeek
Copy link
Contributor

yogeek commented Feb 6, 2019

@ndeloof @ewelinawilkosz I think this issue can be closed :

  • job-dsl PR has finally been merged and is present in the job-dsl-1.71 release of January
  • I just applied the following configuration and the error did not occur anymore :
security:
  globaljobdslsecurityconfiguration:
    useScriptSecurity: true

@noony
Copy link

noony commented Feb 8, 2019

Thanks !

@jetersen
Copy link
Member

jetersen commented Mar 21, 2019

I have actually enabled the test for useScriptSecurity and Jenkins restartable issue by removing the ignore flag: See #780 after bumping integration tests to use job-dsl-1.72

Added a test for Job DSL script security race condition in the same PR and which was reported in #619
💪 😎

@jetersen
Copy link
Member

#1394 should fix the ordering of JobDSL and security section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants