Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] SSLReloadIntegTests testThatSSLConfigurationReloadsOnModification failing #92365

Closed
joegallo opened this issue Dec 14, 2022 · 5 comments
Closed
Labels
:Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. low-risk An open issue or test failure that is a low risk to future releases Team:Distributed Meta label for distributed team >test-failure Triaged test failures from CI

Comments

@joegallo
Copy link
Contributor

Build scan:
https://gradle-enterprise.elastic.co/s/z2pvgeyxxqfoo/tests/:x-pack:plugin:security:internalClusterTest/org.elasticsearch.xpack.ssl.SSLReloadIntegTests/testThatSSLConfigurationReloadsOnModification

Reproduction line:

gradlew ':x-pack:plugin:security:internalClusterTest' --tests "org.elasticsearch.xpack.ssl.SSLReloadIntegTests.testThatSSLConfigurationReloadsOnModification" -Dtests.seed=2CD592B1AC5189E4 -Dtests.locale=el -Dtests.timezone=America/Denver -Druntime.java=19

Applicable branches:
8.6

Reproduces locally?:
Didn't try

Failure history:
https://gradle-enterprise.elastic.co/scans/tests?tests.container=org.elasticsearch.xpack.ssl.SSLReloadIntegTests&tests.test=testThatSSLConfigurationReloadsOnModification

Failure excerpt:

com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=3286, name=elasticsearch[node_s0][generic][T#1], state=RUNNABLE, group=TGRP-SSLReloadIntegTests]


  Caused by: java.lang.AssertionError: initial cluster state not set yet

    at __randomizedtesting.SeedInfo.seed([2CD592B1AC5189E4]:0)
    at org.elasticsearch.cluster.service.ClusterApplierService.state(ClusterApplierService.java:181)
    at org.elasticsearch.cluster.service.ClusterService.state(ClusterService.java:141)
    at org.elasticsearch.xpack.monitoring.exporter.local.LocalExporter.onCleanUpIndices(LocalExporter.java:598)
    at org.elasticsearch.xpack.monitoring.cleaner.CleanerService$IndicesCleaner.doRunInLifecycle(CleanerService.java:164)
    at org.elasticsearch.common.util.concurrent.AbstractLifecycleRunnable.doRun(AbstractLifecycleRunnable.java:56)
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:917)
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    at java.lang.Thread.run(Thread.java:1589)

@joegallo joegallo added :Security/TLS SSL/TLS, Certificates >test-failure Triaged test failures from CI labels Dec 14, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

@elasticsearchmachine elasticsearchmachine added the Team:Security Meta label for security team label Dec 14, 2022
@joegallo
Copy link
Contributor Author

joegallo commented Dec 14, 2022

Note: It might be important to know that the failure above is on Windows. Only 1 failure in the last week.

@slobodanadamovic slobodanadamovic self-assigned this Dec 15, 2022
@slobodanadamovic
Copy link
Contributor

Looking at the test, I don't think this is a security issue since test was executed successfully:

[2022-12-14T01:00:02,734][INFO ][o.e.x.s.SSLReloadIntegTests] [[HandshakeCompletedNotify-Thread]] ssl handshake completed on port [64842]

Seems like there is some race condition when Node is stopping LifecycleComponents, or simply the order in which they are stopped is causing the issue. Since CleanerService (in monitoring plugin) depends on ClusterService, maybe we should stop the cluster service last? This is just hypothetical. I would appreciate if someone from @elastic/es-distributed could have a look.

@gwbrown gwbrown added the low-risk An open issue or test failure that is a low risk to future releases label Oct 13, 2023
@slobodanadamovic slobodanadamovic added the Team:Distributed Meta label for distributed team label Dec 21, 2023
@elasticsearchmachine elasticsearchmachine removed the Team:Distributed Meta label for distributed team label Dec 21, 2023
@slobodanadamovic slobodanadamovic added :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. Team:Distributed Meta label for distributed team and removed :Security/TLS SSL/TLS, Certificates Team:Security Meta label for security team labels Dec 21, 2023
@slobodanadamovic slobodanadamovic removed their assignment Dec 21, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@slobodanadamovic slobodanadamovic self-assigned this Dec 21, 2023
@slobodanadamovic slobodanadamovic removed their assignment Feb 13, 2024
@idegtiarenko
Copy link
Contributor

Fixed by #100565

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. low-risk An open issue or test failure that is a low risk to future releases Team:Distributed Meta label for distributed team >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

5 participants