Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jenkins slaves connection fails randomly #299

Open
mikiedelstein opened this issue Jan 4, 2022 · 2 comments
Open

Jenkins slaves connection fails randomly #299

mikiedelstein opened this issue Jan 4, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@mikiedelstein
Copy link

Jenkins and plugins versions report

Environment

Jenkins: 2.327
OS: Linux - 5.4.0-1058-gcp

CustomHistory:1.6
ace-editor:1.1
ansicolor:1.0.1
ant:1.13
antisamy-markup-formatter:2.6
apache-httpcomponents-client-4-api:4.5.13-1.0
artifactory:3.14.2
authentication-tokens:1.4
authorize-project:1.4.0
aws-credentials:1.33
aws-java-sdk-ec2:1.12.131-302.vbef9650c6521
aws-java-sdk-minimal:1.12.131-302.vbef9650c6521
badge:1.9
bitbucket:214.v2fd4234d0554
bitbucket-push-and-pull-request:2.8.1
bootstrap4-api:4.6.0-3
bootstrap5-api:5.1.3-4
bouncycastle-api:2.25
branch-api:2.7.0
build-environment:1.7
build-monitor-plugin:1.13+build.202112271752
build-name-setter:2.2.0
build-pipeline-plugin:1.5.8
build-timeout:1.20
build-timestamp:1.0.3
build-user-vars-plugin:1.8
caffeine-api:2.9.2-29.v717aac953ff3
checks-api:1.7.2
chucknorris:1.4
cloudbees-folder:6.17
command-launcher:1.6
conditional-buildstep:1.4.1
config-file-provider:3.8.2
copyartifact:1.46.2
credentials:1055.v1346ba467ba1
credentials-binding:1.27
dashboard-view:2.18
display-url-api:2.3.5
docker-build-step:2.8
docker-commons:1.17
docker-java-api:3.1.5.2
docker-plugin:1.2.6
docker-workflow:1.26
durable-task:493.v195aefbb0ff2
echarts-api:5.2.2-2
email-ext:2.86
embeddable-build-status:2.0.3
extended-choice-parameter:0.82
extensible-choice-parameter:1.8.0
external-monitor-job:1.7
extra-columns:1.25
font-awesome-api:5.15.4-5
gcloud-sdk:0.0.3
generic-webhook-trigger:1.79
git:4.10.1
git-client:3.11.0
git-parameter:0.9.14
git-server:1.10
git-tag-message:1.7.1
github:1.34.1
github-api:1.301-378.v9807bd746da5
github-branch-source:2.11.4
google-compute-engine:4.3.8
google-container-registry-auth:0.3
google-kubernetes-engine:0.8.6
google-login:1.6
google-oauth-plugin:1.0.6
gradle:1.37.1
groovy-postbuild:2.5
handlebars:3.0.8
hidden-parameter:0.0.4
htmlpublisher:1.28
ivy:2.1
jackson2-api:2.13.1-244.v773c36c5b330
javadoc:1.6
jaxb:2.3.0
jdk-tool:1.5
jjwt-api:0.11.2-9.c8b45b8bb173
job-dsl:1.78.3
jobConfigHistory:2.31-rc1098.b666422863b2
jquery:1.12.4-1
jquery-detached:1.2.1
jquery-ui:1.0.2
jquery3-api:3.6.0-2
jsch:0.1.55.2
junit:1.53
kubernetes:1.31.1
kubernetes-client-api:5.10.1-171.vaa0774fb8c20
kubernetes-credentials:0.9.0
ldap:2.7
lockable-resources:2.13
mailer:1.34
matrix-auth:3.0
matrix-project:1.19
maven-plugin:3.16
mercurial:2.16
metrics:4.0.2.8
momentjs:1.1.1
nodejs:1.4.3
oauth-credentials:0.5
okhttp-api:4.9.3-105.vb96869f8ac3a
pam-auth:1.6.1
parameter-separator:1.3
parameterized-trigger:2.43
periodicbackup:1.7
pipeline-build-step:2.15
pipeline-github-lib:1.0
pipeline-graph-analysis:188.v3a01e7973f2c
pipeline-input-step:427.va6441fa17010
pipeline-milestone-step:1.3.2
pipeline-model-api:1.9.3
pipeline-model-declarative-agent:1.1.1
pipeline-model-definition:1.9.3
pipeline-model-extensions:1.9.3
pipeline-rest-api:2.20
pipeline-stage-step:291.vf0a8a7aeeb50
pipeline-stage-tags-metadata:1.9.3
pipeline-stage-view:2.20
pipeline-utility-steps:2.11.0
plain-credentials:1.7
plugin-util-api:2.10.0
popper-api:1.16.1-2
popper2-api:2.11.0-1
publish-over:0.22
publish-over-ssh:1.22
purge-job-history:1.6
pwauth:0.4
readonly-parameters:1.0.0
rebuild:1.32
resource-disposer:0.17
role-strategy:3.2.0
run-condition:1.5
saml:2.0.9
scm-api:2.6.5
script-security:1118.vba21ca2e3286
slack:2.49
snakeyaml-api:1.29.1
ssh:2.6.1
ssh-agent:1.23
ssh-credentials:1.19
ssh-slaves:1.33.0
sshd:3.1.0
stashNotifier:1.24
structs:308.v852b473a2b8c
summary_report:1.15
throttle-concurrents:2.6
timestamper:1.15
token-macro:267.vcdaea6462991
trilead-api:1.0.13
uno-choice:2.5.7
variant:1.4
view-job-filters:2.3
windows-slaves:1.8
workflow-aggregator:2.6
workflow-api:1108.v57edf648f5d4
workflow-basic-steps:2.24
workflow-cps:2648.va9433432b33c
workflow-cps-global-lib:552.vd9cc05b8a2e1
workflow-durable-task-step:1112.vda00e6febcc1
workflow-job:1145.v7f2433caa07f
workflow-multibranch:696.v52535c46f4c9
workflow-scm-step:2.13
workflow-step-api:615.vb09dac339255
workflow-support:804.vba10a18a1476
ws-cleanup:0.40

What Operating System are you using (both controller, and any agents involved in the problem)?

Jenkins randomly deletes slaves before completing job runs.
We could not establish any sort of pattern for when it happens, it is not time dependent as far as I can tell.

We get the following errors:

Caused: java.io.IOException: Unexpected termination of the channel

Caused: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@60d32f79:jenkins-jgqlcq": Remote call on jenkins-jgqlcq failed. The channel is closing down or has closed down

I can see in GCP Log Explorer that Jenkins sends a v1.compute.instances.delete to the node when this happens, however, I cannot find any definition for it.

I set a retention time to 500 just to see that it is not the issue, launch timeout is 300.

Reproduction steps

  1. GCP is spinning up a slave instance to run
  2. randomly it will get (or not) a v1.compute.instances.delete from Jenkins.

Expected Results

All slaves should finish their runs without disconnecting

Actual Results

Slaves randomly disconnect without any visible pattern

Anything else?

No response

@mikiedelstein mikiedelstein added the bug Something isn't working label Jan 4, 2022
@BrianRossmajer
Copy link

I recently had this same issue come up and am just starting investigating. It coincided with starting to use Jenkins' configuration as code plugin; I'm just curious if you're using that too.

@BrianRossmajer
Copy link

I'll just comment here in case it helps someone in the future... it did indeed seem related to using Jenkins' Configuration As Code. I had exported the configuration and edited it then duplicated it for something else, not noticing that it kept instanceId for the cloud configuration, so that two different cloud configurations were using the same instanceId. Note that the example yaml does not include the instanceId... once I removed it, the random shutdowns stopped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants