Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jenkins stops trying to restart the slave (ssh and I think jnlp?) when slave down too long #718

Closed
rustycar54 opened this issue Oct 17, 2017 · 2 comments

Comments

@rustycar54
Copy link

I've got a complete showstopper.

I'm trying to get Jenkins to control our testing hosts, which work as follows:

1 - when idle, the test host is running Linux (with the SSH-controlled agent running).
2 - when a test is started and the agent has fired off all the required scripts, the host boots into FreeDOS and runs things for a while (up to a few days or more). FreeDOS has NO networking installed, so that even if there was a Jenkins hook for DOS we'd be out of luck. Anyway, Jenkins obviously decides the agent is offline, but ...
3 - when the FreeDOS part of the test is finished and the test host reboots in to Linux Jenkins appears to never try to relaunch the agent. So far, the ONLY way to get Jenkins back to talking to it is to manually 'Launch agent' from the 'Jenkins -> Nodes -> ' page. Since you must be logged in for the 'launch agent' button to be visible, I'm not expecting to have much luck writing a script to force Jenkins to Launch the agent.

The agents are all configured as follows:

Executors: 1

Launch Method: Launch slave agents via ssh
Maximum number of retries: 0
Seconds to wait between retries: 30 (have tried 0 also)
Availability: Keep this agent online as much as possible.

I've tried searching all over the place, and I've even hit up the Jira, and the only thing that has happened there is someone took the component that I guessed it might be off the list, without giving me any kind of hint as to what I should put on it (JENKINS-47327, if you want to go look there, as I have the systeminfo.pdf and the pipeline script I'm running included there).

I need to either figure out how to get Jenkins to start the agent like I thought it was supposed to, OR find a hook or something I can use to relaunch the agent from a script. Or maybe someone can fix it, but right now, I'll just settle for a hack to make it work.

@rustycar54
Copy link
Author

Ah-HA! Found a hack, as follows:

(I used https://wiki.jenkins.io/display/JENKINS/Distributed+builds to find the secret sauce).

Configure "Launch slave agent headlessly" (which requires a visit to global security),

Then, when you need the agent: java -jar slave.jar -jnlpUrl http://yourserver:port/computer/slave-name/slave-agent.jnlp

(being sure to change what i hope are the obvious places above to what makes sense in your network)

Rusty

@rustycar54
Copy link
Author

This is a hack, and I'd really like it if someone could actually fix it, but for now, I'm done and will close this.

@lemeurherve lemeurherve transferred this issue from jenkinsci/docker-inbound-agent Jan 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants