-
Notifications
You must be signed in to change notification settings - Fork 318
-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Containers are spawned but appear offline and build doesn't start. #740
Comments
The docker plugin hasn't been updated "for a long time" either; I don't think you can lay any blame on that changing 😁 It's possible that your networking/security has changed - could you be using selinux or similar where Jenkins isn't permitted to SSH to the containers? You should also check the slave log on the "offline" slaves to see what Jenkins is reporting. That can be highly instructive. If you have shell access to you Jenkins host, take a look at the last-modified date of the config.xml file in the Jenkins home (and if you don't, use the groovy console to do the same) - if that file changed when it stopped working (or afterwards) then that's a clue. TL;DR: The docker-plugin itself is only a part of this; 95% of what's required for things to work is outside its control, so you have to look all over the place to find the cause of this sort of problem. ...and if all else fails, use the "attach" method - that only requires the ability to issue a "docker run" command to the docker host (as all subsequent comms is over that channel with the docker host) and is therefore much simpler (i.e. less vulnerable to "other things going wrong"). Or use JNLP (IME it's less of a problem than SSH, although #739 can be an inconvenience for some). |
Here you can see the container output. They are terminated after 30min Thanks for linking the issue. Seems to be quiet similar to what is described in #739 |
Not sure where issues should be reported, I reported this here[1], this is incompatibility with |
Hey I just encountered a similar issue and the problem seems to be the "Inject SSH Method". If you use "User Generated SSH Credentials" option instead, it should work as expected :) |
It would be really useful if someone could post a stacktrace of an exception. Maybe the exception isn't being logged directly from the docker code, but somewhere there's sure to be an exception being thrown. TL;DR: If you're reading this, check your Jenkins logs (not just the docker plugin logging) for exceptions mentioning SSH or docker and post them here. |
I've upgrade the plugin again. The Exception i can see in jenkins log is:
|
Hmm, unfortunately these are consequences of a slave disconnecting rather than the cause of a slave failing to connect - they are not the key to enlightenment here :-/ FYI while much of the docker-cloud-plugin process is the same for JNLP, SSH and Attach connection methods, the symptoms of how things look, especially when things don't work, varies a lot between them; it's helpful to specify what kind of connection is/isn't working when picking through logs. PS. Single back-quotes only work for text that doesn't split over multiple lines. When posting a multi-line log, use triple back-quotes before and after, e.g.
|
Should i enable any other logs? Just let me know... I am doing the SSH Connect. So my docker-in-docker-images is based on |
Hello
In Environment field of the Docker Template (advanced section), just add: JENKINS_SLAVE_SSH_PUBKEY= source: https://hub.docker.com/r/jenkins/ssh-slave/ As of the time of writing, my problems were fixed by simply changing to the "Attach container mode" which I was doubtful of using because it's listed as experimental, in the official wiki documentation. |
Thanks Pietro. I've tried that as a "-D" parameter and in the "Environment" Section. Without success. Maybe i did it wrong because i am not sure about the "Environments" in the "Advanced Section". Are you able to post a screenshot? |
@phreakadelle Sadly no, as I just switched to to the 'Attach Container' mode and discarded that previous configuration. Sorry about that |
Hey, as I mentioned before, the inject SSH method doesn't work. For SSH to work you need User Configured SSH credentials:
This should work. |
On Tue, Jul 9, 2019 at 11:01 AM Asad Syed ***@***.***> wrote:
This should work.
But is insecure. Issue should be resolved properly.
|
@alonbl While I agree that the issue "should be resolved properly", I don't see how one method is deemed insecure where the other is acceptable. @phreakadelle Re: Should i enable any other logs? Given that there's a fair amount of interest from multiple people here, can I ask you each to confirm...
My guess is that the recent "breaking-changes" to the ssh-slaves plugin in 1.30 have caused this, as that's the only area I know of where "brave" changes have happened recently, but without a nice stacktrace pointing the finger-of-blame, it's difficult to be sure. |
I've enabled logging on the com.* package and this is the result |
Awesome; thanks for those logs. OK, so what we're seeing here...
When we're using the
i.e. This gets an ID based on the Jenkins server itself, turns that into a private key in PEM form, and passes it to However... I've just looked at the So my guess is that y'all have a new(ish) version of ssh-slaves 1.30.0 (github says 1.30.0 was released on June 9th 2019) or later and that's what broke it. If that's the case then one (temporary) workaround would be to downgrade to the previous version of the ssh-slaves plugin. Can everyone confirm that:
|
So i am using 1.30.0 and it breaks. With 1.29.4 it works. The workaround with manual providing a key does not work for me. For the moment i revert the plugin to 1.29.4. Thanks for the brilliant support. Who is going to fix this now? Jenkins Core Team? Is there a ticket? |
Well, it's not going to be me any time soon ... my "day job" isn't this; I'm only answering questions because (a) I use this plugin too, (b) I seem to be the "last man standing" with commit-rights to this repo and (c) this provides a welcome break from (and is more interesting than) the work that I should be doing instead ;-) ... but I've got other pressures on my time right now such that I can't go diving into this code at present; I'm happy to "advise", but not to "do".
If you (or anyone else reading this!) can code up a pull-request that fixes this then that'll be widely welcomed ... and, because PRs are auto-built by the jenkinsci CI system, any PRs will also result in a downloadable .hpi/.jpi plugin file that folks can try out even before it gets merged into the master code and released. As for "a ticket", I suggest that you check to see if there's one on the Jenkins JIRA site and, if not, create one and link back here. This plugin uses github issues to drive code changes, but it helps to also have JIRA issues cross-linking back here as lots of folks log bugs on, and search for bugs on, purely on JIRA. TL;DR: We need a volunteer. |
Right. Yeah. So maybe someone else can fix that. I've just linked our discussion in the official JIRA. I've my own plugins to care about and if i find some spare time, i will investigate this issue. For the moment i can live with the 1.29.4 version. |
@phreakadelle FYI there's no point pinging Nick - he stopped his involvement in this plugin some time ago, just as I'm trying to do. This plugin is "up for adoption" - I'll help mentor anyone wanting to get involved, but (like Nick!) I have other commitments. |
The issue happens on https://github.com/jenkinsci/ssh-slaves-plugin/blob/master/src/main/java/hudson/plugins/sshslaves/SSHLauncher.java#L865 when the ssh-slaves-plugin as for the descriptor of the DockerComputerSSHConnector, it launches a runtime exception. I'll make a patch with some kind of hack to allow the docker-plugin to bypass the checkconfig. Method threw 'java.lang.AssertionError' exception. |
@kuisathaverat If you can submit a PR that adds a Descriptor to DockerSSHLauncher (and which extends the SSHLauncher's descriptor but stubs out the validation code being called) then please do so - I'm sure that everyone who's involved in this thread will be very grateful. Note: I'm out of the office for the next 3 weeks so I'll take a look when I get back. |
finally, I resolved the issue in the ssh-slaves-plugin, it is better, jenkinsci/ssh-agents-plugin#136, I'll release a new version this weekend. |
Closing issue as it's fixed in ssh-slaves plugin version 1.30.1 |
Since a few days i am facing an issue with my Jenkins. We have used the DockerCloud plugin with the same containers for a long time but suddenly it stopped working. I am unable to find out the reason why it stopped so i am gently asking for support.
Whenever i trigger the build a new container is started. I can see the Docker container is launched on the DockerHost and i can see the Node pops up in the Jenkins master. But the node is shown as offline.
I am able to do a manual ssh login from the jenkins master into the container on the DockerHost with the private key.
The DockerCloud plugins launches as many containers until the maximum of 5 is reached.
Logfile.txt
As said, it worked for a long time with the same containers. The container images have not been updated.
I am using Jenkins 2.176.1 and Docker Plugin 1.1.6
The text was updated successfully, but these errors were encountered: