Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NPE in Agent Launcher #2789

Closed
aschrijver opened this issue Oct 17, 2016 · 18 comments
Closed

NPE in Agent Launcher #2789

aschrijver opened this issue Oct 17, 2016 · 18 comments
Milestone

Comments

@aschrijver
Copy link

Issue Type
  • Bug Report
Summary

Agent cannot start due to NullPointerException

Environment
Basic environment details
  • Go 16.10.0 (4131-730ff1867576754414cc632957f344d0263bc06d)
  • Ubuntu 15.10 (4.2.0-41-generic)
  • JDK 1.8.0_101-b13

Standard configuration (no special changes) with single agent running.

Steps to Reproduce
  1. Start a build. Any build fails, because agent does not launch
  2. Started build hangs indefinitely at the first job
Expected Results

Agent should start. Build should work.

Actual Results

NullPointerException occurs in Agent Launcher

Log snippets
2016-10-17 15:02:27,661 [main     ] ERROR cruise.agent.launcher.AgentLauncherImpl:98 - Launch encountered an unknown exception
java.lang.NullPointerException
        at com.thoughtworks.cruise.agent.launcher.AgentLauncherImpl.getPort(AgentLauncherImpl.java:136)
        at com.thoughtworks.cruise.agent.launcher.AgentLauncherImpl.getUrlGenerator(AgentLauncherImpl.java:127)
        at com.thoughtworks.cruise.agent.launcher.AgentLauncherImpl.launch(AgentLauncherImpl.java:75)
        at com.thoughtworks.go.agent.bootstrapper.AgentBootstrapper.go(AgentBootstrapper.java:72)
        at com.thoughtworks.go.agent.bootstrapper.AgentBootstrapper.main(AgentBootstrapper.java:54)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.simontuffs.onejar.Boot.run(Boot.java:306)
        at com.simontuffs.onejar.Boot.main(Boot.java:159)

Any other info

I was not aware that Go.CD auto-upgrades when doing apt-get on Ubuntu. I had a working installation before the upgrade :(
Also it seems I'll have to upgrade plugins individually by downloading jar-files (did that for gradle plugin)

@ketan
Copy link
Member

ketan commented Oct 17, 2016

The stacktrace seems to indicate that the agent is running on an older version of GoCD. Please ensure that the agent is running with the same version of the go server.

If the problem persists, or if you see an issue with the server being unable to connect to the server, make sure to remove the agent-launcher.jar and agent.jar from /var/lib/go-agent, and restart the agent service.

I'm marking this issue as closed, but if you continue to see this problem, please specify the version of the gocd agent dpkg -l | grep go-agent and attach any gocd jars you find under /var/lib/go-agent

@ketan ketan closed this as completed Oct 17, 2016
@aschrijver
Copy link
Author

Thanks for your quick response! I'll reinstall the agent.
Strange that the server auto-upgrades and the agent doesn't follow, though..

@aschrijver
Copy link
Author

Oh BTW the agent version is 16.10.0-4131
Isn't that the newest already?

@ketan
Copy link
Member

ketan commented Oct 17, 2016

There are a few edge cases, that we've been unable to reproduce that have caused a similar stacktrace. If you could send across all logs under /var/log/go-agent over to support[at]thoughtworks.com someone could take a deeper look and hopefully be able to isolate the problem.

Please also keep backup the logs under /var/log/go-server in case we need more information.

When sending the email — please add a note to have the logs forward the logs to me (Ketan).

Thanks!

@aschrijver
Copy link
Author

Thanks for your help 👍

@ketan
Copy link
Member

ketan commented Oct 17, 2016

Or perhaps the stacktrace leads me to believe that somehow the agent process is still loading up the old agent-bootstrapper.jar

Could you quickly checksum the /usr/share/go-agent/agent-bootstrapper.jar and see if the md5sum is 8a05f5593b693e0dc4149b35a98cca39

@aschrijver
Copy link
Author

Yes, it has the correct checksum.. I'll collect the logs

jyotisingh added a commit to jyotisingh/gocd that referenced this issue Oct 27, 2016
jyotisingh added a commit to jyotisingh/gocd that referenced this issue Oct 27, 2016
jyotisingh added a commit to jyotisingh/gocd that referenced this issue Oct 27, 2016
jyotisingh added a commit to jyotisingh/gocd that referenced this issue Oct 27, 2016
@jyotisingh jyotisingh reopened this Oct 27, 2016
@jyotisingh jyotisingh added this to the Release 16.12 milestone Oct 27, 2016
@jyotisingh jyotisingh self-assigned this Oct 27, 2016
jyotisingh added a commit to jyotisingh/gocd that referenced this issue Oct 27, 2016
jyotisingh added a commit to jyotisingh/gocd that referenced this issue Oct 27, 2016
jyotisingh added a commit to jyotisingh/gocd that referenced this issue Oct 27, 2016
jyotisingh added a commit to jyotisingh/gocd that referenced this issue Oct 27, 2016
ketan added a commit that referenced this issue Oct 27, 2016
…ies_on_agents_upon_upgrade

#2789 cleaning up agent binaries from previous version upon upgrade
@jyotisingh
Copy link
Contributor

The scenario is: both server and agent are on some old build say 16.1. Now you upgrade both agent and server to 16.7 or later build. At this point, when the bootstrapper comes up, it tries running the launcher using the locally available launcher jar(stale copy from 16.1). This launcher expected port number and hostname to be made available, but the new bootstrapper does not pass it along. The installer should have cleared off the older launcher and agent jars. Fix has been checkedin as part of 16.12(upcoming release), and workaround is what @ketan suggested, ie. delete local agent-launcher.jar and restart agent-bootstrapper.

@aschrijver
Copy link
Author

Thanks a lot @jyotisingh 👍

@lenucksi
Copy link

@jyotisingh Can a recent agent launcher operate with an older Go-Server?
Consider i.e. having the 16.11 launcher on the agents and a e.g. a 16.3 server to connect to?
As far as I can see, the fix is embedded in the 16.12 agent?

@jyotisingh
Copy link
Contributor

@lenucksi While upgrading, either you upgrade just your server or you upgrade your server as well as the agents. Newer agents with older server would not work and was never supported. Older agents with a new version of Server should work as expected.

@arvindsv
Copy link
Member

Also, this might be relevant.

@lenucksi
Copy link

@jyotisingh Was this issue not due to the older agent having an incompatible agent-launcher.jar? Would then using an older agent with a server newer than 16.7 not cause the exact mentioned issues? Or does 16.12 introduce some compatibility features to upgrade older agents?

@jyotisingh
Copy link
Contributor

jyotisingh commented Nov 18, 2016

@lenucksi Well, the issue occurred because of a mix of a few things, let me try and explain.

In general an agent-bootstrapper is capable of upgrading an agent-launcher which in turn would upgrade and start an agent. So, normally you would just upgrade your server and let agent-bootstrapper take care of upgrading your agents automatically. Or one may choose to upgrade both server and agents by themselves.

Until version 16.6, bootstrapper expected some specific arguments to be passed along to it ie. server-host-name and server-port, but as a part of 16.7, the bootstrapper was updated to accept a different set of arguments, one being the serverUrl ie. ssl url (instead of accepting the hostname and port-number as separate args), along with a bunch of other arguments for certificate verification. @arvindsv's comment here provides a little more details about this. The same set of arguments are passed along to the launcher which would then be used to upgrade an agent.

Coming to the issue, it was a case of new bootstrapper running with an old version of launcher which shouldn't ever be the case. Upgrade process should have cleared out the older launcher, but it did not [bug]. As mentioned earlier, older launcher expected the hostname and ports to be provided, but the new bootstrapper passed along the serverUrl instead and hence this code from older launcher would throw an exception. The fix was to cleanup older launcher during the upgrade process.

So, in terms of who does the fix apply to -

  • if you upgrade your server from 16.6 or earlier versions to 16.7 or later versions, but do not upgrade your agents, then things will continue to work. Ofcourse you will miss out on end-to-end security of agent-server communication(read more), but your agents would continue to be active and run your builds.
  • if you upgrade your server from 16.6 (or earlier versions) to 16.12 and upgrade your agents too, the things would work and you would be able to enable the end-to-end transport security for agent-server communication.
  • if you upgrade your server from 16.6(or earlier versions) to 16.7 (or later versions), and upgrade your agent to any of the versions between 16.7-16.11, then you would be hit by this bug. The workaround would be to delete the locally available launcher jar and restart bootstrapper.

I hope it makes things clear now.

@arvindsv
Copy link
Member

I hope it makes things clear now.

@jyotisingh: I am so tempted to say, "Not really". :P

@jyotisingh
Copy link
Contributor

Then I would say, @arvindsv is amazing at explaining things and hand it over to him :)

@rajiesh
Copy link
Contributor

rajiesh commented Dec 6, 2016

Verified on GoCD version 16.12.0 (4331-7cd56f28bf55417221a608140e72532de5501f5a), working as expected

@rajiesh rajiesh closed this as completed Dec 6, 2016
@lenucksi
Copy link

lenucksi commented Dec 7, 2016

@arvindsv @jyotisingh The explanation actually was pretty good and helpful, I just did not come to post a reaction to it.

@jyotisingh jyotisingh removed their assignment Jun 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants