Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent lost contact after upgrade to 19.8.0-9915 #6997

Open
mjt5055 opened this issue Sep 24, 2019 · 8 comments
Open

Agent lost contact after upgrade to 19.8.0-9915 #6997

mjt5055 opened this issue Sep 24, 2019 · 8 comments

Comments

@mjt5055
Copy link

@mjt5055 mjt5055 commented Sep 24, 2019

Issue Type

Bug Report

Summary

After upgrading go-server to 19.8.0 all agents are showing a LostContact status. I tried upgrading the agent to the latest version on one server and reinstalling the agent on a different server and neither fixed the connection error.

Environment

Go Server Version:
19.8.0 (9915-9ea99a72c338a132ae1ca83f363e16b2c95d920b)
JVM version:
12.0.1
OS Information:
Linux 4.4.0-164-generic
Database schema version:
1903003

Log snippets
javax.servlet.ServletException: org.springframework.web.util.NestedServletException: Request processing failed; nested exception is java.lang.NullPointerException
        at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
        at org.eclipse.jetty.server.Server.handle(Server.java:502)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)
        at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
        at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
        at org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:411)
        at org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:305)
        at org.eclipse.jetty.io.ssl.SslConnection$2.succeeded(SslConnection.java:159)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
        at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
        at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.springframework.web.util.NestedServletException: Request processing failed; nested exception is java.lang.NullPointerException
        at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:982)
        at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:861)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
        at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
        at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:867)
        at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1623)
        at org.tuckey.web.filters.urlrewrite.RuleChain.handleRewrite(RuleChain.java:176)
        at org.tuckey.web.filters.urlrewrite.RuleChain.doRules(RuleChain.java:145)
        at org.tuckey.web.filters.urlrewrite.UrlRewriter.processRequest(UrlRewriter.java:92)
        at org.tuckey.web.filters.urlrewrite.UrlRewriteFilter.doFilter(UrlRewriteFilter.java:381)
        at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:317)
        at com.thoughtworks.go.server.web.FlashLoadingFilter.doFilterInternal(FlashLoadingFilter.java:38)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
        at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:208)
        at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:185)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:317)
        at com.thoughtworks.go.server.newsecurity.filters.DenyIfRefererIsNotFilesFilter.doFilterInternal(DenyIfRefererIsNotFilesFilter.java:52)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
        at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:214)
        at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:185)

@arvindsv

This comment has been minimized.

Copy link
Member

@arvindsv arvindsv commented Sep 24, 2019

Hello @mjt5055:

  1. What is the output of this?
curl -k -v https://your-go-server:8154/go/admin/latest-agent.status

... and also from within the node:

curl -k -v https://localhost:8154/go/admin/latest-agent.status
  1. What do the agent logs show?

  2. Can you upload the server logs file itself? This snippet might not be enough.

Thanks.

@mjt5055

This comment has been minimized.

Copy link
Author

@mjt5055 mjt5055 commented Sep 24, 2019

Hello @mjt5055:

  1. What is the output of this?
curl -k -v https://your-go-server:8154/go/admin/latest-agent.status
  Trying go-server...
Connected to go-server (go-server) port 8154 (#0)
found 148 certificates in /etc/ssl/certs/ca-certificates.crt
found 594 certificates in /etc/ssl/certs
ALPN, offering http/1.1
SSL connection using TLS1.2 / ECDHE_RSA_AES_256_GCM_SHA384
       server certificate verification SKIPPED
      server certificate status verification SKIPPED
       common name: SH-GOCD (does not match 'go-server')
       server certificate expiration date OK
       server certificate activation date OK
       certificate public key: RSA
       certificate version: #1
       subject: CN=SH-GOCD,OU=Cruise server webserver certificate
       start date: Thu, 01 Jan 1970 00:00:00 GMT
       expire date: Sun, 28 Mar 2027 16:40:08 GMT
      issuer: CN=SH-GOCD,OU=Cruise server webserver certificate
       compression: NULL
ALPN, server did not agree to a protocol
> GET /go/admin/latest-agent.status HTTP/1.1
> Host: go-server:8154
> User-Agent: curl/7.47.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Tue, 24 Sep 2019 19:55:05 GMT
< X-XSS-Protection: 1; mode=block
< X-Content-Type-Options: nosniff
< X-Frame-Options: SAMEORIGIN
< X-UA-Compatible: chrome=1
< Set-Cookie: JSESSIONID=node0i5wy0i22vq8g18nutxbeni79k113.node0; Path=/go; Expires=Tue, 08-Oct-2019 19:55:05 GMT; Max-Age=1209600; Secure; HttpOnly
< Expires: Thu, 01 Jan 1970 00:00:00 GMT
< Agent-Content-MD5: 719e377e29d813420722282789c12dc9
< Agent-Launcher-Content-MD5: 59699f411657bd44394eeb5c80819be8
< Agent-Plugins-Content-MD5: db10c2c2bb60f91d4b762a903c745a9e
< TFS-SDK-Content-MD5: a101373f5e2ef88f52b45d836589d94a
< GoCD-Agent-Extra-Properties:
< Cruise-Server-Ssl-Port: 8154
< Vary: Accept-Encoding, User-Agent
< Content-Length: 0

... and also from within the node:

curl -k -v https://localhost:8154/go/admin/latest-agent.status
> curl -k -v https://localhost:8154/go/admin/latest-agent.status
   Trying ::1...
 Connected to localhost (::1) port 8154 (#0)
 found 148 certificates in /etc/ssl/certs/ca-certificates.crt
 found 592 certificates in /etc/ssl/certs
 ALPN, offering http/1.1
 SSL connection using TLS1.2 / ECDHE_RSA_AES_256_GCM_SHA384
        server certificate verification SKIPPED
        server certificate status verification SKIPPED
        common name: SH-GOCD (does not match 'localhost')
        server certificate expiration date OK
        server certificate activation date OK
        certificate public key: RSA
        certificate version: #1
        subject: CN=SH-GOCD,OU=Cruise server webserver certificate
        start date: Thu, 01 Jan 1970 00:00:00 GMT
        expire date: Sun, 28 Mar 2027 16:40:08 GMT
        issuer: CN=SH-GOCD,OU=Cruise server webserver certificate
        compression: NULL
 ALPN, server did not agree to a protocol
> GET /go/admin/latest-agent.status HTTP/1.1
> Host: localhost:8154
> User-Agent: curl/7.47.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Tue, 24 Sep 2019 20:01:55 GMT
< X-XSS-Protection: 1; mode=block
< X-Content-Type-Options: nosniff
< X-Frame-Options: SAMEORIGIN
< X-UA-Compatible: chrome=1
< Set-Cookie: JSESSIONID=node01u424dil4ulhzrv6bapj7tyr7114.node0; Path=/go; Expires=Tue, 08-Oct-2019 20:01:55 GMT; Max-Age=1209600; Secure; HttpOnly
< Expires: Thu, 01 Jan 1970 00:00:00 GMT
< Agent-Content-MD5: 719e377e29d813420722282789c12dc9
< Agent-Launcher-Content-MD5: 59699f411657bd44394eeb5c80819be8
< Agent-Plugins-Content-MD5: db10c2c2bb60f91d4b762a903c745a9e
< TFS-SDK-Content-MD5: a101373f5e2ef88f52b45d836589d94a
< GoCD-Agent-Extra-Properties:
< Cruise-Server-Ssl-Port: 8154
< Vary: Accept-Encoding, User-Agent
< Content-Length: 0
<
* Connection #0 to host localhost left intact
  1. What do the agent logs show?
org.springframework.remoting.RemoteAccessException: Could not access HTTP invoker remote service at [https://go-server:8154/go/remoting/remoteBuildRepository]; nested exception is java.lang.RuntimeException$
        at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.convertHttpInvokerAccessException(HttpInvokerClientInterceptor.java:226)
        at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.invoke(HttpInvokerClientInterceptor.java:153)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213)
        at com.sun.proxy.$Proxy8.ping(Unknown Source)
        at com.thoughtworks.go.agent.AgentHTTPClientController.ping(AgentHTTPClientController.java:98)
        at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:65)
        at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Couldn't load token from filesystem
        at com.thoughtworks.go.util.ExceptionUtils.bomb(ExceptionUtils.java:41)
        at com.thoughtworks.go.config.IdFileService.load(IdFileService.java:47)
        at com.thoughtworks.go.config.TokenService.load(TokenService.java:22)
        at com.thoughtworks.go.config.DefaultAgentRegistry.token(DefaultAgentRegistry.java:39)
        at com.thoughtworks.go.agent.GoHttpClientHttpInvokerRequestExecutor.doExecuteRequest(GoHttpClientHttpInvokerRequestExecutor.java:70)
        at org.springframework.remoting.httpinvoker.AbstractHttpInvokerRequestExecutor.executeRequest(AbstractHttpInvokerRequestExecutor.java:137)
        at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.executeRequest(HttpInvokerClientInterceptor.java:202)
        at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.executeRequest(HttpInvokerClientInterceptor.java:184)
        at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.invoke(HttpInvokerClientInterceptor.java:150)
        ... 16 common frames omitted
Caused by: java.io.FileNotFoundException: File 'config/token' does not exist
        at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:297)
        at org.apache.commons.io.FileUtils.readFileToString(FileUtils.java:1805)
        at com.thoughtworks.go.config.IdFileService.load(IdFileService.java:45)
        ... 23 common frames omitted
  1. Can you upload the server logs file itself? This snippet might not be enough.
    go-server.log.gz

Thanks.

@arvindsv

This comment has been minimized.

Copy link
Member

@arvindsv arvindsv commented Sep 24, 2019

Well, that is the real problem: java.io.FileNotFoundException: File 'config/token' does not exist

If you do this, it should bring the agent back:

  1. Stop the agent.
  2. Move the config/ directory to config.bak or something.
  3. Start the agent.

One side-effect will be that the agent will show up as a new agent and you will need to bring back the resources / environments.

Which version did you upgrade from? It looks like it was an old version which didn't have a token file.

@phazon90

This comment has been minimized.

Copy link

@phazon90 phazon90 commented Oct 7, 2019

I am having the same problem, we upgraded from version 18.6 to 19.8.

@arvindsv

This comment has been minimized.

Copy link
Member

@arvindsv arvindsv commented Oct 8, 2019

@ankitsri11 and I tried to reproduce these a few days ago: We found that if you upgrade from 17.11 or before, then the token file won't be present and all GoCD servers up to 19.5 ignore that.

Since 19.6, they mandate that the token file is present and the agent will fail to connect, with this message.

If you install GoCD agent 18.6 and then upgrade to 19.8, it should connect. If this message showed up, then the installed GoCD agent at some point was 17.11 or earlier.

At this point, you will need to follow the workaround in the comment above. That will cause it to get a token file and it will connect again.

@arvindsv

This comment has been minimized.

Copy link
Member

@arvindsv arvindsv commented Oct 8, 2019

Here is a summary of what we found:

2019_10_08_16-17-36

@FredrikWendt

This comment has been minimized.

Copy link

@FredrikWendt FredrikWendt commented Nov 15, 2019

The workaround instructions here are unclear: what config directory are we talking about? I can't find one under /usr/share/go-agent/

@FredrikWendt

This comment has been minimized.

Copy link

@FredrikWendt FredrikWendt commented Nov 15, 2019

... and the directory where the config is stored is (of course!) /var/lib/go-agent/.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.