Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node-build-monitor stops working after error and did not recover #100

Closed
ghost opened this issue Jan 10, 2018 · 1 comment
Closed

node-build-monitor stops working after error and did not recover #100

ghost opened this issue Jan 10, 2018 · 1 comment
Labels

Comments

@ghost
Copy link

ghost commented Jan 10, 2018

Our setup:

  • node-build-monitor with GitLab 10.3.2-ee on Kubernetes 1.8

What I saw:
After 5 days of normal operation, the node-build-monitor the HTTP endpoint resulted in a 504 Gateway Timeout. According to the applications log, an undefined error occurred (see below) and the docker container stoped serving traffic.

Log:

7:26:32 AM | Check for builds...
7:29:42 AM | 10 builds found....
7:30:13 AM | Check for builds...
7:33:07 AM | 10 builds found....
7:33:43 AM | Check for builds...
**********************************************************************
An error occured when fetching builds for the following configuration:
----------------------------------------------------------------------
undefined
----------------------------------------------------------------------

{ Error: socket hang up
    at TLSSocket.onHangUp (_tls_wrap.js:1116:19)
    at Object.onceWrapper (events.js:293:19)
    at emitNone (events.js:91:20)
    at TLSSocket.emit (events.js:188:7)
    at endReadableNT (_stream_readable.js:974:12)
    at _combinedTickCallback (internal/process/next_tick.js:80:11)
    at process._tickCallback (internal/process/next_tick.js:104:9) code: 'ECONNRESET' }
**********************************************************************


7:37:51 AM | 0 builds found....
7:37:54 AM | builds changed
7:39:59 AM | Check for builds...

What I expected:
The application would go back to normal operation.

What I did and could do to solve the problem temporarily:
After a restart of the node-build-monitor pod/container, the application resumed normal operation.
To prevent this from happening I could simply configure a health check on the HTTP endpoint of the service and Kubernetes would restart the application automatically if it stops working.

Now, this may solve the problem for me on Kubernetes but it would be more convenient if the application would recover by itself. Is this a known issue and planned to be fixed? =)

@marcells marcells added the bug label Jan 10, 2018
@marcells
Copy link
Owner

Hm, looks like it is a bit difficult to reproduce. I also have no running GitLab instance. I'll try it with other services.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant