Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(kubernetes): add livenessProbe #1253

Merged
merged 1 commit into from
Mar 21, 2019

Conversation

maggieneterval
Copy link
Contributor

Per discussion on spinnaker/spinnaker#4023, reuses readinessProbe as a livenessProbe so that containers that fail health checks are not only removed from the load balancer, but also restarted automatically.

@maggieneterval
Copy link
Contributor Author

Just discussed with @ezimanyi / @dibyom and one potential issue here is that a service's readinessProbe may depend on downstream services being ready. This means we would want a different livenessProbe. Wondering if a simple TCP health check on the container's port would work for this?

cc @davidxia, let us know if this kind of livenessProbe would have prevented the issue you described!

@dibyom
Copy link
Member

dibyom commented Mar 20, 2019

I like the idea of adding a TCP health check for the liveness probe!

@davidxia
Copy link
Contributor

@maggieneterval This PR's current changes looks like it'll make every spinnaker component's readinessProbe a livenessProbe as well?

The clouddriver readinessProbe is this right now.

    readinessProbe:
      exec:
        command:
        - wget
        - --no-check-certificate
        - --spider
        - -q
        - http://localhost:7002/health

I think a wget for a livenessProbe would've restarted clouddriver in my case of spinnaker/spinnaker#4023. But I don't know if a TCP probe would've restarted it.

@dibyom
Copy link
Member

dibyom commented Mar 21, 2019

I think a TCP health check should have worked in the spinnaker/spinnaker#4023 case as well since wget was failing to establish a connection to clouddriver:
Connecting to localhost|::1|:7002...

@maggieneterval
Copy link
Contributor Author

Thanks for looking into that @dibyom! It sounds like at least starting with a TCP health check for the livenessProbe is a good idea; I will update this PR later today.

@maggieneterval
Copy link
Contributor Author

@dibyom @ezimanyi should be ready for another look!

Copy link
Contributor

@ezimanyi ezimanyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants