Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds rationale for healthcheck config #2812

Merged
merged 1 commit into from
Sep 21, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,32 @@ USER zipkin

EXPOSE 9410 9411

# This health check was added for Docker Hub automated test service. Parameters
# were changed in order to mark success faster. You may want to change these
# further in production.
#
#
# By default, the Docker health check runs after 30s, and if a failure occurs,
# it waits 30s to try again. This implies a minimum of 30s before the server is
# marked healthy.
#
# https://docs.docker.com/engine/reference/builder/#healthcheck
#
# We expect the server startup to take less than 10 seconds, even in a fresh
# start. Some health checks will trigger a slow "first request" due to schema
# setup (ex this is the case in Elasticsearch and Cassandra). However, we don't
# want to force an initial delay of 30s as defaults would.
#
# Instead, we lower the interval and timeout from 30s to 5s. If a server starts
# in 7s and takes another 2s to install schema, it can still pass in 10s vs 30s.
#
# We retain the 30s even if it would be an excessively long startup. This is to
# accomodate test containers, which can boot slower than production sites, and
# any knock-on effects of that, like slow dependent storage containers which are
# simultaneously bootstrapping. If in production, you have a 30s startup, please
# report to https://gitter.im/openzipkin/zipkin including the values of the
# /health and /info endpoints as this would be unexpected.
#
HEALTHCHECK --interval=5s --start-period=30s --timeout=5s CMD wget --quiet http://localhost:9411/health || exit 1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anuraaga FWIW with the above comments, I'm fine with the parameters. This is mainly defense, as mentioned a few different ways. Java raises antibodies for a lot of people who have a different favorite language.

I think you've recalled over the years people who don't like Java getting obsessed over image size, memory (with sometimes dodgy accounting of it), and startup time. It has been my experience that anything that can be used against us will. With the above comments, I think someone would have to be quite unprofessional to leap to the conclusion that we expect our server to startup in 30s. This was my goal here.. to reduce the amount of time lost, and potential sites over the FUD spread by folks who hate java.


ENTRYPOINT ["/busybox/sh", "run.sh"]