Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure container/model is ready to be queried #447

Merged
merged 67 commits into from
Apr 8, 2018

Conversation

chester-leung
Copy link
Member

Implemented Docker healtcheck and Kubernetes readiness probe to ensure that a container/model is ready to be queried. This change blocks deploy_model() until we get explicit notification that the container/model is running.

Also changes pull policy in query frontend and management frontend such that new Docker images are always pulled.

Fixes #436, Fixes #425, Fixes #152

chester-leung and others added 30 commits January 21, 2018 18:53
Added docker/k8s container checking
@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/1190/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/1192/
Test FAILed.

@dcrankshaw
Copy link
Contributor

Jenkins test this please

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/1232/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/1275/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/1276/
Test FAILed.

@dcrankshaw
Copy link
Contributor

@chester-leung what's the status of this? Are you waiting on a review?

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/1285/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/1286/
Test FAILed.

@chester-leung
Copy link
Member Author

@dcrankshaw I'm trying to figure out why the build is failing - the tests are all passing when I run them locally. I encountered a problem locally similar to the current error when I hadn't rebuilt the py-rpc docker image to have the healthcheck. I think somehow the healthcheck isn't being set for the containers when Jenkins runs the tests.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/1302/
Test FAILed.

@chester-leung
Copy link
Member Author

Jenkins test this please

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/1305/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/1311/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Clipper-PRB/1316/
Test PASSed.

Copy link
Contributor

@dcrankshaw dcrankshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@dcrankshaw dcrankshaw merged commit 0d89cb2 into ucbrise:develop Apr 8, 2018
@chester-leung chester-leung deleted the docker_healthcheck branch April 8, 2018 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants