New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker image building workflows are failing #1135
Labels
Comments
It looks to me like the dind sidecar got a sig term. |
main container logs
sidecar logs
It looks like the main container was in the middle of a build @ 03:37 when the dind container recieved a SIGTERM. |
Looking at the events for the cluster I see recent events indicating disk pressure but these don't match the timestamp of the pytorch operator
|
jlewi
added a commit
to jlewi/kubeflow
that referenced
this issue
Jul 6, 2018
* See if this makes the build more reliable and faster * We should really be setting disk space because it looks like (see kubeflow#1135) that is one resource that is under pressure. Need to figure out if that is easy to control with K8s. Related to: kubeflow#1135 Docker image building workflows are failing kubeflow#1132 Building Jupyter images took over 4 hours.
k8s-ci-robot
pushed a commit
that referenced
this issue
Jul 6, 2018
#1136) * See if this makes the build more reliable and faster * We should really be setting disk space because it looks like (see #1135) that is one resource that is under pressure. Need to figure out if that is easy to control with K8s. Related to: #1135 Docker image building workflows are failing #1132 Building Jupyter images took over 4 hours.
saffaalvi
pushed a commit
to StatCan/kubeflow
that referenced
this issue
Feb 11, 2021
kubeflow#1136) * See if this makes the build more reliable and faster * We should really be setting disk space because it looks like (see kubeflow#1135) that is one resource that is under pressure. Need to figure out if that is easy to control with K8s. Related to: kubeflow#1135 Docker image building workflows are failing kubeflow#1132 Building Jupyter images took over 4 hours.
surajkota
pushed a commit
to surajkota/kubeflow
that referenced
this issue
Jun 13, 2022
* Fix CNRM cluster package. * namespace should be set in kustomization.yaml; not in the actual resource. If we set it in the resource then if we have kustomize subpackages that try to patch the resource kustomize won't be able to find the original resource. * Change the name of the setter from "cluster-name" to "name" * Divide up the GCP resources into various subpackages. * Change cluster-name to name in asm.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
See #1130
It looks like the latest runs of most of the release workflows for our images failed; e.g. pytorch operator; tf-serving-release, notebook-release.
The text was updated successfully, but these errors were encountered: