-
Notifications
You must be signed in to change notification settings - Fork 831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seldon deployment success/failure condition #255
Comments
You could use the "status" of the SeldonDeployment which will be updated as the SeldonDeployment is created. |
@cliveseldon I tried to add but the workflow does not seem it pick it up. Just hangs until it times out. Not sure why as this command returns |
Yes, that is strange. Is there anything in the Argo logs? You might need to dig into the Argo project to help you why this is not working. |
Not much in the logs, looks like it is not getting back any data:
|
@cliveseldon any suggestion? |
Can you see the "status" field in the SeldonDeployment? And it matches your condition? |
Can you try with the latest 0.2.4-SNAPSHOT images? |
@cliveseldon I updated to 0.2.4-SNAPSHOT, but it seems that the deployment is still not sending the status bit:
Also, the following returns nothing:
|
Which version of k8s are you running?
Can you run this when its finally ready and running to confirm no status resource is there? |
I am running kubernetes 1.10.6 The model is live (I can use it to predict) the command {
"apiVersion": "machinelearning.seldon.io/v1alpha2",
"kind": "SeldonDeployment",
"metadata": {
"annotations": {
"kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"machinelearning.seldon.io/v1alpha2\",\"kind\":\"SeldonDeployment\",\"metadata\":{\"annotations\":{},\"labels\":{\"app\":\"seldon\"},\"name\":\"dep1\",\"namespace\":\"seldon\"},\"spec\":{\"annotations\":{\"deployment_version\":\"v1\",\"project_name\":\"Prediction\"},\"name\":\"dep1\",\"oauth_key\":\"dep1-key1\",\"oauth_secret\":\"dep1-secret1\",\"predictors\":[{\"annotations\":{\"predictor_version\":\"v1\"},\"componentSpecs\":[{\"spec\":{\"containers\":[{\"image\":\"repo-name/dep1-lookup:1.7\",\"imagePullPolicy\":\"IfNotPresent\",\"name\":\"lookup\",\"resources\":{\"requests\":{\"memory\":\"1Mi\"}}}]}},{\"spec\":{\"containers\":[{\"image\":\"repo-name/dep1-prediction:2018-10-29-18-45-44\",\"imagePullPolicy\":\"IfNotPresent\",\"name\":\"predictor\",\"resources\":{\"requests\":{\"memory\":\"1Mi\"}}}],\"terminationGracePeriodSeconds\":20}},{\"spec\":{\"containers\":[{\"image\":\"repo-name/dep1-mapping:2018-10-29-18-45-44\",\"imagePullPolicy\":\"IfNotPresent\",\"name\":\"external-to-internal-mapping\",\"resources\":{\"requests\":{\"memory\":\"1Mi\"}}}],\"terminationGracePeriodSeconds\":20}}],\"graph\":{\"children\":[{\"children\":[{\"endpoint\":{\"type\":\"REST\"},\"name\":\"predictor\",\"type\":\"MODEL\"}],\"endpoint\":{\"type\":\"REST\"},\"name\":\"external-to-internal-mapping\",\"parameters\":[{\"name\":\"external_to_internal\",\"type\":\"BOOL\",\"value\":true}],\"type\":\"TRANSFORMER\"}],\"endpoint\":{\"type\":\"REST\"},\"name\":\"lookup\",\"type\":\"TRANSFORMER\"},\"name\":\"single-model\",\"replicas\":1}]}}\n"
},
"clusterName": "",
"creationTimestamp": "2018-10-29T18:49:16Z",
"generation": 1,
"labels": {
"app": "seldon"
},
"name": "dep1",
"namespace": "seldon",
"resourceVersion": "10677",
"selfLink": "/apis/machinelearning.seldon.io/v1alpha2/namespaces/seldon/seldondeployments/dep1",
"uid": "561fda07-dbab-11e8-b216-0216d208bed4"
},
"spec": {
"annotations": {
"deployment_version": "v1",
"project_name": "Prediction"
},
"name": "dep1",
"oauth_key": "dep1-key1",
"oauth_secret": "dep1-secret1",
"predictors": [
{
"annotations": {
"predictor_version": "v1"
},
"componentSpecs": [
{
"spec": {
"containers": [
{
"image": "repo-name/dep1-lookup:1.7",
"imagePullPolicy": "IfNotPresent",
"name": "lookup",
"resources": {
"requests": {
"memory": "1Mi"
}
}
}
]
}
},
{
"spec": {
"containers": [
{
"image": "repo-name/dep1-prediction:2018-10-29-18-45-44",
"imagePullPolicy": "IfNotPresent",
"name": "predictor",
"resources": {
"requests": {
"memory": "1Mi"
}
}
}
],
"terminationGracePeriodSeconds": 20
}
},
{
"spec": {
"containers": [
{
"image": "repo-name/dep1-mapping:2018-10-29-18-45-44",
"imagePullPolicy": "IfNotPresent",
"name": "external-to-internal-mapping",
"resources": {
"requests": {
"memory": "1Mi"
}
}
}
],
"terminationGracePeriodSeconds": 20
}
}
],
"graph": {
"children": [
{
"children": [
{
"endpoint": {
"type": "REST"
},
"name": "predictor",
"type": "MODEL"
}
],
"endpoint": {
"type": "REST"
},
"name": "external-to-internal-mapping",
"parameters": [
{
"name": "external_to_internal",
"type": "BOOL",
"value": true
}
],
"type": "TRANSFORMER"
}
],
"endpoint": {
"type": "REST"
},
"name": "lookup",
"type": "TRANSFORMER"
},
"name": "single-model",
"replicas": 1
}
]
}
} |
Thanks. Are you able to confirm that the status is never set when you create the SeldonDeployment again? We need to understand if the status is never set or is initial set then disappears. |
@cliveseldon I investigated this and these are the step to reproduce:
|
A pull request #273 has updated when the status is set. If you can test with 0.2.4-SNAPSHOT. |
@cliveseldon updated to 2.4.0, it works for the first deployment, but fails after re-deploying the same model a few times. (status disappears from seldondeployment object) |
Can you try with 0.2.5-SNAPSHOT images. There was a recent fix that should fix this |
@cliveseldon 0.2.5-SNAPSHOT is not getting published to helm (using charts from here: https://storage.googleapis.com/seldon-charts) |
I've pushed to helm the 0.2.5-SNAPSHOT chart if you can try. |
Assuming fixed. Please reopen if still an issue. |
Any recommendation to how to set success/failure condition in the deployment specs? Ideally, to track deployment till it is all green. In the examples here there is a question mark.
The text was updated successfully, but these errors were encountered: