Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding canary, shadow or explainer shouldn't affect main predictor #1110

Closed
ryandawsonuk opened this issue Nov 15, 2019 · 9 comments · Fixed by #1183
Closed

adding canary, shadow or explainer shouldn't affect main predictor #1110

ryandawsonuk opened this issue Nov 15, 2019 · 9 comments · Fixed by #1183
Assignees
Milestone

Comments

@ryandawsonuk
Copy link
Contributor

Check that main predictor doesn't get recreated. Even a rolling update to the main predictor is not ideal.

@ryandawsonuk ryandawsonuk added this to the 1.0 milestone Nov 15, 2019
@ryandawsonuk
Copy link
Contributor Author

Adding or removing a canary does not affect the main predictor so that part is good.

But adding or removing an explainer does lead to a rolling update of the main predictor.

@ryandawsonuk
Copy link
Contributor Author

The operator does check whether the deployment has changed. It logs Updating Deployment and the logs reveal that there's no difference at the container level. The difference actually comes out in an ENGINE_PREDICTOR environment variable:
image

@ryandawsonuk
Copy link
Contributor Author

ryandawsonuk commented Dec 2, 2019

The left is

eyJuYW1lIjoiZGVmYXVsdCIsImdyYXBoIjp7Im5hbWUiOiJjbGFzc2lmaWVyIiwidHlwZSI6Ik1PREVMIiwiaW1wbGVtZW50YXRpb24iOiJTS0xFQVJOX1NFUlZFUiIsImVuZHBvaW50Ijp7InNlcnZpY2VfaG9zdCI6ImxvY2FsaG9zdCIsInNlcnZpY2VfcG9ydCI6OTAwMCwidHlwZSI6IlJFU1QifSwibW9kZWxVcmkiOiJnczovL3NlbGRvbi1tb2RlbHMvc2tsZWFybi9pbmNvbWUvbW9kZWwifSwiY29tcG9uZW50U3BlY3MiOlt7Im1ldGFkYXRhIjp7ImNyZWF0aW9uVGltZXN0YW1wIjoiMjAxOS0xMi0wMlQxNDo1NTo1M1oifSwic3BlYyI6eyJjb250YWluZXJzIjpbeyJuYW1lIjoiY2xhc3NpZmllciIsImltYWdlIjoic2VsZG9uaW8vc2tsZWFybnNlcnZlcl9yZXN0OjAuMiIsInBvcnRzIjpbeyJuYW1lIjoiaHR0cCIsImNvbnRhaW5lclBvcnQiOjkwMDAsInByb3RvY29sIjoiVENQIn1dLCJlbnYiOlt7Im5hbWUiOiJQUkVESUNUSVZFX1VOSVRfU0VSVklDRV9QT1JUIiwidmFsdWUiOiI5MDAwIn0seyJuYW1lIjoiUFJFRElDVElWRV9VTklUX0lEIiwidmFsdWUiOiJjbGFzc2lmaWVyIn0seyJuYW1lIjoiUFJFRElDVE9SX0lEIiwidmFsdWUiOiJkZWZhdWx0In0seyJuYW1lIjoiU0VMRE9OX0RFUExPWU1FTlRfSUQiLCJ2YWx1ZSI6ImluY29tZSJ9LHsibmFtZSI6IlBSRURJQ1RJVkVfVU5JVF9QQVJBTUVURVJTIiwidmFsdWUiOiJbe1wibmFtZVwiOlwibW9kZWxfdXJpXCIsXCJ2YWx1ZVwiOlwiL21udC9tb2RlbHNcIixcInR5cGVcIjpcIlNUUklOR1wifV0ifV0sInJlc291cmNlcyI6e30sInZvbHVtZU1vdW50cyI6W3sibmFtZSI6InBvZGluZm8iLCJtb3VudFBhdGgiOiIvZXRjL3BvZGluZm8ifSx7Im5hbWUiOiJjbGFzc2lmaWVyLXByb3Zpc2lvbi1sb2NhdGlvbiIsInJlYWRPbmx5Ijp0cnVlLCJtb3VudFBhdGgiOiIvbW50L21vZGVscyJ9XSwibGl2ZW5lc3NQcm9iZSI6eyJ0Y3BTb2NrZXQiOnsicG9ydCI6Imh0dHAifSwiaW5pdGlhbERlbGF5U2Vjb25kcyI6NjAsInRpbWVvdXRTZWNvbmRzIjoxLCJwZXJpb2RTZWNvbmRzIjo1LCJzdWNjZXNzVGhyZXNob2xkIjoxLCJmYWlsdXJlVGhyZXNob2xkIjozfSwicmVhZGluZXNzUHJvYmUiOnsidGNwU29ja2V0Ijp7InBvcnQiOiJodHRwIn0sImluaXRpYWxEZWxheVNlY29uZHMiOjIwLCJ0aW1lb3V0U2Vjb25kcyI6MSwicGVyaW9kU2Vjb25kcyI6NSwic3VjY2Vzc1RocmVzaG9sZCI6MSwiZmFpbHVyZVRocmVzaG9sZCI6M30sImxpZmVjeWNsZSI6eyJwcmVTdG9wIjp7ImV4ZWMiOnsiY29tbWFuZCI6WyIvYmluL3NoIiwiLWMiLCIvYmluL3NsZWVwIDEwIl19fX0sInRlcm1pbmF0aW9uTWVzc2FnZVBhdGgiOiIvZGV2L3Rlcm1pbmF0aW9uLWxvZyIsInRlcm1pbmF0aW9uTWVzc2FnZVBvbGljeSI6IkZpbGUiLCJpbWFnZVB1bGxQb2xpY3kiOiJJZk5vdFByZXNlbnQifV19fV0sInJlcGxpY2FzIjoxLCJlbmdpbmVSZXNvdXJjZXMiOnt9LCJsYWJlbHMiOnsidmVyc2lvbiI6ImRlZmF1bHQifSwic3ZjT3JjaFNwZWMiOnt9LCJleHBsYWluZXIiOnsidHlwZSI6IkFuY2hvclRhYnVsYXIiLCJtb2RlbFVyaSI6ImdzOi8vc2VsZG9uLW1vZGVscy9za2xlYXJuL2luY29tZS9leHBsYWluZXIiLCJjb250YWluZXJTcGVjIjp7Im5hbWUiOiIiLCJyZXNvdXJjZXMiOnt9fX19

And the right is:

eyJuYW1lIjoiZGVmYXVsdCIsImdyYXBoIjp7Im5hbWUiOiJjbGFzc2lmaWVyIiwidHlwZSI6Ik1PREVMIiwiaW1wbGVtZW50YXRpb24iOiJTS0xFQVJOX1NFUlZFUiIsImVuZHBvaW50Ijp7InNlcnZpY2VfaG9zdCI6ImxvY2FsaG9zdCIsInNlcnZpY2VfcG9ydCI6OTAwMCwidHlwZSI6IlJFU1QifSwibW9kZWxVcmkiOiJnczovL3NlbGRvbi1tb2RlbHMvc2tsZWFybi9pbmNvbWUvbW9kZWwifSwiY29tcG9uZW50U3BlY3MiOlt7Im1ldGFkYXRhIjp7ImNyZWF0aW9uVGltZXN0YW1wIjoiMjAxOS0xMi0wMlQxNDo1NDo1OFoifSwic3BlYyI6eyJjb250YWluZXJzIjpbeyJuYW1lIjoiY2xhc3NpZmllciIsImltYWdlIjoic2VsZG9uaW8vc2tsZWFybnNlcnZlcl9yZXN0OjAuMiIsInBvcnRzIjpbeyJuYW1lIjoiaHR0cCIsImNvbnRhaW5lclBvcnQiOjkwMDAsInByb3RvY29sIjoiVENQIn1dLCJlbnYiOlt7Im5hbWUiOiJQUkVESUNUSVZFX1VOSVRfU0VSVklDRV9QT1JUIiwidmFsdWUiOiI5MDAwIn0seyJuYW1lIjoiUFJFRElDVElWRV9VTklUX0lEIiwidmFsdWUiOiJjbGFzc2lmaWVyIn0seyJuYW1lIjoiUFJFRElDVE9SX0lEIiwidmFsdWUiOiJkZWZhdWx0In0seyJuYW1lIjoiU0VMRE9OX0RFUExPWU1FTlRfSUQiLCJ2YWx1ZSI6ImluY29tZSJ9LHsibmFtZSI6IlBSRURJQ1RJVkVfVU5JVF9QQVJBTUVURVJTIiwidmFsdWUiOiJbe1wibmFtZVwiOlwibW9kZWxfdXJpXCIsXCJ2YWx1ZVwiOlwiL21udC9tb2RlbHNcIixcInR5cGVcIjpcIlNUUklOR1wifV0ifV0sInJlc291cmNlcyI6e30sInZvbHVtZU1vdW50cyI6W3sibmFtZSI6InBvZGluZm8iLCJtb3VudFBhdGgiOiIvZXRjL3BvZGluZm8ifSx7Im5hbWUiOiJjbGFzc2lmaWVyLXByb3Zpc2lvbi1sb2NhdGlvbiIsInJlYWRPbmx5Ijp0cnVlLCJtb3VudFBhdGgiOiIvbW50L21vZGVscyJ9XSwibGl2ZW5lc3NQcm9iZSI6eyJ0Y3BTb2NrZXQiOnsicG9ydCI6Imh0dHAifSwiaW5pdGlhbERlbGF5U2Vjb25kcyI6NjAsInRpbWVvdXRTZWNvbmRzIjoxLCJwZXJpb2RTZWNvbmRzIjo1LCJzdWNjZXNzVGhyZXNob2xkIjoxLCJmYWlsdXJlVGhyZXNob2xkIjozfSwicmVhZGluZXNzUHJvYmUiOnsidGNwU29ja2V0Ijp7InBvcnQiOiJodHRwIn0sImluaXRpYWxEZWxheVNlY29uZHMiOjIwLCJ0aW1lb3V0U2Vjb25kcyI6MSwicGVyaW9kU2Vjb25kcyI6NSwic3VjY2Vzc1RocmVzaG9sZCI6MSwiZmFpbHVyZVRocmVzaG9sZCI6M30sImxpZmVjeWNsZSI6eyJwcmVTdG9wIjp7ImV4ZWMiOnsiY29tbWFuZCI6WyIvYmluL3NoIiwiLWMiLCIvYmluL3NsZWVwIDEwIl19fX0sInRlcm1pbmF0aW9uTWVzc2FnZVBhdGgiOiIvZGV2L3Rlcm1pbmF0aW9uLWxvZyIsInRlcm1pbmF0aW9uTWVzc2FnZVBvbGljeSI6IkZpbGUiLCJpbWFnZVB1bGxQb2xpY3kiOiJJZk5vdFByZXNlbnQifV19fV0sInJlcGxpY2FzIjoxLCJlbmdpbmVSZXNvdXJjZXMiOnt9LCJsYWJlbHMiOnsidmVyc2lvbiI6ImRlZmF1bHQifSwic3ZjT3JjaFNwZWMiOnt9LCJleHBsYWluZXIiOnsiY29udGFpbmVyU3BlYyI6eyJuYW1lIjoiIiwicmVzb3VyY2VzIjp7fX19fQ==

@ryandawsonuk
Copy link
Contributor Author

base64 decoded these are
image

@ryandawsonuk
Copy link
Contributor Author

This actually makes sense for an explainer as the explainer section is part of the predictor in the SeldonDeployment. That SeldonDeployment is fed into the engine through the env var as the engine reads it and uses it to orchestrate the graph. So we are updating the Deployment in order to feed in the new env var value containing an encoded version of the latest SeldonDeployment spec. We could leave the Deployment with the old version of the env var (because the explainer isn't on the main flow of the graph) but it feels appropriate to use the fresh version.

@ryandawsonuk
Copy link
Contributor Author

ryandawsonuk commented Dec 2, 2019

Actually we do get a rolling update and again a change to ENGINE_PREDICTOR for an istio shadow.

But strangely not with an istio canary. I took the ENGINE_PREDICTOR from the main deployment with the canary in play and found in that case the traffic field doesn't get passed through, which explains why it doesn't see a change.

Not sure yet why the shadow seems to cause an update.

@ryandawsonuk
Copy link
Contributor Author

Hm, for the shadow the only difference is again through the ENGINE_PREDICTOR and in the base64-encoded content it's just the timestamps that differ:

image

@ryandawsonuk
Copy link
Contributor Author

Here I've not set the timestamp in the original yaml so it's getting added by the webhook.

I think this could affect canary too. I suspect I just didn't hit this problem with the canary earlier because the timestamp then was being set on the original yaml.

@ryandawsonuk
Copy link
Contributor Author

I'm thinking a path here might be to copy the resource and remove parts (e.g. the meta section, the explainer section) from the copy before serializing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant