Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service will break for timed blue/green deployment by kubernetes #25

Closed
wangyoucao577 opened this issue Jun 21, 2019 · 5 comments
Closed
Assignees
Labels
Design Documentation Prototype Proof of concept

Comments

@wangyoucao577
Copy link

wangyoucao577 commented Jun 21, 2019

By the draft design OSRM with Telenav Traffic Design (Draft), we hope to use timed blue/green deploy by Kubernetes to support live traffic update without any service breakup.
In the test there're several issues occurred:

  1. The startup process(pull latest traffic and dump to traffic.csv file, osrm-customize, osrm-routed) totally need about more than 15 minutes for North America.
    • It's not critical but too much time to let the live traffic effective. We have to figure out better way to shorten it.
  2. The blue/green deploy will terminate old container once new container running. But it needs about 15 minutes from the new container running to the new container service ready as above, the service will be unavailable in this 15 minutes.
    • It will be good if kubernetes have some settings to switch service to new container until new container service ready, e.g. something like heartbeat check. Not sure whether kubernetes have such setting, need to do some research.
@wangyoucao577 wangyoucao577 changed the title Timed blue/green deployment by kubernetes can not work Timed blue/green deployment by kubernetes works not good Jun 21, 2019
@wangyoucao577 wangyoucao577 added Design Documentation Prototype Proof of concept labels Jun 21, 2019
@wangyoucao577 wangyoucao577 changed the title Timed blue/green deployment by kubernetes works not good Service will break for timed blue/green deployment by kubernetes Jun 21, 2019
@wangyoucao577
Copy link
Author

The kubernetes readiness probe looks for this requirement. See details in: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/
Will have a try.

@wangyoucao577
Copy link
Author

wangyoucao577 commented Jun 24, 2019

It works well on my local test, the POD will be set to ready until the probe succeed. Ready to test it in real env.

          readinessProbe:
            tcpSocket:
              port: 5000
            initialDelaySeconds: 10
            periodSeconds: 5
            failureThreshold: 1000

@wangyoucao577 wangyoucao577 self-assigned this Jun 24, 2019
@wangyoucao577
Copy link
Author

wangyoucao577 commented Jun 24, 2019

After learned more about kubernetes deployment, I think the rolling update deployment strategy will be better than blue/green for our OSRM deployment with traffic update. Will have a try.

References:

We'll also need a timed solution to trigger the rolling update per tens of minutes since we'll update traffic at container startup.
For the trigger rolling update part, this workaround looks good: kubernetes/kubernetes#13488 (comment)
I think we can have a crontab or Jenkins to trigger commands as below:

kubectl set env  --env="LAST_MANUAL_RESTART=$(date -u +%Y%m%dT%H%M%S)" deploy/osrm

Also, we can try the better way by kubectl rollout restart once we can upgrade kubernetes to 1.15(we currently use 1.14). See kubernetes/kubernetes#13488 (comment) for details.

@CodeBear801
Copy link

CodeBear801 commented Jun 26, 2019

Agree. Also have a conversation here.

  • Blue/Green mainly for pair running production to roll out new feature. It needs exactly the same two set of hardware and we could use production request to test new feature, if everything meet expectation kubernetes will switch all loads to new deployment.

  • Rolling update is the common way to update tested app, which has fewer cost compare to Blue/Green way.

  • Say we have 4 pods for routing, Blue/Green need additional 4 containers for routing, while rolling update works like prepare 1 new pod, stop 1 old pod, kubernetes will balance loads in these 4 pods. Then step by step will retire old pods and restart new pods.

  • For traffic update, we don't have new feature to roll out, just need restart app internally, I think roll out should be the strategy.

@wangyoucao577
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Design Documentation Prototype Proof of concept
Projects
None yet
Development

No branches or pull requests

2 participants