Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Being able to do A/B testing - Canary #1164

Closed
guilhem opened this issue Feb 15, 2017 · 23 comments
Closed

Being able to do A/B testing - Canary #1164

guilhem opened this issue Feb 15, 2017 · 23 comments

Comments

@guilhem
Copy link
Contributor

@guilhem guilhem commented Feb 15, 2017

When using an orchestrator (like Marathon), we often want to "bulletproof" software releases by deploying a new version on a small set of instances.

Current workaround it to set traefik.backend to the same value in A and B versions.
It works but it avoids being able to create frontend that targets a specific version.

What is possible:

  1. Frontend with multiple backends
    solution 1
  2. Virtual Backends (who only redirect to other backends)
    solution 2
  3. Backends with nested backend
    solution 3

What are you thinking about this?

@SantoDE

This comment has been minimized.

Copy link
Contributor

@SantoDE SantoDE commented Feb 20, 2017

Actually, I'm not sure but I'm thinking about that issue as well :) I feel like option 1 is my favorite but I'm really not sure.

@guilhem

This comment has been minimized.

Copy link
Contributor Author

@guilhem guilhem commented Feb 21, 2017

Solution 1 is really attractive because it's the most logical, but it has many problems.
Most important is the duplication of configuration between frontend and backend (wrr, ...).

After more reflection I think solution 3 respond to every solution without more configuration. A backend will only work like any other server and it's possible to do the same than 2 with 3.

@ldez ldez added the kind/question label Apr 23, 2017
@gsemet

This comment has been minimized.

Copy link
Contributor

@gsemet gsemet commented May 11, 2017

Is it possible to document this pattern? I wonder how it can work and what needs to be done on Marathon to do it? Apparently, Kubernetes does it already.

@timoreimann

This comment has been minimized.

Copy link
Member

@timoreimann timoreimann commented May 11, 2017

@stibbons curious, how does Kubernetes do this?

The one approach I'm aware of is to spawn more Pods and adjust service labels accordingly, which all seems to be below the proxy/Ingress level that Traefik operates on.

@gsemet

This comment has been minimized.

Copy link
Contributor

@gsemet gsemet commented May 11, 2017

Sorry, I should not have said "Kuberentes does it already". I correct it in my statement. Should be like: Kubernetes allow Rolling update with time delay/period (--update-period argument or kubectl rollingupdate). This is not strictly a controlled A/B testing deployment. To my understanding, Marathon does not provide it, and probably the K8s' Replication Controler is easier to drive to perform a A/B deployment than Marathon API.

In anyway, if Traefik can document inside the official doc how to configure both K8s and Marathon to have an A/B deployment with Traefik, this would be simply extraordinaire!

@ldez ldez added the priority/P2 label Jun 9, 2017
@bitsofinfo

This comment has been minimized.

Copy link
Contributor

@bitsofinfo bitsofinfo commented Apr 17, 2018

I also really have a need for this. Right now I have multiple swarm services (A/B) both declaring the same frontend Host label, attempting to do this kind of thing, but it doesn't work, only the first frontend declared w/ a Host label gets the traffic.

@bitsofinfo

This comment has been minimized.

Copy link
Contributor

@bitsofinfo bitsofinfo commented Apr 17, 2018

@guilhem in your diagram above when you say "what is possible". Are any of those in your diagram currently possible? For example "frontend with multiple backends"... i don't think this is currently possible correct?

@doreplado

This comment has been minimized.

Copy link

@doreplado doreplado commented Apr 17, 2018

@bitsofinfo correct, those are proposals, they are not currently supported. I validated the same thing in the slack channel the other day. :) I also tried the suggestion of overriding the traefik.backend to have both stacks use the same backend but Rancher backend doesn't support it. So I'm watching for updates here.

@bitsofinfo

This comment has been minimized.

Copy link
Contributor

@bitsofinfo bitsofinfo commented Apr 17, 2018

So if want to accomplish this today.... if I have 2 separate docker services (A/B), if they both share the same -traefik.frontend.rule=Host:myhosts.test.com -traefik.backend=backend-one, this will create ONE frontend (or two frontends?), sharing the same Host:myhosts.test.com binding but only one of them will actually take the requests... but since it goes to this shared backend (backend-one) replicas from BOTH A/B docker services exist in it...effectively giving me psuedo 50% A and 50% B containers servicing the request.... correct?

@doreplado

This comment has been minimized.

Copy link

@doreplado doreplado commented Apr 17, 2018

@bitsofinfo I'm not on the Traefik team but AFAIK, neither of those configs are supported. It MAY work as indicated by @guilhem since the marathon backend seems to work but your milage may vary. So to clarify, today there are no officially supported or functional canary - A/B config scenarios. Anyone else feel free to correct me if I've misspoken.

@bitsofinfo

This comment has been minimized.

Copy link
Contributor

@bitsofinfo bitsofinfo commented Apr 17, 2018

For anyone interested, this is the only way I can get it to "work" (traefik latest)

Launch 2 nginx services

docker service create --name nginx-1-0 --network omg-app-dev-int \
--mount type=bind,source=/tmp/nginx-1-0.html,target=/usr/share/nginx/html/index.html \
--label traefik.enable=true \
--label traefik.protocol=http \
--label traefik.port=80 \
--label traefik.frontend.rule=Host:nginx-current.mydomain.com \
--label traefik.docker.network=omg-app-dev-int \
--label traefik.backend=nginx-prod \
nginx

docker service create --name nginx-2-0 --network omg-app-dev-int \
--mount type=bind,source=/tmp/nginx-2-0.html,target=/usr/share/nginx/html/index.html \
--label traefik.enable=true \
--label traefik.protocol=http \
--label traefik.port=80 \
--label traefik.frontend.rule=Host:nginx-current.mydomain.com \
--label traefik.docker.network=omg-app-dev-int \
--label traefik.backend=nginx-prod \
nginx

Traefik dashboard results in:

screen shot 2018-04-17 at 1 04 30 pm

Repeated requests for the fqdn, bounce between both A/B versions

screen shot 2018-04-17 at 1 05 33 pm

screen shot 2018-04-17 at 1 05 25 pm

Not sure exactly which frontend is actually picking up the requests, but sort of don't care. One of the front-ends seems useless I guess. Would definitely be nice to just have Host matches ACROSS front-ends, round-robin appropriately to the different backends, without having to do the traefik.backend=[name] thing

@bitsofinfo

This comment has been minimized.

Copy link
Contributor

@bitsofinfo bitsofinfo commented Apr 17, 2018

The other scenario is if you want to have an Host that serves up a mixed A/B backend, but still have a way to access the B version separately, unfortunately you have to do it like the below, and can't just point to the existing nginx-2-0 service:

Launch 3 nginx services

docker service create --name nginx-1-0 --network omg-app-dev-int \
--mount type=bind,source=/tmp/nginx-1-0.html,target=/usr/share/nginx/html/index.html \
--label traefik.enable=true \
--label traefik.protocol=http \
--label traefik.port=80 \
--label traefik.frontend.rule=Host:nginx-current.mydomain.com \
--label traefik.docker.network=omg-app-dev-int \
--label traefik.backend=nginx-prod \
nginx

docker service create --name nginx-2-0 --network omg-app-dev-int \
--mount type=bind,source=/tmp/nginx-2-0.html,target=/usr/share/nginx/html/index.html \
--label traefik.enable=true \
--label traefik.protocol=http \
--label traefik.port=80 \
--label traefik.frontend.rule=Host:nginx-current.mydomain.com \
--label traefik.docker.network=omg-app-dev-int \
--label traefik.backend=nginx-prod \
nginx

docker service create --name nginx-2-0-nv --network omg-app-dev-int \
--mount type=bind,source=/tmp/nginx-2-0.html,target=/usr/share/nginx/html/index.html \
--label traefik.enable=true \
--label traefik.protocol=http \
--label traefik.port=80 \
--label traefik.frontend.rule=Host:nginx-nv.mydomain.com \
--label traefik.docker.network=omg-app-dev-int \
--label traefik.backend=nginx-prod-nv \
nginx

Traefik dashboard results in:

_test

So here http calls to nginx-current.mydomain.com result in 50/50 to both 1-0/2-0, but to hit 2-0 specifically via nginx-nv.mydomain.com it needs to hit a second service instance which is essentially just a "re-deployment" with a different service name (nginx-2-0-nv), to get traefik to let us hit it specifically (rather than just have a 2nd host specified when deploying the nginx-2-0 instance and let it be shared across 2 front-ends.

@nfedyk

This comment has been minimized.

Copy link

@nfedyk nfedyk commented Apr 18, 2018

@bitsofinfo thanks for the workaround.
I have encountered the same issue trying to split traffic between multiple versions. Is there a plan to implement this by the Traefik team?

@bmudda

This comment has been minimized.

Copy link

@bmudda bmudda commented Apr 20, 2018

Yes. We currently have this same situation as well where we want to distribute load between two backend service in round robin or random fashion. We would love to see this implemented in the next release. Cheers!

@peloncano

This comment has been minimized.

Copy link

@peloncano peloncano commented May 4, 2018

Yes! It would be nice if traefik would just add support for this. The workaround mentioned by @bitsofinfo works but it just adds a lot more complexity and limitations to a service deployment specifically whenever you have to re-deploy a second version of the service just to create a frontend to target that specific version.

@boarder981

This comment has been minimized.

Copy link

@boarder981 boarder981 commented May 7, 2018

I currently have a need for this as well! A straightforward way to split traffic 50/50 between current version and "new" version of a service would be amazing.

@milewski

This comment has been minimized.

Copy link

@milewski milewski commented Aug 26, 2018

Does this workaround @bitsofinfo works with sticky session? as if you fire a HTTP request, and traefik picks backend A or B at 50%/50%, and the subsequent request from the same user strictly goes to the EXACTLY same backend it hit at first entry?

@bitsofinfo

This comment has been minimized.

Copy link
Contributor

@bitsofinfo bitsofinfo commented Aug 27, 2018

@milewski I don't know, you'll have to test that: https://docs.traefik.io/basics/#sticky-sessions

@JPM84

This comment has been minimized.

Copy link

@JPM84 JPM84 commented Jan 28, 2019

Any update on this issue?

Nginx has nginx.ingress.kubernetes.io/canary-by-header and nginx.ingress.kubernetes.io/canary-by-cookie (see https://github.com/Shopify/ingress/blob/master/docs/user-guide/nginx-configuration/annotations.md#canary).

Is there anything like this for traefik?

@timoreimann

This comment has been minimized.

Copy link
Member

@timoreimann timoreimann commented Jan 28, 2019

@JPM84 we shipped support for traffic splitting (which should allow A/B testing) at least for the Kubernetes provider with 1.7: https://docs.traefik.io/user-guide/kubernetes/#traffic-splitting

@JPM84

This comment has been minimized.

Copy link

@JPM84 JPM84 commented Jan 28, 2019

@timoreimann Yeah, I saw this feature, it's is very cool! But what I am looking for is traefik-splitting for testing a new version/canary-builds internally e.g. based on some check (e.g. header, cookie,...).
As far as I see traefik splitting currently allow for slowly public role-outs of new version/canary-builds which is great, but not quiet my use-case.

@timoreimann

This comment has been minimized.

Copy link
Member

@timoreimann timoreimann commented Jan 28, 2019

@JPM84 agreed, your use case would likely require some workarounds for now.

@ldez ldez added this to To do in v2 via automation Mar 27, 2019
@ldez ldez moved this from To do to In progress in v2 Aug 21, 2019
@ldez

This comment has been minimized.

Copy link
Member

@ldez ldez commented Aug 26, 2019

Close by #5237

@ldez ldez closed this Aug 26, 2019
v2 automation moved this from In progress to Done Aug 26, 2019
@ldez ldez added this to the 2.0 milestone Aug 26, 2019
@containous containous locked and limited conversation to collaborators Sep 26, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
v2
Done
You can’t perform that action at this time.