Skip to content
This repository has been archived by the owner on May 6, 2020. It is now read-only.

Validate routeability through deis-router upon application creation. #551

Closed
slack opened this issue Mar 22, 2016 · 15 comments
Closed

Validate routeability through deis-router upon application creation. #551

slack opened this issue Mar 22, 2016 · 15 comments
Assignees
Milestone

Comments

@slack
Copy link
Member

slack commented Mar 22, 2016

Controller should only return from create operations until the application has been "discovered" by the router.

Controller should use the router svc address with app hostname to verify non-404 response.

When no HEALTCHECK_URL has been provided, controller should accept any non-404 response (including 502) as routeable.

Refs deis/workflow-e2e#119

@helgi
Copy link
Contributor

helgi commented Mar 22, 2016

Looking at the requirements here, it means we need to only check this during the first deploy, which can be more than one release (technically) of the application.

Note tho that now that controller will need to know about the Router, even if it just a simple service looking via k8s API. That means the controller can only be used with deis-router in the deis namespace. Also means renaming deis-router to router needs to happen sooner than later

@krancour
Copy link
Contributor

I'm not so crazy about this idea, for a couple reasons. The first is that router has the potential to be scaled out. The fact that whatever router instance the controller might speak to has noticed and reacted to the presence of a new routable service isn't any guarantee that all the router instances have. So the benefit of this check is quite limited, unless we're willing to go into really complex territory to somehow assert that all router instances have reacted to the new service. I wouldn't favor that.

The second issue with this:

Controller should only return from create operations until the application has been "discovered" by the router.

After a deis create no one should actually be expecting the application to be accessible anyway, because nothing would have been deployed yet. Whether the router has reacted to the presence of a new routable service or not is more or less irrelevant since there won't be anything upstream from it. By the time an application has been deployed, it's guaranteed that the router (all all instances thereof) has reacted. The reconciliation loop is tighter than the time required to deploy an app.

@helgi
Copy link
Contributor

helgi commented Mar 22, 2016

@krancour We can do that check on the first deploy instead, so that wouldn't return back that the deploy is ready until we're getting a response from the router as well

@krancour
Copy link
Contributor

I might be confused about one thing... @helgi at what point does controller create the routable service? Is it one time, when the app is created? Or is it replaced/updated after each deployment?

@helgi
Copy link
Contributor

helgi commented Mar 22, 2016

On deploy right now, so that it doesn't try to route until there is something to route to

@helgi
Copy link
Contributor

helgi commented Mar 22, 2016

@krancour To expand on that, it happens during deploy because we have no clue what the scale type is until you have a build artefact to derive that information from, since you can have an application that has no routable information

@helgi
Copy link
Contributor

helgi commented Mar 22, 2016

One idea here to find the router is to look for a service with a particular label, but it would have to be within the deis namespace.

@krancour Your concern around the routers scaling out and some individual pods within that would be lagging behind is valid but I would think that if one of the pods is "green" (in terms of what we are talking about) then the rest will follow shortly. That's a big assumption but takes away a certain need to wait until they are all in quorum

@arschles
Copy link
Member

if one of the pods is "green" (in terms of what we are talking about) then the rest will follow shortly

I'm very wary of relying on this concern for our 2.0.0 release. This reduces to relying on no failure scenarios manifesting within the window to be able to deploy an application correctly. Eventually, some cluster is gonna have a {long GC pause, network "blip", insert-other-failure-scenario-here} which will cause one of the router pods to not pick up the change for a long while (or possibly not at all), but in that case the app will look deployed and every Nth request will fail. It's a failure for which there is possibly no corrective action, and that concerns me.

I'm not saying that, if we go down this road, we need to solve this pre-beta. I am, however, saying that we shouldn't rely on the aforementioned assumption in the final implementation.

Possible alternative: have the router (or some other in-cluster entity) elect a leader to track which router pods have acknowledged the new routable service. Only when all pods have checked in can the deploy be allowed to complete.

@helgi
Copy link
Contributor

helgi commented Mar 23, 2016

@arschles That's fine and can also be done in addition to my work.

You may want to propose that as a solution to keep routers in sync across many pods and not apply the configuration across the cluster until that point. I do not see it as a concern for the controller beyond "I can get a route from whatever LB service host/IP I have access to", which may or may not be 20 pods or just 1. Otherwise we end up with managing more state across things.

The leader could apply annotation or a label to the application service that could tell the controller a deploy/scale can start checking a HTTP ping.

I believe for beta we are still going for replica=1 as per deis/deis#4809 so we don't really need leader election or any other complicated logic until replica > 1

@arschles
Copy link
Member

You may want to propose that as a solution to keep routers in sync across many pods and not apply the configuration across the cluster until that point

Will do

I believe for beta we are still going for replica=1 as per deis/deis#4809 so we don't really need leader election or any other complicated logic until replica > 1

+1. I'm fine with this approach for now, and I'll put an issue in the router repo to revisit when we decide that we're gonna go replicas > 1

@arschles
Copy link
Member

See deis/router#152 for continuation of this discussion when replicas > 1

@helgi
Copy link
Contributor

helgi commented Mar 23, 2016

@slack I want to clarify the non-404 part of this. What's the distinction you are wanting to make between a health URL set vs not? Based on what you put in there it sounds like identical behaviour with and without it and I'd favour that personally since we are anyway verifying that router has records in place.

@slack
Copy link
Member Author

slack commented Mar 23, 2016

Ah, I see the ambiguity.

With a known HEALTHCHECK_URL we can make routeability depend on seeing a 200 at a known good URL. We know with certainty that we are making it all the way down to a Pod. So the first line should have read "Controller should use the router svc address with app hostname and healthcheck url to verify routing."

Without it, we can only hit /. If we get a 404, we probably saw the router, if we get a 200 we're probably ok and if we get a 5xx we might be hitting the router or the app. But we can't tell, so a 5xx should mean that we are routing?

@helgi
Copy link
Contributor

helgi commented Mar 23, 2016

Alright, that's a bit clearer. Have to fetch the health check url differently now but that probably isn't a big deal

@helgi
Copy link
Contributor

helgi commented Mar 23, 2016

@slack one thing I'm worrying about is if 200 is the only good code for when Health URL is around then we may have to account for a longer timeout than what I've done so far (3 seconds) in case the application is slow.

I could take the health check timeout number for that purpose but given out default timeout is 50 seconds there it may be excessive

helgi added a commit to helgi/controller that referenced this issue Mar 24, 2016
…n creation

Hits the router IP via python-requests with a Host header (uses xip.io) to emulate a request to the application. Sends a Deis Controller UA to make debugging easier
HTTP mocking is required for this to work in tests and all tests now get peppered with mock_requests object, which they can use for other purposes as well if required

Depends on a change to the requests_mock python package for testing only, pulling that from git currently

Tests take 2x the time because of added timing jitter to emulate router being slow to respond at times

Fixes deis#551
helgi added a commit to helgi/controller that referenced this issue Mar 24, 2016
…n creation

Hits the router IP via python-requests with a Host header (uses xip.io) to emulate a request to the application. Sends a Deis Controller UA to make debugging easier
HTTP mocking is required for this to work in tests and all tests now get peppered with mock_requests object, which they can use for other purposes as well if required

Depends on a change to the requests_mock python package for testing only, pulling that from git currently

Tests take 2x the time because of added timing jitter to emulate router being slow to respond at times

Fixes deis#551
helgi added a commit to helgi/controller that referenced this issue Mar 24, 2016
…n creation

Hits the router IP via python-requests with a Host header (uses xip.io) to emulate a request to the application. Sends a Deis Controller UA to make debugging easier
HTTP mocking is required for this to work in tests and all tests now get peppered with mock_requests object, which they can use for other purposes as well if required

Depends on a change to the requests_mock python package for testing only, pulling that from git currently

Tests take 2x the time because of added timing jitter to emulate router being slow to respond at times

Fixes deis#551
helgi added a commit to helgi/controller that referenced this issue Mar 24, 2016
…n creation

Hits the router IP via python-requests with a Host header (uses xip.io) to emulate a request to the application. Sends a Deis Controller UA to make debugging easier
HTTP mocking is required for this to work in tests and all tests now get peppered with mock_requests object, which they can use for other purposes as well if required

Depends on a change to the requests_mock python package for testing only, pulling that from git currently

Tests take 2x the time because of added timing jitter to emulate router being slow to respond at times

Fixes deis#551
helgi added a commit to helgi/controller that referenced this issue Mar 24, 2016
…n creation

Hits the router IP via python-requests with a Host header (uses xip.io) to emulate a request to the application. Sends a Deis Controller UA to make debugging easier
HTTP mocking is required for this to work in tests and all tests now get peppered with mock_requests object, which they can use for other purposes as well if required

Depends on a change to the requests_mock python package for testing only, pulling that from git currently

Tests take 2x the time because of added timing jitter to emulate router being slow to respond at times

Fixes deis#551
@helgi helgi modified the milestones: v2.0-rc1, v2.0-beta1 Mar 24, 2016
helgi added a commit to helgi/controller that referenced this issue Mar 26, 2016
…n creation

Hits the router IP via python-requests with a Host header (uses xip.io) to emulate a request to the application. Sends a Deis Controller UA to make debugging easier
HTTP mocking is required for this to work in tests and all tests now get peppered with mock_requests object, which they can use for other purposes as well if required

Depends on a change to the requests_mock python package for testing only, pulling that from git currently

Tests take 2x the time because of added timing jitter to emulate router being slow to respond at times

Fixes deis#551
helgi added a commit to helgi/controller that referenced this issue Mar 26, 2016
…n creation

Hits the router IP via python-requests with a Host header (uses xip.io) to emulate a request to the application. Sends a Deis Controller UA to make debugging easier
HTTP mocking is required for this to work in tests and all tests now get peppered with mock_requests object, which they can use for other purposes as well if required

Depends on a change to the requests_mock python package for testing only, pulling that from git currently

Tests take 2x the time because of added timing jitter to emulate router being slow to respond at times

Fixes deis#551
helgi added a commit to helgi/controller that referenced this issue Mar 26, 2016
…n creation

Hits the router IP via python-requests with a Host header (uses xip.io) to emulate a request to the application. Sends a Deis Controller UA to make debugging easier
HTTP mocking is required for this to work in tests and all tests now get peppered with mock_requests object, which they can use for other purposes as well if required

Depends on a change to the requests_mock python package for testing only, pulling that from git currently

Tests take 2x the time because of added timing jitter to emulate router being slow to respond at times

Fixes deis#551
helgi added a commit to helgi/controller that referenced this issue Mar 28, 2016
…n creation

Hits the router IP via python-requests with a Host header (uses xip.io) to emulate a request to the application. Sends a Deis Controller UA to make debugging easier
HTTP mocking is required for this to work in tests and all tests now get peppered with mock_requests object, which they can use for other purposes as well if required

Depends on a change to the requests_mock python package for testing only, pulling that from git currently

Tests take 2x the time because of added timing jitter to emulate router being slow to respond at times

Fixes deis#551
helgi added a commit to helgi/controller that referenced this issue Mar 28, 2016
…n creation

Hits the router IP via python-requests with a Host header (uses xip.io) to emulate a request to the application. Sends a Deis Controller UA to make debugging easier
HTTP mocking is required for this to work in tests and all tests now get peppered with mock_requests object, which they can use for other purposes as well if required

Depends on a change to the requests_mock python package for testing only, pulling that from git currently

Tests take 2x the time because of added timing jitter to emulate router being slow to respond at times

Fixes deis#551
helgi added a commit to helgi/controller that referenced this issue Mar 28, 2016
…n creation

Hits the router IP via python-requests with a Host header (uses xip.io) to emulate a request to the application. Sends a Deis Controller UA to make debugging easier
HTTP mocking is required for this to work in tests and all tests now get peppered with mock_requests object, which they can use for other purposes as well if required

Depends on a change to the requests_mock python package for testing only, pulling that from git currently

Tests take 2x the time because of added timing jitter to emulate router being slow to respond at times

Fixes deis#551
helgi added a commit to helgi/controller that referenced this issue Mar 29, 2016
…n creation

Hits the router IP via python-requests with a Host header (uses xip.io) to emulate a request to the application. Sends a Deis Controller UA to make debugging easier
HTTP mocking is required for this to work in tests and all tests now get peppered with mock_requests object, which they can use for other purposes as well if required

Depends on a change to the requests_mock python package for testing only, pulling that from git currently

Tests take 2x the time because of added timing jitter to emulate router being slow to respond at times

Fixes deis#551
helgi added a commit to helgi/controller that referenced this issue Mar 29, 2016
…n creation

Hits the router IP via python-requests with a Host header (uses xip.io) to emulate a request to the application. Sends a Deis Controller UA to make debugging easier
HTTP mocking is required for this to work in tests and all tests now get peppered with mock_requests object, which they can use for other purposes as well if required

Depends on a change to the requests_mock python package for testing only, pulling that from git currently

Tests take 2x the time because of added timing jitter to emulate router being slow to respond at times

Fixes deis#551
@helgi helgi removed the in progress label Mar 29, 2016
duanhongyi pushed a commit to duanhongyi/workflow that referenced this issue Dec 4, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants