Configurable deterministic routing #1378

CarsonCook · 2021-04-19T21:02:15Z

Is your feature request related to a problem? Please describe.
Services want to handle user requests from the same instance of that service, but the gateway only balances workloads in a round robin fashion.

Describe the solution you'd like
Deterministic routing for user requests, where each request with a user ID goes to the same service instance it originally was routed to. Additionally, a configurable limit for the number of users each instance can handle.

MVP
see the outline in the discussion. The following is outdated.

The user ID for the request is routed to the same server as its original request
Configurable limits for the max number of users that each service instance can handle
When all instances are at or over their user limit and a new user makes a request, round robin selection and log a warning
If the instance for a user request goes down, then route to a new instance and count this user against the new instance's limit

Extensions

Make the behaviour when all instances are at or over their user limit configurable, with options of a) logging a warning; or b) returning a 500 error
Make the load balancing behaviour configurable, with options of a) go round robin until an instance limit is hit; or b) send all requests to the first instance until its limit is hit, then requests to the 2nd instance until its instances is hit, etc.

Additional context
@DMcKnight

anton-brezina · 2021-04-28T15:43:01Z

Keep in mind we need to support the HA setup here.

Does DVIPA support this?
Can we guarantee this behaviour in HA setup?
Shall we implement this behaviour in the caching service?
Will there be any performance impact (i.e. go to cache every time before routing)?

anton-brezina · 2021-04-28T15:46:07Z

Depends on #1359

jandadav · 2021-05-21T12:19:18Z

Goals:

We heard from multiple extenders and from core services that they have a case:
I want to talk to one particular instance
There are different motivations for this, we heard:
Because the instance is the one i need to talk to (a specific system, specific console ...)
Because there might be state that does not get distributed around to other instances (session)

Joe’s
Tomcat server with 100 Java threads. When a user logs in, they keep a thread in anticipation of the user coming back. One user can have a lots of threads across multiple instances of the one user. The user can log off, when the user logs off, possibly free the thread.
zOSMF actually has the same issue. The TSO session is actually long-lived and use the same optimization.
Mainframe workloads related to development or CICD tend to be isolated to particular LPAR’s

Problems:

Load balancing (LB/DLB)
Rate Limiting (RL/DRL)
User Limiting - we need more information

I would argue that Rate Limiting, while useful for obvious reasons, should not be part of our MVP.
Both have value on their own.
One does not require the other.
It seems that we are in agreement that Rate Limiting can be sacrificed.

Session based solution:

Based on a token that represents client (apiml auth cookie), we can recognize a client and provide deterministic routing.
That means we would store where the client has been routed in past and distribute this knowledge between Gateways.
Upon next request from the client, we recognize him by the token and route the request to the previously routed server.
Positives:
Without client’s interaction
Negatives:
We have to store, resolve and lifecycle the session
Client must be identifiable

Client based solution:

When any client gets routed, we would return (cookie, header) an information, where the request routed to.
On next call, the client can provide a token (cookie, header) and request the same instance as last time.
Gateway will see this request and route as requested.

Benefits:
Less code to break
Does not suffer from synchronization issues across Gateways
Works also for unauthenticated requests
Client can choose what instance he wants

Negatives:
Client has to take action (could be alleviated by using cookies)
Does not carry the rate limiting capabilities

Hybrid

. . .

Deterministic route based on token / Sharding

When user authenticates for instance, the token will dictate where the users get routed in predictable fashion

Positives:
Client does not need to take action

Negatives:
How to manage changing services?

Considerations:

Identifying the user/session:
Token
IP

Transferring the session
Cookie
Header

Configuration
Default (off)
Service can say what it wants
How deep do we want to load balance? (ServiceId <-> path, Composite API’s like zosmf)

Transferability of solution to SC Gateway

Model rejection strategy
Reject

Security of headers
Header spoofing

CarsonCook · 2021-05-25T14:01:37Z

@jandadav the extender has confirmed the client based solution will work for them.

jandadav · 2021-06-01T13:58:10Z

Proposal for follow-up stories to finish the load balancing implementation

Configurable load balancer setup for individual services

As a
Zowe conformant application developer
I can
Configure the load balancer for my service with predefined load balancing schemas
So that I can
Achieve the load balancing scheme that is desirable for my application

This will mean to implement:
PredicateFactory that is aware of the service's registration metadata
Enhances the context's Environment with the metadata
Constructs the load balancing beans conditionally

Authentication based server side load balancing

As a
Zowe conformant application developer
I can
Call my application's API with Zowe authentication through single instance of API Gateway and always get to the same instance of my service for a given period of time.
So that I can
Protect against additional user-related address spaces spawned by my application without changing its code.

This will mean to implement:
A balancing bean that:
Recognizes requests by Zowe authentication - User. User has multiple JWT's so we have to understand who is calling.
Unauthenticated requests? - not sure if it's universal, Carson will check with the extender (pervasive or restrictive)
If there is no preference, routes the request to round robin and stores preference.
If there is preference, routes the requests to the same instanceId as the preference
Lifecycle: Expiry of preference after configurable time period is exceeded since last request

Authentication based distributed server side load balancing

As a
Zowe conformant application developer
I can
Call my application's API with Zowe authentication through any instance of API Gateway and consistently get to the same instance of my service for a given period of time.
So that I can
Protect against additional user-related address spaces spawned by my application without changing its code. And I can do that against any Gateway instance and get consistent behavior.

This will mean to implement:
Whatever was developed for the previous story will have to be stored in caching service

Spike: Investigate and document performance of deterministic routing in HA setup

#1413

jandadav · 2021-06-28T08:21:25Z

The current state of implemented infrastructure looks like this:

CarsonCook added enhancement New feature or request new New issue that has not been worked on yet labels Apr 19, 2021

anton-brezina added 21PI2 Objective Goal for the PI as announced at PI Planning Priority: High and removed new New issue that has not been worked on yet labels Apr 21, 2021

CarsonCook mentioned this issue Apr 28, 2021

Authentication based distributed server side load balancing #1412

Closed

balhar-jakub added the Epic label Apr 29, 2021

jalel01 closed this as completed Dec 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configurable deterministic routing #1378

Configurable deterministic routing #1378

CarsonCook commented Apr 19, 2021 •

edited by jandadav

anton-brezina commented Apr 28, 2021

anton-brezina commented Apr 28, 2021

jandadav commented May 21, 2021 •

edited

CarsonCook commented May 25, 2021

jandadav commented Jun 1, 2021 •

edited by jalel01

jandadav commented Jun 28, 2021

Configurable deterministic routing #1378

Configurable deterministic routing #1378

Comments

CarsonCook commented Apr 19, 2021 • edited by jandadav

anton-brezina commented Apr 28, 2021

anton-brezina commented Apr 28, 2021

jandadav commented May 21, 2021 • edited

Goals:

Problems:

Session based solution:

Client based solution:

Hybrid

Deterministic route based on token / Sharding

Considerations:

CarsonCook commented May 25, 2021

jandadav commented Jun 1, 2021 • edited by jalel01

Configurable load balancer setup for individual services

Authentication based server side load balancing

Authentication based distributed server side load balancing

Spike: Investigate and document performance of deterministic routing in HA setup

jandadav commented Jun 28, 2021

CarsonCook commented Apr 19, 2021 •

edited by jandadav

jandadav commented May 21, 2021 •

edited

jandadav commented Jun 1, 2021 •

edited by jalel01