Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable deterministic routing #1378

Closed
6 tasks
CarsonCook opened this issue Apr 19, 2021 · 6 comments
Closed
6 tasks

Configurable deterministic routing #1378

CarsonCook opened this issue Apr 19, 2021 · 6 comments
Labels
enhancement New feature or request Objective Goal for the PI as announced at PI Planning Priority: High

Comments

@CarsonCook
Copy link
Contributor

CarsonCook commented Apr 19, 2021

Is your feature request related to a problem? Please describe.
Services want to handle user requests from the same instance of that service, but the gateway only balances workloads in a round robin fashion.

Describe the solution you'd like
Deterministic routing for user requests, where each request with a user ID goes to the same service instance it originally was routed to. Additionally, a configurable limit for the number of users each instance can handle.

MVP
see the outline in the discussion. The following is outdated.

  • The user ID for the request is routed to the same server as its original request
  • Configurable limits for the max number of users that each service instance can handle
  • When all instances are at or over their user limit and a new user makes a request, round robin selection and log a warning
  • If the instance for a user request goes down, then route to a new instance and count this user against the new instance's limit

Extensions

  • Make the behaviour when all instances are at or over their user limit configurable, with options of a) logging a warning; or b) returning a 500 error
  • Make the load balancing behaviour configurable, with options of a) go round robin until an instance limit is hit; or b) send all requests to the first instance until its limit is hit, then requests to the 2nd instance until its instances is hit, etc.

Additional context
@DMcKnight

@CarsonCook CarsonCook added enhancement New feature or request new New issue that has not been worked on yet labels Apr 19, 2021
@anton-brezina anton-brezina added 21PI2 Objective Goal for the PI as announced at PI Planning Priority: High and removed new New issue that has not been worked on yet labels Apr 21, 2021
@anton-brezina
Copy link
Contributor

Keep in mind we need to support the HA setup here.

  • Does DVIPA support this?
  • Can we guarantee this behaviour in HA setup?
  • Shall we implement this behaviour in the caching service?
  • Will there be any performance impact (i.e. go to cache every time before routing)?

@anton-brezina
Copy link
Contributor

Depends on #1359

@jandadav
Copy link
Contributor

jandadav commented May 21, 2021

Goals:

We heard from multiple extenders and from core services that they have a case:
I want to talk to one particular instance
There are different motivations for this, we heard:
Because the instance is the one i need to talk to (a specific system, specific console ...)
Because there might be state that does not get distributed around to other instances (session)

Joe’s
Tomcat server with 100 Java threads. When a user logs in, they keep a thread in anticipation of the user coming back. One user can have a lots of threads across multiple instances of the one user. The user can log off, when the user logs off, possibly free the thread.
zOSMF actually has the same issue. The TSO session is actually long-lived and use the same optimization.
Mainframe workloads related to development or CICD tend to be isolated to particular LPAR’s

Problems:

Load balancing (LB/DLB)
Rate Limiting (RL/DRL)
User Limiting - we need more information

I would argue that Rate Limiting, while useful for obvious reasons, should not be part of our MVP.
Both have value on their own.
One does not require the other.
It seems that we are in agreement that Rate Limiting can be sacrificed.

Session based solution:

Based on a token that represents client (apiml auth cookie), we can recognize a client and provide deterministic routing.
That means we would store where the client has been routed in past and distribute this knowledge between Gateways.
Upon next request from the client, we recognize him by the token and route the request to the previously routed server.
Positives:
Without client’s interaction
Negatives:
We have to store, resolve and lifecycle the session
Client must be identifiable

Client based solution:

When any client gets routed, we would return (cookie, header) an information, where the request routed to.
On next call, the client can provide a token (cookie, header) and request the same instance as last time.
Gateway will see this request and route as requested.

Benefits:
Less code to break
Does not suffer from synchronization issues across Gateways
Works also for unauthenticated requests
Client can choose what instance he wants

Negatives:
Client has to take action (could be alleviated by using cookies)
Does not carry the rate limiting capabilities

Hybrid

. . .

Deterministic route based on token / Sharding

When user authenticates for instance, the token will dictate where the users get routed in predictable fashion

Positives:
Client does not need to take action

Negatives:
How to manage changing services?

Considerations:

Identifying the user/session:
Token
IP

Transferring the session
Cookie
Header

Configuration
Default (off)
Service can say what it wants
How deep do we want to load balance? (ServiceId <-> path, Composite API’s like zosmf)

Transferability of solution to SC Gateway

Model rejection strategy
Reject

Security of headers
Header spoofing

@CarsonCook
Copy link
Contributor Author

@jandadav the extender has confirmed the client based solution will work for them.

@jandadav
Copy link
Contributor

jandadav commented Jun 1, 2021

Proposal for follow-up stories to finish the load balancing implementation

Configurable load balancer setup for individual services

As a
Zowe conformant application developer
I can
Configure the load balancer for my service with predefined load balancing schemas
So that I can
Achieve the load balancing scheme that is desirable for my application

This will mean to implement:
PredicateFactory that is aware of the service's registration metadata
Enhances the context's Environment with the metadata
Constructs the load balancing beans conditionally

Authentication based server side load balancing

As a
Zowe conformant application developer
I can
Call my application's API with Zowe authentication through single instance of API Gateway and always get to the same instance of my service for a given period of time.
So that I can
Protect against additional user-related address spaces spawned by my application without changing its code.

This will mean to implement:
A balancing bean that:
Recognizes requests by Zowe authentication - User. User has multiple JWT's so we have to understand who is calling.
Unauthenticated requests? - not sure if it's universal, Carson will check with the extender (pervasive or restrictive)
If there is no preference, routes the request to round robin and stores preference.
If there is preference, routes the requests to the same instanceId as the preference
Lifecycle: Expiry of preference after configurable time period is exceeded since last request

Authentication based distributed server side load balancing

As a
Zowe conformant application developer
I can
Call my application's API with Zowe authentication through any instance of API Gateway and consistently get to the same instance of my service for a given period of time.
So that I can
Protect against additional user-related address spaces spawned by my application without changing its code. And I can do that against any Gateway instance and get consistent behavior.

This will mean to implement:
Whatever was developed for the previous story will have to be stored in caching service

Spike: Investigate and document performance of deterministic routing in HA setup

#1413

@jandadav
Copy link
Contributor

The current state of implemented infrastructure looks like this:
IMG_20210628_101104.jpg

@jalel01 jalel01 closed this as completed Dec 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Objective Goal for the PI as announced at PI Planning Priority: High
Projects
None yet
Development

No branches or pull requests

5 participants