[EPIC] BC Wallet Mediator #6

krobinsonca · 2023-01-25T18:19:12Z

Tasks

Acceptance Criteria

Required - Mediator Service deployment in HA-fashion and running multiple replicas (3+)
- This includes the Agent (auto-scaled), wallet (HA instance of Postgres), Proxy (auto-scaled), External Queue (HA instance of Redis or Kafka), Message Workers (auto-scaled).
- Note regarding the Proxy, it would be nice to eliminate this layer, though it is useful for traffic control and rate limiting. However relying on it to route messages to the separate http and ws ports of the aca-py agents is undesirable.
- With the use of web sockets auto-scaling is best performed on the basis of the web socket connections themselves. This helps to ensure that pods with active web socket connections are not terminated by the system scaling them down prematurely. There are several ways to accomplish this.
Highly Desired - No session/cookie affinity: A agent may seamlessly connect and be served by any of the replicas
- The use of web sockets defines the current need for session affinity. The socket is opened between a client and an agent/worker instance. Once established all traffic must be routed between the same client and agent/worker instance. This is a challenge in K8S/OCP, especially when to comes to HPAS.
- Are there any alternatives to using web sockets?
Required - Uptime/performance monitoring at Aries protocol level (not just http/s)
- For example, the k8s compatible status endpoints on the agents do not provide sufficient information regarding the state and health of the websocket connections, nor do they provide any metrics on the durability and longevity of the websocket connections.

Blocked By:

UPGRADE: Fix multi-use invitation performance openwallet-foundation/acapy#2116
- This PR addresses vertical scalability on a single instance mediator, it does not address any of issues encountered in an HA or horizontally scalable environment.

Additional Resources:

esune · 2023-02-16T04:13:45Z

I did a bit of catching-up and while there are still some items that will require review and planning, I think we have a couple of options to focus on for the short/medium term.

The PR linked in the issue description will allow the mediator agent to scale vertically and manage throughput of 2400+ connections: this should be enough to handle the user volumes we expect in the immediate future. This does NOT help with scenarios involving pod rollout, as the in-progress queue would be lost.
This PR (https://github.com/bcgov/openshift-aries-mediator-service/pull/18/files) includes changes that, in theory, should help with preventing websockets from being dropped due to scaling, by using sticky sessions and affinity. The changes are already deployed in our dev environment, however testing appears to have been interrupted before it could confirm whether the change was helpful or not sue to a shift in priorities. It would be a good idea to wrap-up the testing and confirm whether this approach resolves, or at least mitigates, the horizontal scaling issues.

Both the above approaches should be accompanied by running a persistent queue to handle messages, so that they will not be lost in case of rollouts/re-deployments/failures.

I would suggest we focus on these three items in the short term, and in the meantime complete the investigation of potential next steps/long term strategies to manage mediation.

cvarjao mentioned this issue Feb 15, 2023

Scalable/Reliable Mediator Service bcgov/bc-wallet-mobile#842

Closed

2 tasks

esune added the Epic label Apr 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[EPIC] BC Wallet Mediator #6

[EPIC] BC Wallet Mediator #6

krobinsonca commented Jan 25, 2023 •

edited by WadeBarnes

Loading

esune commented Feb 16, 2023 •

edited

Loading

[EPIC] BC Wallet Mediator #6

[EPIC] BC Wallet Mediator #6

Comments

krobinsonca commented Jan 25, 2023 • edited by WadeBarnes Loading

Tasks

Acceptance Criteria

Blocked By:

Additional Resources:

esune commented Feb 16, 2023 • edited Loading

krobinsonca commented Jan 25, 2023 •

edited by WadeBarnes

Loading

esune commented Feb 16, 2023 •

edited

Loading