Users supervision tree #175

NelsonVides · 2023-12-16T14:13:40Z

Dedup controller and users sup #172

In the original implementation, the process is monitored by two supervisors, the original amoc_users_sup, and the amoc_controller one. But this means that bursts of ups and downs overflow the mailbox of two processes, one of them being actually the very critical controller.

With this reimplementation, only the amoc_users_sup tracks the processes, and all requests from the controller are asynchronous. This ensures the controller does not get blocked and can remain responsive to control requests.

Interarrival of zero #173

Useful if for example we want to spawn a big pool of users all at once. Optimised by not setting any timers, but simply looping new user requests.

Pool user supervision trees

If users go up and down too fast, amoc_users_sup becomes a bottleneck as it
can’t keep up with the requests. Improve the performance by creating a pool of
supervisors whose size is proportional to the number of schedulers, and ensure
distribution of users among them by for example hashing on the user_id.

We do this by creating a pool of amoc_users_worker_sup supervisors and having the whole API through amoc_users_sup choose supervisors according to requests.

In the original implementation, the process is monitored by two supervisors, the original amoc_users_sup, and the amoc_controller one. But this means that bursts of ups and downs overflow the mailbox of _two_ processes, one of them being actually the very critical controller. With this reimplementation, only the amoc_users_sup tracks the processes, and all requests from the controller are asynchronous. This ensures the controller does not get blocked and can remain responsive to control requests.

Useful if for example we want to spawn a big pool of users all at once. Optimised by not setting any timers, but simply looping new user requests.

If users go up and down too fast, amoc_users_sup becomes a bottleneck as it can’t keep up with the requests. Improve the performance by creating a pool of supervisors whose size is proportional to the number of schedulers, and ensure distribution of users among them by for example hashing on the user_id. We do this by creating a pool of supervisors and having the whole API through `amoc_users_sup_sup` choose supervisors according to requests.

Have the user supervisors loop over the users start instead of the controller.

codecov-commenter · 2024-01-15T12:24:04Z

Codecov Report

Attention: 20 lines in your changes are missing coverage. Please review.

Comparison is base (9870612) 73.44% compared to head (623a89f) 75.00%.
Report is 14 commits behind head on master.

Files	Patch %	Lines
src/users/amoc_users_worker_sup.erl	77.19%	13 Missing ⚠️
src/users/amoc_users_sup.erl	94.73%	4 Missing ⚠️
src/amoc_controller.erl	90.32%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #175      +/-   ##
==========================================
+ Coverage   73.44%   75.00%   +1.55%     
==========================================
  Files          29       31       +2     
  Lines        1043     1160     +117     
==========================================
+ Hits          766      870     +104     
- Misses        277      290      +13

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

DenysGonchar

this new implementation sufferrs from the same issues as the existing one. removal operation is async and if we call amoc_controller:remove_users/2 two times in a row, it may select the same users for removal. so it won't remove the expected amount of users (the number returned by amoc_controller:remove_users/2). I don't think that we need to worry about it, but we must to add a note in the documentation about this behaviour.

DenysGonchar · 2024-02-17T21:08:16Z

src/users/amoc_user.erl

+stop(Pid, false) when is_pid(Pid), Pid =/= self() ->
+    proc_lib:stop(Pid, shutdown, ?SHUTDOWN_TIMEOUT);
+stop(Pid, true) when is_pid(Pid), Pid =/= self() ->
+    proc_lib:stop(Pid, kill, ?SHUTDOWN_TIMEOUT).


this implementation is not acceptable, we should never kill the user without giving it a chance to finalize its jobs. this interface must be based on some amoc_users_worker_sup API. The code must be unified, SHUTDOWN_TIMEOUT must be defined at amoc_users_worker_sup only.

src/users/amoc_user.erl

.github/workflows/ci.yml

integration_test/extra_code_paths/path2/dummy_scenario.erl

src/amoc_controller.erl

src/users/amoc_users_sup.erl

src/users/amoc_users_worker_sup.erl

test/controller_SUITE.erl

src/users/amoc_users_worker_sup.erl

src/users/amoc_users_sup.erl

DenysGonchar

thanks for the huge amount of work, it looks really greate :)

NelsonVides force-pushed the users/pool_supervision_tree branch from d2554fa to bc98fd8 Compare December 17, 2023 14:49

NelsonVides force-pushed the controller/interarrival_of_zero branch from 433ef34 to bd0690a Compare December 19, 2023 09:28

NelsonVides force-pushed the users/pool_supervision_tree branch from bc98fd8 to 9c4e2ce Compare December 19, 2023 09:28

NelsonVides force-pushed the controller/interarrival_of_zero branch from bd0690a to 56da56b Compare January 15, 2024 08:10

NelsonVides force-pushed the users/pool_supervision_tree branch from 1aa771e to 9df2ea3 Compare January 15, 2024 08:22

NelsonVides changed the base branch from controller/interarrival_of_zero to master January 15, 2024 09:12

NelsonVides force-pushed the users/pool_supervision_tree branch from b663b82 to 58e461c Compare January 15, 2024 09:12

NelsonVides added 10 commits January 15, 2024 12:38

Support interarrival of zero

4348817

Useful if for example we want to spawn a big pool of users all at once. Optimised by not setting any timers, but simply looping new user requests.

Have amoc_user stop itself encapsulated in proc_lib as well

487217e

Optimise starting many users immediately

d5770ae

Have the user supervisors loop over the users start instead of the controller.

Rename top supervisors and worker supervisors consistently

77f92d0

Ensure proper order of init and terminate sup trees

ed06eaf

Remove support for OTP24 as we use new funs from maps and rand

6b03cd4

Explain and add specs to amoc_users_sup

c942862

Fix issue with stop_children in worker_sup

6d6a7f1

NelsonVides force-pushed the users/pool_supervision_tree branch from 9578920 to d775663 Compare January 15, 2024 12:06

Keep track of worker_sup counts in the atomics array

921c03e

NelsonVides force-pushed the users/pool_supervision_tree branch from d775663 to 30e7646 Compare January 15, 2024 12:15

Ensure positive assignments to user sups

037dc05

NelsonVides force-pushed the users/pool_supervision_tree branch from 30e7646 to 037dc05 Compare January 15, 2024 12:16

Adapt integration test to the new internal representation

392f452

NelsonVides marked this pull request as ready for review January 15, 2024 15:56

NelsonVides changed the title ~~Users/pool supervision tree~~ Users supervision tree Jan 15, 2024

NelsonVides requested a review from DenysGonchar January 15, 2024 15:57

Update copyrights

326fd4b

DenysGonchar requested changes Feb 18, 2024

View reviewed changes

Document that user removal is asynchronous

a1bf272

NelsonVides force-pushed the users/pool_supervision_tree branch 5 times, most recently from 7effd57 to 29d9ea2 Compare February 21, 2024 11:10

NelsonVides added 4 commits February 21, 2024 12:20

Ensure better indexes for amoc_users_sup

865cbf7

Improve encapsulation of amoc_users_worker_sup

6e0c5de

Simplify amoc_users_sup init

cf64003

Reenable OTP24

1bf7b3f

NelsonVides force-pushed the users/pool_supervision_tree branch 2 times, most recently from b55455c to 8ad85c5 Compare February 21, 2024 11:36

NelsonVides added 4 commits February 21, 2024 16:57

Revert amoc_user stop

8d53e23

Store lists, counts and indexes in the amoc_users_sup pt state

1487478

Rework removal assignments more efficiently

cf9a6e5

Add more tests to the number of users being removed

11ffbba

NelsonVides force-pushed the users/pool_supervision_tree branch from c74f00f to 11ffbba Compare February 21, 2024 15:58