dispatcher: fix weighted sort for n<100 calls #3531

devopsec · 2023-08-07T15:56:05Z

update weighted/relative weighted algos to use probability based sort
calls from 0-99 now follow the weighted distributions
removed the now unused shuffle_uint100array() function

Pre-Submission Checklist

Commit message has the format required by CONTRIBUTING guide
Commits are split per component (core, individual modules, libs, utils, ...)
Each component has a single commit (if not, squash them into one commit)
No commits to README files for modules (changes must be done to docbook files
in doc/ subfolder, the README file is autogenerated)

Type Of Change

Small bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds new functionality)
Breaking change (fix or feature that would change existing functionality)

Checklist:

PR should be backported to stable branches
Tested changes locally
Related to issue #XXXX (replace XXXX with an open issue number)

Description

The previous random sort for weights in a dispatcher set distributed correctly only after a minimum of n=100 calls.
This PR fixes the sort algorithm to distribute calls based on those weights, even for n<100 calls.
The real benefit here is customers can see the weighted distribution occur sooner now, when testing.

miconda · 2023-08-07T17:20:03Z

Thanks for this PR! However, it brings significant changes to the exiting behaviour/design of the weight distribution, which was done in purpose to be randomised along the 100 calls.

I think your PR should be changed to add a new dispatching, algorithm, leaving the existing one as it is.

A few more comments for coding style, to keep it as much as possible coherent: declare variables at the beginning of function or block; one line comments should be in a single line /* ... */, not three lines.

devopsec · 2023-08-07T17:38:27Z

@miconda thanks for reviewing this.
I chose to go with replacement since I believe this better represents the original intention of the weighted / relative weighted algorithms.

Now to be honest, I could be completely wrong in thinking that.
At n>100 calls; or simply when scaling, the new algorithm distributes the same percentage of calls to each destination.
The only difference being the ordering the destinations are selected.

If you think that is incorrect then I can just move it to a new dispatching algorithm.

And i'll update the formatting on those comments as well, thanks!

henningw · 2023-08-07T17:46:49Z

If I understand it correctly, then going for a new algorithm makes only sense if we consider that the algorithm seems to be broken for less than 100 calls a feature of this algorithm. ;-) Its probably a better idea to fix the bug, instead of offering two algorithms for our user, a broken and a fixed one.

- update weighted/relative weighted algos to use probability based sort - calls from 0-99 now follow the weighted distributions - removed the now unused shuffle_uint100array() function

devopsec · 2023-08-07T18:17:12Z

@miconda updated formatting per #3531 (comment)

@henningw I agree. I think it is a bug that calls are not distributed per the weights, until a minimum of 100 calls are made.

miconda · 2023-08-07T21:00:22Z

@henningw: the weight being expressed in percentage, you can't ensure proper accuracy for less than 100 calls, maybe in some particular cases, but not for most of the possible variants. Think about a simple to exemplify case of 100 destinations each with weight 1, if you have 2 calls, then two destinations get 50% of the traffic and the rest 0%. So try to fix that, I would love to see a solution.

@devopsec: I haven't checked the code, but I guess it is more like getting to a round robin with some destinations being many times in the group, like for example when there are two addresses, first with weight 75 and second with weight 25: first is added 3 times then the second (... and repeat). Is it randomised within these 4? Or just going to next one for each new call? If my guess is confirmed, then it is a new algorithm, the previous behaviour is lost, it is no longer a randomised distribution across 100 calls, but an ordered distribution (or randomised in group lower than 100).

@devopsec: looking at the updated commit, some fields of pointers are used before checking that the pointer is NULL: declaration of arrays using a field of a structure pointer -- we tried to avoid variable sized arrays in the past for compatibility with old compilers/C standards, I am not really against them, iirc they are allowed by C99, they have to bee relocated after the null check The old code does pkg_malloc/free() for ds_dests_flags for that reason.

miconda · 2023-08-07T21:17:34Z

For the records, variable length arrays seems to be no longer mandatory to be supported from C11 on, likely the modern compilers support them, though -- here is a good summary comparing with malloc:

https://stackoverflow.com/questions/58163105/do-i-really-need-malloc/58163652#58163652

devopsec · 2023-08-08T02:06:00Z

@miconda yes the new code is not using randomized selection for the destination. It is selecting based on probability throughout (recalculate the percentage of calls to that destination each iteration, out of the total calls made). Think of the sorting loop as if it were "processing a new call" per each iteration of i. Maybe an explanation based on your above example; weight1=75, weight2=25, would clarify.

First iteration of i no calls have been processed, so the call goes to dst1. On the second iteration of i dst1 has 1/1 calls, 100% is greater than the intended weight, so we distribute the call to dst2. for i=3 both dsts have 50% of the calls, but dst1 is intended to have 75% of the calls, so the call goes to dst1. This repeats until 100 "calls" have been processed.

The effect is the exact same in terms of distribution of calls as a percentage, i.e. the intention of the algorithm, but the "bug fix" is that distribution is equivalent at high call volumes and low call volumes. Thus, it is true that the new algorithm does not use random sorting under 100 calls like the last, but I believe that is a bug fix, not a new feature. I think that was a decent explanation, but if not please ask for more clarification.

On the topic of memory allocation, thanks for the reference, I am always happy to learn! In my live tests the compiler happily accepted my code and gdb gave me good output back... I don't tout to be a master of memory management so I will defer to you on whether pkg_malloc/free() is required here to be compliant with all the intended C specs.

miconda · 2023-08-08T06:28:21Z

@devopsec: at the end you get more or less to a specific order of selecting the addresses to route to. For example, with two addresses one 99 and the other 1, the one with weight 1 is going to be placed the 2nd (or 1st depending on the initial order). If weights are 90 and 10, it should end up like 1st, 2nd, 1st, 1st, 1st, 1st, 1st, 1st, 1st, 1st, 1st, 2nd, 1st, 1st.... For calls of same type (e.g., codecs, audio/video) and duration, could bring some benefits, but for calls of random type and duration the random ordering proved to give pretty good average load of resources over long time.

With the above remarks, considering that random ordering was done on purpose at the time of implementation, I don't consider this PR a bug fix, but a different distribution algorithm (for example it can be called percentage ordered distribution).

devopsec · 2023-08-08T15:27:28Z

@miconda Hmmm, I see your point there. No worries, I'll resubmit as a new algorithm.

src/modules/dispatcher/dispatch.c

+	if(weights_total < 100) {
+		LM_INFO("extra weight %d for last destination in group %d\n",
+				(100 - weights_total), dset->id);
+		weights[last_valid_dst] = weights[last_valid_dst] + (100 - weights_total);


src/modules/dispatcher/dispatch.c

+	if(rw_slices_total < 100) {
+		LM_INFO("extra rweight percentage %d for last destination in group %d\n",
+				(100 - rw_slices_total), dset->id);
+		rw_slices[last_valid_dst] = rw_slices[last_valid_dst] + (100 - rw_slices_total);


- add support for changing the column delimeter in a hash table - add support for changing the column null character

miconda · 2023-10-05T06:11:22Z

Is any work still planned on refactoring to a new algorithm? Maybe this can be closed and a new PR created when it is done.

devopsec · 2023-10-05T13:47:36Z

@miconda i'll open this as a new PR when I can get back around to it, thanks!

dispatcher: fix weighted sort for n<100 calls

13351fa

- update weighted/relative weighted algos to use probability based sort - calls from 0-99 now follow the weighted distributions - removed the now unused shuffle_uint100array() function

devopsec force-pushed the master branch from d21cf38 to 13351fa Compare August 7, 2023 18:14

github-advanced-security bot found potential problems Aug 23, 2023

View reviewed changes

devopsec added 2 commits September 14, 2023 10:06

Merge remote-tracking branch 'upstream/master'

056007d

htable add column packing features

972723b

- add support for changing the column delimeter in a hash table - add support for changing the column null character

devopsec closed this Oct 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dispatcher: fix weighted sort for n<100 calls #3531

dispatcher: fix weighted sort for n<100 calls #3531

devopsec commented Aug 7, 2023 •

edited

miconda commented Aug 7, 2023

devopsec commented Aug 7, 2023 •

edited

henningw commented Aug 7, 2023

devopsec commented Aug 7, 2023

miconda commented Aug 7, 2023

miconda commented Aug 7, 2023

devopsec commented Aug 8, 2023

miconda commented Aug 8, 2023 •

edited

devopsec commented Aug 8, 2023

miconda commented Oct 5, 2023

devopsec commented Oct 5, 2023

dispatcher: fix weighted sort for n<100 calls #3531

dispatcher: fix weighted sort for n<100 calls #3531

Conversation

devopsec commented Aug 7, 2023 • edited

Pre-Submission Checklist

Type Of Change

Checklist:

Description

miconda commented Aug 7, 2023

devopsec commented Aug 7, 2023 • edited

henningw commented Aug 7, 2023

devopsec commented Aug 7, 2023

miconda commented Aug 7, 2023

miconda commented Aug 7, 2023

devopsec commented Aug 8, 2023

miconda commented Aug 8, 2023 • edited

devopsec commented Aug 8, 2023

miconda commented Oct 5, 2023

devopsec commented Oct 5, 2023

devopsec commented Aug 7, 2023 •

edited

devopsec commented Aug 7, 2023 •

edited

miconda commented Aug 8, 2023 •

edited