Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rtpengine: hashing algorithm not offering a good enough distribution #1911

Open
ionutionita92 opened this issue Mar 27, 2019 · 10 comments

Comments

Projects
None yet
3 participants
@ionutionita92
Copy link
Contributor

commented Mar 27, 2019

We are having problems in our environment with the way rtpengine distributes RTP sessions. It doesn't provide a good enough distribution among RTP nodes and this is mostly because the current algorithm is quite simple, doing the sum of the characters in the callid and after that applying a 0xFF mask over this(which contributes even more to the distribution of this algorithm). This may affect our systems under heavy load and we've been looking for an alternative.

To have something to compare I have designed a tesing application in which I generate CallIds, apply a hashing algorithm and assign the result to a set/node. For example I have 10 nodes each of weight 10. Let's say hash value is 125 I do 125 mod (10 * 10) = 25 therefore the hash will go to node 3.
I've been running some tests over various hashing algorithms such as: SHA 256, jenkins(simple hashing algorithm found on wikipedia), md5, sha1, ripemd, crc32 and the one used in the rtpengine. For each algorithm I test from 1000 randomly generated callids to 10 million with a multiplyer of 10(5 tests for each algorithm). The callIds have 16 randomly generated bytes and 16 fixed bytes which are always the same. This way I'm trying to reproduce the way callIds are generated most of them containing a randomly generated part and a fixed part.

From the results I can tell for sure that multiple rounded hashing algorithms from libssl(md5, sha1, ripemd, sha 2) offer a much better distribution than a trivial hashing function so we're looking to change the current algorithm with SHA 1 because SHA 2 takes more time without significant improvement.

It would be nice to push this upstream, but there are some issues. First, using these functions, rtpengine module will depend on libssl. What is your opinion about this? I've looked if we can copy any of the algos from libssl into the code but this is not as easy as it seems. Secondly, if you agree to replacing the algorithm, how would you approach the issue: only replace the current one or add the posibility for the user to select between multiple algorithms?

Any suggestion of testing scenarios or hashing algorithms that we've not considered is much welcome.

The results of the tests are attached to his post.
h_funcs_compare.txt

@miconda

This comment has been minimized.

Copy link
Member

commented Mar 27, 2019

If you want to use crypto hashing from libcrypto, the best way to do it is to add what you need to crypto module and then have an option via modparam in rtpengine module to bind to crypto module and use that function. Otherwise, it should be using the internal hashing function (can be changed from what is doing now with another hash function from those available in the core).

Not exactly the same, but a similar solution was made for having better "random" call-id for requests generated by Kamailio. That is done also via "crypto" module, but setting some core callback, instead of exporting an intermodule API. See this parameter:

@ionutionita92

This comment has been minimized.

Copy link
Contributor Author

commented Mar 28, 2019

Great, exactly what I need. The only question I have is can I break tm() if I use generate_callid()? I see a static counter in crypto_generate_callid(). At first glance it looks safe, even though using tm + rtpengine won't generate the same results as tm only.

@miconda

This comment has been minimized.

Copy link
Member

commented Mar 28, 2019

You do not have to use what exists now, like the generate_callid(), that can stay as it is and used only by tm.

You can add something in a similar fashion, in the way that the code depending on libcrypto/libssl is in the crypto module and then you call it from rtpengine. If you don't know how that can be done, add the function you want in crypto module, then I will export it to be used by rtpengine.

@ionutionita92

This comment has been minimized.

Copy link
Contributor Author

commented Mar 28, 2019

I see. I'll write a function in crypto generating hash for a given value and export it through srapi. Is that ok?

@miconda

This comment has been minimized.

Copy link
Member

commented Mar 28, 2019

I think it is ok with such option, too. Maybe we other parts of code want to use such hashing function in the future. That's the reason I added in this way for callid, thinking that maybe even requests generated in other parts to be sent stateless will use it.

Otherwise, there is a way to export a function using module interface. You can look at rr module, it exports load_rr(...) via mod interface, then in the rr/api.h is an inline function load_rr_api(...) that makes it easy for other modules to load exported API. Then you can look at path or uac modules how they load the rr API and use it.

@ionutionita92

This comment has been minimized.

Copy link
Contributor Author

commented Mar 28, 2019

Great. Thank you for the tips!

@henningw

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2019

Just to add another comment here:
Using a "real" cryptographic hash function is of course better from mathematical Point of view. But it might be good enough to just use the CRC32 function, that e.g. also carrierroute uses. If I remember the 1&1 internal test suite good enough, this is for a larger number of request quite ok (+/- 1% accuracy). Have a look to the carrierroute module and to core/hash_func.[c,h]. This could save you a bit of work and would be usable without the openssl dependency.

@miconda miconda added the enhancement label Apr 1, 2019

@henningw

This comment has been minimized.

Copy link
Contributor

commented Apr 25, 2019

@ionutionita92 any news on this enhancement idea from your side?

@ionutionita92

This comment has been minimized.

Copy link
Contributor Author

commented Apr 30, 2019

@ionutionita92 any news on this enhancement idea from your side?

the patch is ready, i have to test it in our systems to see how it behaves, but lately I've been busy withan internal project and didn't have much time for this. Will try to come with an update as soon as possible.

@henningw

This comment has been minimized.

Copy link
Contributor

commented Apr 30, 2019

Thank you for the feedback, no hurry necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.