-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible locking in SCTP/IP4 rcv thread #633
Comments
There is for example a global table for looking up the association... In the kernel we use a lock to protect this data structure and differentiate between 'locking for reading' and 'locking for writing'. Multiple readers are allowed to use the data structure at a time, but only a single writer. This is all managed by using macros, see The code is written in a way that you can have multiple threads getting packets and calling So you might want to replace the |
@tuexen |
Thanks for the reply!
We did exactly this. Our hash function which selects thread for an incoming packet is based on using verification tag /*
All packets for one association are handled by a single worker thread.
Target thread choice is based on verification tag from SCTP header.
*/
int get_target_thread(struct mbuf **recvmbuf, bool ipv6)
{
struct sctphdr *sh;
if (ipv6)
sh = (struct sctphdr *)mtod(recvmbuf[0], struct sctphdr *);
else {
struct ip *iphdr = mtod(recvmbuf[0], struct ip *);
sh = (struct sctphdr *)((caddr_t)iphdr + sizeof(struct ip));
}
uint32_t v_tag = sh->v_tag;
return v_tag % worker_threads_number;
}
We also thought about this and already changed it. The commit is mentioned by @cukiernick above. Do you think there is any other possible area where performance can be improved? |
That looks exactly like I had it in mind, except for replacing Can I take (part of) the patch and integrate it? |
I think that is it from the perspective of distributing the load on multiple cores. So you can try to reduce the CPU load:
For more optimisations I guess you need to run the stack with the workload you are expecting and do some profiling. I would expect that you can gain some performance by organising the data used in a way that the number of cache missed are reduced, for example. |
Sure, feel free to use the code. |
Great, thanks! |
We are running our application in container environment on AMD64
We already did this using some help form @weinrank 's code - kanakagrawal@e46c0f4
This sounds interesting. @protodef you had some similar thoughts. How about sharing them here?
Thanks! We'll think on this. |
Just to add to this, we also tried doing pinning of Any help in this regards is appreciated. Thanks! |
Any reason you can't use the kernel SCTP of the host? I think for TCP or UDP based applications in container environments this is done... |
Not of the top of my head. I guess using some profiler should give you some hints. I know that @stewrtrs did years ago some rounds of optimisations for the kernel stack. But this hasn't been done for a while and especially not for the userland stack. So I guess there are some low hanging fruits... |
Our application is a loadbalancer for kubernetes environment. And we needed a feature of binding address to socket which is not on that machine. So, we modified some part of this userland stack to make it happen. This was not possible with kernel sctp as we couldn't modify anything in kernel for container environment. |
Ahh, I see. This also explains why you are optimising for performance. Thanks for the explanation. |
I have a question, since I'm not familiar with the environment you are using: I seems that the load balancer terminates the incoming SCTP associations. Is that needed? Do the nodes behind the load balancer appear under the same IP addresses than the load balancer? I was wondering if you could have a load balancer with its own IP address. Peers send the packet containing the INIT chunk to the load balancer, the load balancer decides which node should handle it, and forward the packet containing the INIT chunk to that node. That node would send back an INIT-ACK, using its own addresses, and a new, yet to be defined parameter, saying "you sent your packet containing the INIT chunk to the address contained in this parameter", and therefore the peer can match it. Would this work? Does it make sense? |
The load balancer right now terminates the incoming SCTP associations. The peer sending traffic doesn't know the IP addresses of the nodes behind loadbalancer. Let me describe our use case in detail. We are making loadbalancer for communication between RAN and AMF in 5G telecom. There are the following specific requirements -
I couldn't fully understand what you mean. But I think you are assuming the client and server can talk directly after the INIT packet is handled using the information in INIT-ACK. This won't work for us because servers' IP addresses are behind the loadbalancer and not exposed to the client. |
OK. (although I don't like the load balancer spoofing the source addresses)
Why hide this failure? I don't think you can't guarantee the sequencing or reliability requirements when this happens. Wouldn't it be much simpler, the client would recover the messages it wants and reestablish a new association with the load balancer?
The client sends an INIT chunk to the address of the load balancer. It must know this address in advance. When the load balancer receives that packet, it decides which server will handle this association and forwards the packet containing the INIT chunk to the server. The server receives this packet (it has still the source address of the client, since the load balancer did not modify the packet), and responds with an INIT-ACK. The server uses its own addresses. To allow the client to match the INIT-ACK with the INIT it has sent earlier, the server adds a new parameter to the INIT, let's call it INIT-FORWARDED, which contains the IP address of the load balancer. So this extension would need to be supported by the client and the server and would allow a load balancer only to deal with packets containing INIT chunks. So the scalability should be much better and you don't have a single point os failure.
Why can't you expose the addresses of the servers to the client? The above solution would do this, but I don't see why that is not possible... |
Hi,
We are using this userland SCTP stack with raw socket for our application. There was an issue of performance that more than ~40k tps was not working with our application. We found that bottleneck was SCTP/IP4 rcv thread and tried to parallelize it. To do that, we created n worker threads to do the actual work of the stack. SCTP/IP4 rcv thread will only read packets from raw socket and pass it to worker threads. Using a hash function, it is ensured that all packets of same association goes to same worker threads. This increased the performance a lot (around 260k tps). I think more tps can be supported but there is still some locking in between worker threads. SCTP/IP4 rcv thread was waiting for worker threads to be ready and this was causing performance drop but CPU usage of worker threads were around 50% only.
Can you tell if there are some global data being accessed across association for which there might be locking? Our assumption was that all locking will be within an association and so ensuring a single worker thread process all packets of an associaiton will reduce the performance impact of locking.
The text was updated successfully, but these errors were encountered: