-
Notifications
You must be signed in to change notification settings - Fork 3.6k
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Publish rate limit on broker to avoid OOM #5513
Comments
There's already a limit per-connection of max outstanding entries between broker and bookies. Do you have a precise scenario for this to happen? |
We have users doing load tests over Pulsar broker. The behavior we have observed is as follows: disks are overwhelmed. the entries are queuing up in the broker side. the load test is continuing and causing broker OOM. The tests were done in 2.4.2. They are upgrading the cluster to 2.5.0 and run the test again. Regarding the feature you introduced in #3985, we have asked them to try it out. However they can't really predicate the traffic and set a good publish rate in advance. Hence we are looking for a better solution at the broker level. We are thinking of reusing the mechanism you introduced in #3985. The idea is to disable autoRead when the max pending requests exceeds a threshold or the direct memory usage exceeds a threshold. |
Hi @rdhabalia @merlimat. Thanks for your concern. And thanks @rdhabalia for your great feature of #3985. |
Is it possible for you to reproduce OOM with master code-base. Just make sure that broker has change #3985 and you don't have to enable publish-rate limiting. I know disabling auto-read was not working all the time with So, if you are still able to reproduce OOM then can you please provide the steps to reproduce and error-log? |
We are able to reproduce OOM problem with master code-base (including your code change and doesn't enable publish-rate limiting). The steps to reproduce:
The heapdump of broker shows a lot of pending entries because the flink job is producing more than the bookkeeper cluster can accept. so all the entries are accumulating at the broker and cause broker crash due to OOM. The expectation is broker should degrade when reaching its capacity limitation and give back pressure to the clients. Broker shouldn't crash due to OOM. If we don't provide the capability as Jia proposed in #5710, pulsar can't be used in high-volume ingestion workload. |
I gave a try with similar setup and somehow I couldn't reproduce it. and reason broker doesn't go OOM because I think what @merlimat mentioned. Broker restricts max pending publish request per connection. also, introducing counter across all topics for throttling can cause bottleneck while publishing for all topics so, this feature might not be recommended for most of the users. so, I would recommend to depend on maxPendingRequestPerConnection rather adding more complexity and if that's not working then it's worth to investigate why it disabling channel still cause OOM. I have also created #5742 which can allow users to configure max-pending requests per connection if needed. |
I am not sure how do you setup. Are you running the flink connector? or you simulate it? Did you run enough parallelism to stress test the cluster? OOM is a common scene during our test.
But the broker is still facing OOM when the number of connections increase, no?
The feature we are adding is controlled by a flag. People need to pre-configure what is the rate that a broker can accept. That rate is typically aligned with your NIC configuration. E.g. 80% of your NIC bandwidth. We might make it smart by automatically adjust the rate based on NIC bandwidth and memory usage. but we will always make it easy to turn on/off with a flag. What is the side effect of adding this rate limiter at broker level?
We can't really depend on You mentioned that "I know disabling auto-read was not working all the time with ByteToMessageDecoder because of this reason and I think it's taken care in #3985." But the implementation that Jia proposed is using your implementation at #3985. If the problem is taken care by #3985, then there shouldn't be a problem in #5710; if the problem is not taken care by #3985, then it is a problem for both namespace level rate limiter and a broker level rate limiter, which I am not sure it is a problem for adding a broker level rate limiter not a problem for adding the namespace level rate limiter. |
no, I have pref broker setup and trying to publish messages with multiple processes of perf-producers. my only point is to figure out root cause of OOM and address it. if broker is going OOM with 20 topics and 20-400 producers then do we think auto-read is not working as expected and broker is still accepting messages after auto-read disable and keeping them in memory is causing OOM? |
Yes. we think auto-read is not working as expected, because The requirement from us is deadly simple - I have a broker, no matter how many clients send the request and how clients batch the request, broker should work as normal and give backpressure to the client when exceed the capacity that a broker offers. The ideal perfect solution is to disable auto-read when the resource usage (aka memory usage, cpu usage, or network usage) exceed a threshold and the process should be done automatically. However to achieve a perfect solution like that takes time and is usually complicated. The closest solution that we can provide is to have a broker-level rate limiter : 1) we can limit the traffic based on an aggregate throughput (bytes/second); 2) the mechanism is already available by #3985, we just piggyback. |
@rdhabalia putting OOM question aside, what are the concerns of adding a broker-level rate limiter as a feature? Just looking into features that Pulsar has, it usually has broker-level settings and namespace-level settings. As a feature, doesn't it make sense to also have a broker-level rate limiter in addition to namespace-level rate limiter? |
I wanted to figure out root cause of OOM as we are trying to target that issue mainly. I had verified autoread behavior while implementing #3985 so, was curious about the actual issue. also, all threads are going to do rate limit using one counter and that could be bottleneck as well so, one wants to avoid it as well. |
what else we can provide you to figure out the root cause of OOM?
We also verified #3985 works as expected. As I explained, auto-read mechanism is not a problem. The problem is we need to a broker-level metric/mechanism to disable auto-read, which is not a namespace based or a connection based mechanism. That's why we need a broker-level rate limiter.
The question here is more about: do we need a broker-level mechanism to meet the requirements of running Pulsar at high load? If we need, then why not adopt the rate limiter approach? If we don't, then what are the other approaches? (I have tried to explain that existing approaches don't apply to our requirements as above) |
The current implementation of RateLimiter is using LongAdder for counting the bytes and msgs. A LongAdder maintains multiple cells for counters. They are only aggregated when LongAdder is also used for other metrics. Shouldn't they be all concerned as well? |
sure. If you feel this will be useful for your usecase and useful to your users then we can add it. 👍 |
@rdhabalia @sijie Thanks for your comments. And many thanks for @rdhabalia 's PR #3985, It worked well in our test. |
Fixes #5513 ### Motivation Through #3985, user could set the publish rate for each topic, but the topic number for each broker is not limited, so there is case that a lot of topics served in same broker, and if each topic send too many message, it will cause the messages not able to send to BookKeeper in time, and messages be hold in the direct memory of broker, and cause Broker out of direct memory. ### Modifications - add broker publish rate limit base on #3985, - add unit test. ### Verifying this change unit test passed.
Out Of Memory is seen frequently while client produced message too quickly and used all the direct memory.
This happens like this:
If in the 2nd step the direct memory not get released, then the Broker will get OOM.
It would be great to have a way to detect the memory pressure, and do the publish rate limit if lack of memory; once the memory get back, cancel the rate limit.
The text was updated successfully, but these errors were encountered: