-
Notifications
You must be signed in to change notification settings - Fork 903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add memory limiter for the add entry request to avoid OOM #3139
base: master
Are you sure you want to change the base?
Conversation
The test come with next commit |
@zymap Have you tried configuring backpressure for the BK client and server? I think it covers your case. |
@dlg99 Thanks for your information. But looks like the maxAddsInProgressLimit only applied on the v3 request? |
I tried the backpressure with V2 protocol. It doesn't work. Because it is a server limitation, when there has a server slow to respond, the other server will respond successfully, but the client will hold the entry buffer until the slowest response is returned. Same issue: https://lists.apache.org/thread/r2yh6gxgnhzzz35o63db040wt2t6o108 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -103,6 +106,7 @@ public class BookieClientImpl implements BookieClient, PerChannelBookieClientFac | |||
private final BookieAddressResolver bookieAddressResolver; | |||
|
|||
private final long bookieErrorThresholdPerInterval; | |||
private Optional<MemoryLimitController> memoryLimitController; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final?
@@ -378,6 +414,31 @@ public void safeRun() { | |||
} | |||
} | |||
|
|||
private static class WriteAndFlushCallbackImpl implements WriteAndFlushCallback { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be probably recycled
private Optional<WriteAndFlushCallback> setMemoryLimit(final long entryId, final long entrySize) throws InterruptedException { | ||
if (getMemoryLimitController().isPresent()) { | ||
MemoryLimitController mlc = getMemoryLimitController().get(); | ||
mlc.reserveMemory(entrySize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should handle this with a timeout otherwise we may block forever?
@@ -200,6 +200,10 @@ public class ClientConfiguration extends AbstractConfiguration<ClientConfigurati | |||
protected static final String CLIENT_CONNECT_BOOKIE_UNAVAILABLE_LOG_THROTTLING = | |||
"clientConnectBookieUnavailableLogThrottling"; | |||
|
|||
// client memory limit options | |||
protected static final String CLIENT_MEMORY_LIMIT_ENABLED = "clientMemoryLimitEnabled"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is only for writes.
We should reflect this in the name
bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/BookieClientImpl.java
Outdated
Show resolved
Hide resolved
@hangc0276 I don't think that the issue is resolved. @zymap what do you think ? |
--- *Motivation* When there has a bookie is slowly to respond the add entry request, and the AQ is smaller than WQ, the client will hole the entry buffer and that will cause the memory won't be released. More context: apache/pulsar#14861 *Modifications* Add the memory limit for the client to send add request.
fa285bf
to
a55b91b
Compare
The test output looks like works as expected, I will try to add more tests and clean the code. logs
|
} | ||
|
||
public long getWriteMemoryLowWaterMark() { | ||
return getInt(WRITE_MEMORY_LOW_WATER_MARK, 64 * 1024 * 1024); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getLong?
} | ||
|
||
public long getWriteMemoryHighWaterMark() { | ||
return getInt(WRITE_MEMORY_HIGH_WATER_MARK, 256 * 1024 * 1024); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getLong?
*/ | ||
public class WriteWaterMark { | ||
private static final long DEFAULT_LOW_WATER_MARK = 1610612736L; // 1.5GB | ||
private static final long DEFAULT_HIGH_WATER_MARK = 2147483648L; // 2GB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the default lowWaterMark=64MB and HighWaterMark=256MB
in ClientConfiguration, but default lowWaterMark=1.5GB and HighWaterMark=2GB
in WriteWaterMark?
@Slf4j | ||
public class WriteMemoryCounter { | ||
private final WriteWaterMark writeWaterMark; | ||
private AtomicLong sizeCounter = new AtomicLong(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final type?
|
||
public void decrementPendingWriteBytes(long size) { | ||
long usage = sizeCounter.addAndGet(-size); | ||
log.info("decrement the size to {}", usage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change to debug level?
Master issue: #3231
Motivation
When there has a bookie is slow to respond the add entry request,
and the AQ is smaller than WQ, the client will hole the entry buffer
and that will cause the memory won't be released.
More context:
apache/pulsar#14861
Modifications
Add the memory limit for the client to the send add request.