Skip to content

Conversation

@Eistern
Copy link

@Eistern Eistern commented Sep 26, 2024

What are we trying to achieve with this PR:

Fixing NoSuchElementException when using the DirectExecutorService as compressionExecutor.

Initial exception

java.util.concurrent.CompletionException: java.util.NoSuchElementException
	at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:315)
	at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:320)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1807)
	at com.google.common.util.concurrent.MoreExecutors$DirectExecutorService.execute(MoreExecutors.java:327)
	at java.base/java.util.concurrent.CompletableFuture.asyncRunStage(CompletableFuture.java:1818)
	at java.base/java.util.concurrent.CompletableFuture.runAsync(CompletableFuture.java:2033)
	at tech.ydb.topic.write.impl.WriterImpl.acceptMessageIntoSendingQueue(WriterImpl.java:156)
	at tech.ydb.topic.write.impl.WriterImpl.free(WriterImpl.java:293)
	at tech.ydb.topic.write.impl.WriterImpl.access$1400(WriterImpl.java:37)
	at tech.ydb.topic.write.impl.WriterImpl$WriteSessionImpl.onWriteResponse(WriterImpl.java:454)
	at tech.ydb.topic.write.impl.WriterImpl$WriteSessionImpl.processMessage(WriterImpl.java:471)
	at tech.ydb.topic.impl.SessionBase.lambda$start$0(SessionBase.java:45)
	at tech.ydb.core.impl.call.ReadWriteStreamCall.onMessage(ReadWriteStreamCall.java:127)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInternal(ClientCallImpl.java:657)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:644)
	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.util.NoSuchElementException
	at java.base/java.util.LinkedList.removeFirst(LinkedList.java:281)
	at java.base/java.util.LinkedList.remove(LinkedList.java:696)
	at tech.ydb.topic.write.impl.WriterImpl.free(WriterImpl.java:295)
	at tech.ydb.topic.write.impl.WriterImpl.moveEncodedMessagesToSendingQueue(WriterImpl.java:206)
	at tech.ydb.topic.write.impl.WriterImpl.lambda$acceptMessageIntoSendingQueue$0(WriterImpl.java:158)
	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
	... 17 more

Issue analysis

Currently, we are using SyncWriter in order to write high amount of data into the YDB topic
We've noticed that during peak load per thread (when the availableSizeBytes starts getting exceeded), we get NoSuchElementException which cause the write session to reboot and our calls to time out

Seems like it is happening due to the chain of calls

onWriteResponse -> free -> acceptMessageIntoSendingQueue -> moveEncodedMessagesToSendingQueue -> free

All elements in this chain are executed in the same thread (with the acquired lock on incomingQueue) due to the usage of the Guava's DirectExecutorService

Proposal

Pop the message from the incomingQueue only once during the completion stage of the incomingMessage.future (as incomingQueue.remove() can only target this exact incomingMessage)

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Project coverage is 47.41%. Comparing base (22ed47b) to head (2f92d0a).
Report is 12 commits behind head on master.

Files with missing lines Patch % Lines
...ain/java/tech/ydb/topic/write/impl/WriterImpl.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master     #321      +/-   ##
============================================
+ Coverage     47.37%   47.41%   +0.03%     
- Complexity     1727     1729       +2     
============================================
  Files           304      305       +1     
  Lines         12350    12363      +13     
  Branches       1229     1233       +4     
============================================
+ Hits           5851     5862      +11     
+ Misses         6044     6043       -1     
- Partials        455      458       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pnv1 pnv1 merged commit 14c04e1 into ydb-platform:master Sep 27, 2024
8 checks passed
@Eistern Eistern deleted the bugfix/update-writer-for-direct-executor branch September 27, 2024 12:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants