-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Improvement] Support skip memory data when use multiple replicas #400
Conversation
Maybe we should remove BLOCK_RANGE filter first. |
# Conflicts: # client/src/main/java/org/apache/uniffle/client/impl/ShuffleReadClientImpl.java # client/src/main/java/org/apache/uniffle/client/request/CreateShuffleReadClientRequest.java # common/src/main/java/org/apache/uniffle/common/util/RssUtils.java # integration-test/common/src/test/java/org/apache/uniffle/test/ShuffleServerWithMemoryTest.java # storage/src/main/java/org/apache/uniffle/storage/factory/ShuffleHandlerFactory.java # storage/src/main/java/org/apache/uniffle/storage/handler/impl/MemoryClientReadHandler.java
TASK_BITMAP
support multiple replicas
Could we add a ut for multi replica filter? |
@@ -86,7 +108,7 @@ public ShuffleDataResult readShuffleData() { | |||
partitionId, | |||
lastBlockId, | |||
readBufferSize, | |||
expectedTaskIdsBitmapFilterEnable ? expectTaskIds : null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we remove expectedTaskIdsBitmapFilterEnable
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If expectedTaskIdsBitmapFilterEnable=false, expectTaskIds will be null
ShuffleServerClient shuffleServerClient) { | ||
ShuffleServerClient shuffleServerClient, | ||
Roaring64NavigableMap expectBlockIds, | ||
Roaring64NavigableMap processBlockIds) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we lazily create memory handler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like
Line 39 in 4b3d37d
private Callable<ClientReadHandler> hotHandlerCreator; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can create the handler when we need to use them and the memory handler don't need to care how to construct taskbitmap filter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we lazily create memory handler?
Yes. Other handlers can be lazily created too. Should we raise a new pr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this pr need the lazy creation of memory handler. It will help us simplify the logic of this pr. You can implement it in another pr. And we merge that pr first . It's also ok to implement them in this pr at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, i will do it in this pr
Codecov Report
@@ Coverage Diff @@
## master #400 +/- ##
============================================
- Coverage 58.54% 57.60% -0.94%
+ Complexity 1611 1393 -218
============================================
Files 195 173 -22
Lines 11035 9223 -1812
Branches 966 816 -150
============================================
- Hits 6460 5313 -1147
+ Misses 4195 3573 -622
+ Partials 380 337 -43
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
this.appId = appId; | ||
this.shuffleId = shuffleId; | ||
this.partitionId = partitionId; | ||
this.readBufferSize = readBufferSize; | ||
this.shuffleServerClient = shuffleServerClient; | ||
this.expectTaskIds = expectTaskIds; | ||
this.expectedTaskIdsBitmapFilterEnable = expectedTaskIdsBitmapFilterEnable; | ||
if (expectedTaskIdsBitmapFilterEnable) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we calculate taskBitmap in the upper layer? Could we remove expectBlockIds
and processBlockIds
? It's enough to pass the paramter expectTaskIds
?
realExceptBlockIds.xor(request.getProcessBlockIds()); | ||
expectTaskIds = RssUtils.generateTaskIdBitMap(realExceptBlockIds, request.getIdHelper()); | ||
} else { | ||
expectTaskIds = request.getExpectTaskIds(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should epectTaskIds
be request#getExpectTaskIds()
instead of null
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My fault
@@ -116,14 +119,21 @@ public ClientReadHandler createSingleReplicaClientReadHandler(CreateShuffleReadH | |||
private ClientReadHandler getMemoryClientReadHandler(CreateShuffleReadHandlerRequest request, ShuffleServerInfo ssi) { | |||
ShuffleServerClient shuffleServerClient = ShuffleServerClientFactory.getInstance().getShuffleServerClient( | |||
ClientType.GRPC.name(), ssi); | |||
Roaring64NavigableMap expectTaskIds; | |||
if (request.isExpectedTaskIdsBitmapFilterEnable()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In current implement, we only enable taskbitmapFilter
when the partition is skewed. Could we turn on the filter condition for only multi replicas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Roaring64NavigableMap expectTaskIds; | ||
if (request.isExpectedTaskIdsBitmapFilterEnable()) { | ||
Roaring64NavigableMap expectTaskIds = null; | ||
if (request.isExpectedTaskIdsBitmapFilterEnable() || request.getShuffleServerInfoList().size() > 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we put the similar logic in the same place? It's more easy to manage.
Line 125 in 66c752f
this.expectedTaskIdsBitmapFilterEnable = !(mapStartIndex == 0 && mapEndIndex == Integer.MAX_VALUE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, let @zuston take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
What changes were proposed in this pull request?
Support filter data when use multiple replica
Why are the changes needed?
Now, filter the data that the client had read from the first replica is not supported when we use multiple replica. This will cause duplicate data to be read.
Does this PR introduce any user-facing change?
No
How was this patch tested?
UT