-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Pubsub] Pubsub module command batch part 1 #16167
Conversation
object_status_subscriber_->Subscribe( | ||
rpc::ChannelType::WORKER_REF_REMOVED_CHANNEL, addr.ToProto(), | ||
object_id.Binary(), message_published_callback, publisher_failed_callback); | ||
std::move(sub_message), rpc::ChannelType::WORKER_REF_REMOVED_CHANNEL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This currently is doing nothing. This PR will only implement the command batch in the cpp layer, but the real code actually doesn't use it.
/// Subscribe | ||
/// | ||
message Command { | ||
ChannelType channel_type = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably unnecessary as Yi pointed out before. I might remove it in the future PRs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
src/ray/pubsub/subscriber.cc
Outdated
// Update the command in the FIFO order. | ||
int64_t updated_commands = 0; | ||
while (!commands_.empty() && updated_commands < command_max_batch_size_) { | ||
auto &command = commands_.front(); | ||
auto *new_command = long_polling_request.add_commands(); | ||
new_command->Swap(command.get()); | ||
commands_.pop(); | ||
updated_commands += 1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some questions here. What do we do for the rest of the commands_? Do you mean that we'll keep calling this function until there is nothing left?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They will be delivered in the next long polling request. But actually, that's a good point. We should reply right away if there are commands left, so that this will be consumed without waiting. I will fix that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, here's the workflow;
Subscriber batches commands in a long polling "request" -> publisher replies right away although there are no pub messages -> in this way, the subscriber keeps sending long polling "request" until all commands are served.
I can probably add this comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this will be handled in the next PR where we actually migrate all subscribe to command based subscription.
Could you add some comment about what's the definition of |
@iycheng Sure. I will update the comment. Here command is from Redis pubsub terminologies. |
Got it. Sorry I'm not very familiar with redis pubsub so I think I lost here. |
What are terminologies from other systems? Is it just Sub/unsub messages? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I'd wait for a final pass by @iycheng before merging to confirm that his feedback was fully addressed.
// ObjectID that contains object_id. This is used when an ObjectID is stored | ||
// inside another object ID that we do not own. Then, we must notify the | ||
// outer ID's owner that the ID contains object_id. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
// ObjectID that contains object_id. This is used when an ObjectID is stored | |
// inside another object ID that we do not own. Then, we must notify the | |
// outer ID's owner that the ID contains object_id. | |
// ID of object that contains `reference`. This is used when an ObjectID is stored | |
// inside another object that we do not own. Then, we must notify the | |
// outer object's owner that the object contains `reference`. |
// No more request after that. | ||
ASSERT_EQ(owner_client->PopRequest(), nullptr); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice tests!
Feel good about this one. But please update the shared ptr to unique ptr (or you can try &command as well). |
local_object_manager_test is flaky in the master and the first mac build occasionally failed in Travis. This PR doesn't effectively change any external behavior, so it should be safe. Merging it. |
Why are these changes needed?
Please check this issue for more details; #16048. This implements the pubsub command batch to improve throughput + solve ordering issues. The proposal is at https://docs.google.com/document/d/1OnbmqiXIuyOfJwv_uEShTKrVT9t_vuxNXjX1dc02GW4/edit#heading=h.gi8hbyr0xqg7.
Related issue number
#16048
Checks
scripts/format.sh
to lint the changes in this PR.