Skip to content
This repository has been archived by the owner on Feb 20, 2023. It is now read-only.

Investigate moving replication serialization to Messenger thread #1582

Open
jkosh44 opened this issue May 19, 2021 · 2 comments
Open

Investigate moving replication serialization to Messenger thread #1582

jkosh44 opened this issue May 19, 2021 · 2 comments
Labels
performance Performance related issues or changes.

Comments

@jkosh44
Copy link
Contributor

jkosh44 commented May 19, 2021

Feature Request

Summary

Currently all replication messages are serialized in the LogSerializationTask thread. This can potential slow down the LogSerializationTask. Additionally under the sync durability and async replication configuration, transactions will have to unnecessarily wait for replication messages to be serialized before committing.

The replication messages are serialized in the following places in the primary node:
Log Record Batches:

messenger::callback_id_t destination_cb =
messenger::Messenger::GetBuiltinCallback(messenger::Messenger::BuiltinCallback::NOOP);
const msg_id_t msg_id = msg.GetMessageId();
const std::string msg_string = msg.Serialize();
for (const auto &replica : replicas_) {
Send(replica.first, msg_id, msg_string, messenger::CallbackFns::Noop, destination_cb);
}

Notify OAT:

const msg_id_t msg_id = msg.GetMessageId();
const std::string msg_string = msg.Serialize();
for (const auto &replica : replicas_) {
Send(replica.first, msg_id, msg_string, messenger::CallbackFns::Noop, destination_cb);
}

Solution

It might be beneficial to move the message serialization to the Messenger thread itself. That way the LogSerializerTask won't be slowed down by message serialization.

Currently the Messenger thread maintains a map of pending messages

std::map<message_id_t, PendingMessage> pending_messages_;

struct PendingMessage {
common::ManagedPointer<zmq::socket_t> zmq_socket_;
std::string destination_id_;
ZmqMessage msg_;
bool is_router_socket_;
std::time_t last_send_time_;
};

The pending message list contains many ZmqMessages which hold the messages as strings.

/** An abstraction around ZeroMQ messages which explicitly have the sender specified. */
class ZmqMessage {
public:
/** @return The ID of this message. */
message_id_t GetMessageId() const { return message_id_; }
/** @return The callback to invoke on the source. */
callback_id_t GetSourceCallbackId() const { return source_cb_id_; }
/** @return The callback to invoke on the destination. */
callback_id_t GetDestinationCallbackId() const { return dest_cb_id_; }
/** @return The routing ID of this message. */
std::string_view GetRoutingId() const { return std::string_view(routing_id_); }
/** @return The message itself. */
std::string_view GetMessage() const { return message_; }
/** @return The raw payload of the message. */
std::string_view GetRawPayload() const { return std::string_view(payload_); }
private:
friend Messenger;
friend ZmqUtil;
/**
* Build a new ZmqMessage from the supplied information.
* @param message_id The ID of this message.
* @param source_cb_id The callback ID of the message on the source.
* @param dest_cb_id The callback ID of the message on the destination.
* @param routing_id The routing ID of the message sender. Roughly speaking, "who sent this message".
* @param message The contents of the message.
* @return A ZmqMessage encapsulating the given message.
*/
static ZmqMessage Build(message_id_t message_id, callback_id_t source_cb_id, callback_id_t dest_cb_id,
const std::string &routing_id, std::string_view message);
/**
* Parse the given payload into a ZmqMessage.
* @param routing_id The message's routing ID.
* @param message The message for the destination.
* @return A ZmqMessage encapsulating the given message.
*/
static ZmqMessage Parse(const std::string &routing_id, const std::string &message);
/** Construct a new ZmqMessage with the given routing ID and payload. Payload of form ID-MESSAGE. */
ZmqMessage(std::string routing_id, std::string payload);
/** The routing ID of the message. */
std::string routing_id_;
/** The payload in the message, of form ID-MESSAGE. */
std::string payload_;
/** The cached id of the message. */
message_id_t message_id_;
/** The cached callback id of the message (source). */
callback_id_t source_cb_id_;
/** The cached callback id of the message (destination). */
callback_id_t dest_cb_id_;
/** The cached actual message. */
std::string_view message_;
};

Since we have multiple different types of messages I see two possible solutions

  1. Create a pending message list for each message type.
  2. Use some form of inheritance or templating so that the pending message list stores pointers to serializable objects.

Personally I think 2 might be cleaner.

@jkosh44 jkosh44 added the performance Performance related issues or changes. label May 19, 2021
@lmwnshn
Copy link
Contributor

lmwnshn commented May 19, 2021

I agree that 2 might be cleaner.

@lmwnshn
Copy link
Contributor

lmwnshn commented May 19, 2021

ah, this finally reminded me about our "async durability callbacks get swapped out".

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
performance Performance related issues or changes.
Projects
None yet
Development

No branches or pull requests

2 participants