Skip to content

HDDS-9369. Avoid deserializing OMRequest multiple times in OzoneManagerStateMachine.#5376

Merged
szetszwo merged 3 commits intoapache:masterfrom
szetszwo:HDDS-9369
Oct 4, 2023
Merged

HDDS-9369. Avoid deserializing OMRequest multiple times in OzoneManagerStateMachine.#5376
szetszwo merged 3 commits intoapache:masterfrom
szetszwo:HDDS-9369

Conversation

@szetszwo
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

In OzoneManagerStateMachine, the same log data is deserialized to the same OMRequest multiple times. It can be avoided by setting stateMachineContext.

What is the link to the Apache JIRA

HDDS-9369

How was this patch tested?

By existing tests.

@kerneltime
Copy link
Copy Markdown
Contributor

cc @tanvipenumudy

try {
OMRequest request = OMRatisHelper.convertByteStringToOMRequest(
trx.getStateMachineLogEntry().getLogData());
final OMRequest request = (OMRequest) trx.getStateMachineContext();
Copy link
Copy Markdown
Contributor

@kerneltime kerneltime Sep 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a line that this case depends on the deserialization done in startTransaction.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Thanks.

Copy link
Copy Markdown
Contributor

@kerneltime kerneltime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, pending CI and minor nit.

Copy link
Copy Markdown
Contributor

@duongkame duongkame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change @szetszwo. LGTM, but I think we should move the validations in preAppendTransaction to startTransaction as well.

trx.getStateMachineLogEntry().getLogData());
final OMRequest request = (OMRequest) trx.getStateMachineContext();
OzoneManagerProtocolProtos.Type cmdType = request.getCmdType();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these validations should be on startTransaction as well. The creation of UGI is quite expensive as well.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UGI is only for upgrade Prepare. Let's don't optimize it here.

@sodonnel
Copy link
Copy Markdown
Contributor

sodonnel commented Oct 2, 2023

In looking at this, I checked:

  public static OMRequest convertByteStringToOMRequest(ByteString byteString)
      throws InvalidProtocolBufferException {
    byte[] bytes = byteString.toByteArray();
    return OMRequest.parseFrom(bytes);
  }

And I noticed it takes the passed byteString and converts it to Bytes and then passes the bytes to OMRequest.parseFrom(). OMRequest.parseFrom also accepts a ByteString as a parameter, avoiding this copy - however, the parameter to this method is org.apache.ratis.thirdparty.com.google.protobuf.ByteString; and the parameter in the method is com.google.protobuf.ByteString.

I don't know much about how this Ratis stuff is put together, but it seems that we are paying the price for an extra copy on every message received due to "different" but likely the same classes being used behind the scenes.

@jojochuang
Copy link
Copy Markdown
Contributor

jojochuang commented Oct 2, 2023

We had the similar problem in HBase where the protobuf it used collides with the one shaded by Hadoop 3.3.
IIRC we solved it using reflection. Not the best solution IMO.

HBASE-23833

@szetszwo
Copy link
Copy Markdown
Contributor Author

szetszwo commented Oct 2, 2023

Proto 3 support wrapping a ByteBuffer to a ByteString but proto 2 does not. We may be able to copy the proto 3 wrapper code to proto 2. Then, the conversion can be done by

ByteString proto2 = wrap(proto3byteString.asReadOnlyByteBuffer());

Let me try it in a separated JIRA.

Copy link
Copy Markdown
Contributor

@duongkame duongkame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the patch @szetszwo .

// For the Leader, the OMRequest is set in trx in startTransaction.
// For Followers, the OMRequest hast to be converted from the log entry.
final Object context = trx.getStateMachineContext();
final OMRequest request = context != null ? (OMRequest) context
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the followers need to deserialize. That's OK if applyTransaction on the followers does not slow the commit phase down. Is that the case?
I think applyTransaction is done after the majority commit, just wanna double check.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... applyTransaction is done after the majority commit, ...

That is correct.

@szetszwo szetszwo merged commit 5df34e5 into apache:master Oct 4, 2023
@szetszwo
Copy link
Copy Markdown
Contributor Author

szetszwo commented Oct 4, 2023

@kerneltime , @duongkame , @sodonnel , @jojochuang , thanks a lot for the review and ideas!

@szetszwo
Copy link
Copy Markdown
Contributor Author

szetszwo commented Oct 4, 2023

... We may be able to copy the proto 3 wrapper code to proto 2. ...

Tried it a little bit but not succeeded. Many ByteString subclasses in proto 3 were missing in proto 2 and the wrap method needed them.

@jojochuang jojochuang added the hbase HBase on Ozone support label Jan 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hbase HBase on Ozone support

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants