Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-10874. Freon tool DNRPCLoadGenerator should not cache client objects. #6705

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jojochuang
Copy link
Contributor

What changes were proposed in this pull request?

HDDS-10874. Freon tool DNRPCLoadGenerator should not cache client objects.

The original XceiverClientSpi API always caches client objects to a specific pipeline. But in this freon tool we wanted to have multiple clients to parallelize throughput.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10874

How was this patch tested?

Ran on a real cluster. Other than that it leverages the existing tests.

After this change:

ozone freon dne --clients=32 --container-id=1 -t 32 -n 1000000 --ratis --read-only
rpc-payload
mean rate = 38794.07 calls/second

ozone freon dne --clients=1 --container-id=1 -t 32 -n 1000000 --ratis --read-only
count = 1000000
mean rate = 9987.42 calls/second

But number of clients has no impact for Ratis in write mode and GRPC.

…ects.

Change-Id: Icc1adf11db4d07e970414e7d1cafcd22ad4ba40d
Change-Id: Ia22492a6633c65e00ef5f23d84e38db1a364c61f
@kerneltime
Copy link
Contributor

@jojochuang why not make this the default behavior for reads on client side? Maybe we can change the logic on the client side to have a pool of clients per container, I guess the benefit for the client cache is to avoid the setup cost when there is sequential access to the same container but when there is parallel access to the same container we see a performance hit. Having a pool of clients per container and no upper bound on the number of clients across containers might be right way forward. We might still need a TTL or LRU for the clients.

@adoroszlai
Copy link
Contributor

The original XceiverClientSpi API always caches client objects to a specific pipeline. But in this freon tool we wanted to have multiple clients to parallelize throughput.

Distinct clients can be acquired simply by providing pipeline with unique ID.

@smengcl
Copy link
Contributor

smengcl commented May 21, 2024

@jojochuang why not make this the default behavior for reads on client side?

@kerneltime I want to take a different angle. -- I wonder where the overhead is coming from in the first place.

In the case of Ratis, XceiverClientRatis itself does not appear to have much synchronization overhead. So I did a little bit of digging. And I think the limiting factor can be this config in each RaftClient:

raft.client.async.outstanding-requests.max

by default each RaftClient allows 100 outstanding requests. As a result only 100 parallel RaftClient#send can be in-flight at one point:

https://github.com/apache/ratis/blob/release-3.0.1/ratis-client/src/main/java/org/apache/ratis/client/impl/OrderedAsync.java#L160-L183

We should try setting raft.client.async.outstanding-requests.max to 100000 and see what happens. cc @jojochuang @szetszwo

@smengcl
Copy link
Contributor

smengcl commented May 21, 2024

And btw for grpc we also have a similar semaphore config in XceiverClientGrpc that limits parallelization for each XceiverClient instance:

this.semaphore =
new Semaphore(HddsClientUtils.getMaxOutstandingRequests(config));

and it is set to a default of 32:

description =
"Controls the maximum number of outstanding async requests that can"
+ " be handled by the Standalone as well as Ratis client.")
private int maxOutstandingRequests = 32;

@jojochuang
Copy link
Contributor Author

jojochuang commented May 22, 2024

Worth point out that the Echo request is implemented as a blocking call and the number of threads limits number of concurrent requests. And that is consistent across all ContainerProtocolCalls APIs.

For GRPC client, It made no difference increasing hdds.ratis.raft.client.async.outstanding-requests.max. They were all blocked waiting for response from DN.

Hadn't compared RPC Client. But given that 32 thread implies 32 inflight requests, I don't expect tuning that raft.client.async.outstanding-requests.max would make a difference.

Comment on lines +45 to +46
XceiverClientSpi acquireClientUncached(Pipeline pipeline, boolean topologyAware)
throws Exception;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding a new method to the XceiverClientFactory interface, how about creating a new, non-caching implementation as parent class of XceiverClientManager?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see ad95eb8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants