Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#584] feat(netty): Add transport client pool for netty #771

Merged
merged 12 commits into from
Apr 3, 2023

Conversation

xumanbu
Copy link
Contributor

@xumanbu xumanbu commented Mar 27, 2023

What changes were proposed in this pull request?

  1. add for netty rpc client TransportClient
  2. TransportClientFactory for connection pool
  3. TransportContext contains the context to create a TransportClientFactory, setup Netty Channel pipelines with a TransportResponseHandler
  4. TransportConf for netty transport config create by RssConf

Why are the changes needed?

Fix: #584

Does this PR introduce any user-facing change?

add client configurations and add the ability to reuse netty clients.
Todo: update the user documentation after the netty feature is completed @xumanbu

How was this patch tested?

local test

@smallzhongfeng smallzhongfeng changed the title [#584] transport client pool for netty [#584] feat(netty): transport client pool for netty Mar 27, 2023
@codecov-commenter
Copy link

codecov-commenter commented Mar 27, 2023

Codecov Report

Merging #771 (bd9c2a7) into master (3f9ba81) will increase coverage by 0.16%.
The diff coverage is 16.32%.

@@             Coverage Diff              @@
##             master     #771      +/-   ##
============================================
+ Coverage     60.95%   61.11%   +0.16%     
+ Complexity     1956     1851     -105     
============================================
  Files           244      226      -18     
  Lines         13308    10967    -2341     
  Branches       1119     1089      -30     
============================================
- Hits           8112     6703    -1409     
+ Misses         4740     3889     -851     
+ Partials        456      375      -81     
Impacted Files Coverage Δ
...e/uniffle/common/netty/client/TransportClient.java 0.00% <0.00%> (ø)
...le/common/netty/client/TransportClientFactory.java 0.00% <0.00%> (ø)
...che/uniffle/common/netty/client/TransportConf.java 0.00% <0.00%> (ø)
.../uniffle/common/netty/client/TransportContext.java 0.00% <0.00%> (ø)
.../common/netty/handle/TransportResponseHandler.java 0.00% <0.00%> (ø)
.../uniffle/common/netty/protocol/MessageDecoder.java 0.00% <0.00%> (ø)
.../uniffle/common/netty/protocol/MessageEncoder.java 0.00% <0.00%> (ø)
...rg/apache/uniffle/common/config/RssClientConf.java 98.21% <100.00%> (+4.46%) ⬆️

... and 26 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@jerqi jerqi requested a review from leixm March 28, 2023 01:58
@jerqi jerqi changed the title [#584] feat(netty): transport client pool for netty [#584] feat(netty): Add transport client pool for netty Mar 28, 2023
@leixm
Copy link
Contributor

leixm commented Mar 28, 2023

@xumanbu Encoder and Decoder already added, see #742 .

jerqi
jerqi previously approved these changes Mar 28, 2023
Copy link
Contributor

@jerqi jerqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@advancedxy
Copy link
Contributor

@xumanbu would you mind modify the pr description to the current status?

@xumanbu
Copy link
Contributor Author

xumanbu commented Mar 28, 2023

@xumanbu Encoder and Decoder already added, see #742 .

I'll rebase with #742

@xumanbu
Copy link
Contributor Author

xumanbu commented Mar 28, 2023

@xumanbu would you mind modify the pr description to the current status?

done.

@advancedxy
Copy link
Contributor

@xumanbu would you mind modify the pr description to the current status?

done.

I think the content of Does this PR introduce any user-facing change? should fall into What changes were proposed in this pull request??

And regarding the user changes, you could describe it as add client configurations and add the ability to reuse netty clients. And you may update the user documentation after the netty feature is completed.

@xumanbu
Copy link
Contributor Author

xumanbu commented Mar 28, 2023

@xumanbu would you mind modify the pr description to the current status?

done.

I think the content of Does this PR introduce any user-facing change? should fall into What changes were proposed in this pull request??

And regarding the user changes, you could describe it as add client configurations and add the ability to reuse netty clients. And you may update the user documentation after the netty feature is completed.

thanks for your guidance. edited.

ClientPool clientPool = connectionPool.get(unresolvedAddress);
if (clientPool == null) {
connectionPool.putIfAbsent(unresolvedAddress, new ClientPool(numConnectionsPerPeer));
clientPool = connectionPool.get(unresolvedAddress);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use computeIfAbsent to merge this two lines?

public TransportClientFactory(TransportContext context) {
this.context = Preconditions.checkNotNull(context);
this.conf = context.getConf();
this.connectionPool = new ConcurrentHashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can use JavaUtils.newConcurrentMap()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

public static final ConfigOption<Integer> NETTY_CLIENT_NUM_CONNECTIONS_PER_PEER = ConfigOptions
.key("rss.client.netty.client.connections.per.peer")
.intType()
.defaultValue(2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this too small?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think default value small is safty. In small number uniffle server cluster,the large value will case a lot of connections to server.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense.

.withDescription("netty connect to server time out mills");

public static final ConfigOption<IOMode> NETTY_IO_MODE = ConfigOptions
.key("rss.client.netty.io.mode")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leixm added rss.server.netty.epoll.enable for netty sever,do you guys it's better to rename that configuration to rss.server.netty.io.mode instead? So they are more consistent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok for me, i will fix it later PRs.


public TransportClient(Channel channel, TransportResponseHandler handler) {
this.channel = Preconditions.checkNotNull(channel);
this.handler = Preconditions.checkNotNull(handler);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use java.util.Objects#requireNonNull instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preconditions#checkNotNull will throw NPE when object is null,Preconditions may easyer to use than Object#requireNonNull.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preconditions#checkNotNull will throw NPE when object is null,

Objects#requireNonNull does exactly the same. The point is that we should prefer std lib over guava as guava has its reputation to break things in various versions, therefore it will cause class conflict for other systems which also has different version of guava.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense.I'll instead.

@Override
protected void handleFailure(String errorMsg, Throwable cause) {
handler.removeRpcRequest(rpcRequestId);
callback.onFailure(new IOException(errorMsg, cause));
Copy link
Contributor

@advancedxy advancedxy Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add a logging here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense.I'll add same log.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same log is already add in parent method,so I think it's enough at now.

return counter.getAndIncrement();
}

public class StdChannelListener implements GenericFutureListener<Future<? super Void>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like that it's better to declare it as public static class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use non-static inner class for Listener is for the method to access the attributes of the out class,such as channel,handle.

public TransportClientFactory(TransportContext context) {
this.context = Preconditions.checkNotNull(context);
this.conf = context.getConf();
this.connectionPool = new ConcurrentHashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use JavaUtils.newConcurrentMap?

Comment on lines +148 to +159
synchronized (clientPool.locks[clientIndex]) {
cachedClient = clientPool.clients[clientIndex];

if (cachedClient != null) {
if (cachedClient.isActive()) {
logger.trace("Returning cached connection to {}: {}", resolvedAddress, cachedClient);
return cachedClient;
} else {
logger.info("Found inactive connection to {}, creating a new one.", resolvedAddress);
}
}
clientPool.clients[clientIndex] = internalCreateClient(resolvedAddress, decoder);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you wrap these block of code into the ClientPool structure?

Such as clientPoo.createClientIfAbsent, which should be similar as ConcurrentMap's computeIfAbsent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to create createClientIfAbsent function in ClientPool. like this :

TransportClient createClientIfAbsent(
        int clientIndex
        , InetSocketAddress resolvedAddress
        , ChannelInboundHandlerAdapter decoder
        , BiFunction<InetSocketAddress,ChannelInboundHandlerAdapter,TransportClient> createClientFunction ){
     
        clients[clientIndex] = createClientFunction.apply(resolvedAddress, decoder);
        return clients[clientIndex];
      }
} 

but It seems not very graceful becuase of createClientFunction have two unrelated args.

…ansportClientFactory.java


make sense

Co-authored-by: advancedxy <xianjin@apache.org>
.withDescription("netty connect to server time out mills");

public static final ConfigOption<IOMode> NETTY_IO_MODE = ConfigOptions
.key("rss.client.netty.io.mode")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok for me, i will fix it later PRs.

public static final ConfigOption<Integer> NETTY_CLIENT_NUM_CONNECTIONS_PER_PEER = ConfigOptions
.key("rss.client.netty.client.connections.per.peer")
.intType()
.defaultValue(2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense.

* <p>After `onSuccess` returns, `response` will be recycled and its content will become invalid.
* Please copy the content of `response` if you want to use it after `onSuccess` returns.
*/
void onSuccess(RpcResponse rpcResponse);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, not all response types are RPCResponse, such as getInMemoryShuffleData.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getInMemoryShuffleData result will extends RPCResponse in your design?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetMemoryShuffleDataResponse will extend Message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do GetMemoryShuffleDataResponse need extend Message?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems every Response type will have status and retMsg attr, it can extend RPCResponse, my mistake.

@leixm
Copy link
Contributor

leixm commented Mar 29, 2023

We better add some UTs.

@jerqi
Copy link
Contributor

jerqi commented Mar 31, 2023

We better add some UTs.

Spark don't have UTs, too. Maybe we can use Integration tests to guarantee the correctness when other prs were merged.

@leixm
Copy link
Contributor

leixm commented Mar 31, 2023

We better add some UTs.

Spark don't have UTs, too. Maybe we can use Integration tests to guarantee the correctness when other prs were merged.

Ok.

@jerqi
Copy link
Contributor

jerqi commented Apr 1, 2023

@xumanbu There is one comment left. #771 (comment) Could you address it?

@xumanbu
Copy link
Contributor Author

xumanbu commented Apr 3, 2023

@xumanbu There is one comment left. #771 (comment) Could you address it?

My mistake.done.

Copy link
Contributor

@jerqi jerqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, merged. thanks all

@jerqi jerqi merged commit bc9aaaa into apache:master Apr 3, 2023
xianjingfeng pushed a commit to xianjingfeng/incubator-uniffle that referenced this pull request Apr 5, 2023
…#771)

### What changes were proposed in this pull request?
  1. add  for netty rpc client TransportClient
  2. TransportClientFactory for connection pool
  3. TransportContext contains the context to create a TransportClientFactory, setup Netty Channel pipelines with a TransportResponseHandler
  4. TransportConf for netty transport config create by RssConf

### Why are the changes needed?
Fix: apache#584

### Does this PR introduce _any_ user-facing change?
add client configurations and add the ability to reuse netty clients.
Todo: update the user documentation after the netty feature is completed @xumanbu

### How was this patch tested?
local test

Co-authored-by: jam.xu <jam.xu@vipshop.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Subtask] [Netty] Implementation of client connection management
6 participants