-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDFS-16283. RBF: reducing the load of renewLease() RPC #4524
Conversation
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me to have calls only to specified namespaces. Had a quick look, have dropped some comments.
public void renewLease(String clientName) throws IOException { | ||
public void renewLease(String clientName, String nsIdentifies) | ||
throws IOException { | ||
checkNNStartup(); | ||
// just ignore nsIdentifies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to have a check that it is null
, from accidentally letting user pass some value to Namenode and feel it is getting honoured.
@@ -579,6 +581,28 @@ void updateLastLeaseRenewal() { | |||
} | |||
} | |||
|
|||
/** | |||
* Get all nsIdentifies of DFSOutputStreams. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Identifies in the method names and arguments, doesn't make sense, Can we change it to NsIndentifiers
, well I am good with just namespaces/namespace
also
if (nsIdentify != null && !nsIdentify.isEmpty()) { | ||
allNSIdentifies.add(nsIdentify); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which case it can be null
or empty
?
One which I can think of is if the router is at older version than the client, means if Router doesn't have this and client is upgraded.
I think that scenario should be sorted, if either of the identifier is null
or empty
pass some null
or so to the Router and make sure the old functionality of shooting RPC to all namespaces, stays intact.
@@ -763,7 +763,7 @@ SnapshotStatus[] getSnapshotListing(String snapshotRoot) | |||
* @throws IOException If an I/O error occurred | |||
*/ | |||
@Idempotent | |||
void renewLease(String clientName) throws IOException; | |||
void renewLease(String clientName, String allNSIdentifies) throws IOException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add detail about the new argument in the javadoc as well
if (nsIdentifies == null || nsIdentifies.isEmpty()) { | ||
return new ArrayList<>(namenodeResolver.getNamespaces()); | ||
} | ||
String[] nsIdList = nsIdentifies.split(","); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First at client we are doing a String.Joinner stuff, then here we are splitting, can't we pass an array/set/list whichever possible and get rid of this join & split overhead during the call?
namespaceInfo = new FederationNamespaceInfo("", "", nsId); | ||
nsNameSpaceInfoCache.put(nsId, namespaceInfo); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't catch this logic of new FederationNamespaceInfor
creation, you have a cached Map, which is empty. You do a get, it will return null
, you come to the if block and create explicitly this, why aren't we initialising the cached map from namenodeResolver.getNamespaces()
or in case we don't find it in the cached map, why don't we go ahead and try find from namenodeResolver.getNamespaces()
if (nss.size() == 1) { | ||
rpcClient.invokeSingle(nss.get(0).getNameserviceId(), method); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nsId is getting passed from the client, if we get an array or so, you can figure out initially itself whether you have only one entry or not. so you can get rid of getRewLeaseNSs(nsIdentifies);
completely in that case?
fsDataOutputStream0.close(); | ||
fsDataOutputStream1.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either use finally or try-with resources, for close.
FSDataOutputStream fsDataOutputStream0 = routerFS.create(newTestPath0); | ||
FSDataOutputStream fsDataOutputStream1 = routerFS.create(newTestPath1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this code bother Append
flow as well?
dfsRouterFS.getClient().getLeaseRenewer().interruptAndJoin(); | ||
|
||
Path testPath = new Path("/testRenewLease0/test.txt"); | ||
FSDataOutputStream fsDataOutputStream = routerFS.create(testPath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test both for both replicated as well as Erasure Coded files
Thanks @ayushtkn for your review, I learned a lot from it. Thank you again. Because the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanx @ZanderXu for the update, dropped some comments give a check, there may be some checkstyle warnings as well from Jenkins.
The last build shows some test failures as well, and they look related I think, give a check to them as well
Rest post that things looks good...
@@ -759,11 +759,19 @@ SnapshotStatus[] getSnapshotListing(String snapshotRoot) | |||
* the last call to renewLease(), the NameNode assumes the | |||
* client has died. | |||
* | |||
* @param namespaces The full Namespace list that the release rpc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems typo release
-> renewLease
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
throws IOException { | ||
if (namespaces != null && namespaces.size() > 0) { | ||
LOG.warn("namespaces({}) should be null or empty " | ||
+ "on NameNode side, please check it.", namespaces); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
throw Exception here, We don't expect Namespaces here and neither wan't to silently ignore such an occurrence
@@ -1450,6 +1452,95 @@ public void testProxyRestoreFailedStorage() throws Exception { | |||
assertEquals(nnSuccess, routerSuccess); | |||
} | |||
|
|||
@Test | |||
public void testRewnewLease() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is has become little big, Can we split the create & append apart into different tests? Can extract the common stuff into a util method and reuse
if (ret instanceof LastBlockWithStatus) { | ||
((LastBlockWithStatus) ret).getFileStatus().setNamespace(ns); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this for append? Then No I don't think we should do this for all other API, should restrict our changes to only Append code.
Check if changing the Append code in RouterClientProtocol
helps:
@Override
public LastBlockWithStatus append(String src, final String clientName,
final EnumSetWritable<CreateFlag> flag) throws IOException {
rpcServer.checkOperation(NameNode.OperationCategory.WRITE);
List<RemoteLocation> locations = rpcServer.getLocationsForPath(src, true);
RemoteMethod method = new RemoteMethod("append",
new Class<?>[] {String.class, String.class, EnumSetWritable.class},
new RemoteParam(), clientName, flag);
RemoteResult result = rpcClient
.invokeSequential(method, locations, LastBlockWithStatus.class, null);
LastBlockWithStatus lbws = (LastBlockWithStatus) result.getResult();
lbws.getFileStatus().setNamespace(result.getLocation().getNameserviceId());
return lbws;
}
Map<String, FederationNamespaceInfo> allAvailableNamespaces = | ||
getAvailableNamespaces(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should have some caching here:
Like:
Initially initialise availableNamespace
and for every call check from this, if some entry isn't found in the stored/cached availableNamespace
, In that case call getAvailableNamespaces()
and update the value of availableNamespace
,
if still we don't find the entry after then we can return all the namespace what we are doing now
💔 -1 overall
This message was automatically generated. |
Thanks @ayushtkn for your good idea, and I have updated the patch. About caching I'm looking for your help, thanks. |
hmm, the caching may be can have a follow up post this, might be tricky but doable. you missed a couple of comments, rest things look almost good to me @goiri / @Hexiaoqiao mind giving an additional check.. |
Thanks @ayushtkn for your review and ideas. |
💔 -1 overall
This message was automatically generated. |
@ZanderXu @ayushtkn, Thanks for your great works here. After a quick glance, it seems one solution to improve renewLease for RBF. |
Thank @Hexiaoqiao for your solution. In the beginning, we try to carry the writing paths to RBF to fix this issue. After running for a while, I found some cases also need to be fixed:
Also, the number of renewLease requests between client and rbf will also increases, depending on the number of files being written at the same time. |
Thanks for quick response.
In my practice, the cost with split-path to renewLease will be under control even for long running applications, such flink applications (I have not observed that many files being written concurrently, it will be helpful if any cases could offer.)
For both create and renewLease (with file path), I think they will apply the same MountTableResolver for same file. So it does not seem to one issue for renewLease. Maybe some corner case I do not catch. Please correct me if something missed.
Yes, it is true. I am totally agree. Based on my internal production cluster, it will be less than 5% increase. |
Although they all used MountTableResolver, create and append rpc can get the NS which the file belongs, but renewlease can only obtain the full NSs which the file mounted. So int this case, the renewlease rpc always forwarded to some unnecessary nameservices. |
🎊 +1 overall
This message was automatically generated. |
Exactly true. For MultipleDestinationMount, it could forward to different NS when request with file path only, especially for DestinationOrder.RANDOM and related order. cc @ayushtkn @goiri Anymore feedback here? Thanks. |
@Hexiaoqiao I am ok with using path, but if I catch correct, the only save with using path will be like we won't be exposing the namespaces to the end client? but in exchange we will be saving I think a bunch of RPCs, especially in case of multi destination mount points. May be from the performance point of view, it might be better with namespaces( the present approach). But I don't have any strong objections, if you feel we shouldn't expose the namespaces to end client. if there is a particular use case where we shouldn't expose namespace to end client, in that case we may hide this change behind a config, and this optimisation won't work in that case, but in general ViewFs also knows about all namespaces and usually a lot of clients too have these namespaces defined in their configs. so, that is not a big secret, and this namespace info will also be their in back-end and I even don't think exposing them via this route can have any security issue? But I am Ok, with whichever approach you folks feel better.. |
@ayushtkn It is not related with any security issue when I propose to use path as one parameter of renewLease. Actually in my opinion, it will be confused and poor readable with both namespaces and router name at client side, without other strong support points.
As mentioned above, for MultipleDestinationMount it will be difficult to reduce requests to NameNode at Router side. (I am limited by my internal case where no MultipleDestination with DestinationOrder.RANDOM hash configured.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanx @Hexiaoqiao for the details. Makes sense :-)
Changes LGTM.
Will hold for @Hexiaoqiao to have a final look before we conclude this.
checkNNStartup(); | ||
// just ignore nsIdentifies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this line or change it to // Ignore the namespaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy, I will fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ZanderXu It almost look good to me. Just leave some nit comments. FYI. Will give my +1 once fixed. Thanks again.
/** | ||
* Try to get a list of FederationNamespaceInfo for renewLease RPC. | ||
*/ | ||
private List<FederationNamespaceInfo> getRewLeaseNSs(List<String> namespaces) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method name should be getRenewLeaseNSs
?
getAvailableNamespaces(); | ||
for (String namespace : namespaces) { | ||
if (!allAvailableNamespaces.containsKey(namespace)) { | ||
return new ArrayList<>(namenodeResolver.getNamespaces()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use result
directly rather than create another ArrayList again here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
namenodeResolver.getNamespaces()
is a hashSet, I want to a List so that we can use invokeSingle
method to forward this rpc when there is only one namespace.
List<FederationNamespaceInfo> nss = getRenewLeaseNSs(namespaces);
if (nss.size() == 1) {
rpcClient.invokeSingle(nss.get(0).getNameserviceId(), method);
} else {
rpcClient.invokeConcurrent(nss, method, false, false);
}
Of course, Set
can also achieve this goal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get it. make sense to me.
} | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicate blank line.
@@ -759,11 +759,19 @@ SnapshotStatus[] getSnapshotListing(String snapshotRoot) | |||
* the last call to renewLease(), the NameNode assumes the | |||
* client has died. | |||
* | |||
* @param namespaces The full Namespace list that the release rpc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ayushtkn @Hexiaoqiao Thanks for your discussion and review. I will continue to word hard to submit more patches to the community. |
… Contributed by ZanderXu. Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org> Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
… Contributed by ZanderXu. Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org> Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
…3.2-bzl-hdfs-merge' HDFS-16283. RBF: reducing the load of renewLease() RPC (apache#4524). See merge request dap/hadoop!79
… Contributed by ZanderXu. Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org> Signed-off-by: Ayush Saxena <ayushsaxena@apache.org> With part of HDFS-15535. ACLOVERRIDE
Description of PR
HDFS-16283: RBF: improve renewLease() to call only a specific NameNode rather than make fan-out calls
Currently RBF will forward the renewLease() rpc to all the available name services. So the forwarding efficiency will be affected by the unhealthy downstream name services. And along with as more as NSs are monitored by RBF, this problem will become more and more serious.
In our prod cluster, there are 70+ nameservices, the phenomenon that renewLease() rpc is blocked often occurs.
This patch is be used to fix this problem and work well on our cluster, and the main ideas is: