Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS-17218. NameNode should process time out excess redundancy blocks #6176

Merged
merged 7 commits into from
Dec 4, 2023

Conversation

haiyang1987
Copy link
Contributor

@haiyang1987 haiyang1987 commented Oct 12, 2023

Description of PR

https://issues.apache.org/jira/browse/HDFS-17218

Currently found that DN will lose all pending DNA_INVALIDATE blocks if it restarts.
DN enables asynchronously deletion, it have many pending deletion blocks in memory.
when DN restarts, these cached blocks may be lost. it causes some blocks in the excess map in the namenode to be leaked and this will result in many blocks having more replicas then expected.

Root case
1.block1 of dn1 is chosen as excess, added to excessRedundancyMap and add To Invalidates.
2.dn1 heartbeat gets Invalidates command.
3.dn1 will execute async deletion when receive commands, but before it is actually deleted, the service stop, so the block1 still exsit.
4.at this time, nn's excessRedundancyMap will still have the block of dn1
5. restart the dn, at this time nn has not determined that the dn is in a dead state.
6. dn restarts will FBR is executed (processFirstBlockReport will not be executed here, processReport will be executed). since block1 is not a new block, the processExtraRedundancy logic will not be executed.

In HeartbeatManager#register(final DatanodeDescriptor d)
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java#L230-L238

//here current dn still is alive(expired heartbeat time has not been exceeded), dn register will not call d.updateHeartbeatState, so torageInfo.hasReceivedBlockReport() still is true
synchronized void register(final DatanodeDescriptor d) {
  if (!d.isAlive()) {
    addDatanode(d);
    //update its timestamp
    d.updateHeartbeatState(StorageReport.EMPTY_ARRAY, 0L, 0L, 0, 0, null);
    stats.add(d);
  }
}

In BlockManager#processReport, the dn restart run FBR, here current dn still is alive,storageInfo.hasReceivedBlockReport() is true, so will call method processReport
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L2916-L2946

if (!storageInfo.hasReceivedBlockReport()) {
        // The first block report can be processed a lot more efficiently than
        // ordinary block reports.  This shortens restart times.
        blockLog.info("BLOCK* processReport 0x{} with lease ID 0x{}: Processing first "
            + "storage report for {} from datanode {}",
            strBlockReportId, fullBrLeaseId,
            storageInfo.getStorageID(),
            nodeID);
        processFirstBlockReport(storageInfo, newReport);
      } else {
        // Block reports for provided storage are not
        // maintained by DN heartbeats
        if (!StorageType.PROVIDED.equals(storageInfo.getStorageType())) {
          invalidatedBlocks = processReport(storageInfo, newReport);
        }
      }
      storageInfo.receivedBlockReport();
    } finally {
      endTime = Time.monotonicNow();
      namesystem.writeUnlock("processReport");
    }

    if (blockLog.isDebugEnabled()) {
      for (Block b : invalidatedBlocks) {
        blockLog.debug("BLOCK* processReport 0x{} with lease ID 0x{}: {} on node {} size {} " +
                "does not belong to any file.", strBlockReportId, fullBrLeaseId, b,
            node, b.getNumBytes());
      }
    }

In BlockManager#processReport run FBR, since the current DatanodeStorageInfo exists in the triplets in the BlockInfo corresponding to the reported block, so will not add toAdd list, addStoredBlock and processExtraRedundancy logic will not be executed.
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L3044-L3085

Collection<Block> processReport(
      final DatanodeStorageInfo storageInfo,
      final BlockListAsLongs report) throws IOException {
    // Normal case:
    // Modify the (block-->datanode) map, according to the difference
    // between the old and new block report.
    //
    Collection<BlockInfoToAdd> toAdd = new ArrayList<>();
    Collection<BlockInfo> toRemove = new HashSet<>();
    Collection<Block> toInvalidate = new ArrayList<>();
    Collection<BlockToMarkCorrupt> toCorrupt = new ArrayList<>();
    Collection<StatefulBlockInfo> toUC = new ArrayList<>();
    reportDiff(storageInfo, report,
                 toAdd, toRemove, toInvalidate, toCorrupt, toUC);

    DatanodeDescriptor node = storageInfo.getDatanodeDescriptor();
    // Process the blocks on each queue
    for (StatefulBlockInfo b : toUC) {
      addStoredBlockUnderConstruction(b, storageInfo);
    }
    for (BlockInfo b : toRemove) {
      removeStoredBlock(b, node);
    }
    int numBlocksLogged = 0;
    for (BlockInfoToAdd b : toAdd) {
      addStoredBlock(b.stored, b.reported, storageInfo, null,
          numBlocksLogged < maxNumBlocksToLog);
      numBlocksLogged++;
    }
    if (numBlocksLogged > maxNumBlocksToLog) {
      blockLog.info("BLOCK* processReport: logged info for {} of {} " +
          "reported.", maxNumBlocksToLog, numBlocksLogged);
    }
    for (Block b : toInvalidate) {
      addToInvalidates(b, node);
    }
    for (BlockToMarkCorrupt b : toCorrupt) {
      markBlockAsCorrupt(b, storageInfo, node);
    }

    return toInvalidate;
  }
  1. so the block of dn1 will always exist in excessRedundancyMap (until HA switch is performed).

In BlockManager#processChosenExcessRedundancy will add the redundancy of the given block stored in the given datanode to the excess map.
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L4267-L4285

private void processChosenExcessRedundancy(
      final Collection<DatanodeStorageInfo> nonExcess,
      final DatanodeStorageInfo chosen, BlockInfo storedBlock) {
    nonExcess.remove(chosen);
    excessRedundancyMap.add(chosen.getDatanodeDescriptor(), storedBlock);
    //
    // The 'excessblocks' tracks blocks until we get confirmation
    // that the datanode has deleted them; the only way we remove them
    // is when we get a "removeBlock" message.
    //
    // The 'invalidate' list is used to inform the datanode the block
    // should be deleted.  Items are removed from the invalidate list
    // upon giving instructions to the datanodes.
    //
    final Block blockToInvalidate = getBlockOnStorage(storedBlock, chosen);
    addToInvalidates(blockToInvalidate, chosen.getDatanodeDescriptor());
    blockLog.debug("BLOCK* chooseExcessRedundancies: ({}, {}) is added to invalidated blocks set",
        chosen, storedBlock);
  }

but because the dn side has not deleted the block, it will not call processIncrementalBlockReport, so the block of dn can not remove from excessRedundancyMap.

Solution
NameNode add logic to handle excess redundant block timeouts to resolve current issue.
If NN determines that the excess redundancy block in DN has timed out and re-adds it to Invalidates.

@@ -1007,6 +1013,7 @@ public void updateRegInfo(DatanodeID nodeReg) {
for(DatanodeStorageInfo storage : getStorageInfos()) {
if (storage.getStorageType() != StorageType.PROVIDED) {
storage.setBlockReportCount(0);
storage.setBlockContentsStale(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why set content stale here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zhangshuyan0 for you comment.

The modifications here may not be directly related to the current problem.
the reason for modifying this is that if the current dn is re-registered, block deletion or exception may occur on the dn during this period, because FBR has not yet been completed, and the NN side memory record is different from the actual dn block.

If processExtraRedundancyBlock is executed at this time, block loss may occur.
such as a file has 2 replicas, but only 1 replica is expected. when processExtraRedundancyBlock is executed, a live dn will be choose for deletion, which will cause a block miss, so the dn re-registering needs to be marked blockContentsStale will avoid this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank @haiyang1987 for your reply. I understand what you mean. This patch removes corresponding excess replicas from ExcessRedundancyMap when re-registering, so NameNode does not know whether the replica is still on the register DN any more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @haiyang1987 for your report. And thanks @zhangshuyan0 for your review.

I think this modification is a new bug, not related to this case, we need to fix this bug in a new issue.

We can produce this bug by the following steps:

  • Assume that there is block1 contains three replicas, dn1, dn2
  • DN1 is shutdown for maintenance for corrupt disk
  • Admin removed the corrupted disk and restart datanode
  • DN1 try to register it to NameNode through registerDatanode rpc
  • End-user try to decrease the replicas of block from 2 to 1 through setReplication RPC
  • Block1 still contains three replicas in namenode, but the dn1 is not existed because it is stored in a corrupt disk
  • NameNode select dn2 as a redundancy replica for this block to delete
  • DN1 try to report all stored blocks to namnode through blockreport rpc
  • NameNode will remove the dn1 replica for block1 because the blockreport from DN1 doesn't contains block1

After these two operations(setReplication from end-user and restart from admin), the block1 may lose all replicas.
So I think we should mark all storage as a stale storage while namenode processing registerdatanode rpc, so that this case can be fixed.

@zhangshuyan0 @haiyang1987 I'm looking forward your good idea, thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ZanderXu Thanks for your reply. I think this modification is not a new bug. Before this patch, NameNode knows all excess replicas even though a DataNode is re-registered, so it wouldn't delete more replicas than expected.
As the situation you just said, we can discuss it from two aspects:

  1. If NameNode knows the corrupt replicas corresponding to corrupt disk, it will not delete the only healthy replica.
  2. If NameNode know nothing about the corrupt disk, the essence of the problem is that the Admin manually removed some replicas without notifying NameNode. Then in the time between "the replica has been removed" and "NN learns that the replica has been removed", there is always a chance that the only healthy replica will be deleted. So, I think the key to solving this problem is to immediately notify NN which disk is corrupt. Restarting & re-registering may not be necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this situation can happen at any time, not just between "registerDataNode" and "blockReport". Why do you think that after the DN is re-registered, the probability of the above situation happening will increase, and it needs to be dealt with specifically?

Yes, there are some other situations can cause this case. And I don't think the probability of the above situation happening will increase.

I just think we have a chance to reduce this case. So I think we need to do it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the relationship between stale storage and ExcessRedundancyMap.

  • StaleStorage is used to prevent the namenode from deleting replicas of blocks whose replicas are indeterminate.
  • ExcessRedundancyMap is used to mark the replicas of blocks that namenode is deleting

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ZanderXu I think there is a misunderstanding between us. I totally agree with this change. The difference between us may be that I think it is more appropriate to merge this change with this PR instead of opening a new issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StaleStorage is used to prevent the namenode from deleting replicas of blocks whose replicas are indeterminate.

About StaleStorage we can say comments in code:

* At startup or at failover, the storages in the cluster may have pending
* block deletions from a previous incarnation of the NameNode. The block
* contents are considered as stale until a block report is received. When a
* storage is considered as stale, the replicas on it are also considered as
* stale. If any block has at least one stale replica, then no invalidations
* will be processed for this block. See HDFS-1972.
*/
private boolean blockContentsStale = true;

At startup or at failover, the storages in the cluster may have pending block deletions from a previous incarnation of the NameNode.

From this, it can be seen that, the design of the "stale content" is to address the "indeterminate" caused by pending deletions. By the way, if the information provided by ExcessRedundancyMap is accurate, there will be no "indeterminate" caused by pending deletions.
See also: https://issues.apache.org/jira/browse/HDFS-1972

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, late reply.
Thanks @ZanderXu @zhangshuyan0 for your detailed reply. It is very meaningful to me and learned a lot.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 49s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 1s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 48m 46s trunk passed
+1 💚 compile 1m 28s trunk passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04
+1 💚 compile 1m 16s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 1m 13s trunk passed
+1 💚 mvnsite 1m 25s trunk passed
+1 💚 javadoc 1m 9s trunk passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 38s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 25s trunk passed
+1 💚 shadedclient 40m 34s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 13s the patch passed
+1 💚 compile 1m 16s the patch passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04
+1 💚 javac 1m 16s the patch passed
+1 💚 compile 1m 7s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 1m 7s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 2s the patch passed
+1 💚 mvnsite 1m 14s the patch passed
+1 💚 javadoc 0m 56s the patch passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 31s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 22s the patch passed
+1 💚 shadedclient 40m 34s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 235m 26s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 42s The patch does not generate ASF License warnings.
391m 1s
Reason Tests
Failed junit tests hadoop.hdfs.server.blockmanagement.TestDatanodeManager
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/1/artifact/out/Dockerfile
GITHUB PR #6176
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 34195dc4fdd0 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / cddd018
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/1/testReport/
Max. process+thread count 2254 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 52s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 48m 13s trunk passed
+1 💚 compile 1m 23s trunk passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04
+1 💚 compile 1m 14s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 1m 11s trunk passed
+1 💚 mvnsite 1m 24s trunk passed
+1 💚 javadoc 1m 8s trunk passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 35s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 24s trunk passed
+1 💚 shadedclient 40m 6s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 12s the patch passed
+1 💚 compile 1m 15s the patch passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04
+1 💚 javac 1m 15s the patch passed
+1 💚 compile 1m 7s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 1m 7s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 3s the patch passed
+1 💚 mvnsite 1m 15s the patch passed
+1 💚 javadoc 0m 57s the patch passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 29s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 28s the patch passed
+1 💚 shadedclient 40m 23s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 120m 5s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+0 🆗 asflicense 0m 44s ASF License check generated no output?
274m 1s
Reason Tests
Failed junit tests hadoop.hdfs.TestFileChecksum
hadoop.hdfs.TestMaintenanceWithStriped
hadoop.hdfs.TestErasureCodingExerciseAPIs
hadoop.hdfs.TestFileAppend2
hadoop.hdfs.TestDFSStripedOutputStreamWithRandomECPolicy
hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy
hadoop.hdfs.TestWriteReadStripedFile
hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/2/artifact/out/Dockerfile
GITHUB PR #6176
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 50977c3b1dae 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 6024caf
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/2/testReport/
Max. process+thread count 2213 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 23m 18s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
-1 ❌ mvninstall 0m 26s /branch-mvninstall-root.txt root in trunk failed.
-1 ❌ compile 0m 35s /branch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04.txt hadoop-hdfs in trunk failed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04.
-1 ❌ compile 0m 25s /branch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_382-8u382-ga-1~20.04.1-b05.txt hadoop-hdfs in trunk failed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05.
+1 💚 checkstyle 5m 27s trunk passed
-1 ❌ mvnsite 0m 17s /branch-mvnsite-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in trunk failed.
-1 ❌ javadoc 0m 38s /branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04.txt hadoop-hdfs in trunk failed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04.
-1 ❌ javadoc 0m 24s /branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_382-8u382-ga-1~20.04.1-b05.txt hadoop-hdfs in trunk failed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05.
-1 ❌ spotbugs 0m 17s /branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in trunk failed.
-1 ❌ shadedclient 10m 14s branch has errors when building and testing our client artifacts.
_ Patch Compile Tests _
-1 ❌ mvninstall 0m 25s /patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch failed.
-1 ❌ compile 0m 21s /patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04.txt hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04.
-1 ❌ javac 0m 21s /patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04.txt hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04.
-1 ❌ compile 0m 25s /patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_382-8u382-ga-1~20.04.1-b05.txt hadoop-hdfs in the patch failed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05.
-1 ❌ javac 0m 25s /patch-compile-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_382-8u382-ga-1~20.04.1-b05.txt hadoop-hdfs in the patch failed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05.
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 23s /buildtool-patch-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt The patch fails to run checkstyle in hadoop-hdfs
-1 ❌ mvnsite 0m 23s /patch-mvnsite-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch failed.
-1 ❌ javadoc 1m 20s /results-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04.txt hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 generated 99 new + 0 unchanged - 0 fixed = 99 total (was 0)
-1 ❌ javadoc 0m 35s /patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_382-8u382-ga-1~20.04.1-b05.txt hadoop-hdfs in the patch failed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05.
-1 ❌ spotbugs 0m 26s /patch-spotbugs-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch failed.
+1 💚 shadedclient 10m 41s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 0m 25s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch failed.
+1 💚 asflicense 0m 42s The patch does not generate ASF License warnings.
48m 56s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/3/artifact/out/Dockerfile
GITHUB PR #6176
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 5a634e698c7e 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / a63b1a0
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/3/testReport/
Max. process+thread count 80 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/3/console
versions git=2.25.1 maven=3.6.3
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 51s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 48m 44s trunk passed
+1 💚 compile 1m 23s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 1m 13s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 1m 12s trunk passed
+1 💚 mvnsite 1m 25s trunk passed
+1 💚 javadoc 1m 9s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 36s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 23s trunk passed
+1 💚 shadedclient 41m 35s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 11s the patch passed
+1 💚 compile 1m 17s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 1m 17s the patch passed
+1 💚 compile 1m 9s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 1m 9s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 3s the patch passed
+1 💚 mvnsite 1m 15s the patch passed
+1 💚 javadoc 0m 56s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 29s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 20s the patch passed
+1 💚 shadedclient 40m 27s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 242m 5s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 42s The patch does not generate ASF License warnings.
398m 3s
Reason Tests
Failed junit tests hadoop.hdfs.server.datanode.TestDirectoryScanner
hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/4/artifact/out/Dockerfile
GITHUB PR #6176
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 36a7fabcf599 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 9bcc4d8
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/4/testReport/
Max. process+thread count 2292 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@haiyang1987
Copy link
Contributor Author

The failed unit test seems unrelated to the change.

@haiyang1987
Copy link
Contributor Author

Hi @Hexiaoqiao @zhangshuyan0 do you have any comments or suggestions about this PR? Thanks.

@@ -1819,6 +1820,19 @@ void removeBlocksAssociatedTo(final DatanodeStorageInfo storageInfo) {
storageInfo, node);
}

/** Remove the blocks to the given DatanodeDescriptor from InvalidateBlocks. */
void removeBlocksFromInvalidateBlocks(final DatanodeDescriptor node) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest modifying the method name:
removeBlocksFromInvalidateBlocks -> removeNodeFromInvalidateBlocks

}

/** Remove the blocks to the given DatanodeDescriptor from excessRedundancyMap. */
LightWeightHashSet<BlockInfo> removeBlocksFromExcessRedundancyMap(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removeBlocksFromExcessRedundancyMap -> removeNodeFromExcessRedundancyMap

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 53s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 48m 58s trunk passed
+1 💚 compile 1m 25s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 1m 14s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 1m 11s trunk passed
+1 💚 mvnsite 1m 23s trunk passed
+1 💚 javadoc 1m 8s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 40s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 22s trunk passed
+1 💚 shadedclient 40m 38s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 13s the patch passed
+1 💚 compile 1m 18s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 1m 18s the patch passed
+1 💚 compile 1m 9s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 1m 9s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 2s the patch passed
+1 💚 mvnsite 1m 13s the patch passed
+1 💚 javadoc 0m 56s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 29s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 27s the patch passed
+1 💚 shadedclient 40m 59s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 244m 17s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 1m 0s The patch does not generate ASF License warnings.
400m 26s
Reason Tests
Failed junit tests hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/5/artifact/out/Dockerfile
GITHUB PR #6176
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux a943d1e8742e 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / c6cea42
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/5/testReport/
Max. process+thread count 2423 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/5/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 26s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 33m 21s trunk passed
+1 💚 compile 0m 53s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 47s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 0m 46s trunk passed
+1 💚 mvnsite 0m 56s trunk passed
+1 💚 javadoc 0m 50s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 12s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 1m 55s trunk passed
+1 💚 shadedclient 21m 22s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 40s the patch passed
+1 💚 compile 0m 47s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 47s the patch passed
+1 💚 compile 0m 41s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 0m 41s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 38s the patch passed
+1 💚 mvnsite 0m 45s the patch passed
+1 💚 javadoc 0m 37s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 59s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 1m 57s the patch passed
+1 💚 shadedclient 21m 35s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 138m 35s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+0 🆗 asflicense 0m 29s ASF License check generated no output?
231m 20s
Reason Tests
Failed junit tests hadoop.hdfs.server.namenode.snapshot.TestSnapshotDeletion
hadoop.hdfs.server.namenode.TestMetaSave
hadoop.hdfs.server.namenode.snapshot.TestRandomOpsWithSnapshots
hadoop.hdfs.server.namenode.TestCacheDirectivesWithViewDFS
hadoop.hdfs.server.namenode.TestListOpenFiles
hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength
hadoop.hdfs.server.namenode.TestNameNodeReconfigure
hadoop.hdfs.server.namenode.TestHDFSConcat
hadoop.hdfs.server.namenode.TestNamenodeRetryCache
hadoop.hdfs.server.namenode.snapshot.TestFileContextSnapshot
hadoop.hdfs.server.namenode.snapshot.TestINodeFileUnderConstructionWithSnapshot
hadoop.hdfs.server.namenode.snapshot.TestSnapshotNameWithInvalidCharacters
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/7/artifact/out/Dockerfile
GITHUB PR #6176
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 6c018a447c49 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 64dc319
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/7/testReport/
Max. process+thread count 3160 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/7/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 9s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 51m 6s trunk passed
+1 💚 compile 1m 27s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 1m 26s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 1m 17s trunk passed
+1 💚 mvnsite 1m 25s trunk passed
+1 💚 javadoc 1m 16s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 44s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 46s trunk passed
+1 💚 shadedclient 44m 28s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 20s the patch passed
+1 💚 compile 1m 30s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 1m 30s the patch passed
+1 💚 compile 1m 14s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 1m 14s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 8s the patch passed
+1 💚 mvnsite 1m 23s the patch passed
+1 💚 javadoc 1m 4s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 34s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 42s the patch passed
+1 💚 shadedclient 43m 54s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 239m 25s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 42s The patch does not generate ASF License warnings.
407m 9s
Reason Tests
Failed junit tests hadoop.hdfs.server.datanode.TestDirectoryScanner
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/6/artifact/out/Dockerfile
GITHUB PR #6176
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux ff37c399625c 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 2cd0509
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/6/testReport/
Max. process+thread count 2300 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/6/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@Hexiaoqiao
Copy link
Contributor

Back to this PR.

IMO, adding a timeout mechanism may not add much pressure on NameNode. However, it seems that the implementation of that solution is more complex than the current patch and requires more comprehensive design and consideration. The good aspect is that the timeout mechanism can completely solve the problem of excess replica leakage, after all, the situation where datanodes fail to successfully delete replicas according to commands may not be limited to the scenario described in this JIRA.

I totally support the solution @zhangshuyan0 mentioned here. https://issues.apache.org/jira/browse/HDFS-17218?focusedCommentId=17774766&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17774766

@haiyang1987
Copy link
Contributor Author

Back to this PR.

IMO, adding a timeout mechanism may not add much pressure on NameNode. However, it seems that the implementation of that solution is more complex than the current patch and requires more comprehensive design and consideration. The good aspect is that the timeout mechanism can completely solve the problem of excess replica leakage, after all, the situation where datanodes fail to successfully delete replicas according to commands may not be limited to the scenario described in this JIRA.

I totally support the solution @zhangshuyan0 mentioned here. https://issues.apache.org/jira/browse/HDFS-17218?focusedCommentId=17774766&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17774766

Thanks @Hexiaoqiao for your comment.
Indeed this timeout mechanism solution can completely solve the problem of redundant copy leakage, however the implementation cost of this solution maybe relatively high. for a large cluster, such as balance and reduce replication etc occur frequently, and ExcessRedundancyMap stores a lot of information.
During each iteration, DataNodes and their corresponding blocks are fully traversed for processing. this approach might potentially increase the load on holding the NameNode write lock, sure we can make good designs to avoid holding the write lock time as much as possible.

The current pr to slove problem of ExcessRedundancyMap leakage on a case-by-case basis, and the implementation cost is relatively low.
I think if there will be problems like ExcessRedundancyMap leakage in the future again, we should probably solve it case by case and find the root cause of the leakage to solve it.

Of course, if we decide to adopt the timeout mechanism solution, I will submit a new PR. look forward to your feedback. Thanks.

@haiyang1987
Copy link
Contributor Author

I try to implement this based on the timeout mechanism solution. However, there is a case where I have some questions, such as:

  • t1 time: Block1 on DN1 is choosed to be added to ExcessRedundancyMap.
  • t2 time: DN1 heartbeat gets Invalidates command.
  • t3 time: Due to a serious accumulationin DN1 async deletion queue, the replica might not be deleted for a prolonged period.

The question here is how the current NN can define a reasonable timeframe to determine whether Block1 corresponding to DN1 in ExcessRedundancyMap has timed out.
Currently, I haven't think of a particularly good way to define this.

Hi @Hexiaoqiao @ZanderXu @zhangshuyan0 excuse me, do you have any suggestions for this case?
look forward to your feedback, Thanks~

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 28s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 3s codespell was not available.
+0 🆗 detsecrets 0m 3s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 53m 8s trunk passed
+1 💚 compile 1m 31s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 1m 14s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 1m 13s trunk passed
+1 💚 mvnsite 1m 26s trunk passed
-1 ❌ javadoc 1m 11s /branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04.txt hadoop-hdfs in trunk failed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04.
+1 💚 javadoc 1m 41s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 30s trunk passed
+1 💚 shadedclient 41m 55s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 16s the patch passed
+1 💚 compile 1m 18s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 1m 18s the patch passed
+1 💚 compile 1m 19s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 1m 19s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 4s the patch passed
+1 💚 mvnsite 1m 21s the patch passed
-1 ❌ javadoc 0m 58s /patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04.txt hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04.
+1 💚 javadoc 2m 0s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 4m 19s the patch passed
+1 💚 shadedclient 42m 57s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 255m 25s hadoop-hdfs in the patch passed.
+1 💚 asflicense 1m 16s The patch does not generate ASF License warnings.
422m 1s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6176/1/artifact/out/Dockerfile
GITHUB PR #6176
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux a97889964d53 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 64dc319
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6176/1/testReport/
Max. process+thread count 2223 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6176/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@zhangshuyan0
Copy link
Contributor

I think we can determine whether the replica in ExcessRedundancyMap has timed out based on the configured timeout paprameter. As for the scenario you mentioned, I think this can be done directly: NN determines that DN1 has timed out and sends it another delete command. Will this have any adverse effects?

@haiyang1987
Copy link
Contributor Author

I think we can determine whether the replica in ExcessRedundancyMap has timed out based on the configured timeout paprameter. As for the scenario you mentioned, I think this can be done directly: NN determines that DN1 has timed out and sends it another delete command. Will this have any adverse effects?

Thanks @zhangshuyan0 for your detailed suggestions.
I think this should work, I will submit the MR as soon as possible, thanks again.

@haiyang1987 haiyang1987 changed the title HDFS-17218. NameNode should remove its excess blocks from the ExcessRedundancyMap When a DN registers HDFS-17218. NameNode should process time out excess redundancy blocks Oct 30, 2023
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 53s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 49m 28s trunk passed
+1 💚 compile 1m 23s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 1m 14s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 1m 11s trunk passed
+1 💚 mvnsite 1m 22s trunk passed
+1 💚 javadoc 1m 10s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 37s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 20s trunk passed
+1 💚 shadedclient 40m 44s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 10s the patch passed
+1 💚 compile 1m 17s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 1m 17s the patch passed
+1 💚 compile 1m 8s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 1m 8s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 3s the patch passed
+1 💚 mvnsite 1m 16s the patch passed
+1 💚 javadoc 0m 57s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 33s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 23s the patch passed
+1 💚 shadedclient 42m 23s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 242m 34s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 50s The patch does not generate ASF License warnings.
400m 55s
Reason Tests
Failed junit tests hadoop.hdfs.server.datanode.TestDirectoryScanner
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/8/artifact/out/Dockerfile
GITHUB PR #6176
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 5600f898960d 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 7fa80cc
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/8/testReport/
Max. process+thread count 2521 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/8/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@haiyang1987
Copy link
Contributor Author

Hi @Hexiaoqiao @ayushtkn @ZanderXu @zhangshuyan0
Could you please help review this pr again when you have free time. Thanks~

@haiyang1987
Copy link
Contributor Author

Test failures seems unrelated.

try {
Iterator<Map.Entry<String, LightWeightHashSet<ExcessBlockInfo>>> iter =
excessRedundancyMap.getExcessRedundancyMap().entrySet().iterator();
while (iter.hasNext() && processed < excessRedundancyTimeoutCheckLimit) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the size of excessRedundancyMap is large and there are few items that have timed out, the lock holding time of this method may be very long. It is recommended to try to avoid this situation, such as increasing the value of variable processed for every block processed, rather than just for blocks that have timed out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get it, i will update it later.

Thanks @zhangshuyan0 for your comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update PR.
Hi @zhangshuyan0 please help me review it again when you have free time. Thanks~

Copy link
Contributor

@Hexiaoqiao Hexiaoqiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great progress here. Leave some comments inline, PFYI.

Map.Entry<String, LightWeightHashSet<ExcessBlockInfo>> entry = iter.next();
String datanodeUuid = entry.getKey();
LightWeightHashSet<ExcessBlockInfo> blocks = entry.getValue();
List<ExcessRedundancyMap.ExcessBlockInfo> sortedBlocks = new ArrayList<>(blocks);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExcessRedundancyMap is redundant here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get it and will fix.

DatanodeStorageInfo datanodeStorageInfo = iterator.next();
DatanodeDescriptor datanodeDescriptor = datanodeStorageInfo.getDatanodeDescriptor();
if (datanodeDescriptor.getDatanodeUuid().equals(datanodeUuid)) {
if (datanodeStorageInfo.getState().equals(State.NORMAL)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about combine these two conditions to one as if (a && b) { do something; }?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get it and will fix.

final LightWeightHashSet<BlockInfo> set = map.get(dn.getDatanodeUuid());
return set != null && set.contains(blk);
final LightWeightHashSet<ExcessBlockInfo> set = map.get(dn.getDatanodeUuid());
return set != null && set.contains(new ExcessBlockInfo(blk));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am concerned if it will involve more heap footprint when new frequently. Is it necessary here?

if (set == null) {
return false;
}

final boolean removed = set.remove(blk);
final boolean removed = set.remove(new ExcessBlockInfo(blk));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the last comment too.

return false;
}
ExcessBlockInfo other = (ExcessBlockInfo) obj;
return (this.blockInfo.equals(other.blockInfo));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it enough to compare blockInfo only? If true, we don't need to create new instance to contains or remove to avoid more heap footprint cost. Right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sir's suggestion is reasonable. here it's sufficient to just compare blockInfo, i will fix it to avoid more heap footprint cost.

assertEquals(0, blockManager.getPendingDeletionBlocksCount());
assertNotNull(excessDn);

// Name node will ask datanode to delete replicas in heartbeat response.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer NameNode to Name node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get it and will fix.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 57s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 50m 4s trunk passed
+1 💚 compile 1m 25s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 1m 19s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 1m 14s trunk passed
+1 💚 mvnsite 1m 26s trunk passed
+1 💚 javadoc 1m 11s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 38s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 30s trunk passed
+1 💚 shadedclient 41m 56s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 21s the patch passed
+1 💚 compile 1m 21s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 1m 21s the patch passed
+1 💚 compile 1m 10s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 1m 10s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 5s the patch passed
+1 💚 mvnsite 1m 19s the patch passed
+1 💚 javadoc 1m 0s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 33s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 35s the patch passed
+1 💚 shadedclient 41m 33s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 246m 10s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 50s The patch does not generate ASF License warnings.
406m 37s
Reason Tests
Failed junit tests hadoop.hdfs.TestDFSUtil
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/9/artifact/out/Dockerfile
GITHUB PR #6176
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 93d1c35b9681 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / cafe3c6
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/9/testReport/
Max. process+thread count 1978 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/9/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@haiyang1987
Copy link
Contributor Author

Thanks @Hexiaoqiao for your comment, i will fix it later.

@haiyang1987
Copy link
Contributor Author

Update PR.
Hi @Hexiaoqiao @zhangshuyan0 please help me review it again when you have free time. Thanks~

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 1s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 48m 6s trunk passed
+1 💚 compile 1m 23s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 1m 14s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 1m 15s trunk passed
+1 💚 mvnsite 1m 24s trunk passed
+1 💚 javadoc 1m 9s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 37s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 19s trunk passed
+1 💚 shadedclient 40m 27s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 13s the patch passed
+1 💚 compile 1m 16s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 1m 16s the patch passed
+1 💚 compile 1m 7s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 1m 7s the patch passed
+1 💚 blanks 0m 1s The patch has no blanks issues.
-0 ⚠️ checkstyle 1m 4s /results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 454 unchanged - 0 fixed = 455 total (was 454)
+1 💚 mvnsite 1m 16s the patch passed
+1 💚 javadoc 0m 56s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 30s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
-1 ❌ spotbugs 3m 25s /new-spotbugs-hadoop-hdfs-project_hadoop-hdfs.html hadoop-hdfs-project/hadoop-hdfs generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0)
+1 💚 shadedclient 40m 32s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 238m 5s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 42s The patch does not generate ASF License warnings.
393m 2s
Reason Tests
SpotBugs module:hadoop-hdfs-project/hadoop-hdfs
BlockInfo is incompatible with expected argument type ExcessRedundancyMap$ExcessBlockInfo in org.apache.hadoop.hdfs.server.blockmanagement.ExcessRedundancyMap.contains(DatanodeDescriptor, BlockInfo) At ExcessRedundancyMap.java:argument type ExcessRedundancyMap$ExcessBlockInfo in org.apache.hadoop.hdfs.server.blockmanagement.ExcessRedundancyMap.contains(DatanodeDescriptor, BlockInfo) At ExcessRedundancyMap.java:[line 70]
BlockInfo is incompatible with expected argument type ExcessRedundancyMap$ExcessBlockInfo in org.apache.hadoop.hdfs.server.blockmanagement.ExcessRedundancyMap.remove(DatanodeDescriptor, BlockInfo) At ExcessRedundancyMap.java:argument type ExcessRedundancyMap$ExcessBlockInfo in org.apache.hadoop.hdfs.server.blockmanagement.ExcessRedundancyMap.remove(DatanodeDescriptor, BlockInfo) At ExcessRedundancyMap.java:[line 105]
org.apache.hadoop.hdfs.server.blockmanagement.ExcessRedundancyMap$ExcessBlockInfo.equals(Object) checks for operand being a BlockInfo At ExcessRedundancyMap.java:a BlockInfo At ExcessRedundancyMap.java:[line 162]
Failed junit tests hadoop.hdfs.server.datanode.TestDirectoryScanner
hadoop.hdfs.TestDFSUtil
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/10/artifact/out/Dockerfile
GITHUB PR #6176
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux b9af0ea38fc7 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 77c1342
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/10/testReport/
Max. process+thread count 2218 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/10/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@haiyang1987
Copy link
Contributor Author

Test failures seems unrelated.

However changes in the contains method and remove method input blockInfo and the equals method in ExcessBlockInfo will cause some SpotBugs problems.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 7m 19s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 32m 21s trunk passed
+1 💚 compile 0m 44s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 44s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 0m 40s trunk passed
+1 💚 mvnsite 0m 49s trunk passed
+1 💚 javadoc 0m 42s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 2s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 1m 51s trunk passed
+1 💚 shadedclient 21m 26s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 42s the patch passed
+1 💚 compile 0m 43s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 43s the patch passed
+1 💚 compile 0m 40s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 0m 40s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 34s the patch passed
+1 💚 mvnsite 0m 44s the patch passed
+1 💚 javadoc 0m 33s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 2s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 1m 53s the patch passed
+1 💚 shadedclient 21m 21s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 185m 50s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 29s The patch does not generate ASF License warnings.
282m 34s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/11/artifact/out/Dockerfile
GITHUB PR #6176
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux e449eb1c7786 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 59e36cc
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/11/testReport/
Max. process+thread count 4426 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/11/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@haiyang1987
Copy link
Contributor Author

haiyang1987 commented Nov 15, 2023

Hi @Hexiaoqiao @ayushtkn @zhangshuyan0 @tomscut @xinglin Would you mind to also take a review this pr when you have free time? thank you very much~

Copy link
Contributor

@Hexiaoqiao Hexiaoqiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Leave one nit comment inline. Let's wait @zhangshuyan0 to confirm. Thanks.

* less than or equal to 0, the default value is used (converted to milliseconds).
* @param timeOut The time (in seconds) to set as the excess redundancy block timeout.
*/
public void setExcessRedundancyTimeout(long timeOut) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timeOut -> timeout

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Hexiaoqiao for your comment.
Get it ,i will fix it later.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 17m 45s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 48m 14s trunk passed
+1 💚 compile 1m 23s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 1m 14s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 1m 14s trunk passed
+1 💚 mvnsite 1m 24s trunk passed
+1 💚 javadoc 1m 8s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 38s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 19s trunk passed
+1 💚 shadedclient 40m 36s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 11s the patch passed
+1 💚 compile 1m 15s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 1m 15s the patch passed
+1 💚 compile 1m 7s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 1m 7s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 5s the patch passed
+1 💚 mvnsite 1m 15s the patch passed
+1 💚 javadoc 0m 55s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 29s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 21s the patch passed
+1 💚 shadedclient 40m 0s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 253m 43s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 46s The patch does not generate ASF License warnings.
423m 55s
Reason Tests
Failed junit tests hadoop.hdfs.server.datanode.TestDirectoryScanner
hadoop.hdfs.TestRollingUpgrade
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/12/artifact/out/Dockerfile
GITHUB PR #6176
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux f4711f33a855 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 09eed20
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/12/testReport/
Max. process+thread count 2835 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/12/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@haiyang1987
Copy link
Contributor Author

The failed unit test seems unrelated to the change, local run test is normal.

Copy link
Contributor

@zhangshuyan0 zhangshuyan0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one small suggestion. Others LGTM.

@@ -315,6 +315,13 @@ public class DFSConfigKeys extends CommonConfigurationKeys {
public static final int
DFS_NAMENODE_RECONSTRUCTION_PENDING_TIMEOUT_SEC_DEFAULT = 300;

public static final String DFS_NAMENODE_EXCESS_REDUNDANCY_TIMEOUT_SEC_KEY =
"dfs.namenode.excess.redundancy.timeout-sec";
public static final long DFS_NAMENODE_EXCESS_REDUNDANCY_TIMEOUT_SEC = 3600;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DFS_NAMENODE_EXCESS_REDUNDANCY_TIMEOUT_SEC -> DFS_NAMENODE_EXCESS_REDUNDANCY_TIMEOUT_SEC_DEAFULT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zhangshuyan0 for your comment.
already fixed!

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 51s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 47m 53s trunk passed
+1 💚 compile 1m 24s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 1m 15s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 1m 15s trunk passed
+1 💚 mvnsite 1m 24s trunk passed
+1 💚 javadoc 1m 8s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 36s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 22s trunk passed
+1 💚 shadedclient 39m 54s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 10s the patch passed
+1 💚 compile 1m 15s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 1m 15s the patch passed
+1 💚 compile 1m 7s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 1m 7s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 4s the patch passed
+1 💚 mvnsite 1m 15s the patch passed
+1 💚 javadoc 0m 55s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 32s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 3m 20s the patch passed
+1 💚 shadedclient 39m 29s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 253m 16s /patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 41s The patch does not generate ASF License warnings.
404m 59s
Reason Tests
Failed junit tests hadoop.hdfs.server.datanode.TestDirectoryScanner
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/13/artifact/out/Dockerfile
GITHUB PR #6176
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 67b6b0c7dd15 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / b7f9b23
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/13/testReport/
Max. process+thread count 2471 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6176/13/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@Hexiaoqiao Hexiaoqiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. +1. Thanks @haiyang1987

@Hexiaoqiao
Copy link
Contributor

Let's try to checkin while wait for two workdays if no more comments. Thanks.

@haiyang1987
Copy link
Contributor Author

Thanks @Hexiaoqiao for your review!

@Hexiaoqiao Hexiaoqiao merged commit 9a6d00a into apache:trunk Dec 4, 2023
1 of 4 checks passed
@Hexiaoqiao
Copy link
Contributor

Committed to trunk. Thanks @haiyang1987 for your works. And @zhangshuyan0 @ZanderXu for your reviews.

@haiyang1987
Copy link
Contributor Author

Thanks @Hexiaoqiao @zhangshuyan0 @ZanderXu for your review and merge.

jiajunmao pushed a commit to jiajunmao/hadoop-MLEC that referenced this pull request Feb 6, 2024
…apache#6176). Contributed by Haiyang Hu.

Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants