Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS-16261 Configurable grace period around deletion of invalidated blocks #3594

Open
wants to merge 1 commit into
base: trunk
Choose a base branch
from

Conversation

bbeaudreault
Copy link
Contributor

Description of PR

See https://issues.apache.org/jira/browse/HDFS-16261 for more details.

The most straightforward way to introduce a grace period for replaced blocks is in the InvalidateBlocks datastructure. Work is periodically pulled from this class to be divvied up to DataNodes. This class is also reported on with JMX metrics, so one can easily see how many blocks are currently pending deletion.

I achieved the grace period by adding a new pollNWithFilter method to LightWeightHashSet. InvalidateBlocks are added to the LightWeightHashSet with a calculated readyForDeleteAt time. When getBlocksToInvalidateByLimit is called, a filter is passed which only includes those blocks whose readyForDeleteAt is expired.

The default grace period for blocks added to InvalidateBlocks is 0, which effectively makes all blocks immediately ready for deletion per existing behavior. When a grace period is configured, it applies only to blocks deleted through the delHintNode passed in RECEIVED_BLOCK messages. This minimizes the impact of this feature on blocks deleted for other reasons (i.e. if a file is deleted or through other ongoing namenode auditing).

How was this patch tested?

I've added tests for the relevant pieces, and have also been running this on 2 clusters internally.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 2s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 40m 6s trunk passed
+1 💚 compile 1m 41s trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 compile 1m 25s trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 checkstyle 1m 13s trunk passed
+1 💚 mvnsite 1m 37s trunk passed
+1 💚 javadoc 1m 18s trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 1m 37s trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 spotbugs 4m 8s trunk passed
+1 💚 shadedclient 30m 15s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 31s the patch passed
+1 💚 compile 1m 35s the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javac 1m 35s the patch passed
+1 💚 compile 1m 24s the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 javac 1m 24s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 1m 1s /results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt hadoop-hdfs-project/hadoop-hdfs: The patch generated 19 new + 397 unchanged - 1 fixed = 416 total (was 398)
+1 💚 mvnsite 1m 31s the patch passed
+1 💚 xml 0m 1s The patch has no ill-formed XML file.
+1 💚 javadoc 0m 55s the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 1m 31s the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 spotbugs 3m 59s the patch passed
+1 💚 shadedclient 28m 42s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 332m 45s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 38s The patch does not generate ASF License warnings.
456m 7s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3594/1/artifact/out/Dockerfile
GITHUB PR #3594
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell xml
uname Linux 202e847ebbae 4.15.0-147-generic #151-Ubuntu SMP Fri Jun 18 19:21:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 2b9883b
Default Java Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3594/1/testReport/
Max. process+thread count 2392 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3594/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants