HDFS-16426. Fix nextBlockReportTime when trigger full block report force #3887

liubingxing · 2022-01-13T13:25:46Z

When we trigger full block report force by command line, the next block report time will be set like this:

nextBlockReportTime.getAndAdd(blockReportIntervalMs);

nextBlockReportTime will larger than blockReportIntervalMs.

If we trigger full block report twice, the nextBlockReportTime will larger than 2 * blockReportIntervalMs. This is obviously not what we want.

We fix the nextBlockReportTime = now + blockReportIntervalMs after full block report trigger by command line.

tomscut · 2022-01-13T13:50:18Z

...project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java

      // If we have sent the first set of block reports, then wait a random
      // time before we start the periodic block reports.
      if (resetBlockReportTime) {
        nextBlockReportTime.getAndSet(monotonicNow() +
            ThreadLocalRandom.current().nextInt((int) (blockReportIntervalMs)));
        resetBlockReportTime = false;
+      } else if (forceFullBr) {
+        nextBlockReportTime.getAndSet(monotonicNow() + blockReportIntervalMs);


If many datanodes of a large cluster is triggered in batches, the FBR time of these datanodes will be concentrated in the future, which may cause great pressure on NN. Maybe we also need to add a random value here.

I think that in most of cases, we want to know the exactly nextBlockReportTime if we trigger block report by force.
As for trigger full block report in batches, I think it is a dangerous behavior because it may cause some datanode heartbeat timeout.

I think that in most of cases, we want to know the exactly nextBlockReportTime if we trigger block report by force. As for trigger full block report in batches, I think it is a dangerous behavior because it may cause some datanode heartbeat timeout.

I mean, if we trigger the FBR on some nodes in a short period of time, it will affect the original randomness.

@liubingxing @tomscut
Thanks for the discussion. IMHO, I prefer to use a random time to keep the randomness.
I think HDFS-15167 causes this problem. Before HDFS-15167, resetBlockReportTime is true when triggering full block report by force, and it used a random time.

hadoop-yetus · 2022-01-13T21:29:39Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 55s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	32m 29s		trunk passed
+1 💚	compile	1m 28s		trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚	compile	1m 22s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	checkstyle	1m 0s		trunk passed
+1 💚	mvnsite	1m 27s		trunk passed
+1 💚	javadoc	1m 3s		trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚	javadoc	1m 35s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	3m 13s		trunk passed
+1 💚	shadedclient	22m 36s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	1m 19s		the patch passed
+1 💚	compile	1m 18s		the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚	javac	1m 18s		the patch passed
+1 💚	compile	1m 13s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	javac	1m 13s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 50s		the patch passed
+1 💚	mvnsite	1m 17s		the patch passed
+1 💚	javadoc	0m 51s		the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚	javadoc	1m 24s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	3m 15s		the patch passed
+1 💚	shadedclient	22m 14s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	383m 14s		hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 48s		The patch does not generate ASF License warnings.
		482m 34s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3887/1/artifact/out/Dockerfile
GITHUB PR	#3887
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname	Linux 261e0b815b37 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / 1a64e33fcfd9c02ca9ee49faef520b63cc65ee39
Default Java	Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3887/1/testReport/
Max. process+thread count	2776 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3887/1/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

liubingxing · 2022-01-17T10:03:14Z

@tasanuma @Hexiaoqiao Please take a look.

liubingxing · 2022-01-18T11:30:06Z

@tasanuma Thanks for your advice, I update the PR to set resetBlockReportTime=true when triggering full block report by force. Please take a look.

hadoop-yetus · 2022-01-18T13:46:46Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 50s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 1 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	33m 38s		trunk passed
+1 💚	compile	1m 25s		trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚	compile	1m 17s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	checkstyle	0m 59s		trunk passed
+1 💚	mvnsite	1m 24s		trunk passed
+1 💚	javadoc	1m 2s		trunk passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚	javadoc	1m 32s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	3m 9s		trunk passed
+1 💚	shadedclient	25m 21s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	1m 23s		the patch passed
+1 💚	compile	1m 26s		the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚	javac	1m 26s		the patch passed
+1 💚	compile	1m 16s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	javac	1m 16s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 55s		the patch passed
+1 💚	mvnsite	1m 25s		the patch passed
+1 💚	javadoc	0m 58s		the patch passed with JDK Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04
+1 💚	javadoc	1m 33s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	3m 35s		the patch passed
+1 💚	shadedclient	24m 49s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
-1 ❌	unit	450m 24s	/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt	hadoop-hdfs in the patch passed.
+1 💚	asflicense	0m 48s		The patch does not generate ASF License warnings.
		556m 34s

Reason	Tests
Failed junit tests	hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand
	hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks
	hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3887/2/artifact/out/Dockerfile
GITHUB PR	#3887
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname	Linux 700022c21a6a 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `6965a47`
Default Java	Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.13+8-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3887/2/testReport/
Max. process+thread count	3074 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3887/2/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

tasanuma

LGTM.

tasanuma · 2022-01-19T04:42:56Z

Thanks for your contribution, @liubingxing. Thanks for your review, @tomscut.

…rce (#3887) (cherry picked from commit fcb1076)

liubingxing · 2022-01-19T09:33:28Z

Thanks @tasanuma @tomscut

…eport force (#3887)" This reverts commit 96dd426.

…rce (apache#3887) (cherry picked from commit fcb1076)

…rce (apache#3887)

tomscut reviewed Jan 13, 2022

View reviewed changes

HDFS-16426. Fix nextBlockReportTime when trigger full block report force

6965a47

liubingxing force-pushed the HDFS-16426 branch from 1a64e33 to 6965a47 Compare January 18, 2022 04:28

tasanuma approved these changes Jan 19, 2022

View reviewed changes

tasanuma merged commit fcb1076 into apache:trunk Jan 19, 2022

tasanuma pushed a commit that referenced this pull request Jan 19, 2022

HDFS-16426. Fix nextBlockReportTime when trigger full block report fo…

1c71d6e

…rce (#3887) (cherry picked from commit fcb1076)

tasanuma pushed a commit that referenced this pull request Jan 19, 2022

HDFS-16426. Fix nextBlockReportTime when trigger full block report fo…

96dd426

…rce (#3887) (cherry picked from commit fcb1076)

tasanuma added a commit that referenced this pull request Jan 20, 2022

Revert "HDFS-16426. Fix nextBlockReportTime when trigger full block r…

53c4d89

…eport force (#3887)" This reverts commit 96dd426.

bogthe pushed a commit to bogthe/hadoop that referenced this pull request Feb 2, 2022

HDFS-16426. Fix nextBlockReportTime when trigger full block report fo…

9ec25f9

…rce (apache#3887) (cherry picked from commit fcb1076)

HarshitGupta11 pushed a commit to HarshitGupta11/hadoop that referenced this pull request Nov 28, 2022

HDFS-16426. Fix nextBlockReportTime when trigger full block report fo…

923f4cb

…rce (apache#3887)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDFS-16426. Fix nextBlockReportTime when trigger full block report force #3887

HDFS-16426. Fix nextBlockReportTime when trigger full block report force #3887

liubingxing commented Jan 13, 2022 •

edited

tomscut Jan 13, 2022 •

edited

liubingxing Jan 14, 2022

tomscut Jan 14, 2022

tasanuma Jan 18, 2022

hadoop-yetus commented Jan 13, 2022

liubingxing commented Jan 17, 2022

liubingxing commented Jan 18, 2022

hadoop-yetus commented Jan 18, 2022

tasanuma left a comment

tasanuma commented Jan 19, 2022

liubingxing commented Jan 19, 2022

HDFS-16426. Fix nextBlockReportTime when trigger full block report force #3887

HDFS-16426. Fix nextBlockReportTime when trigger full block report force #3887

Conversation

liubingxing commented Jan 13, 2022 • edited

tomscut Jan 13, 2022 • edited

Choose a reason for hiding this comment

liubingxing Jan 14, 2022

Choose a reason for hiding this comment

tomscut Jan 14, 2022

Choose a reason for hiding this comment

tasanuma Jan 18, 2022

Choose a reason for hiding this comment

hadoop-yetus commented Jan 13, 2022

liubingxing commented Jan 17, 2022

liubingxing commented Jan 18, 2022

hadoop-yetus commented Jan 18, 2022

tasanuma left a comment

Choose a reason for hiding this comment

tasanuma commented Jan 19, 2022

liubingxing commented Jan 19, 2022

liubingxing commented Jan 13, 2022 •

edited

tomscut Jan 13, 2022 •

edited