HDFS-16967. RBF: File based state stores should allow concurrent access to the records #5523

virajjasani · 2023-03-30T22:39:06Z

File based state store implementations (StateStoreFileImpl and StateStoreFileSystemImpl) should allow updating as well as reading of the state store records concurrently rather than serially. Concurrent access to the record files on the hdfs based store seems to be improving the state store cache loading performance by more than 10x.

For instance, in order to maintain data integrity, when any mount table record(s) is updated, the cache is reloaded. This reload operation seems to be able to gain significant performance improvement by the concurrent access of the mount table records.

…ss to the records

virajjasani · 2023-03-30T22:39:46Z

@goiri @ZanderXu @tasanuma could you please review this PR?

virajjasani · 2023-03-30T22:42:21Z

Based on one of the testing data points (hdfs as state store driver), for the same num of mount table records to be loaded in the cache, avg time taken by default is ~1500 ms whereas with concurrent mode, it goes down to ~130 ms.

hadoop-yetus · 2023-03-31T00:46:21Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 57s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 2s		codespell was not available.
+0 🆗	detsecrets	0m 2s		detect-secrets was not available.
+0 🆗	xmllint	0m 2s		xmllint was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	41m 49s		trunk passed
+1 💚	compile	0m 44s		trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	compile	0m 38s		trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	checkstyle	0m 29s		trunk passed
+1 💚	mvnsite	0m 42s		trunk passed
+1 💚	javadoc	0m 48s		trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	0m 57s		trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	spotbugs	1m 32s		trunk passed
+1 💚	shadedclient	23m 57s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 34s		the patch passed
+1 💚	compile	0m 37s		the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javac	0m 37s		the patch passed
+1 💚	compile	0m 31s		the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	javac	0m 31s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 17s		the patch passed
+1 💚	mvnsite	0m 35s		the patch passed
+1 💚	javadoc	0m 33s		the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	0m 51s		the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	spotbugs	1m 22s		the patch passed
+1 💚	shadedclient	23m 28s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	22m 11s		hadoop-hdfs-rbf in the patch passed.
+1 💚	asflicense	0m 34s		The patch does not generate ASF License warnings.
		125m 56s

Subsystem	Report/Notes
Docker	ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5523/1/artifact/out/Dockerfile
GITHUB PR	#5523
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname	Linux 1a3d234f15b4 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `0c41ead`
Default Java	Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5523/1/testReport/
Max. process+thread count	2917 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5523/1/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

virajjasani · 2023-03-31T05:39:34Z

CacheMountTableLoadAvgTime diff:

Default mode:

Concurrent access mode:

goiri · 2023-03-31T22:11:38Z

.../java/org/apache/hadoop/hdfs/server/federation/store/driver/impl/StateStoreFileBaseImpl.java

@@ -168,9 +182,30 @@ public boolean initDriver() {
      return false;
    }
    setInitialized(true);
+    int threads = getConcurrentFilesAccessNumThreads();
+    if (threads > 0) {


Should it be >1?
Technically 1 thread would be serial.

.../java/org/apache/hadoop/hdfs/server/federation/store/driver/impl/StateStoreFileBaseImpl.java

goiri · 2023-03-31T22:22:51Z

...est/java/org/apache/hadoop/hdfs/server/federation/store/driver/TestStateStoreFileSystem.java

+    Configuration conf =
+        FederationStateStoreTestUtils.getStateStoreConfiguration(StateStoreFileSystemImpl.class);
+    conf.set(StateStoreFileSystemImpl.FEDERATION_STORE_FS_PATH, "/hdfs-federation/");
+    conf.set(FEDERATION_STORE_FS_ASYNC_THREADS, numFsAsyncThreads);


Could we make it setInt() and pass the number as an int.
It would be cleaner.
I'm not sure how the Parameterized handles that though.

hadoop-yetus · 2023-04-01T01:47:15Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 59s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+0 🆗	xmllint	0m 0s		xmllint was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	41m 44s		trunk passed
+1 💚	compile	0m 50s		trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	compile	0m 37s		trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	checkstyle	0m 31s		trunk passed
+1 💚	mvnsite	0m 43s		trunk passed
+1 💚	javadoc	0m 47s		trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	0m 55s		trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	spotbugs	1m 32s		trunk passed
+1 💚	shadedclient	23m 45s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 35s		the patch passed
+1 💚	compile	0m 37s		the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javac	0m 37s		the patch passed
+1 💚	compile	0m 32s		the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	javac	0m 32s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 17s		the patch passed
+1 💚	mvnsite	0m 35s		the patch passed
+1 💚	javadoc	0m 32s		the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	0m 53s		the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	spotbugs	1m 22s		the patch passed
+1 💚	shadedclient	23m 26s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	21m 49s		hadoop-hdfs-rbf in the patch passed.
+1 💚	asflicense	0m 34s		The patch does not generate ASF License warnings.
		125m 20s

Subsystem	Report/Notes
Docker	ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5523/2/artifact/out/Dockerfile
GITHUB PR	#5523
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname	Linux a7ef1392030a 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `70d39f0`
Default Java	Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5523/2/testReport/
Max. process+thread count	2539 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5523/2/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

simbadzina

Changes generally look good to me. Just a few small comments to make the code more readable.

simbadzina · 2023-04-03T20:18:13Z

.../java/org/apache/hadoop/hdfs/server/federation/store/driver/impl/StateStoreFileBaseImpl.java

+    return new QueryResult<>(result, getTime());
+  }
+
+  private <T extends BaseRecord> Void getRecordsFromFileAndRemoveOldTmpRecords(Class<T> clazz,


Can you add documentation to this function indication that the results list is being modified to collect the results.
Changing the function name would make that clearer too.

Done, thanks.
Since the method name already seems long enough, directly added to the Javadoc. Sounds good?

Looks good.

simbadzina · 2023-04-03T20:22:01Z

.../java/org/apache/hadoop/hdfs/server/federation/store/driver/impl/StateStoreFileBaseImpl.java

+    return success.get();
+  }
+
+  private <T extends BaseRecord> Void writeRecordToFile(AtomicBoolean success,


Similar comment as above. Can we add documentation indicating success is being modified.

simbadzina · 2023-04-03T20:31:09Z

.../java/org/apache/hadoop/hdfs/server/federation/store/driver/impl/StateStoreFileBaseImpl.java

@@ -137,6 +149,8 @@ public abstract <T extends BaseRecord> BufferedWriter getWriter(
   */
  protected abstract String getRootDir();

+  protected abstract int getConcurrentFilesAccessNumThreads();


Can we provide an implementation here and then just have one set of configs for the following two

FEDERATION_STORE_PREFIX + "driver.file.async.threads"; FEDERATION_STORE_PREFIX + "driver.fs.async.threads";

I'm okay with keeping them separate though if you have prefer that.

I believe separate configs would be better since we anyways have different implementation for many of other methods too. Sounds good to keep it as is?

Yeah, good to keep it as it is.

virajjasani · 2023-04-03T21:01:19Z

Thanks @simbadzina, addressed your comments

hadoop-yetus · 2023-04-03T23:05:14Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	1m 2s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 1s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+0 🆗	detsecrets	0m 1s		detect-secrets was not available.
+0 🆗	xmllint	0m 1s		xmllint was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	42m 11s		trunk passed
+1 💚	compile	0m 44s		trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	compile	0m 36s		trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	checkstyle	0m 30s		trunk passed
+1 💚	mvnsite	0m 42s		trunk passed
+1 💚	javadoc	0m 50s		trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	0m 59s		trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	spotbugs	1m 33s		trunk passed
+1 💚	shadedclient	23m 48s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 34s		the patch passed
+1 💚	compile	0m 36s		the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javac	0m 36s		the patch passed
+1 💚	compile	0m 31s		the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	javac	0m 31s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 17s		the patch passed
+1 💚	mvnsite	0m 34s		the patch passed
+1 💚	javadoc	0m 32s		the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1
+1 💚	javadoc	0m 51s		the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
+1 💚	spotbugs	1m 23s		the patch passed
+1 💚	shadedclient	23m 36s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	22m 29s		hadoop-hdfs-rbf in the patch passed.
+1 💚	asflicense	0m 35s		The patch does not generate ASF License warnings.
		126m 29s

Subsystem	Report/Notes
Docker	ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5523/3/artifact/out/Dockerfile
GITHUB PR	#5523
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname	Linux 7115f3382df4 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `a8673be`
Default Java	Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5523/3/testReport/
Max. process+thread count	2892 (vs. ulimit of 5500)
modules	C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5523/3/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

tasanuma

+1.

tasanuma · 2023-04-04T13:40:31Z

Thanks for your contribution, @virajjasani.
Thanks for reviewing it, @goiri and @simbadzina.

…ss to the records (apache#5523) Reviewed-by: Inigo Goiri <inigoiri@apache.org> Reviewed-by: Simbarashe Dzinamarira <sdzinamarira@linkedin.com> Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>

…ss to the records (apache#5523) Reviewed-by: Inigo Goiri <inigoiri@apache.org> Reviewed-by: Simbarashe Dzinamarira <sdzinamarira@linkedin.com> Signed-off-by: Takanobu Asanuma <tasanuma@apache.org> (cherry-picked from 937caf7) ACLOVERRIDE

HDFS-16967. RBF: File based state stores should allow concurrent acce…

0c41ead

…ss to the records

goiri approved these changes Mar 31, 2023

View reviewed changes

addendum

70d39f0

goiri approved these changes Apr 3, 2023

View reviewed changes

simbadzina reviewed Apr 3, 2023

View reviewed changes

addendum

a8673be

simbadzina approved these changes Apr 3, 2023

View reviewed changes

tasanuma approved these changes Apr 4, 2023

View reviewed changes

tasanuma merged commit 937caf7 into apache:trunk Apr 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDFS-16967. RBF: File based state stores should allow concurrent access to the records #5523

HDFS-16967. RBF: File based state stores should allow concurrent access to the records #5523

virajjasani commented Mar 30, 2023

virajjasani commented Mar 30, 2023

virajjasani commented Mar 30, 2023 •

edited

Loading

hadoop-yetus commented Mar 31, 2023

virajjasani commented Mar 31, 2023 •

edited

Loading

goiri Mar 31, 2023

goiri Mar 31, 2023

hadoop-yetus commented Apr 1, 2023

simbadzina left a comment

simbadzina Apr 3, 2023

virajjasani Apr 3, 2023 •

edited

Loading

simbadzina Apr 3, 2023

simbadzina Apr 3, 2023

virajjasani Apr 3, 2023

simbadzina Apr 3, 2023

virajjasani Apr 3, 2023

simbadzina Apr 3, 2023

virajjasani commented Apr 3, 2023

hadoop-yetus commented Apr 3, 2023

tasanuma left a comment

tasanuma commented Apr 4, 2023

HDFS-16967. RBF: File based state stores should allow concurrent access to the records #5523

HDFS-16967. RBF: File based state stores should allow concurrent access to the records #5523

Conversation

virajjasani commented Mar 30, 2023

virajjasani commented Mar 30, 2023

virajjasani commented Mar 30, 2023 • edited Loading

hadoop-yetus commented Mar 31, 2023

virajjasani commented Mar 31, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hadoop-yetus commented Apr 1, 2023

simbadzina left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

virajjasani Apr 3, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

virajjasani commented Apr 3, 2023

hadoop-yetus commented Apr 3, 2023

tasanuma left a comment

Choose a reason for hiding this comment

tasanuma commented Apr 4, 2023

virajjasani commented Mar 30, 2023 •

edited

Loading

virajjasani commented Mar 31, 2023 •

edited

Loading

virajjasani Apr 3, 2023 •

edited

Loading