[SPARK-35988][SS] The implementation for RocksDBStateStoreProvider#33187
[SPARK-35988][SS] The implementation for RocksDBStateStoreProvider#33187xuanyuanking wants to merge 7 commits intoapache:masterfrom
Conversation
|
Test build #140577 has finished for PR 33187 at commit
|
|
Kubernetes integration test unable to build dist. exiting with code: 1 |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #140587 has finished for PR 33187 at commit
|
HeartSaVioR
left a comment
There was a problem hiding this comment.
Looks OK, but I'd like to make sure @viirya is also OK with the change, as I'm familiar with this change and might be biased.
There was a problem hiding this comment.
This can be referenced by version so no need to be val.
There was a problem hiding this comment.
I thought the characters (=,()) being used in storeIdStr are odd, but this works in MacOS and I think it would also work in Linux file systems, so OK.
There was a problem hiding this comment.
Yes, the dir contains =() works in the Linux file system.
There was a problem hiding this comment.
This simply works as there's only one version (version 0) - we'll need to check the version when we add a new state encoding version.
There was a problem hiding this comment.
Yea, agree. For the new encoding version, we should have branches here for different versions.
There was a problem hiding this comment.
We can deduplicate here; I can deal with it in prefix scan for RocksDB state store as I'll bring broader change for state encoding.
There was a problem hiding this comment.
nit: remove one of two empty lines
|
NOTE: This might have some post-review comments from #32934 . |
|
Finally we only have this one for RocksDB state store provider! Please rebase this so that we can continue. |
|
Yes, finally! |
|
Kubernetes integration test unable to build dist. exiting with code: 1 |
|
Kubernetes integration test starting |
|
Test build #140685 has finished for PR 33187 at commit
|
|
Kubernetes integration test status success |
|
Test build #140697 has finished for PR 33187 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #140707 has finished for PR 33187 at commit
|
HeartSaVioR
left a comment
There was a problem hiding this comment.
The code change looks good to me. Great!
One thing we haven't been addressed yet is documentation. End users have no idea how to use this provider. Probably we should introduce RocksDB state store provider in SS guide doc.
It's OK if you prefer to file a "blocker" JIRA issue to address documentation separately. I can approve this one then.
| assert(store.hasCommitted) | ||
| val storeMetrics = store.metrics | ||
| assert(storeMetrics.numKeys === 1) | ||
| assert(getCustomMetric(storeMetrics, CUSTOM_METRIC_FILES_COPIED) == 1L) |
There was a problem hiding this comment.
nit: I guess we can loose the condition via > 0L. "The number of files is 1" doesn't seem to something we need to check.
There was a problem hiding this comment.
Agree, will update in the next commit.
| assert(getSizeOfStateForCurrentVersion(store.metrics) > noDataMemoryUsed) | ||
| } | ||
|
|
||
| test("maintenance") { |
There was a problem hiding this comment.
Just a head-up: "maintenance" and "snapshotting" are moved from StateStoreSuiteBase to StateStoreSuite as these tests cannot be applied to RocksDB state store provider.
There was a problem hiding this comment.
Yes, since these tests have the assumption that the provider have a baseDir
Sure, create the blocker issue SPARK-36041 for tracking the documentation. |
HeartSaVioR
left a comment
There was a problem hiding this comment.
+1
I'll merge once the test passes.
My apologize on not waiting for reviewing on others, we are far behind on progress (compared to the planned date on RC) and this PR is somehow a blocker for session window stuff. The code has been running on production for years so it won't make some problems, but we are open to deal with post-reviews during QA phase.
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
|
||
| override def id: StateStoreId = RocksDBStateStoreProvider.this.stateStoreId | ||
|
|
||
| override def version: Long = lastVersion |
There was a problem hiding this comment.
I recommend to be consistent with HDFSBackedStateStore where it uses version and newVersion.
There was a problem hiding this comment.
Actually for RocksDB, we keep the name newVersion. The major difference here is we have a loadedVersion concept, so that's why for the provider side, we have a latestVersion.
| verify(state == UPDATING, "Cannot put after already committed or aborted") | ||
| verify(key != null, "Key cannot be null") | ||
| require(value != null, "Cannot put a null value") | ||
| logDebug(s"Storing $key => $value") |
There was a problem hiding this comment.
Not sure if this log debug is necessary. Especially key/value are unsafe rows.
There was a problem hiding this comment.
Make sense, let me delete this debug log.
|
|
||
| /** | ||
| * Encodes/decodes UnsafeRows to versioned byte arrays. | ||
| * It uses the first byte of the generated byte array to store the version the describes how the |
There was a problem hiding this comment.
I guess this is a typo. "the" -> "that".
| Platform.BYTE_ARRAY_OFFSET + STATE_ENCODING_NUM_VERSION_BYTES, | ||
| keyBytes.length - STATE_ENCODING_NUM_VERSION_BYTES) | ||
| keyRow | ||
| } else null |
There was a problem hiding this comment.
nit: I rarely see style like this in Spark.
There was a problem hiding this comment.
Make sense, done in the next commit.
| val CUSTOM_METRIC_CHECKPOINT_TIME = StateStoreCustomTimingMetric( | ||
| "rocksdbCommitCheckpointLatency", "RocksDB: commit - checkpoint time") | ||
| val CUSTOM_METRIC_FILESYNC_TIME = StateStoreCustomTimingMetric( | ||
| "rocksdbCommitPauseBgTime", "RocksDB: commit - file sync time") |
There was a problem hiding this comment.
rocksdbCommitPauseBgTime? Or FileSyncTime?
There was a problem hiding this comment.
Yea, changed to rocksdbFileSyncTime
| val rocksDBConfInTask: RocksDBConf = testRDD.mapPartitionsWithStateStore[RocksDBConf]( | ||
| spark.sqlContext, testStateInfo, testSchema, testSchema, None) { | ||
| (store: StateStore, _: Iterator[String]) => | ||
| // Use reflection to get RockDB instance |
There was a problem hiding this comment.
Thanks, done in the next commit.
viirya
left a comment
There was a problem hiding this comment.
Looks good. Some minor comments. Feel free to address them in follow-up if you prefer to get this in first.
|
Kubernetes integration test unable to build dist. exiting with code: 1 |
|
Test build #140779 has finished for PR 33187 at commit
|
HeartSaVioR
left a comment
There was a problem hiding this comment.
+1
Thanks! Merging to master/3.2!
### What changes were proposed in this pull request? Add the implementation for the RocksDBStateStoreProvider. It's the subclass of StateStoreProvider that leverages all the functionalities implemented in the RocksDB instance. ### Why are the changes needed? The interface for the end-user to use the RocksDB state store. ### Does this PR introduce _any_ user-facing change? Yes. New RocksDBStateStore can be used in their applications. ### How was this patch tested? New UT added. Closes #33187 from xuanyuanking/SPARK-35988. Authored-by: Yuanjian Li <yuanjian.li@databricks.com> Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com> (cherry picked from commit 0621e78) Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
|
Thanks @xuanyuanking for contribution! Merged to master/3.2 branches. |
|
Thanks @HeartSaVioR and @viirya for all the help and review! |
|
Test build #140790 has finished for PR 33187 at commit
|
What changes were proposed in this pull request?
Add the implementation for the RocksDBStateStoreProvider. It's the subclass of StateStoreProvider that leverages all the functionalities implemented in the RocksDB instance.
Why are the changes needed?
The interface for the end-user to use the RocksDB state store.
Does this PR introduce any user-facing change?
Yes. New RocksDBStateStore can be used in their applications.
How was this patch tested?
New UT added.