-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-53794][SS] Add option to limit deletions per maintenance operation associated with rocksdb state provider #52511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… associated with rocksdb state provider
cc - @HeartSaVioR - PTAL, thx ! |
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBFileManager.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBFileManager.scala
Outdated
Show resolved
Hide resolved
Hi, @anishshri-db , just a question, do you have any reason not to wait for an approval from the Apache Spark committers? ![]() For some urgent cases, Apache Spark release managers can do the merging their PRs to unblock the release (and, of course, expects a late post-merge approval), but I'm just curious if you have those kind of exceptional cases in your BTW, according to the PR code, I can give my +1 to meet the Apache Spark community requirements (if you missed the approval mistakenly). So, this is just a question and there is no big deal here. |
Hi @dongjoon-hyun - thanks for the pointer. Sorry not intentional, was not sure whether we always need committer stamp for committer authored PRs as well. Could you please stamp this one for me ? Will make sure to get committer stamp henceforth before merging. Thanks ! |
+1, LGTM. No problem at all. I just wanted to remove any such an ASF issue from this PR . Thank you, @anishshri-db . |
What changes were proposed in this pull request?
Add option to limit deletions per maintenance operation associated with rocksdb state provider
Why are the changes needed?
We see some instances where the changelog deletion can take a really long time. This means that for that partition, we also cannot upload full snapshots which affects recovery/replay scenarios. This problem is much more apparent on resource constrained clusters. So, we add an option to allow for incremental cleanup per maintenance operation invocation.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Added unit tests
Was this patch authored or co-authored using generative AI tooling?
No