Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-46979][SS] Add support for specifying key and value encoder separately and also for each col family in RocksDB state store provider #45038
[SPARK-46979][SS] Add support for specifying key and value encoder separately and also for each col family in RocksDB state store provider #45038
Changes from 3 commits
3ad16ab
dc1fcc4
4afdc1a
71cf5d1
c96ad24
62cba05
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Maybe microbenchmark could tell that this could regress for default column family only - map lookup with carefully crafted lock operation in every op, though I'd rather not concern before we see actual regression.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I didn't worry about it too much, given that the provider init likely happens once for long lived queries and where we can retain the use of the same provider on the same executor across m/batch executions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, what I meant is to look up concurrent map per "every op" to figure out encoder, for existing stateful operators - previously it was just a reference to the field. But ops is relatively very cheap compared to commit as of now, so let's see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok - yea mainly didn't want to maintain 2 data structures for this. But if we find that its more expensive, then we can just split some of the logic for the default col family case