New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CASSANDRA-18345 Stream all components registered by an SSTable #2420
Conversation
b4f92de
to
b67167e
Compare
nit: There are some unused import errors on J11 checkstyle |
b67167e
to
344d133
Compare
Indexes may register custom components. We need to include them in streaming to avoid costly building of indexes on the receiving side. In order to make this work, zero copy streaming had to be modified to index its writers by component name instead of component type because component types are not unique - e.g. many index components are of type CUSTOM.
344d133
to
c838ece
Compare
test/distributed/org/apache/cassandra/distributed/test/sai/IndexStreamingTest.java
Outdated
Show resolved
Hide resolved
"CREATE TABLE %s.test (pk int PRIMARY KEY, v text, b blob) WITH compression = { 'enabled' : false };" | ||
)); | ||
cluster.schemaChange(withKeyspace( | ||
"CREATE CUSTOM INDEX ON %s.test(v) USING 'StorageAttachedIndex';" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once numerics support merges, I guess we'll want to parameterize this test further to use those. The number of expected index components might not actually change though. CC @mike-tr-adamson
I'm not sure if that means we should merge this first and adjust inside a rebased CASSANDRA-18067 or the other way around.
src/java/org/apache/cassandra/db/streaming/ComponentManifest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach looks fine here. There are just a few things we should at least double check in terms of how we generate and validate the components we want to stream (...and we have to decide whether to merge this now and make the numerics patch add some testing or wait for numerics and test in this patch).
I addressed all review comments, except the part about introducing the SAI type for SAI components, because IMHO that needs a wider discussion. |
Instead of hardcoding a collection of streamable components, a flag `streamable` is added to the component type. This allows custom components (e.g. SAI components) to decide whether they should be streamed.
src/java/org/apache/cassandra/io/sstable/format/SSTableFormat.java
Outdated
Show resolved
Hide resolved
src/java/org/apache/cassandra/io/sstable/format/SSTableFormat.java
Outdated
Show resolved
Hide resolved
We don't need it because we can check streamability by inspecting component's streamable field directly.
test/distributed/org/apache/cassandra/distributed/test/sai/IndexStreamingTest.java
Outdated
Show resolved
Hide resolved
test/distributed/org/apache/cassandra/distributed/test/sai/IndexStreamingTest.java
Outdated
Show resolved
Hide resolved
test/distributed/org/apache/cassandra/distributed/test/sai/IndexStreamingTest.java
Outdated
Show resolved
Hide resolved
test/distributed/org/apache/cassandra/distributed/test/sai/IndexStreamingTest.java
Outdated
Show resolved
Hide resolved
test/distributed/org/apache/cassandra/distributed/test/sai/IndexStreamingTest.java
Outdated
Show resolved
Hide resolved
Don't use CUSTOM component. Instead, create separate component type for each SAI index component type.
src/java/org/apache/cassandra/io/sstable/SSTableZeroCopyWriter.java
Outdated
Show resolved
Hide resolved
test/distributed/org/apache/cassandra/distributed/test/sai/IndexStreamingTest.java
Outdated
Show resolved
Hide resolved
test/distributed/org/apache/cassandra/distributed/test/sai/IndexStreamingTest.java
Show resolved
Hide resolved
src/java/org/apache/cassandra/index/sai/disk/format/IndexComponent.java
Outdated
Show resolved
Hide resolved
src/java/org/apache/cassandra/index/sai/disk/format/IndexComponent.java
Outdated
Show resolved
Hide resolved
src/java/org/apache/cassandra/index/sai/disk/format/IndexComponent.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made a few more nits, but this looks pretty nice now.
Again, we'll have to modify/add some things here to deal w/ the numerics patch when it's ready (or in that patch if this merges), but that shouldn't be hard.
Co-authored-by: Andrés de la Peña <adelapena@users.noreply.github.com>
…exStreamingTest.java Co-authored-by: Andrés de la Peña <adelapena@users.noreply.github.com>
Co-authored-by: Andrés de la Peña <adelapena@users.noreply.github.com>
Co-authored-by: Caleb Rackliffe <maedhroz@users.noreply.github.com>
{ | ||
logger.info("Writing component {} to {} length {}", type, componentWriters.get(type).getPath(), prettyPrintMemory(size)); | ||
@SuppressWarnings({"resource", "RedundantSuppression"}) // all writers are closed in close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious...can individual writers fail and cause some writers not to be closed? (above, on 197)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess if that's the case, we could use FBUtitilites#closeAll
at SSTableZeroCopyWriter#close
. But we are not modifying that here, so I think it can be done on a separate ticket.
Committed as 5d3f257 |
Custom indexes may register custom components.
We need to include them in streaming as well.