STAR-1693: Changes from OSS - DO NOT MERGE by jacek-lewandowski · Pull Request #552 · datastax/cassandra

jacek-lewandowski · 2022-10-11T06:37:38Z

No description provided.

(cherry picked from commit d834519) (cherry picked from commit 2e3828b) (cherry picked from commit ad9a9e6)

…g in query path (#222) (cherry picked from commit d40f9e3) (cherry picked from commit 41b5b8f) (cherry picked from commit d83a0ae)

…227) (cherry picked from commit 0969332) (cherry picked from commit 91a753d) (cherry picked from commit 56a9b56)

Add page size in bytes flag to protocol Introduce PageSize object Protocol version changes No support for describe statement yet Simplify SecondaryIndexManager page calculation Add page size in bytes to DataLimits Refactor pagers Add / pull some tests Add some toString implementations Add PageSize to expected classes in DatabaseDescriptorRefTest Fix AggregationPartitionIterator So far we were passing the main page size to the AggregationPartitionIterator, which: - was pointless because there is no paging when we aggregate everything - it was actually harmful because AggregationPartitionIterator is a subclass of GroupByPartitionIterator and the later updates the subPager's limits with the minimum count of main page size and the number of remaining. It is correct if we use grouping aware limits, where count applies to the whole groups. But when we do aggregate everything, simply CQL limits are used and count limit applies to rows. Concluding, without fixing that we would limit the number of aggregated rows to the main page size which is not what we want (cherry picked from commit e11d716) (cherry picked from commit 13d4569) (cherry picked from commit 4f65564)

This was failing because off-heap native clustering keys were used in stats metadata without being copied, referencing memory that could be overwritten. Also fixes a problem creating retainable/minimized versions of clustering bounds and boundaries. (cherry picked from commit f00e340) (cherry picked from commit f0904a3) (cherry picked from commit 80a0383)

STAR-823: Refactor background compactions CompactionManager.BackgroundCompactionCandidate task is scheduled on compaction executor and when it detects there are compaction tasks to run, it starts each compaction task as a separate job on the same compaction executor and blocks until all tasks are finished. When the pool size for the executor is n, and there are n background tasks submitted in parallel, and all of them find that there are some compactions to run, they will schedule them and block until the tasks are finished. Though, the tasks cannot start because the pool is full - we have n background tasks there waiting for the compaction tasks that cannot start. Another thing (perhaps minor) is that we use getActiveCount() on the executor to check how many tasks it is currently running, and based on that information either schedule new tasks or not. The problem with this method is that it returns an approximate result and should not be used for making such decisions. To address those problems, running of background compactions was refactored. The whole logic for background compactions was extracted into a distinct class BackgroundCompactionsRunner. It allows flagging CFSs for compaction and schedules scans through the flagged CFSs on a dedicated executor so that scans and compaction tasks are no longer sharing the same executor. Co-authored-by: Branimir Lambov <branimir.lambov@datastax.com> (cherry picked from commit 96bf61c) (cherry picked from commit 804b885) (cherry picked from commit 2c0dcd7)

added more language tests added brazilian cql test passes added support for setting a lucene analyzer cql json test passes fixed up some things cleanup added query analyzer cleanup; added constants added exception handling in unit test added bad options unit tests added char filter removed comments and extra code added illegal arg ex to LuceneAnalyzer#hasNext added stop word support; prior to removal reworked, no more stop words added lowercase filter test added ngram filter test added simplepattern test; snowball off added czech and porter fixed alloc removed commented out code removed extra code fixed minor issues maybe fixex setMinMax cleanup reverted reverted to a new byte[] per tokenized term cleanup cleanup fixed sasi test fixed unit test bug cleanup refactored for npe addressed review comments fixed npe bug fixed a couple of bugs removed json_ from options names; applied sonar comments fixed sonar comments fixed unit test bug changed exception thrown get -> create fixed minor issue (cherry picked from commit 3227a57) (cherry picked from commit add6b8d) (cherry picked from commit 1c60b2d)

The failure detector is now configurable via cassandra.custom_failure_detector_class (cherry picked from commit 8127d43) # The commit message #2 will be skipped: # STAR-842 - fix up (cherry picked from commit d9625b6)

… port (#245) (cherry picked from commit adf34e3) (cherry picked from commit f2a9b43)

…oken metadata) (#242) Introduced TokenMetadataProvider to abstract access to TokenMetadata and make it pluggable. (cherry picked from commit 8c0a970) (cherry picked from commit 58b15c2)

Partition key ByteBuffer and columns btree were not taken into account and some ByteBuffers were not measured correctly. Also fixes flakes in MemtableSizeTest caused by including allocator pool in measurements and updates it to test all memtable allocation types. (cherry picked from commit d8d3e8b) (cherry picked from commit f8963ca)

* STAR-865: Porting metrics from cndb-884, riptano/bdp@03b23db6a5697baaf71d46d661c0ac1c908bc33e and riptano/bdp/#19515 Co-authored-by: Zhao Yang <jasonstack.zhao@gmail.com> Co-authored-by: Jake Luciani <tjake@users.noreply.github.com> * STAR-865: Porting MicrometerChunkCacheMetrics from: CNDB-161 Add MicrometerMetrics class CNDB-780 Add Micrometer metrics for the chunk cache Co-authored-by: Stefania Alborghetti <stefania.alborghetti@datastax.com> (cherry picked from commit 5e0d889) fix ConcurrencyFactorTest the metrics require reset if we want to measure, especially the max values (cherry picked from commit 3f89514)

This is a port of - https://github.com/riptano/bdp/commit/b6f0a18cb832c62f05cdcbd9cdcc2923f2fa727f - https://github.com/riptano/bdp/pull/19468 The first change set introduces QueryInfoTracker (QIT) interface and hooks it to StorageProxy. The second adds ClientState to the interface. The original QIT utilizes ReadReconciliationObserver in the ReadTracker paths. Only onRow, onPartition and queried callbacks are utilized by CNDB and thus only these methods are ported to Converged Cassandra (CC). The callbacks are a bit different tho: - The callback methods are added directly to ReadTracker as CC doesn't have ReadReconciliationObserver. The class was added as a part of NodeSync effort and it is rather superfluous. Porting the whole class would add unnecessary complexity. Adding the required methods directly to the ReadTracker makes the interface cleaner and easier to understand. - CC operates on ReplicaPlans instead of plain host lists, that is why queried was changed to onReplicaPlan. (cherry picked from commit c32e91f) (cherry picked from commit 093bc63)

Add support for: - Registering new verbs at runtime - Decorating existing verb handlers with another method - Running a callback after the MessagingService sends a message (cherry picked from commit c7d0ba0) (cherry picked from commit e8e13ed)

(cherry picked from commit 7373d2f) (cherry picked from commit 4134210)

FailingRepairTest uses serialization to pass Verbs back and forth between the nodes during the test. Unforuntately, Verbs aren't serializable anymore because they're no longer enums and this broke the test. Instead of passing a verb around, pass the verb Id and lookup the verb inside of the test method. (cherry picked from commit 47d0719) (cherry picked from commit 6ed9a94)

…A-16663) (#262) Ported from OSS commit d220d24. (cherry picked from commit b762e3c) (cherry picked from commit 34374c1)

The main objective of this refactoring is to enable compaction strategies to operate on a lean abstraction of an sstable and the compaction space instead of the full-blown open SSTableReader and ColumnFamilyStore. The compaction process itself must still operate on SSTableReaders which provide the mechanisms for reading the data; switching between the two representations is done when compaction signals it is ready to start compaction on a set of sstables via the realm's tryModify method. Most files in the compaction package have been changed to rely solely on CompactionSSTable and CompactionRealm, with the exception of CompactionManager and BackgroundCompactionsRunner, which are part of the CFS implementation. Also does some small fixes and simplifications identified during the refactoring: - Fixes bloom filter size in Upgrader calculated for splitting to compaction strategy's sstable size limit while files weren't actually split. - Stops checking an sstable's bloom filter if its minTimestamp is already above the current min for purge functions. - Some collection construction/processing simplifications. - Breaks up compaction -> CFS -> compaction reference cycles. - Refactors some method to lower their complexity as requested by sonarcloud. - Changes some remaining ...LatencyPerKb names to ...TimePerKb. (cherry picked from commit 943ae99) (cherry picked from commit e0fd645)

This replaces the Node-based walks and transformations. The result is drastically less intermediate object creation, improved performance and somewhat simpler code at the expense of the concept being a little harder to understand initially. Adds further documentation and expands tests for sliced tries. (cherry picked from commit 2b3c4c5) (cherry picked from commit 43c5206)

(cherry picked from commit e0982c6)

(cherry picked from commit aff4ab0) (cherry picked from commit d3f271c)

* STAR-894 Port [CASSANDRA-16926] CEP-10 Phase 1: Mockable Filesystem Co-authored-by: Benedict Elliott Smith <benedict@apache.org> Co-authored-by: Aleksey Yeschenko <aleksey@apache.org> (cherry picked from commit 477fda8) (cherry picked from commit c459c65)

Makes Snapshot class of DecayingEstimatedHistogramReservoir public, so its API is accessible by external components such as CNDB. Adds public getter to retrieve the array of bucket offsets. (cherry picked from commit e57753c) (cherry picked from commit 572fa86)

…oks (#271) (cherry picked from commit c6cd4ee) (cherry picked from commit fb5c241)

* STAR-909 Port DynamicSnitchSeverityProvider from bdp/cndb Co-authored-by: Zhao Yang <jasonstack.zhao@gmail.com> (cherry picked from commit 150e432) (cherry picked from commit 21a4dd3)

* LogTransaction: add ILogTransactionsFactory to provide custom log transaction * UCS: Port CNDB-2134 to disable shards on UCS L0 * UCS: add CompactionAggregatePrioritizer to prioritize sstables based on remote file cache * NativeLibrary: Add INativeLibrary interface to provide custom implementation * SSTableWatcher: to discover custom component before opening sstables * StorageProvider: support custom file system and change Descriptor to use URI * StorageFeatureFlags: disable features that are not supported by custom file system * StorageHandler: to reload sstable from custom file system (cherry picked from commit e27ee69) (cherry picked from commit e98d05a)

(cherry picked from commit 7d5184d) (cherry picked from commit 54627bd)

(cherry picked from commit 41cb66e) (cherry picked from commit 8b71940)

Add methods to PathUtils: * deleteContent method that recursively deletes the contents of a directory, leaving the directory empty; * listPaths methods to list all the paths in a directory, optionally using a provided filter. Add method to Descriptor: * validFilenameWithComponent to return the Component from an sstable file name (cherry picked from commit 4f1c86b) (cherry picked from commit ef840e5)

Extends the API of NativeLibrary to create a directory by providing the path as string, so specialized implementations of a file system don't to do additional conversion into Cassandra File, which is then converted into string representation. (cherry picked from commit 3ba1a16) (cherry picked from commit aff175b)

…d getLast (#546)

In STAR-1335 we ported most of CNDB-4090 but we missed out a call to StorageProvider.invalidateFileSystemCache() which is required to invalidate the remote storage cache in CNDB whenever we encounter corruption. This was discovered because the RemoteFileCacheCorruptedPageTest unit test is failing.

Co-authored-by: Stefania Alborghetti <stef1927@users.noreply.github.com>

#555) * STAR-1697: Port keyspace renaming (KeyspaceMetadata::rename) from DB-3896 Required by CNDB-3170 and CNDB-4909.

This patch adds a way to customize the compaction overhead, i.e. the transient amount of space required by a compaction whilst both input and output sstables are present. In BDP this is just estimated to be the size of the input sstables. It's unclear if we can improve this in CNDB, but I kept the refactoring because initially I got confused thinking that in CNDB we could just waive this requirement since the input sstables are in the file cache. So I think it's good to spell out why we use the input sstable sizes by encapsulating the calculation in a method with javadoc. The patch also adds a warning to the logs: if a compaction cannot be performed because the space overhead is larger than the space available, then the logs now contain this information. Without this, troubleshooing why compaction tasks are skipped is quite hard. This warning was already present in BDP but was missing for CNDB. Port CNDB-4385

…fication This commit changes the API of UCS as follows: - The Bucket inner class is now public - The method for extracting shards with buckets is now public, and it accepts a custom list of sstables These changes are required for CNDB, so that we can classify all live sstables and visualize their corresponding shards and buckets in a diagnostic tool such as Autobot. The comments for warnIfSizeAbove have been clarified and moved to the method Javadoc. Port of CNDB-4385

Port of CNDB-5113 Fixed DroppedColumn#toCQLString by using the CQL String version of the column name, which also double quotes the name if it's in mixed case. Co-authored-by: Massimiliano Tomassi <max.tomassi@datastax.com>

Co-authored-by: Stefania Alborghetti <stef1927@users.noreply.github.com>

…arn about it (#558) Co-authored-by: Matt Fleming <mfleming@users.noreply.github.com>

…le txn bug Port BDP part of CNDB-4035: restore sstables if they cannot be dropped and fix lifecycle txn bug so that we SSTables are added back to the live set if we fail to drop them.

…n region comes online (#551)

Port CNDB-4855 Fixed streaming to connect back using peer preferred address instead of Channel#remoteAddress

There are a couple of things here: - `unsafeFree` method in `BufferPool` did not do what it was probably expected to do - that is, the direct buffer was not released properly because when it is allocated in `allocateDirectAligned` the method actually returns a slice of the original buffer, while the only reference to the original buffer is in `attachment` field of the returned buffer. It was mitigated by using a new method to clean, which can release the parent buffer by recursively go through the attachment hierarchy - for in-jvm dtests, releasing of all buffers in buffer pools was added as the very last step of instance shutdown; it fixes memory leaking between the subsequent instance restarts; in production run, we just stop the JVM and the buffers are lost, but in case of those dtest we need to deal with that explicitly

blambov

The Allocator/Cloner changes look good to me.

…merge-ds-trunk

sonarqubecloud · 2022-10-18T14:42:54Z

Kudos, SonarCloud Quality Gate passed!

2 Bugs
0 Vulnerabilities
0 Security Hotspots
83 Code Smells

85.6% Coverage
0.0% Duplication

…2042) ### What is the issue Fixes: https://github.com/riptano/cndb/issues/15527 CNDB test PR: https://github.com/riptano/cndb/pull/16797 ### What does this PR fix and why was it fixed This PR upgrades jvector, which brings several improvements. Here are the git commits brought in: ``` 8b3e93cf (tag: 4.0.0-rc.8) chore: update changelog for 4.0.0-rc.8 (#627) 9d0488e5 release 4.0.0-rc.8 (#626) 570bd118 Refactor parallel writer (#608) 20c348ec Move buffer position in ByteBufferIndexWriter#writeFloats (#607) d9ddce51 Ensure extractTrainingVectors return a list of at most MAX_PQ_TRAINING_SET_SIZE (#610) d663b4f7 add config options for regression testing (#609) 7e493eee On-disk index cache for the Grid benchmark harness (#612) e263cc80 Improved dataset loading; fixes, safeties, diagnostics, and better feedback (#613) 6b235ce7 bump to next SNAPSHOT (#605) 84bf5708 (tag: 4.0.0-rc.7) chore: update changelog for 4.0.0-rc.7 (#604) fceeb885 release 4.0.0-rc.7 (#603) 51807cba add protection against bad ordinal mappings (#602) 6ca3b5e2 adding memory and disk usage stats to bench tests (#591) a66fd914 Fix OnDiskGraphIndex#ramBytesUsed NPE (#588) 0ca5a392 Move float bulk-write into IndexWriter to enforce endianness (#577) a6c6c09b Add diversityScoreFunctionFor to avoid creation of wrapper object (#592) 977c21d4 Relax the threshold of a flaky test related to an experimental feature (#598) fa808d69 adding average nodes visited to benchmark tests (#552) 3bd15e70 Virtualize and Modularize DataSetLoader logic (#593) 42259e9f Speed up ivec reads by buffering (#584) f967f1c9 virtualize DataSet (#589) 55f902f4 turn off parallel writes in grid (#582) 019a241d Parallelize graph writes (#542) 02fea879 Save allocation of a large array in PQVectors.encodeAndBuild (#574) 32a51821 javadoc for base [graph] (#548) 4eb607f8 javadoc for base [disk,exceptions] (#547) 30e8932c Enable the fused graph index (#561) d8848fc6 Start development on 4.0.0-rc.7-SNAPSHOT (#573) c57f3a62 (tag: 4.0.0-rc.6) chore: update changelog for 4.0.0-rc.6 (#572) 214b7c20 release 4.0.0-rc.6 (#571) e3686999 fix javadoc error (#570) 88669887 Ignoring testIncrementalInsertionFromOnDiskIndex_withNonIdentityOrdinalMapping and adding a TODO in buildAndMergeNewNodes (#569) 29a943e1 Computation of reconstruction errors for vector compressors (#567) d8e9cb16 Add NVQ paper in README (#560) d5cbe658 Add ImmutableGraphIndex.isHierarchical (#563) b484dae2 Harden tests for heap graph reconstruction (#543) 9471c57d Make the thresholds in TestLowCardinalityFiltering tighter (#559) 21e4a226 Begin development on 4.0.0-rc.6 (#558) 4f661d99 Revert "Start development on 4.0.0-rc.6-SNAPSHOT" fdee5779 Start development on 4.0.0-rc.6-SNAPSHOT ``` ### SAI Version Bump Adds a new sai on disk version: `fa` ### Fused PQ With this version, we are adding a new, experimental feature to write PQ vectors fused into the graph. In doing so, we are able to skip writing the PQ vectors to the PQ file, which results in significant memory savings since the PQ vectors in the `CassandraDiskAnn` graph searcher consumers `O(n)` memory based on the number of vectors and their quantized size. The fused pq vectors mostly fit within the page cache as we read the node and its neighbors from disk, so we see minimal latency reduction due to this change, though further testing is required to see the real impact. In order to enable fused pq, the runtime needs `cassandra.sai.latest.version=fa` or greater and `cassandra.sai.vector.enable_fused=true`. Note that because this feature is still experimental, `cassandra.sai.vector.enable_fused` defaults to `false`. Another experimental feature introduced in this commit via the jvector upgrade is parallel graph encoding and writing to disk. Writing the fused graph requires increased CPU time to encode the graph node and we write more bytes to disk, so this parallelism is likely necessary to keep vector index creation/compaction times down. The key configurations available with their associated defaults: ```java // When building a compaction graph, encode layer 0 nodes in parallel and subsequently use async io for writes. // This feature is experimental, so defaults to false. SAI_ENCODE_AND_WRITE_VECTOR_GRAPH_IN_PARALLEL_ENABLED("cassandra.sai.vector.encode_and_write_graph_in_parallel.enabled", "false"), // When parallel graph encoding is enabled, the number of threads to use for encoding. Defaults to 0, meaning // use all available processors as reported by the JVM. SAI_ENCODE_AND_WRITE_VECTOR_GRAPH_IN_PARALLEL_NUM_THREADS("cassandra.sai.vector.encode_and_write_graph_in_parallel.num_threads", "0"), // When parallel graph encoding is enabled, whether to use director buffers. Defaults to false, meaning heap // buffers are used. A buffer will be allocated per encoding thread. The size of each buffer is the size // of the encoded graph node at layer 0, which varies based on graph feature settings. SAI_ENCODE_AND_WRITE_VECTOR_GRAPH_IN_PARALLEL_USE_DIRECT_BUFFERS("cassandra.sai.vector.encode_and_write_graph_in_parallel.use_direct_buffers", "false"), ``` ### OnDiskVectorValues and OnDiskVectorValuesWriter `OnDiskVectorValues` is now in its own file and is now thread safe in order to account for some necessary implementation details within jvector. Added `OnDiskVectorValuesWriter` to improve test coverage and to abstract away the flush issues associated with `BufferedRandomAccessWriter` as described in datastax/jvector#562. ### Verification This PR also introduces new benchmarks as well as improved unit testing. The new benchmarks verify the performance of the `OnDiskVectorValues` and `OnDiskVectorValuesWriter` to confirm (at least directionally) the time associated with read and write operations. New tests have been added to verify that when we iterate over an sstable's rows, we are able to assert that the sstable's vector value's similarity to the one stored in the vector graph is ~1. This testing is valuable in that it confirms the row id to ordinal mapping is correct at every node. Previously, we relied on recall results to verify this for us. This new pattern allows us to confirm _every_ node, which is more thorough and removes most edge cases that might have led to partially correct graphs that may have achieved acceptable recall.

…2042) Fixes: riptano/cndb#15527 CNDB test PR: riptano/cndb#16797 This PR upgrades jvector, which brings several improvements. Here are the git commits brought in: ``` 8b3e93cf (tag: 4.0.0-rc.8) chore: update changelog for 4.0.0-rc.8 (#627) 9d0488e5 release 4.0.0-rc.8 (#626) 570bd118 Refactor parallel writer (#608) 20c348ec Move buffer position in ByteBufferIndexWriter#writeFloats (#607) d9ddce51 Ensure extractTrainingVectors return a list of at most MAX_PQ_TRAINING_SET_SIZE (#610) d663b4f7 add config options for regression testing (#609) 7e493eee On-disk index cache for the Grid benchmark harness (#612) e263cc80 Improved dataset loading; fixes, safeties, diagnostics, and better feedback (#613) 6b235ce7 bump to next SNAPSHOT (#605) 84bf5708 (tag: 4.0.0-rc.7) chore: update changelog for 4.0.0-rc.7 (#604) fceeb885 release 4.0.0-rc.7 (#603) 51807cba add protection against bad ordinal mappings (#602) 6ca3b5e2 adding memory and disk usage stats to bench tests (#591) a66fd914 Fix OnDiskGraphIndex#ramBytesUsed NPE (#588) 0ca5a392 Move float bulk-write into IndexWriter to enforce endianness (#577) a6c6c09b Add diversityScoreFunctionFor to avoid creation of wrapper object (#592) 977c21d4 Relax the threshold of a flaky test related to an experimental feature (#598) fa808d69 adding average nodes visited to benchmark tests (#552) 3bd15e70 Virtualize and Modularize DataSetLoader logic (#593) 42259e9f Speed up ivec reads by buffering (#584) f967f1c9 virtualize DataSet (#589) 55f902f4 turn off parallel writes in grid (#582) 019a241d Parallelize graph writes (#542) 02fea879 Save allocation of a large array in PQVectors.encodeAndBuild (#574) 32a51821 javadoc for base [graph] (#548) 4eb607f8 javadoc for base [disk,exceptions] (#547) 30e8932c Enable the fused graph index (#561) d8848fc6 Start development on 4.0.0-rc.7-SNAPSHOT (#573) c57f3a62 (tag: 4.0.0-rc.6) chore: update changelog for 4.0.0-rc.6 (#572) 214b7c20 release 4.0.0-rc.6 (#571) e3686999 fix javadoc error (#570) 88669887 Ignoring testIncrementalInsertionFromOnDiskIndex_withNonIdentityOrdinalMapping and adding a TODO in buildAndMergeNewNodes (#569) 29a943e1 Computation of reconstruction errors for vector compressors (#567) d8e9cb16 Add NVQ paper in README (#560) d5cbe658 Add ImmutableGraphIndex.isHierarchical (#563) b484dae2 Harden tests for heap graph reconstruction (#543) 9471c57d Make the thresholds in TestLowCardinalityFiltering tighter (#559) 21e4a226 Begin development on 4.0.0-rc.6 (#558) 4f661d99 Revert "Start development on 4.0.0-rc.6-SNAPSHOT" fdee5779 Start development on 4.0.0-rc.6-SNAPSHOT ``` Adds a new sai on disk version: `fa` With this version, we are adding a new, experimental feature to write PQ vectors fused into the graph. In doing so, we are able to skip writing the PQ vectors to the PQ file, which results in significant memory savings since the PQ vectors in the `CassandraDiskAnn` graph searcher consumers `O(n)` memory based on the number of vectors and their quantized size. The fused pq vectors mostly fit within the page cache as we read the node and its neighbors from disk, so we see minimal latency reduction due to this change, though further testing is required to see the real impact. In order to enable fused pq, the runtime needs `cassandra.sai.latest.version=fa` or greater and `cassandra.sai.vector.enable_fused=true`. Note that because this feature is still experimental, `cassandra.sai.vector.enable_fused` defaults to `false`. Another experimental feature introduced in this commit via the jvector upgrade is parallel graph encoding and writing to disk. Writing the fused graph requires increased CPU time to encode the graph node and we write more bytes to disk, so this parallelism is likely necessary to keep vector index creation/compaction times down. The key configurations available with their associated defaults: ```java // When building a compaction graph, encode layer 0 nodes in parallel and subsequently use async io for writes. // This feature is experimental, so defaults to false. SAI_ENCODE_AND_WRITE_VECTOR_GRAPH_IN_PARALLEL_ENABLED("cassandra.sai.vector.encode_and_write_graph_in_parallel.enabled", "false"), // When parallel graph encoding is enabled, the number of threads to use for encoding. Defaults to 0, meaning // use all available processors as reported by the JVM. SAI_ENCODE_AND_WRITE_VECTOR_GRAPH_IN_PARALLEL_NUM_THREADS("cassandra.sai.vector.encode_and_write_graph_in_parallel.num_threads", "0"), // When parallel graph encoding is enabled, whether to use director buffers. Defaults to false, meaning heap // buffers are used. A buffer will be allocated per encoding thread. The size of each buffer is the size // of the encoded graph node at layer 0, which varies based on graph feature settings. SAI_ENCODE_AND_WRITE_VECTOR_GRAPH_IN_PARALLEL_USE_DIRECT_BUFFERS("cassandra.sai.vector.encode_and_write_graph_in_parallel.use_direct_buffers", "false"), ``` `OnDiskVectorValues` is now in its own file and is now thread safe in order to account for some necessary implementation details within jvector. Added `OnDiskVectorValuesWriter` to improve test coverage and to abstract away the flush issues associated with `BufferedRandomAccessWriter` as described in datastax/jvector#562. This PR also introduces new benchmarks as well as improved unit testing. The new benchmarks verify the performance of the `OnDiskVectorValues` and `OnDiskVectorValuesWriter` to confirm (at least directionally) the time associated with read and write operations. New tests have been added to verify that when we iterate over an sstable's rows, we are able to assert that the sstable's vector value's similarity to the one stored in the vector graph is ~1. This testing is valuable in that it confirms the row id to ordinal mapping is correct at every node. Previously, we relied on recall results to verify this for us. This new pattern allows us to confirm _every_ node, which is more thorough and removes most edge cases that might have led to partially correct graphs that may have achieved acceptable recall.

Piotr Kołaczkowski and others added 30 commits May 27, 2022 11:53

STAR-821 Cleanup minor code-style issue reported by Sonar

7a6e63c

(cherry picked from commit d834519) (cherry picked from commit 2e3828b) (cherry picked from commit ad9a9e6)

STAR-815: MultiRangeReadCommand always used by SAI and remove info lo…

9338023

…g in query path (#222) (cherry picked from commit d40f9e3) (cherry picked from commit 41b5b8f) (cherry picked from commit d83a0ae)

STAR-847 Close client to prevent driver queries after dropping user (#…

a2c6bb4

…227) (cherry picked from commit 0969332) (cherry picked from commit 91a753d) (cherry picked from commit 56a9b56)

STAR-842: Port pluggable Failure Detector (#237)

3197439

The failure detector is now configurable via cassandra.custom_failure_detector_class (cherry picked from commit 8127d43) # The commit message #2 will be skipped: # STAR-842 - fix up (cherry picked from commit d9625b6)

STAR-868: Fix test failure caused by typo during CNDB FailureDetector…

68ea688

… port (#245) (cherry picked from commit adf34e3) (cherry picked from commit f2a9b43)

STAR-789: Refactor and port pluggable cluster membership from cndb (t…

1b2034e

…oken metadata) (#242) Introduced TokenMetadataProvider to abstract access to TokenMetadata and make it pluggable. (cherry picked from commit 8c0a970) (cherry picked from commit 58b15c2)

STAR-882 port BTree reduce function (#256)

2c9e4a3

(cherry picked from commit 7373d2f) (cherry picked from commit 4134210)

STAR-887: Port Request-Based Native Transport Rate-Limiting (CASSANDR…

b9046be

…A-16663) (#262) Ported from OSS commit d220d24. (cherry picked from commit b762e3c) (cherry picked from commit 34374c1)

STAR-790: Adjust schema updates

6171c0d

(cherry picked from commit e0982c6)

STAR-886: port pluggable guardrails implementations

6a60617

(cherry picked from commit aff4ab0) (cherry picked from commit d3f271c)

STAR-891 Add pluggable disk/commit log error handlers and shutdown ho…

f7f8e0a

…oks (#271) (cherry picked from commit c6cd4ee) (cherry picked from commit fb5c241)

STAR-909 Port DynamicSnitchSeverityProvider from bdp/cndb (#276)

70b1001

* STAR-909 Port DynamicSnitchSeverityProvider from bdp/cndb Co-authored-by: Zhao Yang <jasonstack.zhao@gmail.com> (cherry picked from commit 150e432) (cherry picked from commit 21a4dd3)

STAR-913 Add methods need by cndb to MemoryUtil (#283)

5322b77

(cherry picked from commit 7d5184d) (cherry picked from commit 54627bd)

STAR-912 Add InternalRequestExecutionException (#285)

19d2ecb

(cherry picked from commit 41cb66e) (cherry picked from commit 8b71940)

JeremiahDJordan and others added 4 commits September 29, 2022 13:42

STAR-1659 Update to latest snakeyaml.

aa6b96d

STAR-1680 Use interface PartitionPartition return type in getFirst an…

d598357

…d getLast (#546)

STAR-1683 Port remaining pieces of CNDB-4308 (#549)

3039232

Co-authored-by: Stefania Alborghetti <stef1927@users.noreply.github.com>

jacek-lewandowski marked this pull request as ready for review October 11, 2022 06:53

jacek-lewandowski force-pushed the STAR-1693-merge-ds-trunk branch 2 times, most recently from e8758e5 to 9b7dc0d Compare October 11, 2022 12:36

mfleming and others added 10 commits October 12, 2022 17:03

STAR-1697: Port keyspace renaming (KeyspaceMetadata::rename) from DB-… (

fbf1e8a

#555) * STAR-1697: Port keyspace renaming (KeyspaceMetadata::rename) from DB-3896 Required by CNDB-3170 and CNDB-4909.

STAR-1702 Port CNDB-4894 - bypass txn.isOffline() in scrubber (#559)

2a227a7

Co-authored-by: Stefania Alborghetti <stef1927@users.noreply.github.com>

STAR-1699 Port CNDB-3654: Streaming cdc-enabled SSTables will fail. W…

5f4659c

…arn about it (#558) Co-authored-by: Matt Fleming <mfleming@users.noreply.github.com>

STAR-1701: Restore SSTables if they cannot be dropped and fix lifecyc…

f695429

…le txn bug Port BDP part of CNDB-4035: restore sstables if they cannot be dropped and fix lifecycle txn bug so that we SSTables are added back to the live set if we fail to drop them.

CNDB-4898: add ReloadReason.REGION_CHANGED for reloading sstables whe…

163450e

…n region comes online (#551)

STAR-1706 Connect back using preferred address

db13f57

Port CNDB-4855 Fixed streaming to connect back using peer preferred address instead of Channel#remoteAddress

Refactor StreamSession constructor to remove duplicated code

b85eb11

jacek-lewandowski force-pushed the STAR-1693-merge-ds-trunk branch from 9b7dc0d to 6a4c3b2 Compare October 17, 2022 09:43

jacek-lewandowski force-pushed the STAR-1693-merge-ds-trunk branch from 6a4c3b2 to 55301ae Compare October 17, 2022 13:01

blambov approved these changes Oct 17, 2022

View reviewed changes

jacek-lewandowski added 2 commits October 18, 2022 11:56

STAR-1639: Use repo.datastax.com/dse as alternative repository

2e521cf

Merge remote-tracking branch 'datastax/cassandra-4.0' into STAR-1693-…

0ac29ec

…merge-ds-trunk

jacek-lewandowski force-pushed the STAR-1693-merge-ds-trunk branch from 55301ae to 0ac29ec Compare October 18, 2022 12:51

jacek-lewandowski force-pushed the ds-trunk branch from 2e521cf to dfae73c Compare October 18, 2022 13:14

jacek-lewandowski closed this Oct 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

STAR-1693: Changes from OSS - DO NOT MERGE#552

STAR-1693: Changes from OSS - DO NOT MERGE#552
jacek-lewandowski wants to merge 372 commits intods-trunkfrom
STAR-1693-merge-ds-trunk

jacek-lewandowski commented Oct 11, 2022

Uh oh!

blambov left a comment

Uh oh!

sonarqubecloud Bot commented Oct 18, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

17 participants

Conversation

jacek-lewandowski commented Oct 11, 2022

Uh oh!

blambov left a comment

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud Bot commented Oct 18, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

17 participants