Cassandra 15752 trunk #606

jasonstack · 2020-05-31T16:12:41Z

CircleCI: https://circleci.com/workflow-run/440126c4-058d-4511-8301-16df2bddf5db

adelapena · 2020-06-09T11:23:52Z

src/java/org/apache/cassandra/service/StorageProxy.java

@@ -2100,22 +2112,33 @@ public RowIterator computeNext()
            }
        }

-        private void updateConcurrencyFactor()
+        @VisibleForTesting
+        public void handleBatchCompleted()


+1 to place update of liveReturned here. AFIK this method doesn't do anything else than updating the concurrency factor, and we don't have any alternative implementations, so I think I'd prefer the old name for this method, updateConcurrencyFactor. As for the visibility change, I don't see where is it used in testing.

renamed back to updateConcurrencyFactor

I mulled around the idea of having a class whose only responsibility was tracking the command's completion state. In that case, you could imagine a method updating it in a general sense, taking the number of live rows returned and ranges queried in a round and automatically updating the concurrency factor (which you would just access to make the next batch). It would be pretty easy to test and make RangeCommandIterator a little more focused, but the static computeConcurrencyFactor() basically gets us the same thing.

tl;dr I have no problems with the current structure.

adelapena · 2020-06-09T11:29:59Z

test/unit/org/apache/cassandra/db/PartitionRangeReadTest.java

+
+        // no live row returned, fetch all remaining ranges but hit the max instead
+        int cf = StorageProxy.RangeCommandIterator.computeConcurrencyFactor(100, 30, maxConccurrentRangeRequest, 500, 0);
+        assertEquals(maxConccurrentRangeRequest, cf); // because 100 - 30 = 70 > maxConccurrentRangeRequest


I really like the comments here 👍

adelapena · 2020-06-09T11:43:05Z

src/java/org/apache/cassandra/locator/ReplicaPlan.java


-        public ForRangeRead(Keyspace keyspace, ConsistencyLevel consistencyLevel, AbstractBounds<PartitionPosition> range, EndpointsForRange candidates, EndpointsForRange contact)
+        public ForRangeRead(Keyspace keyspace, ConsistencyLevel consistencyLevel, AbstractBounds<PartitionPosition> range, EndpointsForRange candidates, EndpointsForRange contact, int rangeCount)


Nit: We could break this line

Suggested change

public ForRangeRead(Keyspace keyspace, ConsistencyLevel consistencyLevel, AbstractBounds<PartitionPosition> range, EndpointsForRange candidates, EndpointsForRange contact, int rangeCount)

public ForRangeRead(Keyspace keyspace,

ConsistencyLevel consistencyLevel,

AbstractBounds<PartitionPosition> range,

EndpointsForRange candidates,

EndpointsForRange contact,

int rangeCount)

adelapena · 2020-06-09T11:51:32Z

src/java/org/apache/cassandra/locator/ReplicaPlan.java

        }

        public AbstractBounds<PartitionPosition> range() { return range; }

+        /**
+         * @return number of vnode ranges


Perhaps we could extend this to number of vnode ranges intersected by the range, or rename the method to vnodesCount/subrangeCount, or something like that. I think that having a singular range and a range count at the same time might be a bit confusing to unaware readers.

good idea. updated the javadoc and method name to vnodeCount

adelapena · 2020-06-09T11:53:17Z

src/java/org/apache/cassandra/service/StorageProxy.java

        private int concurrencyFactor;
        // The two following "metric" are maintained to improve the concurrencyFactor
        // when it was not good enough initially.
        private int liveReturned;
        private int rangesQueried;

-        public RangeCommandIterator(RangeIterator ranges, PartitionRangeReadCommand command, int concurrencyFactor, Keyspace keyspace, ConsistencyLevel consistency, long queryStartNanoTime)
+        public RangeCommandIterator(RangeIterator ranges, PartitionRangeReadCommand command, int concurrencyFactor, int maxConcurrencyFactor, Keyspace keyspace, ConsistencyLevel consistency, long queryStartNanoTime)


Nit: we could break this line

Suggested change

public RangeCommandIterator(RangeIterator ranges, PartitionRangeReadCommand command, int concurrencyFactor, int maxConcurrencyFactor, Keyspace keyspace, ConsistencyLevel consistency, long queryStartNanoTime)

public RangeCommandIterator(RangeIterator ranges,

PartitionRangeReadCommand command,

int concurrencyFactor,

int maxConcurrencyFactor,

Keyspace keyspace,

ConsistencyLevel consistency,

long queryStartNanoTime)

adelapena · 2020-06-09T11:54:51Z

test/unit/org/apache/cassandra/service/reads/repair/AbstractReadRepairTest.java

@@ -294,7 +294,7 @@ public void setUp()
    static ReplicaPlan.ForRangeRead replicaPlan(Keyspace keyspace, ConsistencyLevel consistencyLevel, EndpointsForRange replicas, EndpointsForRange targets)
    {
        return new ReplicaPlan.ForRangeRead(keyspace, consistencyLevel,
-                ReplicaUtils.FULL_BOUNDS, replicas, targets);
+                ReplicaUtils.FULL_BOUNDS, replicas, targets, 1);


Nit: I think we don't need to break this line

maedhroz · 2020-06-11T16:31:35Z

test/unit/org/apache/cassandra/db/PartitionRangeReadTest.java

+    @Test
+    public void testComputeConcurrencyFactor()
+    {
+        int maxConccurrentRangeRequest = 32;


Suggested change

int maxConccurrentRangeRequest = 32;

int maxConcurrentRangeRequest = 32;

maedhroz · 2020-06-11T20:36:04Z

test/unit/org/apache/cassandra/db/PartitionRangeReadTest.java

+    }
+
+    @Test
+    public void testRangeCountWithRangeMerge()


nit: This could probably be the first resident of a new RangeMergerTest.

this method needs setTokens().. we can probably move it when refactoring RangeReadExecutor out of StorageProxy

maedhroz · 2020-06-11T20:37:39Z

test/unit/org/apache/cassandra/db/PartitionRangeReadTest.java

+        assertEquals(tokens.size() + 1, data.rangesQueried());
+    }
+
+    private List<Token> updateTokens(List<Integer> values)


nit: Maybe setTokens(), given it clears the existing stuff?

maedhroz · 2020-06-11T20:48:14Z

test/unit/org/apache/cassandra/db/PartitionRangeReadTest.java

+
+        int num = Util.size(data);
+        assertEquals(rows, num);
+        assertEquals(tokens.size() + 1, data.rangesQueried());


It might be helpful to remind the future reader why there's a +1 here ;)

maedhroz · 2020-06-11T21:00:50Z

src/java/org/apache/cassandra/service/StorageProxy.java

                    concurrentQueries.add(response);
                    readRepairs.add(response.readRepair);
-                    ++rangesQueried;
+                    rangesQueried += range.vnodeCount();
+                    i += range.vnodeCount();


@jasonstack @adelapena Say we have a concurrency factor of 2, and the next range actually is a merged range representing 3 vnodes. We'll actually exceed the concurrency factor by 1, but is the idea that we would otherwise not be able to make progress? It doesn't feel like it matters much either way, given the point of this whole mechanism is to limit queries to replicas, which it still does, but I wanted to make sure we're on the same page about the intent...

Say we have a concurrency factor of 2, and the next range actually is a merged range representing 3 vnodes. We'll actually exceed the concurrency factor by 1

this is inevitable, unless coordinator defers the range merging until it knows how many ranges it needs for next batch. I think fetching more ranges in one replica read command isn't going to be very costly, but the cost of under-fetching is significantly higher.

Doesn't seem that over-fetching is going to be a problem, but we might add a comment about it.

maedhroz · 2020-06-11T21:04:33Z

src/java/org/apache/cassandra/service/StorageProxy.java

@@ -2041,16 +2051,24 @@ public void close()
        private DataLimits.Counter counter;
        private PartitionIterator sentQueryIterator;

+        private int maxConcurrencyFactor;


Should be final?

maedhroz

@jasonstack Left a few minor comments, but without making more invasive changes to StorageProxy, this looks pretty good. A bit more testing around RangeCommentIterator might be in order though.

maedhroz · 2020-06-11T21:29:43Z

test/unit/org/apache/cassandra/db/PartitionRangeReadTest.java

+        PartitionRangeReadCommand command = (PartitionRangeReadCommand) Util.cmd(cfs).build();
+        // avoid merging ranges, so that it queries in multiple batches and check if liveReturned is updated correctly.
+        StorageProxy.RangeIterator rangeIterator = new StorageProxy.RangeIterator(command, keyspace, ConsistencyLevel.ONE);
+        StorageProxy.RangeCommandIterator data = new StorageProxy.RangeCommandIterator(rangeIterator, command, 1, 1000, keyspace, ConsistencyLevel.ONE, System.nanoTime());


@jasonstack I traced through this test and it actually looks like we are merging all 5 ranges into one ForRangeRead, and the reason we still get 5 from data.rangesQueried() is just that we allow overflow from the last ForRangeRead in sendNextRequests(). It feels like perhaps a couple more tests around sendNextRequests() would be helpful? If we parameterize RangeCommandIterator to optionally merge ranges, this would be easier. (It could even make the signature of the RangeCommandIterator constructor less busy, given keyspace and consistency are actually only used by the RangeMerger created inside it.)

the reason we still get 5 from data.rangesQueried() is just that we allow overflow from the last ForRangeRead in sendNextRequests()

correct..coordinator needs to track num of vnodes queried. The test will merge ranges.

I will add some more tests and fix the test comments..

maedhroz · 2020-06-11T21:34:31Z

test/unit/org/apache/cassandra/db/PartitionRangeReadTest.java

+        cfs.forceBlockingFlush();
+
+        PartitionRangeReadCommand command = (PartitionRangeReadCommand) Util.cmd(cfs).build();
+        // avoid merging ranges, so that it queries in multiple batches and check if liveReturned is updated correctly.


liveReturned appears to be zero if you assert on it here, which makes sense, given we don't get past the first batch.

right.. it won't update CF if iteration ends.

jasonstack · 2020-06-12T07:24:19Z

rebased with latest trunk

adelapena · 2020-06-12T12:54:37Z

test/unit/org/apache/cassandra/service/reads/DataResolverTest.java

@@ -1321,9 +1322,10 @@ private void assertRepairMetadata(Mutation mutation)
        assertEquals(update.metadata().name, cfm.name);
    }

+    @VisibleForTesting


Do we need this annotation?

src/java/org/apache/cassandra/service/StorageProxy.java

…nd cap max concurrency factor by core * 10 patch by Zhao Yang; reviewed by Andres de la Peña, Caleb Rackliffe for CASSANDRA-15752

… property name (apache#606) * added 'Void' type in Timer.java * - Either "static_scaling_parameters" xor "static_scaling_factors" is accepted - Changed default readMultiplier to 1.0 instead of 0.5 for adaptive compaction cost calculations * added coverage for 'static_scaling_factors' option * renamed 'static_scaling_parameters' to 'scaling_parameters' * fixed failing tests

… property name (apache#606) * added 'Void' type in Timer.java * - Either "static_scaling_parameters" xor "static_scaling_factors" is accepted - Changed default readMultiplier to 1.0 instead of 0.5 for adaptive compaction cost calculations * added coverage for 'static_scaling_factors' option * renamed 'static_scaling_parameters' to 'scaling_parameters' * fixed failing tests (cherry picked from commit 89e9cb7) (cherry picked from commit 3ba08bd)

… property name (apache#606) * added 'Void' type in Timer.java * - Either "static_scaling_parameters" xor "static_scaling_factors" is accepted - Changed default readMultiplier to 1.0 instead of 0.5 for adaptive compaction cost calculations * added coverage for 'static_scaling_factors' option * renamed 'static_scaling_parameters' to 'scaling_parameters' * fixed failing tests (cherry picked from commit 89e9cb7) (cherry picked from commit 3ba08bd) (cherry picked from commit 745f0d8)

jasonstack force-pushed the CASSANDRA-15752-trunk branch 2 times, most recently from 8f0f541 to f272a35 Compare June 2, 2020 16:18

adelapena reviewed Jun 9, 2020

View reviewed changes

jasonstack force-pushed the CASSANDRA-15752-trunk branch from f272a35 to a8d104c Compare June 9, 2020 12:44

maedhroz reviewed Jun 11, 2020

View reviewed changes

maedhroz approved these changes Jun 11, 2020

View reviewed changes

maedhroz reviewed Jun 11, 2020

View reviewed changes

Test ONLY - DO NOT MERGE

7daa53c

jasonstack force-pushed the CASSANDRA-15752-trunk branch from a8d104c to 9f62012 Compare June 12, 2020 07:20

adelapena reviewed Jun 12, 2020

View reviewed changes

maedhroz reviewed Jun 12, 2020

View reviewed changes

src/java/org/apache/cassandra/service/StorageProxy.java Outdated Show resolved Hide resolved

adelapena approved these changes Jun 17, 2020

View reviewed changes

Count vnode ranges towards concurrency factor instead merged ranges a…

a5d7eea

…nd cap max concurrency factor by core * 10 patch by Zhao Yang; reviewed by Andres de la Peña, Caleb Rackliffe for CASSANDRA-15752

jasonstack force-pushed the CASSANDRA-15752-trunk branch from 596d5d2 to a5d7eea Compare June 18, 2020 12:21

jasonstack closed this Jun 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cassandra 15752 trunk #606

Cassandra 15752 trunk #606

jasonstack commented May 31, 2020 •

edited

adelapena Jun 9, 2020

jasonstack Jun 9, 2020

maedhroz Jun 11, 2020

adelapena Jun 9, 2020

adelapena Jun 9, 2020

jasonstack Jun 9, 2020

adelapena Jun 9, 2020

jasonstack Jun 9, 2020

adelapena Jun 9, 2020

jasonstack Jun 9, 2020

adelapena Jun 9, 2020

jasonstack Jun 9, 2020

maedhroz Jun 11, 2020

jasonstack Jun 12, 2020

maedhroz Jun 11, 2020

jasonstack Jun 12, 2020

maedhroz Jun 11, 2020

jasonstack Jun 12, 2020

maedhroz Jun 11, 2020

jasonstack Jun 12, 2020

maedhroz Jun 11, 2020

jasonstack Jun 12, 2020

adelapena Jun 12, 2020

maedhroz Jun 11, 2020

jasonstack Jun 12, 2020

maedhroz left a comment •

edited

maedhroz Jun 11, 2020

jasonstack Jun 12, 2020 •

edited

maedhroz Jun 11, 2020

jasonstack Jun 12, 2020

jasonstack commented Jun 12, 2020

adelapena Jun 12, 2020

jasonstack Jun 12, 2020


		public ForRangeRead(Keyspace keyspace, ConsistencyLevel consistencyLevel, AbstractBounds<PartitionPosition> range, EndpointsForRange candidates, EndpointsForRange contact)
		public ForRangeRead(Keyspace keyspace, ConsistencyLevel consistencyLevel, AbstractBounds<PartitionPosition> range, EndpointsForRange candidates, EndpointsForRange contact, int rangeCount)

	int maxConccurrentRangeRequest = 32;
	int maxConcurrentRangeRequest = 32;

Cassandra 15752 trunk #606

Cassandra 15752 trunk #606

Conversation

jasonstack commented May 31, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maedhroz left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jasonstack Jun 12, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jasonstack commented Jun 12, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jasonstack commented May 31, 2020 •

edited

maedhroz left a comment •

edited

jasonstack Jun 12, 2020 •

edited