`ExecuteBatchStreamingTest` fails sporadically for the C* 4.0 #1679

ivansenic · 2022-03-07T14:18:45Z

Unfortunately, the gRPC test ExecuteBatchStreamingTest.manyStreamingBatch fails sporadically even after merging the #1661. Seems that two retries are not enough and that we can hit the situation that both retries also fail.
I don't see any other solution, than to increase the number of retries. We might leave it being limited, but maybe 2 is too less.

The text was updated successfully, but these errors were encountered:

ivansenic · 2022-03-07T14:18:58Z

@mpenick what do you think?

tatu-at-datastax · 2022-03-07T15:58:42Z

Is there any delay/wait between retries? 2 may be too low but as importantly retries usually should not be back-to-back. Especially since in this case failure is immediate I think (no external calls, access to cache is very fast).

ivansenic · 2022-03-07T16:01:02Z

There is no delay as there was no option to add this in the existing grpc retry policy and the implementation around it..

tatu-at-datastax · 2022-03-07T16:13:20Z

@ivansenic that makes sense, was guessing that would be the case. Not sure how but it seems important here that there's no busy loop attempt because that may take a very high repeat count without guarantees.

I guess we could start with higher retry count, still, but without some sort of "Thread.yield()" equivalent (that is, something similar that would work in this context) not sure how much simple increase of back-to-back retries would help.
Although maybe gRPC is smart enough to do something like this by default (and not just assume there must be slower external call or I/O)

ivansenic · 2022-03-08T11:58:29Z

@tatu-at-datastax I think since we are using the direct executor in gRPC, there must not be any thead blocks.. So no this is not an option..

mpenick · 2022-03-08T19:11:29Z

I guess I need to understand why it's being invalidated so much that it's returning unprepared so often. Is that something that can be fixed?

mpenick · 2022-03-08T19:24:08Z

I see: https://github.com/apache/cassandra/blob/cassandra-4.0.3/src/java/org/apache/cassandra/cql3/QueryProcessor.java#L580-L583

So if the same statement is prepared at the same time either one of those checks could fail because of a simultaneous eviction. This is not ideal, but I think the way to fix this is to synchronize prepares, at least in gRPC, so that it reduces the chance of eviction. However, this is likely to affect all API that eagerly prepare, so maybe all prepare could be synchronized for the C* 4.X persistence?

tatu-at-datastax · 2022-03-08T21:30:44Z

@ivansenic yes, understood, was expecting that sleep/yield won't work. But effectively we need something like that (but not those :) ).

tatu-at-datastax · 2022-03-08T21:39:02Z

@mpenick Wow. That cache handling logic is seriously convoluted, with fixes wrt keyspace/no-keyspace cases and all. It is difficult to trust that to work for all cases...

But if that is to be retained (which for performance I assume needs to), instead of global sync lock it is probably possible to split lock into N different ones on module of hash code.

mpenick · 2022-03-09T13:13:20Z

But if that is to be retained (which for performance I assume needs to), instead of global sync lock it is probably possible to split lock into N different ones on module of hash code.

Sounds good.

ivansenic added the bug Something isn't working label Mar 7, 2022

mpenick mentioned this issue Mar 9, 2022

Fix eviction race for prepared statements on Cassandra 4.0 #1688

Merged

4 tasks

mpenick closed this as completed in #1688 Mar 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`ExecuteBatchStreamingTest` fails sporadically for the C* 4.0 #1679

`ExecuteBatchStreamingTest` fails sporadically for the C* 4.0 #1679

ivansenic commented Mar 7, 2022

ivansenic commented Mar 7, 2022

tatu-at-datastax commented Mar 7, 2022 •

edited

ivansenic commented Mar 7, 2022

tatu-at-datastax commented Mar 7, 2022 •

edited

ivansenic commented Mar 8, 2022

mpenick commented Mar 8, 2022 •

edited

mpenick commented Mar 8, 2022 •

edited

tatu-at-datastax commented Mar 8, 2022

tatu-at-datastax commented Mar 8, 2022

mpenick commented Mar 9, 2022 •

edited

ExecuteBatchStreamingTest fails sporadically for the C* 4.0 #1679

ExecuteBatchStreamingTest fails sporadically for the C* 4.0 #1679

Comments

ivansenic commented Mar 7, 2022

ivansenic commented Mar 7, 2022

tatu-at-datastax commented Mar 7, 2022 • edited

ivansenic commented Mar 7, 2022

tatu-at-datastax commented Mar 7, 2022 • edited

ivansenic commented Mar 8, 2022

mpenick commented Mar 8, 2022 • edited

mpenick commented Mar 8, 2022 • edited

tatu-at-datastax commented Mar 8, 2022

tatu-at-datastax commented Mar 8, 2022

mpenick commented Mar 9, 2022 • edited

`ExecuteBatchStreamingTest` fails sporadically for the C* 4.0 #1679

`ExecuteBatchStreamingTest` fails sporadically for the C* 4.0 #1679

tatu-at-datastax commented Mar 7, 2022 •

edited

tatu-at-datastax commented Mar 7, 2022 •

edited

mpenick commented Mar 8, 2022 •

edited

mpenick commented Mar 8, 2022 •

edited

mpenick commented Mar 9, 2022 •

edited