Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

readers: evictable_reader: skip progress guarantee when next pos is partition start #13563

Conversation

denesb
Copy link
Contributor

@denesb denesb commented Apr 18, 2023

The evictable reader must ensure that each buffer fill makes forward progress, i.e. the last fragment in the buffer has a position larger than the last fragment from the last buffer-fill. Otherwise, the reader could get stuck in an infinite loop between buffer fills, if the reader is evicted in-between.
The code guranteeing this forward change has a bug: when the next expected position is a partition-start (another partition), the code would loop forever, effectively reading all there is from the underlying reader.
To avoid this, add a special case to ignore the progress guarantee loop altogether when the next expected position is a partition start. In this case, progress is garanteed anyway, because there is exactly one partition-start fragment in each partition.

Fixes: #13491

…artition start

The evictable reader must ensure that each buffer fill makes forward
progress, i.e. the last fragment in the buffer has a position larger
than the last fragment from the last buffer-fill. Otherwise, the reader
could get stuck in an infinite loop between buffer fills, if the reader
is evicted in-between.
The code guranteeing this forward change has a bug: when the next
expected position is a partition-start (another partition), the code
would loop forever, effectively reading all there is from the underlying
reader.
To avoid this, add a special case to ignore the progress guarantee loop
altogether when the next expected position is a partition start. In this
case, progress is garanteed anyway, because there is exactly one
partition-start fragment in each partition.

Fixes: scylladb#13491
@scylladb-promoter
Copy link
Contributor

@denesb
Copy link
Contributor Author

denesb commented Apr 18, 2023

@scylladb-promoter
Copy link
Contributor

@denesb
Copy link
Contributor Author

denesb commented Apr 19, 2023

CI state FAILURE - https://jenkins.scylladb.com/job/scylla-master/job/scylla-ci/705/

20:30:21  hudson.remoting.ProxyException: groovy.lang.MissingPropertyException: No such property: results for class: WorkflowScript
20:30:21  	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.unwrap(ScriptBytecodeAdapter.java:66)
20:30:21  	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.getProperty(ScriptBytecodeAdapter.java:471)
20:30:21  	at org.kohsuke.groovy.sandbox.impl.Checker$7.call(Checker.java:377)
20:30:21  	at org.kohsuke.groovy.sandbox.GroovyInterceptor.onGetProperty(GroovyInterceptor.java:68)
20:30:21  	at org.jenkinsci.plugins.scriptsecurity.sandbox.groovy.SandboxInterceptor.onGetProperty(SandboxInterceptor.java:347)
20:30:21  	at org.kohsuke.groovy.sandbox.impl.Checker$7.call(Checker.java:375)
20:30:21  	at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:379)
20:30:21  	at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:355)
20:30:21  	at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:355)
20:30:21  	at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:355)
20:30:21  	at org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:355)
20:30:21  	at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.getProperty(SandboxInvoker.java:29)
20:30:21  	at com.cloudbees.groovy.cps.impl.PropertyAccessBlock.rawGet(PropertyAccessBlock.java:20)
20:30:21  	at WorkflowScript.run(WorkflowScript:198)
20:30:21  	at com.cloudbees.groovy.cps.CpsDefaultGroovyMethods.each(CpsDefaultGroovyMethods:2125)
20:30:21  	at com.cloudbees.groovy.cps.CpsDefaultGroovyMethods.each(CpsDefaultGroovyMethods:2110)
20:30:21  	at com.cloudbees.groovy.cps.CpsDefaultGroovyMethods.each(CpsDefaultGroovyMethods:2151)
20:30:21  	at WorkflowScript.run(WorkflowScript:194)
20:30:21  	at ___cps.transform___(Native Method)
20:30:21  	at com.cloudbees.groovy.cps.impl.PropertyishBlock$ContinuationImpl.get(PropertyishBlock.java:73)
20:30:21  	at com.cloudbees.groovy.cps.LValueBlock$GetAdapter.receive(LValueBlock.java:30)
20:30:21  	at com.cloudbees.groovy.cps.impl.PropertyishBlock$ContinuationImpl.fixName(PropertyishBlock.java:65)
20:30:21  	at jdk.internal.reflect.GeneratedMethodAccessor571.invoke(Unknown Source)
20:30:21  	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
20:30:21  	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
20:30:21  	at com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
20:30:21  	at com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
20:30:21  	at com.cloudbees.groovy.cps.Next.step(Next.java:83)
20:30:21  	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:152)
20:30:21  	at com.cloudbees.groovy.cps.Continuable$1.call(Continuable.java:146)
20:30:21  	at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:136)
20:30:21  	at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:275)
20:30:21  	at com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:146)
20:30:21  	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:18)
20:30:21  	at org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:51)
20:30:21  	at org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:187)
20:30:21  	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:420)
20:30:21  	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:330)
20:30:21  	at org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:294)
20:30:21  	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$2.call(CpsVmExecutorService.java:67)
20:30:21  	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
20:30:21  	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:139)
20:30:21  	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
20:30:21  	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
20:30:21  	at jenkins.util.ErrorLoggingExecutorService.lambda$wrap$0(ErrorLoggingExecutorService.java:51)
20:30:21  	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
20:30:21  	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
20:30:21  	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
20:30:21  	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
20:30:21  	at java.base/java.lang.Thread.run(Thread.java:829)
20:30:21  Finished: FAILURE

@benipeled

@scylladb-promoter
Copy link
Contributor

@denesb
Copy link
Contributor Author

denesb commented Apr 19, 2023

@scylladb-promoter
Copy link
Contributor

while (next_mf && _tri_cmp(_next_position_in_partition, buffer().back().position()) <= 0) {
// This loop becomes inifinite when next pos is a partition start.
// In that case progress is guranteed anyway, so skip this loop entirely.
while (!_next_position_in_partition.is_partition_start() && next_mf && _tri_cmp(_next_position_in_partition, buffer().back().position()) <= 0) {
Copy link
Member

@bhalevy bhalevy Apr 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, we access buffer().back().position() below, right after this loop.
How do we know that the buffer isn't empty at that point?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't that be done under while (next_mf)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, we access buffer().back().position() below, right after this loop. How do we know that the buffer isn't empty at that point?

See !is_buffer_empty() in the enclosing if.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I got confused and thought we're consuming from the buffer, but we're actually filling it.

while (next_mf && _tri_cmp(_next_position_in_partition, buffer().back().position()) <= 0) {
// This loop becomes inifinite when next pos is a partition start.
// In that case progress is guranteed anyway, so skip this loop entirely.
while (!_next_position_in_partition.is_partition_start() && next_mf && _tri_cmp(_next_position_in_partition, buffer().back().position()) <= 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the comment. Why is it guaranteed we make progress on partition_start?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making progress is only a concern due to range tombstone changes, which can have non-monotonically increasing positions. The evictable reader needs to ensure that each buffer fill, ends with a position, strictly larger than that of the previous buffer fill. When the next expected position is a partition start, this is guaranteed and need not be checked (partition start means a new partition is started, so we make partition-level progress).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Though I don't see why the current code is broken. You read the partition start and emit it, Then you read the partition end, which should have its position_in_partition greater than the partition start, which should stop the loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's absolutely broken to compare position_in_partition without noticing that we changed partitions, as position_in_partition can't be compared across partitions. So I agree with the fix, just wondering why partition_end didn't save us.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading an entire partition into memory is an OOM sentence if the partition is large.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I don't see it. push_mutation_fragment() updates buffer().back().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But doesn't buffer.back() change during the loop? Ah, I guess it doesn't.

It does. And we keep changing it (by pushing new fragments) until the condition becomes false and we exit the loop. The problem is that if _next_position_in_partition is partition_start, now matter what we push to the buffer, _tri_cmp(_next_position_in_partition, buffer().back().position()) <= 0 will always hold.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok - I get it now. I thought that we check that the next unconsumed position is after the last consumed position, but we check against some position that isn't advanced. Thanks for bearing with me.

Copy link
Contributor

@michoecho michoecho Jun 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get it. The code is while (_next_position_in_partition <= back) advance_back...

But if this condition is supposed to check for progress against _next_position_in_partition, shouldn't it be while (back <= (<?) _next_position_in_partition) advance_back...?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. If you modify the test below to call fill_buffer() a second time, this time it will read till the end of partition instead of reading another small batch of fragments - because the comparison is done in the wrong direction.


rd.fill_buffer().get();
auto buf1 = rd.detach_buffer();
BOOST_REQUIRE_EQUAL(buf1.size(), 3);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this test fail before the patch with an infinite loop (or by consuming all the reader)? If not, it's just testing some detail of the implementation, not the bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the test fails before the patch, by finding that the reader read more than expected into the buffer. Unfortunately, there is no way to write a test for this, without involving specific details about how readers work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More than expected != the failure condition.

If before the fix would stop at 4 or 2 fragments, then the test doesn't reproduce the infinite loop.

If it's really unbounded, then you can have a reader with 10 fragments and assert that not all fragments were consumed, rather than some specific number was consumed (but another number would have been just as well, as long as it's not "consume the entire stream while some data-dependent condition holds".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before the fix, the reader would read all 1003 fragments of test data. After the fix, it stops after 3 fragments, which is where it should stop, according to the precalculated max buffer size.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I suggest to change the condition to < 10. Instead of enshrining some detail, let's check the actual failure (we can't check that the number of fragments is infinite, but 10 is a close approximation).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reader not reading more than what its max-buffer-size is is actually part of the reader contract. But I can increase this number so that it allows the reader deciding to read a few more fragments than expected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just queue it.

avikivity pushed a commit that referenced this pull request May 2, 2023
…artition start

The evictable reader must ensure that each buffer fill makes forward
progress, i.e. the last fragment in the buffer has a position larger
than the last fragment from the last buffer-fill. Otherwise, the reader
could get stuck in an infinite loop between buffer fills, if the reader
is evicted in-between.
The code guranteeing this forward change has a bug: when the next
expected position is a partition-start (another partition), the code
would loop forever, effectively reading all there is from the underlying
reader.
To avoid this, add a special case to ignore the progress guarantee loop
altogether when the next expected position is a partition start. In this
case, progress is garanteed anyway, because there is exactly one
partition-start fragment in each partition.

Fixes: #13491

Closes #13563

(cherry picked from commit 72003dc)
avikivity pushed a commit that referenced this pull request May 2, 2023
…artition start

The evictable reader must ensure that each buffer fill makes forward
progress, i.e. the last fragment in the buffer has a position larger
than the last fragment from the last buffer-fill. Otherwise, the reader
could get stuck in an infinite loop between buffer fills, if the reader
is evicted in-between.
The code guranteeing this forward change has a bug: when the next
expected position is a partition-start (another partition), the code
would loop forever, effectively reading all there is from the underlying
reader.
To avoid this, add a special case to ignore the progress guarantee loop
altogether when the next expected position is a partition start. In this
case, progress is garanteed anyway, because there is exactly one
partition-start fragment in each partition.

Fixes: #13491

Closes #13563

(cherry picked from commit 72003dc)
avikivity pushed a commit that referenced this pull request May 2, 2023
…artition start

The evictable reader must ensure that each buffer fill makes forward
progress, i.e. the last fragment in the buffer has a position larger
than the last fragment from the last buffer-fill. Otherwise, the reader
could get stuck in an infinite loop between buffer fills, if the reader
is evicted in-between.
The code guranteeing this forward change has a bug: when the next
expected position is a partition-start (another partition), the code
would loop forever, effectively reading all there is from the underlying
reader.
To avoid this, add a special case to ignore the progress guarantee loop
altogether when the next expected position is a partition start. In this
case, progress is garanteed anyway, because there is exactly one
partition-start fragment in each partition.

Fixes: #13491

Closes #13563

(cherry picked from commit 72003dc)
avikivity pushed a commit that referenced this pull request May 2, 2023
…artition start

The evictable reader must ensure that each buffer fill makes forward
progress, i.e. the last fragment in the buffer has a position larger
than the last fragment from the last buffer-fill. Otherwise, the reader
could get stuck in an infinite loop between buffer fills, if the reader
is evicted in-between.
The code guranteeing this forward change has a bug: when the next
expected position is a partition-start (another partition), the code
would loop forever, effectively reading all there is from the underlying
reader.
To avoid this, add a special case to ignore the progress guarantee loop
altogether when the next expected position is a partition start. In this
case, progress is garanteed anyway, because there is exactly one
partition-start fragment in each partition.

Fixes: #13491

Closes #13563

(cherry picked from commit 72003dc)
avikivity pushed a commit that referenced this pull request May 2, 2023
…artition start

The evictable reader must ensure that each buffer fill makes forward
progress, i.e. the last fragment in the buffer has a position larger
than the last fragment from the last buffer-fill. Otherwise, the reader
could get stuck in an infinite loop between buffer fills, if the reader
is evicted in-between.
The code guranteeing this forward change has a bug: when the next
expected position is a partition-start (another partition), the code
would loop forever, effectively reading all there is from the underlying
reader.
To avoid this, add a special case to ignore the progress guarantee loop
altogether when the next expected position is a partition start. In this
case, progress is garanteed anyway, because there is exactly one
partition-start fragment in each partition.

Fixes: #13491

Closes #13563

(cherry picked from commit 72003dc)
kbr-scylla added a commit to kbr-scylla/scylladb that referenced this pull request Jun 23, 2023
…ition

The evictable reader must ensure that each buffer fill makes forward
progress, i.e. the last fragment in the buffer has a position larger
than the last fragment from the previous buffer-fill. Otherwise, the
reader could get stuck in an infinite loop between buffer fills, if the
reader is evicted in-between.

The code guranteeing this forward progress had a bug: the comparison
between the position after the last buffer-fill and the current
last fragment position was done in the wrong direction.

So if the condition that we wanted to achieve was already true, we would
continue filling the buffer until partition end which may lead to OOMs
such as in scylladb#13491.

There was already a fix in this area to handle `partition_start`
fragments correctly - scylladb#13563 - but it missed that the position
comparison was done in the wrong order.

Fix the comparison and adjust one of the tests (added in scylladb#13563) to
detect this case.

Fixes scylladb#13491
kbr-scylla added a commit to kbr-scylla/scylladb that referenced this pull request Jun 27, 2023
…ition

The evictable reader must ensure that each buffer fill makes forward
progress, i.e. the last fragment in the buffer has a position larger
than the last fragment from the previous buffer-fill. Otherwise, the
reader could get stuck in an infinite loop between buffer fills, if the
reader is evicted in-between.

The code guranteeing this forward progress had a bug: the comparison
between the position after the last buffer-fill and the current
last fragment position was done in the wrong direction.

So if the condition that we wanted to achieve was already true, we would
continue filling the buffer until partition end which may lead to OOMs
such as in scylladb#13491.

There was already a fix in this area to handle `partition_start`
fragments correctly - scylladb#13563 - but it missed that the position
comparison was done in the wrong order.

Fix the comparison and adjust one of the tests (added in scylladb#13563) to
detect this case.

Fixes scylladb#13491
kbr-scylla added a commit to kbr-scylla/scylladb that referenced this pull request Jun 27, 2023
…ition

The evictable reader must ensure that each buffer fill makes forward
progress, i.e. the last fragment in the buffer has a position larger
than the last fragment from the previous buffer-fill. Otherwise, the
reader could get stuck in an infinite loop between buffer fills, if the
reader is evicted in-between.

The code guranteeing this forward progress had a bug: the comparison
between the position after the last buffer-fill and the current
last fragment position was done in the wrong direction.

So if the condition that we wanted to achieve was already true, we would
continue filling the buffer until partition end which may lead to OOMs
such as in scylladb#13491.

There was already a fix in this area to handle `partition_start`
fragments correctly - scylladb#13563 - but it missed that the position
comparison was done in the wrong order.

Fix the comparison and adjust one of the tests (added in scylladb#13563) to
detect this case.

Fixes scylladb#13491
kbr-scylla added a commit to kbr-scylla/scylladb that referenced this pull request Jun 27, 2023
…ition

The evictable reader must ensure that each buffer fill makes forward
progress, i.e. the last fragment in the buffer has a position larger
than the last fragment from the previous buffer-fill. Otherwise, the
reader could get stuck in an infinite loop between buffer fills, if the
reader is evicted in-between.

The code guranteeing this forward progress had a bug: the comparison
between the position after the last buffer-fill and the current
last fragment position was done in the wrong direction.

So if the condition that we wanted to achieve was already true, we would
continue filling the buffer until partition end which may lead to OOMs
such as in scylladb#13491.

There was already a fix in this area to handle `partition_start`
fragments correctly - scylladb#13563 - but it missed that the position
comparison was done in the wrong order.

Fix the comparison and adjust one of the tests (added in scylladb#13563) to
detect this case.

Fixes scylladb#13491
kbr-scylla added a commit to kbr-scylla/scylladb that referenced this pull request Jun 27, 2023
…ition

The evictable reader must ensure that each buffer fill makes forward
progress, i.e. the last fragment in the buffer has a position larger
than the last fragment from the previous buffer-fill. Otherwise, the
reader could get stuck in an infinite loop between buffer fills, if the
reader is evicted in-between.

The code guranteeing this forward progress had a bug: the comparison
between the position after the last buffer-fill and the current
last fragment position was done in the wrong direction.

So if the condition that we wanted to achieve was already true, we would
continue filling the buffer until partition end which may lead to OOMs
such as in scylladb#13491.

There was already a fix in this area to handle `partition_start`
fragments correctly - scylladb#13563 - but it missed that the position
comparison was done in the wrong order.

Fix the comparison and adjust one of the tests (added in scylladb#13563) to
detect this case.

Fixes scylladb#13491
kbr-scylla added a commit to kbr-scylla/scylladb that referenced this pull request Jun 27, 2023
…ition

The evictable reader must ensure that each buffer fill makes forward
progress, i.e. the last fragment in the buffer has a position larger
than the last fragment from the previous buffer-fill. Otherwise, the
reader could get stuck in an infinite loop between buffer fills, if the
reader is evicted in-between.

The code guranteeing this forward progress had a bug: the comparison
between the position after the last buffer-fill and the current
last fragment position was done in the wrong direction.

So if the condition that we wanted to achieve was already true, we would
continue filling the buffer until partition end which may lead to OOMs
such as in scylladb#13491.

There was already a fix in this area to handle `partition_start`
fragments correctly - scylladb#13563 - but it missed that the position
comparison was done in the wrong order.

Fix the comparison and adjust one of the tests (added in scylladb#13563) to
detect this case.

Fixes scylladb#13491
kbr-scylla added a commit to kbr-scylla/scylladb that referenced this pull request Jun 27, 2023
…ition

The evictable reader must ensure that each buffer fill makes forward
progress, i.e. the last fragment in the buffer has a position larger
than the last fragment from the previous buffer-fill. Otherwise, the
reader could get stuck in an infinite loop between buffer fills, if the
reader is evicted in-between.

The code guranteeing this forward progress had a bug: the comparison
between the position after the last buffer-fill and the current
last fragment position was done in the wrong direction.

So if the condition that we wanted to achieve was already true, we would
continue filling the buffer until partition end which may lead to OOMs
such as in scylladb#13491.

There was already a fix in this area to handle `partition_start`
fragments correctly - scylladb#13563 - but it missed that the position
comparison was done in the wrong order.

Fix the comparison and adjust one of the tests (added in scylladb#13563) to
detect this case.

Fixes scylladb#13491
denesb added a commit that referenced this pull request Jun 28, 2023
…re partition' from Kamil Braun

The evictable reader must ensure that each buffer fill makes forward progress, i.e. the last fragment in the buffer has a position larger than the last fragment from the previous buffer-fill. Otherwise, the reader could get stuck in an infinite loop between buffer fills, if the reader is evicted in-between.

The code guranteeing this forward progress had a bug: the comparison between the position after the last buffer-fill and the current last fragment position was done in the wrong direction.

So if the condition that we wanted to achieve was already true, we would continue filling the buffer until partition end which may lead to OOMs such as in #13491.

There was already a fix in this area to handle `partition_start` fragments correctly - #13563 - but it missed that the position comparison was done in the wrong order.

Fix the comparison and adjust one of the tests (added in #13563) to detect this case.

After the fix, the evictable reader starts generating some redundant (but expected) range tombstone change fragments since it's now being paused and resumed. For this we need to adjust mutation source tests which were a bit too specific. We modify `flat_mutation_reader_assertions` to squash the redundant `r_t_c`s.

Fixes #13491

Closes #14375

* github.com:scylladb/scylladb:
  readers: evictable_reader: don't accidentally consume the entire partition
  test: flat_mutation_reader_assertions: squash `r_t_c`s with the same position
denesb added a commit that referenced this pull request Jun 29, 2023
…re partition' from Kamil Braun

The evictable reader must ensure that each buffer fill makes forward progress, i.e. the last fragment in the buffer has a position larger than the last fragment from the previous buffer-fill. Otherwise, the reader could get stuck in an infinite loop between buffer fills, if the reader is evicted in-between.

The code guranteeing this forward progress had a bug: the comparison between the position after the last buffer-fill and the current last fragment position was done in the wrong direction.

So if the condition that we wanted to achieve was already true, we would continue filling the buffer until partition end which may lead to OOMs such as in #13491.

There was already a fix in this area to handle `partition_start` fragments correctly - #13563 - but it missed that the position comparison was done in the wrong order.

Fix the comparison and adjust one of the tests (added in #13563) to detect this case.

After the fix, the evictable reader starts generating some redundant (but expected) range tombstone change fragments since it's now being paused and resumed. For this we need to adjust mutation source tests which were a bit too specific. We modify `flat_mutation_reader_assertions` to squash the redundant `r_t_c`s.

Fixes #13491

Closes #14375

* github.com:scylladb/scylladb:
  readers: evictable_reader: don't accidentally consume the entire partition
  test: flat_mutation_reader_assertions: squash `r_t_c`s with the same position

(cherry picked from commit 586102b)
denesb added a commit that referenced this pull request Jun 29, 2023
…re partition' from Kamil Braun

The evictable reader must ensure that each buffer fill makes forward progress, i.e. the last fragment in the buffer has a position larger than the last fragment from the previous buffer-fill. Otherwise, the reader could get stuck in an infinite loop between buffer fills, if the reader is evicted in-between.

The code guranteeing this forward progress had a bug: the comparison between the position after the last buffer-fill and the current last fragment position was done in the wrong direction.

So if the condition that we wanted to achieve was already true, we would continue filling the buffer until partition end which may lead to OOMs such as in #13491.

There was already a fix in this area to handle `partition_start` fragments correctly - #13563 - but it missed that the position comparison was done in the wrong order.

Fix the comparison and adjust one of the tests (added in #13563) to detect this case.

After the fix, the evictable reader starts generating some redundant (but expected) range tombstone change fragments since it's now being paused and resumed. For this we need to adjust mutation source tests which were a bit too specific. We modify `flat_mutation_reader_assertions` to squash the redundant `r_t_c`s.

Fixes #13491

Closes #14375

* github.com:scylladb/scylladb:
  readers: evictable_reader: don't accidentally consume the entire partition
  test: flat_mutation_reader_assertions: squash `r_t_c`s with the same position

(cherry picked from commit 586102b)
denesb added a commit that referenced this pull request Jun 29, 2023
…re partition' from Kamil Braun

The evictable reader must ensure that each buffer fill makes forward progress, i.e. the last fragment in the buffer has a position larger than the last fragment from the previous buffer-fill. Otherwise, the reader could get stuck in an infinite loop between buffer fills, if the reader is evicted in-between.

The code guranteeing this forward progress had a bug: the comparison between the position after the last buffer-fill and the current last fragment position was done in the wrong direction.

So if the condition that we wanted to achieve was already true, we would continue filling the buffer until partition end which may lead to OOMs such as in #13491.

There was already a fix in this area to handle `partition_start` fragments correctly - #13563 - but it missed that the position comparison was done in the wrong order.

Fix the comparison and adjust one of the tests (added in #13563) to detect this case.

After the fix, the evictable reader starts generating some redundant (but expected) range tombstone change fragments since it's now being paused and resumed. For this we need to adjust mutation source tests which were a bit too specific. We modify `flat_mutation_reader_assertions` to squash the redundant `r_t_c`s.

Fixes #13491

Closes #14375

* github.com:scylladb/scylladb:
  readers: evictable_reader: don't accidentally consume the entire partition
  test: flat_mutation_reader_assertions: squash `r_t_c`s with the same position

(cherry picked from commit 586102b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sstableloader/nodetool refresh: bad_alloc (seastar - Failed to allocate 536870912 bytes)
6 participants