Skip to content

Fix bug of scan aggregate index returning empty non-end continuation #3397

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

pengpeng-lu
Copy link
Contributor

@pengpeng-lu pengpeng-lu commented Jun 13, 2025

Fixes issue: #3206

Instead of returning the innerContinuation byte array, serializes into a protobuf, with an isEnd flag.

There will be 2 steps:

  1. still serializes to the old format, being able to deserialize from both old and new format;
  2. serializes to the new format.

Add an ExecuteProperty for testing purpose, can remove it after we completely switch to TO_NEW mode.

@pengpeng-lu pengpeng-lu added the bug fix Change that fixes a bug label Jun 17, 2025
@pengpeng-lu pengpeng-lu changed the title Keyvalue cursor Fix bug of scan aggregate index returning empty non-end continuation Jun 17, 2025
@@ -108,8 +106,6 @@ test_block:
-
- query: select count(col1) from t2
- explain: "ISCAN(MV5 <,>) | MAP (_ AS _0) | AGG (count(_._0.COL1) AS _0) | ON EMPTY NULL | MAP (coalesce_long(_._0._0, promote(0l AS LONG)) AS _0)"
# Cannot run with FORCE_CONTINUATIONS due to: https://github.com/FoundationDB/fdb-record-layer/issues/3206
- maxRows: 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 2 were also mis-categorized to issue 3206.

@pengpeng-lu pengpeng-lu marked this pull request as ready for review June 17, 2025 05:29
@pengpeng-lu pengpeng-lu requested a review from alecgrieser June 17, 2025 05:29
@@ -72,12 +72,13 @@ public class ExecuteProperties {
// how record scan limit reached is handled -- false: return early with continuation, true: throw exception
private final boolean failOnScanLimitReached;
private final boolean isDryRun;
private final boolean kvCursorContSerializeToNew;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will add comments in the next commit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. I'm still not sure this really belongs in the ExecuteProperties. The point of the ExecuteProperties configuration is that it's supposed to contain stuff that a user might want to configure differently for other identical invocations of some operation. For example, various limits are in here, which could be applied to different plans. This is different, because for a given plan, it's actually very important that the user correctly serialize/deserialize KV cursor continuations. Which doesn't seem to fit the pattern

Copy link
Collaborator

@alecgrieser alecgrieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay in getting to reviewing this!

@@ -72,12 +72,13 @@ public class ExecuteProperties {
// how record scan limit reached is handled -- false: return early with continuation, true: throw exception
private final boolean failOnScanLimitReached;
private final boolean isDryRun;
private final boolean kvCursorContSerializeToNew;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. I'm still not sure this really belongs in the ExecuteProperties. The point of the ExecuteProperties configuration is that it's supposed to contain stuff that a user might want to configure differently for other identical invocations of some operation. For example, various limits are in here, which could be applied to different plans. This is different, because for a given plan, it's actually very important that the user correctly serialize/deserialize KV cursor continuations. Which doesn't seem to fit the pattern

@@ -156,21 +164,79 @@ public boolean isEnd() {
@Nonnull
@Override
public ByteString toByteString() {
if (lastKey == null) {
return ByteString.EMPTY;
System.out.println("Keyvaluecursor toByteString serialization mode:" + serializationMode.name());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
System.out.println("Keyvaluecursor toByteString serialization mode:" + serializationMode.name());

Comment on lines +182 to +191
if (serializationMode == SerializationMode.TO_OLD) {
if (lastKey == null) {
return null;
}
return Arrays.copyOfRange(lastKey, prefixLength, lastKey.length);
} else {
byte[] result = toProto().toByteArray();
System.out.println("result:" + result);
return toProto().toByteArray();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose there are some performance implications, but it would be nice if this were able to be based on the toByteString() implementation

}

private static class Continuation implements RecordCursorContinuation {
public static class Continuation implements RecordCursorContinuation {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be public just so that the RecordQueryIndexPlan can update its continuation appropriately?


message KeyValueCursorContinuation {
optional bytes continuation = 1;
optional bool isEnd = 2;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really shouldn't be returning any "end" continuations that are of type KeyValueCursorContinuation, so this seems a little suspect. To support something #1076, it could be useful to have the prefix length, though that actually has some portability problems, though that's not necessarily required. For the purposes of just solving this problem, I think the issue was just that we would sometimes get the empty string if the prefix size was the same as the whole key size. If we can confirm that wrapping it in a continuation and setting the continuation field to the empty string results in a non-empty continuation, that might actually be enough. If we did go with just a wrapped bytes, we should probably note why we went with that in a comment lest someone be confused why it's not just returning the underlying bytes.

Comment on lines +775 to +776
if (continuation instanceof KeyValueCursorBase.Continuation) {
continuationBytes = ((KeyValueCursorBase.Continuation) continuation).getInnerContinuationInBytes();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there cases where the continuation is not of this type? Should assert on the continuation type here instead of attempting to work around it?

@@ -44,6 +44,8 @@
import org.junit.jupiter.api.Tag;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.extension.RegisterExtension;
import org.junit.jupiter.params.ParameterizedTest;
import org.junit.jupiter.params.provider.EnumSource;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably be good to add a test here where there is a scan over a single key range. The expectation would be that the continuation from the (only) key should be empty in the old serialization mode, but it should be non-empty in the new serialization mode

@@ -496,7 +496,7 @@ void basicReadWithNullsTest(final boolean useAsync, @Nonnull final String storag
}

void indexReadTest(final boolean useAsync, final long seed, final int numRecords, @Nonnull final String storage,
final boolean storeHilbertValues, final boolean useNodeSlotIndex) throws Exception {
final boolean storeHilbertValues, final boolean useNodeSlotIndex) throws Exception {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor:

Suggested change
final boolean storeHilbertValues, final boolean useNodeSlotIndex) throws Exception {
final boolean storeHilbertValues, final boolean useNodeSlotIndex) throws Exception {

I think. The main goal here would be to align the finals from the two lines

Comment on lines -251 to -252
assertEquals(indexResult.getContinuation().toByteString(), overscanResult.getContinuation().toByteString(), "Continuation byte strings should match");
assertArrayEquals(indexResult.getContinuation().toBytes(), overscanResult.getContinuation().toBytes(), "Continuation byte arrays should match");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this assertion actually need to change? I would have thought that we'd expect the continuations to stay the same, even if both are wrapped in protobuf

@@ -83,6 +83,7 @@ public boolean hasNext() {

private void fetchNextResult() {
if (result != null) {
continuation = ContinuationImpl.fromRecordCursorContinuation(result.getContinuation());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I'm not sure this is quite right. It seems like there is perhaps a deficiency in the previous code, but perhaps it should set the continuation on below near where it sets the continuation if the noNextReason is SOURCE_EXHAUSTED. That is, instead of just setting it to END if the no next reason is SOURCE_EXHAUSTED, it should also set it to the copied over continuation if the no-next-reason is not SOURCE_EXHAUSTED

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug fix Change that fixes a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants