Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paging on Secondary Index does not end (NodeJS: ReadableStream never ends) #4569

Closed
thales-gaddini opened this issue Jun 17, 2019 · 16 comments

Comments

Projects
None yet
6 participants
@thales-gaddini
Copy link

commented Jun 17, 2019

Problem

I'm using the npm cassandra client to make a select query using the stream method but the end event is never being triggered. This only happens in large collections where pagination is actually necessary. You won't see this behaviour for small sets.

Installation details

I'm running scylla locally in docker with

docker run --name scylla -d scylladb/scylla

Scylla version (or git commit hash): 3.0.5-0.20190414.403f66eca

Example:

DDL

CREATE KEYSPACE my_keyspace WITH replication = { 'class': 'SimpleStrategy', 'replication_factor' : 1 };

USE my_keyspace;

CREATE TABLE events
(
    eventId UUID,
    userId INT,
    payload TEXT,
    ackeddevices SET<TEXT>,
    PRIMARY KEY(eventId)
);

CREATE INDEX ON events(userId);

Code

const stream = scylladb.stream('SELECT payload, ackeddevices FROM events WHERE userId=?', [userId], { prepare: true });

stream
  .on('readable', function () {
    let data;
    while (data = this.read()) {
      // Doing something with data here
    }
  })
  .on('end', () => console.log('Ended'))
  .on('error', (err) => console.log(`Some error: ${err}`));

I've confirmed:

  • The 'Ended' is never logged when using Scylla (neither is 'Some error', which isn't an issue, just making sure some error wasn't being triggered).
  • the same code works using a Cassandra image (docker run cassandra:latest)
  • the stream.readable property is never set to false with Scylla, even seconds after the query is finished
  • the stream paging is working with both Cassandra and Scylla, it is only that the end event is not being triggered with the Scylla image

I can provide a dml.cql with a lot of inserts if it helps.

@glommer

This comment has been minimized.

Copy link
Contributor

commented Jun 17, 2019

@denesb please comment

@slivne

This comment has been minimized.

Copy link
Contributor

commented Jun 18, 2019

@thales-gaddini please do provide the dml.cql

It will also help if you can provide a tcp dump for the interaction with cassandra and the interaction with scylla capturing port 9042

(the info for no next page should be visible on the last response - and it will help to validate that on the wire its marked correctly flag: Has_more_pages https://github.com/apache/cassandra/blob/trunk/doc/native_protocol_v4.spec#L552)

@slivne slivne added the User Request label Jun 18, 2019

@thales-gaddini

This comment has been minimized.

Copy link
Author

commented Jun 18, 2019

Hi @slivne , the dml is attached as a zip (just a really big file with a lot of inserts).

I've also implemented the paging with the eachRow method as describe here and noticed that the nextPage and pageState from the result are never null.

This is an example of the ResultSet returned in the eachRow callback with Cassandra when there are still pages to be retrieved:

{
  result: ResultSet {
    info: {
      queriedHost: '172.17.0.2:9042',
      triedHosts: [Object],
      speculativeExecutions: 0,
      achievedConsistency: 10,
      traceId: undefined,
      warnings: undefined,
      customPayload: undefined,
      isSchemaInAgreement: true
    },
    rows: undefined,
    rowLength: 20000,
    columns: [ [Object], [Object] ],
    pageState: '10ee2f4e7f73e043bea633a78769f5dfe300f07fffb1df00',
    nextPage: [Function: nextPage]
  }
}

This is an example of the ResultSet returned in the eachRow callback result with Cassandra when there are no more pages to be retrieved:

{
  result: ResultSet {
    info: {
      queriedHost: '172.17.0.2:9042',
      triedHosts: [Object],
      speculativeExecutions: 0,
      achievedConsistency: 10,
      traceId: undefined,
      warnings: undefined,
      customPayload: undefined,
      isSchemaInAgreement: true
    },
    rows: [],
    rowLength: 20000,
    columns: [ [Object], [Object] ],
    pageState: null,
    nextPage: undefined
  }
}

The ResultSet with Scylla never reaches the pageState: null and nextPage: undefined status, making infinite requests. It makes sense to me that the stream uses the same ResultSet under the hood and that's why it never ends. Two consecutive ResultSets when there where no more rows to be retrieved from Scylla:

{
  result: ResultSet {
    info: {
      queriedHost: '172.17.0.2:9042',
      triedHosts: [Object],
      speculativeExecutions: 0,
      achievedConsistency: 10,
      traceId: undefined,
      warnings: undefined,
      customPayload: undefined,
      isSchemaInAgreement: true
    },
    rows: [],
    rowLength: 20000,
    columns: [ [Object], [Object] ],
    pageState: '00000000a40000001000000001000000040000000000271001280000000200000008000000fffc0480a1e1a69410000000c50f3598f6304f7baa73526537dc724cdfb1ff7fee47a20ac473782b0f0c912bde39858e010000003900000001190000001400000001000000080000003aaa8898163098470101190000001400000001000000080000003aaa8898163098470101010000006d41db007affc3b5d4340de9fa2b2a940100',
    nextPage: [Function: nextPage]
  }
}
{
  result: ResultSet {
    info: {
      queriedHost: '172.17.0.2:9042',
      triedHosts: [Object],
      speculativeExecutions: 0,
      achievedConsistency: 10,
      traceId: undefined,
      warnings: undefined,
      customPayload: undefined,
      isSchemaInAgreement: true
    },
    rows: [],
    rowLength: 20000,
    columns: [ [Object], [Object] ],
    pageState: '00000000a40000001000000001000000040000000000271001280000000200000008000000fffc0480a1e1a69410000000c50f3598f6304f7baa73526537dc724cdfb1ff7fee47a20ac473782b0f0c912bde39858e010000003900000001190000001400000001000000080000003aaa8898163098470101190000001400000001000000080000003aaa8898163098470101010000006d41db007affc3b5d4340de9fa2b2a940100',
    nextPage: [Function: nextPage]
  }
}

(they are the same)

About the tcp dump, I wasn't able to do it, could you help me with that?

I've tried:

sudo tcpdump port 9042 -w scylla.pcap
sudo tcpdump port 10000 -w scylla.pcap
sudo tcpdump host 172.17.0.2 and port 9042 -w scylla.pcap
sudo tcpdump host 172.17.0.2 and port 10000 -w scylla.pcap

But none of them had any output. I've also tried to just tcpdump everything (without specifying a host or port), but nothing related to Scylla was logged. Should I tcpdump inside the Docker container?

@thales-gaddini

This comment has been minimized.

Copy link
Author

commented Jun 19, 2019

tcpdumping from docker worked. Here is a zip with the pcap files from both Scylla and Cassandra.

As you said, the Has_more_pages flag is true in Scylla even though there are no more records (with a lot of the same results), as can be seen below (one of the many results )

results

scylla

Has_more_pages is false in the final result in the Cassandra dump.

cassandra

@slivne

This comment has been minimized.

Copy link
Contributor

commented Jun 19, 2019

@thales-gaddini thanks (and extra points for doing the initial analysis - nice going)

There is one difference between cassandra and scylla when it comes to paging.

  • Cassandra will not return empty pages and will attempt to fill up pages to the number of rows requested
  • Scylla will cap page size by the number of rows requested or 1 MB which ever is reached first and may also return empty pages in some cases. In those cases the paging state returned should change between the pages

I am not sure the paging state is changing correctly between pages - yet will dig further after trying to reproduce this using the sample you shared.

@slivne

This comment has been minimized.

Copy link
Contributor

commented Jun 19, 2019

So trying your schema + dml.cql on a clean 3.0.5 docker container with python I can't reproduce

from cassandra.cluster import Cluster

cluster = Cluster(["172.17.0.3"])
session = cluster.connect()
p = session.prepare('SELECT payload, ackeddevices from my_keyspace.events where userid=?')
rows = session.execute(p,[10000])
for user_row in rows:
    print ".",

next the dump :)

@slivne

This comment has been minimized.

Copy link
Contributor

commented Jun 19, 2019

Yet a closer look it does reproduce the issue

last_page

the last page has the flag Has_next_page set to True

issue_4569.cap.gz

@slivne slivne added the sec-index label Jun 19, 2019

@slivne slivne changed the title ReadableStream never ends Paging on Secondary Index does not end (NodeJS: ReadableStream never ends) Jun 19, 2019

@slivne slivne added the bug label Jun 19, 2019

@avikivity

This comment has been minimized.

Copy link
Contributor

commented Jun 19, 2019

Is the paging state changing between different iterations?

@avikivity

This comment has been minimized.

Copy link
Contributor

commented Jun 19, 2019

The driver should have continued fetching. I don't understand @slivne's capture.

@slivne

This comment has been minimized.

Copy link
Contributor

commented Jun 19, 2019

The paging state does not change

frame 17623 result paging state
frame_17623.paging_state.txt

frame 17624 next page request (passes the paging info)
frame_17624.e_paging_state.txt

frame 17625 result paging state - the same as 17623
frame_17625.paging_state.txt

@slivne

This comment has been minimized.

Copy link
Contributor

commented Jun 19, 2019

@avikivity I don;t understand the driver as well not fetching more pages - the program is the one I provided above

@slivne

This comment has been minimized.

Copy link
Contributor

commented Jun 19, 2019

so the query returns a total of 20000 rows split into 5000 per page

I changed the pagesize from being 5000 to being 4999

So scylla can return 4999 and the last page not a full page (4 results)

in this case the last page Has_more_pages is marked as FALSE

issue_4569_2_page_4999.cap.gz

issue_4569_page_4999

So it seems the issue is with the last page and the next query that returns no results returns back the paging information while it shouldn't

@slivne

This comment has been minimized.

Copy link
Contributor

commented Jun 19, 2019

another oddity using the 5000 row page size , when I change the code and not use prepared statements it does not reproduce as well

on docker 3.0.7

this has the problem e.g. the last page is an empty page and has Has_more_pages : True

from cassandra.cluster import Cluster

cluster = Cluster(["172.17.0.4"])
session = cluster.connect()
p = session.prepare('SELECT payload, ackeddevices from my_keyspace.events where userid=?')
rows = session.execute(p,[10000])
for user_row in rows:
    print ".",

this does not e.g. the last page is an empty page with Has_more_pages set to False

from cassandra.cluster import Cluster

cluster = Cluster(["172.17.0.4"])
session = cluster.connect()
rows = session.execute("SELECT payload, ackeddevices from my_keyspace.events where userid=10000")
for user_row in rows:
    print ".",

psarna added a commit to psarna/scylla that referenced this issue Jun 19, 2019

cql3: fix infinite paging for indexed queries
Indexed queries need to translate between view table paging state
and base table paging state, in order to be able to page the results
correctly. One of the stages of this translation is overwriting
the paging state obtained from the base query, in order to return
view paging state to the user, so it can be used for fetching next
pages. Unfortunately, in the original implementation the paging
state was overwritten only if more pages were available,
while if 'remaining' pages were equal to 0, nothing was done.
This is not enough, because the paging state of the base query
needs to be overwritten unconditionally - otherwise a guard paging state
value of 'remaining == 0' is returned back to the client along with
'has_more_pages = true', which will result in an infinite loop.
This patch correctly overwrites the base paging state unconditionally.

Fixes scylladb#4569

psarna added a commit to psarna/scylla that referenced this issue Jun 19, 2019

tests: add test case for finishing index paging
The test case makes sure that paging indexes does not result
in an infinite loop.

Refs scylladb#4569

psarna added a commit to psarna/scylla that referenced this issue Jun 19, 2019

tests: add test case for finishing index paging
The test case makes sure that paging indexes does not result
in an infinite loop.

Refs scylladb#4569

avikivity added a commit that referenced this issue Jun 19, 2019

Merge "Fix infinite paging for indexed queries" from Piotr
"
Fixes #4569

This series fixes the infinite paging for indexed queries issue.
Before this fix, paging indexes tended to end up in an infinite loop
of returning pages with 0 results, but has_more_pages flag set to true,
which confused the drivers.

Tests: unit(dev)
Branches: 3.0, 3.1
"

* 'fix_infinite_paging_for_indexed_queries' of https://github.com/psarna/scylla:
  tests: add test case for finishing index paging
  cql3: fix infinite paging for indexed queries
@thales-gaddini

This comment has been minimized.

Copy link
Author

commented Jun 19, 2019

Hey guys, thanks for the fix. When does it go from next to master?

@psarna

This comment has been minimized.

Copy link
Member

commented Jun 19, 2019

@thales-gaddini before being merged into master, it needs to pass the gating tests. If everything goes smoothly, then it may be a couple of hours, if not, then until I fix the bugs and resend :)

@nyh nyh referenced this issue Jun 19, 2019

Open

Infinite paging while querying secondary index #4127

1 of 1 task complete

avikivity added a commit that referenced this issue Jun 20, 2019

Merge "Fix infinite paging for indexed queries" from Piotr
"
Fixes #4569

This series fixes the infinite paging for indexed queries issue.
Before this fix, paging indexes tended to end up in an infinite loop
of returning pages with 0 results, but has_more_pages flag set to true,
which confused the drivers.

Tests: unit(dev)
Branches: 3.0, 3.1
"

* 'fix_infinite_paging_for_indexed_queries' of https://github.com/psarna/scylla:
  tests: add test case for finishing index paging
  cql3: fix infinite paging for indexed queries

avikivity added a commit that referenced this issue Jun 23, 2019

Merge "Fix infinite paging for indexed queries" from Piotr
"
Fixes #4569

This series fixes the infinite paging for indexed queries issue.
Before this fix, paging indexes tended to end up in an infinite loop
of returning pages with 0 results, but has_more_pages flag set to true,
which confused the drivers.

Tests: unit(dev)
Branches: 3.0, 3.1
"

* 'fix_infinite_paging_for_indexed_queries' of https://github.com/psarna/scylla:
  tests: add test case for finishing index paging
  cql3: fix infinite paging for indexed queries

avikivity added a commit that referenced this issue Jun 23, 2019

Merge "Fix infinite paging for indexed queries" from Piotr
"
Fixes #4569

This series fixes the infinite paging for indexed queries issue.
Before this fix, paging indexes tended to end up in an infinite loop
of returning pages with 0 results, but has_more_pages flag set to true,
which confused the drivers.

Tests: unit(dev)
Branches: 3.0, 3.1
"

* 'fix_infinite_paging_for_indexed_queries' of https://github.com/psarna/scylla:
  tests: add test case for finishing index paging
  cql3: fix infinite paging for indexed queries

(cherry picked from commit 9229afe)

psarna added a commit to psarna/scylla that referenced this issue Jun 24, 2019

cql3: fix infinite paging for indexed queries
Indexed queries need to translate between view table paging state
and base table paging state, in order to be able to page the results
correctly. One of the stages of this translation is overwriting
the paging state obtained from the base query, in order to return
view paging state to the user, so it can be used for fetching next
pages. Unfortunately, in the original implementation the paging
state was overwritten only if more pages were available,
while if 'remaining' pages were equal to 0, nothing was done.
This is not enough, because the paging state of the base query
needs to be overwritten unconditionally - otherwise a guard paging state
value of 'remaining == 0' is returned back to the client along with
'has_more_pages = true', which will result in an infinite loop.
This patch correctly overwrites the base paging state unconditionally.

Fixes scylladb#4569

(cherry picked from commit 88f3ade)

psarna added a commit to psarna/scylla that referenced this issue Jun 24, 2019

tests: add test case for finishing index paging
The test case makes sure that paging indexes does not result
in an infinite loop.

Refs scylladb#4569

(cherry picked from commit b8cadc9)

avikivity added a commit that referenced this issue Jun 24, 2019

Merge "Backport fixing infinite paging for indexed queries" from Piotr
"
This series backports fixing infinite paging for indexed queries
to branch-3.0.

Tests: unit(dev)
"

Fixes #4569

* 'fix_infinite_paging_for_indexed_queries_for_3.0' of https://github.com/psarna/scylla:
  tests: add test case for finishing index paging
  cql3: fix infinite paging for indexed queries
@avikivity

This comment has been minimized.

Copy link
Contributor

commented Jun 24, 2019

Backported to 3.0 and 3.1, removing from backport candidates.

avikivity added a commit that referenced this issue Jun 25, 2019

Merge "Backport fixing infinite paging for indexed queries" from Piotr
"
This series backports fixing infinite paging for indexed queries
to branch-3.0.

Tests: unit(dev)
"

Fixes #4569

* 'fix_infinite_paging_for_indexed_queries_for_3.0' of https://github.com/psarna/scylla:
  tests: add test case for finishing index paging
  cql3: fix infinite paging for indexed queries
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.