Fix CQLPagingIterator for Amazon Keyspace #3390

li-boxuan · 2022-12-15T02:23:23Z

Unfortunately, Amazon Keyspace diverges from Apache Cassandra when it comes to paging.

Due to the way Amazon Keyspace works, even if a page iterator returns empty results, there still can be a valid next page with data. This happens when ALLOW FILTERING is used. In JanusGraph, this means when a full scan is triggered. We should fix CQLPagingIterator in CQLKeyColumnValueStore.java.

At the moment, I am not 100% sure whether this would also require a fix from the DataStax java driver which implements paging internally as well.

C.C. @porunov since you did quite a bit of the CQLPagingIterator implementation.

References:
https://stackoverflow.com/questions/69940435/amazon-keyspaces-allow-filtering-without-all-keys-in-node-js
https://docs.aws.amazon.com/keyspaces/latest/devguide/working-with-queries.html#paginating-results
#3378 (reply in thread)

The text was updated successfully, but these errors were encountered:

porunov · 2023-02-08T11:41:50Z

In this case I don't know how to determine that we finished reading data from Amazon Keyspace (at least I'm not sure how to do that with Datastax Java Driver).
I would imagine that we can improve CQLPagingIterator with providing a function which builds a new statement with the last page state (i.e. currentResultSet.getExecutionInfo().getPagingState()). That said, if all those statements return nothing then it's not clear that we finished or not. Thus, it will lead to infinite loop.
I guess as a workaround we could store the last used page state and the current page state. If they are equal then it means we reached the end (probably).
That means we will always have a single redundant call at the end to verify that we really reached the end.
Not sure how good this solution is. Also, I guess it could really be implemented on DataStax driver side.

hexa00 · 2023-03-30T21:33:44Z

@li-boxuan will that be a problem also for a SparkComputer iterating over all edges for example ?
I'm not sure if it's doing iteration differently?

li-boxuan · 2023-04-11T03:40:58Z

@hexa00

That won't affect SparkGraphComputer. Spark traversal uses CqlRecordReader to load the data.

xentripetal · 2023-05-22T18:47:26Z

Some context on the datastax side, the issue is here

https://github.com/datastax/java-driver/blob/39bf5e363e79d0184b5897d65197202391967bae/core/src/main/java/com/datastax/oss/driver/internal/core/cql/MultiPageResultSet.java#L92

    @Override
    protected Row computeNext() {
      maybeMoveToNextPage();
      return currentRows.hasNext() ? currentRows.next() : endOfData();
    }

where it advances the page then ends the iterator if the next page has no data, regardless of if theres another page.

I'm surprised no one at AWS ran into this issue as this is their officially recommended java driver.

porunov · 2023-05-22T18:56:11Z

The issue could potentially be fixed in #3760 because the pagination approach is changed in that PR and we don’t use sync pagination anymore. I assume after that PR is merged the issue could be potentially fixed, but I personally didn’t use Amazon Keyspaces, so can’t tell for sure. Anyone who is interested in Amazon Keyspaces is free to try it afterwards.

li-boxuan added kind/bug/confirmed backend/awskeyspace labels Dec 15, 2022

li-boxuan mentioned this issue Dec 15, 2022

Add a warning of the full-scan usage on AWS Keyspace #3391

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix CQLPagingIterator for Amazon Keyspace #3390

Fix CQLPagingIterator for Amazon Keyspace #3390

li-boxuan commented Dec 15, 2022 •

edited

Loading

porunov commented Feb 8, 2023

hexa00 commented Mar 30, 2023

li-boxuan commented Apr 11, 2023

xentripetal commented May 22, 2023

porunov commented May 22, 2023

Fix CQLPagingIterator for Amazon Keyspace #3390

Fix CQLPagingIterator for Amazon Keyspace #3390

Comments

li-boxuan commented Dec 15, 2022 • edited Loading

porunov commented Feb 8, 2023

hexa00 commented Mar 30, 2023

li-boxuan commented Apr 11, 2023

xentripetal commented May 22, 2023

porunov commented May 22, 2023

li-boxuan commented Dec 15, 2022 •

edited

Loading