Allow filtering doesn't page correctly #4156

avnerbarr · 2019-01-28T12:25:27Z

scylla --version
3.0.1-0.20190120.3c4f8cf6e

testing the filtering feature in the cqlsh - it returns one result but doesn't "page" (or continue , what ever the term is) to the subsequent results - it just shows the "--MORE---" foot note

cqlsh> SELECT * FROM data_feeds.data_feeds_log where section_id=8767466 allow filtering;

 section_id | feed_id | start_time                      | deletions | end_time                        | insertions | message | modifications | version
------------+---------+---------------------------------+-----------+---------------------------------+------------+---------+---------------+---------------
    8767466 |    4142 | 2019-01-27 15:53:48.928000+0000 |         0 | 2019-01-27 15:53:49.443000+0000 |        150 |    null |             0 | 1548604428498

---MORE---
---MORE---
---MORE---
---MORE---

cqlsh> DESCRIBE data_feeds.data_feeds_log;

CREATE TABLE data_feeds.data_feeds_log (
    section_id int,
    feed_id int,
    start_time timestamp,
    deletions int,
    end_time timestamp,
    insertions int,
    message text,
    modifications int,
    version text,
    PRIMARY KEY ((section_id, feed_id), start_time)
) WITH CLUSTERING ORDER BY (start_time ASC)```

The text was updated successfully, but these errors were encountered:

psarna · 2019-01-28T12:33:39Z

What ---MORE--- represents here is empty pages, because all the results were filtered out from them. It was a design decision whether coordinator should return empty pages to the client (which is allowed by the protocol), or try to skip empty pages in order to find one that has at least one row. The downside of solution 2 is that it could easily lead to client-side timeouts if the amount of rows to be filtered is big, so we went with option 1.

slivne · 2019-01-28T12:42:40Z

Please note:

CQLSH default page size is very small 100
You can increase the page size - that will force fetching more results to filter through.

duarten · 2019-01-28T12:44:12Z

Indeed, we could enhance our cqlsh to automatically page empty result sets. I opened scylladb/scylla-tools-java#81.

slivne · 2019-01-28T12:54:33Z

I think the bigger question is how to better handle the case of empty pages in ALLOW FILTERING and we did have some other alternatives. such as fetch additional results till we are closer to the timeout time of the query.

duarten · 2019-01-28T12:58:24Z

That's not a complete solution, since we can still send empty pages to the client. I still think there's value in modifying cqlsh.

Indeed we can try and fetch more pages if we have budget for it. But would we apply this solution if we got a single row back? I don't think we need to avoid these intermediate user requests for this case, since allow filtering is not supposed to be an efficient operation - there's even a warning about that.

slivne · 2019-01-28T13:01:18Z

On Mon, Jan 28, 2019 at 2:58 PM Duarte Nunes ***@***.***> wrote: That's not a complete solution, since we can still send empty pages to the client. I still think there's value in modifying cqlsh. Indeed we can try and fetch more pages if we have budget for it. But would we apply this solution if we got a single row back?

No - I guess.

I don't think we need to avoid these intermediate user requests for this case, since allow filtering is not supposed to be an efficient operation - there's even a warning about that.

Its more efficient then having the client fetch all the data and doing filtering on his own. fair enough - the client can do it to some extent on his end by increasing the page size to a higher value.

…

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <#4156 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADThCLiw9Tv3ZVWpb-6hv0CuqV6vfrQ4ks5vHvP1gaJpZM4aVzrD> .

avikivity · 2019-01-28T13:09:50Z

I don't think we should play games with trying to fetch more rows. There will always be a case where it doesn't work (consider a query that scans the entire table and returns no rows).

The inefficiency here is in the query, adding an extra hop to the client makes it less efficient, but not by much. Filtering should be done when the query returns >1% of the data, below that it is inefficient.

avnerbarr · 2019-01-28T20:16:45Z

You have many cases where the "allow filtering" is utilized for experimentation and examination of the data directly in cqlsh- not for efficiency reasons- this is required since the key is "synthetic" or "bucketed" (i.e by hours) because of partitioning constraints - in this case the feature is meant as a convenience and should be easy to use, not there for peformance optimizations reasons

duarten · 2019-01-28T20:17:40Z

Agreed, but it's still a tooling issue, not a DB issue.

nyh · 2019-01-29T08:36:56Z

I think this is a cqlsh bug. cqlsh documentation states that "PAGING ON displays query results in 100-line chunks followed by the more prompt". But this is not really what it does. It asks Cassandra or Scylla for up to 100 rows, and outputs the page it got. But that page may be smaller than 100 rows, it may even be (as we noticed) empty. cqlsh needs to fetch the next page in this case, again and again, until it has 100 rows to output, if it really wants to obey its documentation. Alternaively, cqlsh's documentation can be modified.

This is similar to piping to "more" in Linux: "something | more" is supposed to show a full screen (e.g., 25 lines) of output at a time. If "something" outputs just 2 lines (or zero lines) and pauses, more waits longer until 25 lines are available and only then says "--More--". It doesn't just show a "--More--" after two lines of output because that's what it got quickly.

But I do agree with @slivne that with changes to Scylla (to try to produce more results for a longer time instead of quickly returning a page of zero results) would have made this issue less important. But I wonder if it doesn't make even more sense to just fix cqlsh, and optimize Scylla for this case (to return fewer silly empty pages) later.

avikivity · 2019-01-29T09:10:21Z

It's not possible, in the general case, to fix this in the database. Suppose you filter all the data away; scylla will have to scan the entire data set before returning the final page (which would also be empty).

tgrabiec · 2019-01-29T11:01:46Z

It's more efficient to scan the table on the scylla side without involving the client, so that network latency to the client doesn't limit throughput.

The client shouldn't set a timeout for queries whose completion is not bounded in time and he's willing to wait for as long as it takes. Filtering queries and count(*) are such operations.

For canceling queries on the server side for which no one is waiting we should have a different mechanism - connection drop should cancel them. The driver should also expose a way to cancel.

duarten · 2019-01-29T11:08:37Z

I don't understand why we should be interested in optimizing allow filtering queries, or queries that are performed without paging. I think aggregation is a different topic, and something we should support efficiently.

I also don't think it's wise for clients to perform operations without setting a timeout, or they can easily block the application for an arbitrary amount of time.

tgrabiec · 2019-01-29T12:18:00Z

On Tue, Jan 29, 2019 at 12:08 PM Duarte Nunes ***@***.***> wrote: I don't understand why we should be interested in optimizing allow filtering queries, or queries that are performed without paging.

We can certainly wait until a user with a weird use case comes along and complains.

I think aggregation is a different topic, and something we should support efficiently.

Both aggregate queries and filtered queries have the same issue with timeouts, so they could share the solution.

I also don't think it's wise for clients to perform operations without setting a timeout, or they can easily block the application for an arbitrary amount of time.

If there is an upper bound which stems from the business logic, then it should be imposed by the application, but I think that's for the application to decide, not the driver. When I run a filtering query via cqlsh, I don't want it to timeout, unless I'm no longer interested in the results. ^C should cancel the query. Detecting failures or deadlocks within the system is a different matter.

…

duarten · 2019-01-29T12:50:43Z

On Tue, Jan 29, 2019 at 12:08 PM Duarte Nunes @.***> wrote: I don't understand why we should be interested in optimizing allow filtering queries, or queries that are performed without paging.
We can certainly wait until a user with a weird use case comes along and complains.

:D

But on the serious side, I don't think users expect allow filtering queries to perform optimally, and I also don't think the interstitial round trips to and from the client will make a huge difference for these types of queries.

I think aggregation is a different topic, and something we should support efficiently.
Both aggregate queries and filtered queries have the same issue with timeouts, so they could share the solution.

A filtering query is more or less sequential, and the result set is bounded by the page size. An aggregation is fully concurrent, so there are other vector for optimization (i.e., map-reduce).

A potential solution that we discussed would be to send keep-alives as empty pages, but then we'd see the same behavior.

I also don't think it's wise for clients to perform operations without setting a timeout, or they can easily block the application for an arbitrary amount of time.
If there is an upper bound which stems from the business logic, then it should be imposed by the application, but I think that's for the application to decide, not the driver. When I run a filtering query via cqlsh, I don't want it to timeout, unless I'm no longer interested in the results. ^C should cancel the query. Detecting failures or deadlocks within the system is a different matter.

Agreed, my point was simply that we can't recommend users to not use timeouts, because I don't think that's the general case. If they are able, and aware of the risks, then querying without a timeout is legitimate.

slivne · 2019-02-03T11:57:53Z

We tested this on Cassandra and it does not return an empty page - so although the protocol supports it - its clear we do not behave in the same manner.

One solution that tzach suggested is to make the drivers "smarter" and fetch the next page if the returned page was empty.

duarten · 2019-02-03T12:02:49Z

On Sun, Feb 3, 2019 at 12:57 Shlomi Livne ***@***.***> wrote: One solution that tzach suggested is to make the drivers "smarter" and fetch the next page if the returned page was empty.

The drivers naturally do this, it's just cqlsh that doesn't. See my first comment here - I already opened an issue in scylla-tools-java.

…

avnerbarr changed the title ~~Allow filtering doesn't~~ Allow filtering doesn't page correctly Jan 28, 2019

duarten added the status/wontfix label Jan 28, 2019

slivne closed this as completed Mar 14, 2019

glommer mentioned this issue Jun 11, 2019

allow filtering queries with aggregation that page returns intermediate aggregated results, not the full one #4540

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow filtering doesn't page correctly #4156

Allow filtering doesn't page correctly #4156

avnerbarr commented Jan 28, 2019 •

edited

psarna commented Jan 28, 2019

slivne commented Jan 28, 2019

duarten commented Jan 28, 2019

slivne commented Jan 28, 2019

duarten commented Jan 28, 2019

slivne commented Jan 28, 2019 via email

avikivity commented Jan 28, 2019

avnerbarr commented Jan 28, 2019 •

edited

duarten commented Jan 28, 2019

nyh commented Jan 29, 2019

avikivity commented Jan 29, 2019

tgrabiec commented Jan 29, 2019

duarten commented Jan 29, 2019

tgrabiec commented Jan 29, 2019 via email

duarten commented Jan 29, 2019

slivne commented Feb 3, 2019

duarten commented Feb 3, 2019 via email

Allow filtering doesn't page correctly #4156

Allow filtering doesn't page correctly #4156

Comments

avnerbarr commented Jan 28, 2019 • edited

psarna commented Jan 28, 2019

slivne commented Jan 28, 2019

duarten commented Jan 28, 2019

slivne commented Jan 28, 2019

duarten commented Jan 28, 2019

slivne commented Jan 28, 2019 via email

avikivity commented Jan 28, 2019

avnerbarr commented Jan 28, 2019 • edited

duarten commented Jan 28, 2019

nyh commented Jan 29, 2019

avikivity commented Jan 29, 2019

tgrabiec commented Jan 29, 2019

duarten commented Jan 29, 2019

tgrabiec commented Jan 29, 2019 via email

duarten commented Jan 29, 2019

slivne commented Feb 3, 2019

duarten commented Feb 3, 2019 via email

avnerbarr commented Jan 28, 2019 •

edited

avnerbarr commented Jan 28, 2019 •

edited