Abort too-large non-paged queries #5870

dyasny · 2020-02-21T18:28:27Z

This is a request similar to #5804 but wrt non-paged queries.

We keep receiving reports about memory allocation issues on nodes, when in fact the client was running a non-paged query.

I'd like to request to add a special treatment for non-paged queries, so when they are too large they will get aborted and an error will be sent to the client, instead of crashing nodes on bad_allocs

slivne · 2020-02-23T11:12:29Z

For paged queries we have already changed the code so we will accumulate up to 1M in a result page and then return it, making sure the query at least on the coordinator will not consume to much memory.

We should abort queries that consume to much memory on the coordinator side and have a counter attached to that.

denesb · 2020-02-24T12:37:26Z

@slivne we have to do this on the replicas, as by the time results get to the coordinator it might be too late, a single replica can have more data then memory.

denesb · 2020-03-10T09:41:24Z

Maybe we do have to do #5919 before we can do this. Materialized views also issue unpaged queries when generating view updates, and we definitely don't want to abort those. See #5983.

We'll have to carefully tag all internal queries to avoid suprizes.

dyasny · 2020-03-10T13:06:12Z

@denesb according to @nyh MVs do not issue queries as such, and this feature should not break that functionality. See https://github.com/scylladb/scylla-enterprise/issues/1279#issuecomment-596234707

denesb · 2020-03-10T13:11:03Z

@dyasny rows can be big, even a single cell can be bigger than the limit (of 1MB). Since we know MVs read single rows, we can tag its queries as system and as such not subject to the limit.

gleb-cloudius · 2020-03-10T13:13:00Z

On Tue, Mar 10, 2020 at 06:11:20AM -0700, Botond Dénes wrote: @dyasny rows can be big, even a single cell can be bigger than the limit (of 1MB). Since we know MVs read single rows, we can tag its queries as system and as such not subject to the limit.

I do not think we enforce a limit for one row otherwise it will be impossible to read a large row.

…

-- Gleb.

denesb · 2020-03-10T13:14:10Z

@gleb-cloudius we don't. For paged queries, we just close the page off whenever we go above the limit (by however much), for unpaged queries there is no limit.

The plan is to introduce a limit for unpaged queries as well, and be mores strict about it. Fail any queries that go above it.

gleb-cloudius · 2020-03-10T13:17:47Z

On Tue, Mar 10, 2020 at 06:14:11AM -0700, Botond Dénes wrote: The plan is to introduce a limit for unpaged queries as well, and be mores strict about it. Fail any queries that go above it.

We can punish unpaged queries all we want, but if we apply a limit (for paged queries) before there is a row ready to be returned it means there will be no way to read the row. Of course there is already such limit: shard memory.

…

-- Gleb.

dyasny · 2020-03-10T13:20:06Z

So maybe

We should calculate the imit from the amount of shard memory instead of hardcoding it to 1Mb?
Make the limit configurable (in scylla.yaml)

denesb · 2020-03-10T13:24:36Z

On Tue, Mar 10, 2020 at 06:14:11AM -0700, Botond Dénes wrote: The plan is to introduce a limit for unpaged queries as well, and be mores strict about it. Fail any queries that go above it.
We can punish unpaged queries all we want, but if we apply a limit (for paged queries) before there is a row ready to be returned it means there will be no way to read the row. Of course there is already such limit: shard memory.

I don't plan to do any changes for paged queries. For those the sky is already the shard's memory.
The plan is to simply don't allow any unpaged user query to read more than the limit (1MB) by default. We of course want to exclude system queries from this, and MV update is technically a system query, just wrongly tagged.

denesb · 2020-03-10T13:27:00Z

So maybe

1. We should calculate the imit from the amount of shard memory instead of hardcoding it to 1Mb?

What if there is another unpaged query executing already? In general determining how much memory is safely consumable is very hard. Users just shouldn't do unpaged queries.

2. Make the limit configurable (in scylla.yaml)

This I planned to do anyway, in fact there is already such a limit introduced in 75efa70 in scope of #5804. The limit is just not used for unpaged queries yet.

slivne added enhancement area/stability labels Feb 23, 2020

slivne added this to the 3.x milestone Feb 23, 2020

slivne added the Eng-3 label Feb 23, 2020

slivne assigned bhalevy Feb 23, 2020

denesb mentioned this issue Feb 24, 2020

Introduce hard memory limit for non-paged queries #5889

Closed

denesb mentioned this issue Feb 28, 2020

Classify queries based on initiator, instead of target table #5919

Closed

bhalevy assigned denesb Mar 3, 2020

scylladb-promoter closed this as completed in fea5067 Jul 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Abort too-large non-paged queries #5870

Abort too-large non-paged queries #5870

dyasny commented Feb 21, 2020

slivne commented Feb 23, 2020

denesb commented Feb 24, 2020

denesb commented Mar 10, 2020

dyasny commented Mar 10, 2020

denesb commented Mar 10, 2020

gleb-cloudius commented Mar 10, 2020 via email

denesb commented Mar 10, 2020

gleb-cloudius commented Mar 10, 2020 via email

dyasny commented Mar 10, 2020

denesb commented Mar 10, 2020

denesb commented Mar 10, 2020

Abort too-large non-paged queries #5870

Abort too-large non-paged queries #5870

Comments

dyasny commented Feb 21, 2020

slivne commented Feb 23, 2020

denesb commented Feb 24, 2020

denesb commented Mar 10, 2020

dyasny commented Mar 10, 2020

denesb commented Mar 10, 2020

gleb-cloudius commented Mar 10, 2020 via email

denesb commented Mar 10, 2020

gleb-cloudius commented Mar 10, 2020 via email

dyasny commented Mar 10, 2020

denesb commented Mar 10, 2020

denesb commented Mar 10, 2020