Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abort too-large non-paged queries #5870

Closed
dyasny opened this issue Feb 21, 2020 · 11 comments
Closed

Abort too-large non-paged queries #5870

dyasny opened this issue Feb 21, 2020 · 11 comments

Comments

@dyasny
Copy link
Contributor

dyasny commented Feb 21, 2020

This is a request similar to #5804 but wrt non-paged queries.

We keep receiving reports about memory allocation issues on nodes, when in fact the client was running a non-paged query.

I'd like to request to add a special treatment for non-paged queries, so when they are too large they will get aborted and an error will be sent to the client, instead of crashing nodes on bad_allocs

@slivne
Copy link
Contributor

slivne commented Feb 23, 2020

For paged queries we have already changed the code so we will accumulate up to 1M in a result page and then return it, making sure the query at least on the coordinator will not consume to much memory.

We should abort queries that consume to much memory on the coordinator side and have a counter attached to that.

@denesb
Copy link
Contributor

denesb commented Feb 24, 2020

@slivne we have to do this on the replicas, as by the time results get to the coordinator it might be too late, a single replica can have more data then memory.

@denesb
Copy link
Contributor

denesb commented Mar 10, 2020

Maybe we do have to do #5919 before we can do this. Materialized views also issue unpaged queries when generating view updates, and we definitely don't want to abort those. See #5983.

We'll have to carefully tag all internal queries to avoid suprizes.

@dyasny
Copy link
Contributor Author

dyasny commented Mar 10, 2020

@denesb according to @nyh MVs do not issue queries as such, and this feature should not break that functionality. See https://github.com/scylladb/scylla-enterprise/issues/1279#issuecomment-596234707

@denesb
Copy link
Contributor

denesb commented Mar 10, 2020

@dyasny rows can be big, even a single cell can be bigger than the limit (of 1MB). Since we know MVs read single rows, we can tag its queries as system and as such not subject to the limit.

@gleb-cloudius
Copy link
Contributor

gleb-cloudius commented Mar 10, 2020 via email

@denesb
Copy link
Contributor

denesb commented Mar 10, 2020

@gleb-cloudius we don't. For paged queries, we just close the page off whenever we go above the limit (by however much), for unpaged queries there is no limit.

The plan is to introduce a limit for unpaged queries as well, and be mores strict about it. Fail any queries that go above it.

@gleb-cloudius
Copy link
Contributor

gleb-cloudius commented Mar 10, 2020 via email

@dyasny
Copy link
Contributor Author

dyasny commented Mar 10, 2020

So maybe

  1. We should calculate the imit from the amount of shard memory instead of hardcoding it to 1Mb?
  2. Make the limit configurable (in scylla.yaml)

@denesb
Copy link
Contributor

denesb commented Mar 10, 2020

On Tue, Mar 10, 2020 at 06:14:11AM -0700, Botond Dénes wrote: The plan is to introduce a limit for unpaged queries as well, and be mores strict about it. Fail any queries that go above it.
We can punish unpaged queries all we want, but if we apply a limit (for paged queries) before there is a row ready to be returned it means there will be no way to read the row. Of course there is already such limit: shard memory.

I don't plan to do any changes for paged queries. For those the sky is already the shard's memory.
The plan is to simply don't allow any unpaged user query to read more than the limit (1MB) by default. We of course want to exclude system queries from this, and MV update is technically a system query, just wrongly tagged.

@denesb
Copy link
Contributor

denesb commented Mar 10, 2020

So maybe

1. We should calculate the imit from the amount of shard memory instead of hardcoding it to 1Mb?

What if there is another unpaged query executing already? In general determining how much memory is safely consumable is very hard. Users just shouldn't do unpaged queries.

2. Make the limit configurable (in scylla.yaml)

This I planned to do anyway, in fact there is already such a limit introduced in 75efa70 in scope of #5804. The limit is just not used for unpaged queries yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants