Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete documents via delete_by_query if index.blocks.read_only_allow_delete is set to true #36114

Closed
makubi opened this issue Nov 30, 2018 · 7 comments · Fixed by #57184
Closed
Assignees
Labels
:Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. >docs General docs changes help wanted adoptme Team:Distributed Meta label for distributed team Team:Docs Meta label for docs team

Comments

@makubi
Copy link

makubi commented Nov 30, 2018

Elasticsearch version (bin/elasticsearch --version): 6.5.1

Plugins installed: []

JVM version (java -version): 1.8.0u191

OS version (uname -a if on a Unix-like system): 4.4.0-128-generic

Description of the problem including expected versus actual behavior:

Elasticsearch does not allow to delete documents via delete_by_query api if index.blocks.read_only_allow_delete is set to true.
The error is

{"took":6,"timed_out":false,"total":1,"deleted":0,"batches":1,"version_conflicts":0,"noops":0,"retries":{"bulk":0,"search":0},"throttled_millis":0,"requests_per_second":-1.0,"throttled_until_millis":0,"failures":[{"index":"my-index","type":"doc","id":"1234","cause":{"type":"cluster_block_exception","reason":"blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"},"status":403}]}

I would expect that deletion via the delete_by_query api is allowed if index.blocks.read_only_allow_delete is set to true.

Steps to reproduce:

  1. Create an index
  2. Insert data
  3. Set index.blocks.read_only_allow_delete to true
  4. Try to delete data using the delete_by_query api

Provide logs (if relevant):

@jasontedor jasontedor added >bug :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. labels Nov 30, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@nik9000
Copy link
Member

nik9000 commented Nov 30, 2018

It looks like read_only_allow_delete is documented as "Identical to index.blocks.read_only but allows deleting the index to free up resources." Deleting documents from an index actually makes the index bigger in the short run. In the long run we merge the documents out of the index and the index will shrink on disk. Those merges have to write a second copy of the live data and I'm pretty sure we won't start a merge that it looks like we don't have room to finish.

@ywelsch
Copy link
Contributor

ywelsch commented Dec 12, 2018

As @nik9000 pointed out, allowing document deletion will not immediately make the index smaller, but can even temporarily make it larger. We should improve the error message and documentation here, however. I'm marking the issue as adoptme (for the docs/error message change).

@makubi
Copy link
Author

makubi commented Dec 14, 2018

Hi @nik9000 and @ywelsch, thank you for the information.

You're right that the documentation for index.blocks.read_only_allow_delete is correct, because it does not mention that deleting documents is allowed. Maybe the information you provided here can be helpful for others, nevertheless.

Thanks.

@wchrisdean
Copy link
Contributor

[doc issue triage]

@rjernst rjernst added Team:Distributed Meta label for distributed team Team:Docs Meta label for docs team labels May 4, 2020
@lockewritesdocs
Copy link
Contributor

Under Dynamic index settings, the description for index.blocks.read_only_allow_delete reads:

Similar to index.blocks.read_only but also allows deleting the index to free up resources.

After a bit of research, I propose the following changes:

  1. Consider removing "to free up resources" from the existing description, since that seems to be a misnomer based on the conversations above.

  2. Add the default, which appears to be false (based on this line).

  3. Add a clarifying note to the effect of:

NOTE: Deleting documents from an index to release resources can increase the index size over time. Instead, consider this other thing.

Admittedly, I don't know what this other thing is. Should we invite users to review the General recommendations and possibly Tune for disk usage? @nik9000 and @ywelsch, I defer to your wisdom here.

@DaveCTurner
Copy link
Contributor

Thanks for picking this up @lockewritesdocs.

There's two very distinct concepts here: deleting documents from an index, and deleting the index itself. Deleting documents from an index doesn't necessarily free up resources and may in fact consume additional resources, and is therefore not permitted when the read-only-allow-delete block is in place. Deleting the index itself does free up resources pretty much straight away, and is a sensible way to free up disk space and therefore release a read-only-allow-delete block.

Delete-by-query is one way to delete documents from an index, and doesn't delete the index itself, so it's a special case of the above..

Therefore:

  1. I think the "to free up resources" bit is correct, but perhaps we can clarify the distinction between the two concepts mentioned above.

  2. Elasticsearch adds and removes the read-only-allow-delete block automatically, so I think it might be misleading to document that the default is false. In fact I would like there to be stronger wording to discourage users from setting this block themselves: the intention is that it's only applied automatically.

  3. Adding a clarifying note seems reasonable, and this other thing is deleting indices as opposed to deleting documents from indices.

lockewritesdocs pushed a commit to lockewritesdocs/elasticsearch that referenced this issue May 26, 2020
lockewritesdocs pushed a commit that referenced this issue May 28, 2020
* Changes for issue #36114.

* Adding stronger wording to the new note.

* Removing statement about typically not needting to set the read-only allow delete block.

* Replacing Elasticsearch with {es} variable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. >docs General docs changes help wanted adoptme Team:Distributed Meta label for distributed team Team:Docs Meta label for docs team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants