Allow retries for statuses other than 429 in streaming_bulk #1004

david-a · 2019-08-28T21:36:31Z

Please allow retry on other statuses as well, not just 429. i.e. You can take in an argument which defaults to [429] or some callback to test the status or the error type.
Use case: sometimes the elasticsearch cluster returns 403 - cluster_block_exception, like when in maintenance, we want to retry the failed items only.

Currently, with raise_on_error=False the errors are aggregated but without their data (because _process_bulk_chunk only adds the data when raise_on_error=True or in case of a TransportError), so we don't know which of them failed.
with raise_on_error=True, the bulk stops whenever it encounters the error, and you can't tell in which chunk the error was found and which item should be retried.

https://github.com/elastic/elasticsearch-py/blob/master/elasticsearch/helpers/actions.py

The text was updated successfully, but these errors were encountered:

lucasrcezimbra · 2020-06-23T20:47:38Z

@david-a

I was having the same problem, but with ConnectionTimeout. Reading the codebase, I found that the client receives the args retry_on_timeout and retry_on_status that are passed to Transport.

I set retry_on_timeout to True and this fixed my problem, like that:

client = Elasticsearch(hosts, retry_on_timeout=True)

Maybe passing a retry_on_status will work for you.

huntekah · 2021-09-14T07:50:16Z

I have to say, adding an option to handle other codes with exponential backoff in bulk() would prevent me from doubling code, just to use exponential backoff with 403 throttling exceptions like:

AuthorizationException(403, '403 Request throttled due to too many requests /my-index_write/_bulk')

exponential backoff works for both those errors, but elasticsearch-py detectsonly one of them :(.

retry_on_status is nice, but it will retry immediately, without any sleep in between.

Closes elastic#1004. This updates elastic#1005 to work for both the async and sync client as well as adding tests.

david-a mentioned this issue Aug 28, 2019

Allow retries for statuses other than 429 in streaming_bulk #1005

Open

ayayron added a commit to ayayron/elasticsearch-py that referenced this issue Sep 22, 2022

Allow retries for statuses other than 429 in bulk streaming

662298a

Closes elastic#1004. This updates elastic#1005 to work for both the async and sync client as well as adding tests.

ayayron linked a pull request Sep 22, 2022 that will close this issue

Allow retries for statuses other than 429 in bulk streaming #2071

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow retries for statuses other than 429 in streaming_bulk #1004

Allow retries for statuses other than 429 in streaming_bulk #1004

david-a commented Aug 28, 2019

lucasrcezimbra commented Jun 23, 2020 •

edited

huntekah commented Sep 14, 2021

Allow retries for statuses other than 429 in streaming_bulk #1004

Allow retries for statuses other than 429 in streaming_bulk #1004

Comments

david-a commented Aug 28, 2019

lucasrcezimbra commented Jun 23, 2020 • edited

huntekah commented Sep 14, 2021

lucasrcezimbra commented Jun 23, 2020 •

edited