Skip to content

There are no easy way to retry on failures when indexing in bulk #130

@vytas-dauksa

Description

@vytas-dauksa

Search::Elasticsearch client tracks two error types internally:

  • "NoNodes" Cxn errors, which don't clear buffer and are retried immediately at least once (Async::Static::NoPing cxn_pool accepts max_retries, which allows specifying maximum number of retries). However, other cxn errors clear buffer - making it hard to retry on failures.
  • Individual action failures - like EsRejectedExecutionException[rejected execution (queue capacity 50) on org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction..] -
    clears the buffer and are not retried.

In either error case it is difficult to retry on failure.

For bulk individual action failures there are handy on_error and on_conflict handlers, however they only provide action, response and index in the callback. I would argue later should not be included - as it only exposes internal workings. I think it would be much more useful to provide the original request, so that client user would be able to requeue it or otherwise handle it.

For cxn errors, it is unclear what other than "NoNodes" errors exist and why they clear the buffer. In case of Async, it is difficult to handle even "NoNodes" errors, for instance user may desire to delay between retries. I haven't checked - but wouldn't ES client in bulk mode auto-flush and keep-dispatching after each action in case of "NoNodes" cxn error and larger queue than max_count or max_size?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions