New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrations should dynamically adjust batch size to prevent failing on 413 errors from Elasticsearch #107641
Comments
Pinging @elastic/kibana-core (Team:Core) |
yeah, I suggested the same approach in CRA, but IMO it's valuable to analyze why we have such a large SO. Is it even a valid case to store several Mb in a single SO? It might lead to |
While we haven't been able to reproduce this exactly, we have seen cases where batches are approaching 85MB and we suspect this is on large batches of index-patterns. @mattkime Do we still need to store the field list in the index-pattern saved object? My understanding is that we're now fetching the field list on the fly. Is it possible to add a migration that strips these fields from the SOs? |
In one case this was caused by |
Only saw this after creating my own issue which I now closed as a duplicate. From #108708 Thoughts on implementation... For simplicity we should continue to read batches of So the dynamic batch size would only be relevant when indexing documents (the
Unless (1) adds significant CPU overhead and slows down migrations, it feels like the preferred approach. |
What if we talk to the Elasticsearch team to include the actual request size and max allowed request size in the error response? So we can use these limits to lower the number of the object sent. This approach is quite similar to option (2), but it might be a bit more faster than halving the payload.
I agree it's rather low risk, but uncontrolled growth of SO size might sooner or later lead to this problem. Shouldn't we have at least an approximate understanding of what might be the max heap size used by a batch? We can use |
Yeah I agree, having no upper bound on a saved object size makes it really hard to create a robust system.
Are you suggesting that instead of calculating each batch size like in (1) we only calculate the batch size with I like this approach, that way we're not artificially limiting the batch sizes when an ES cluster might already have a larger |
Should this be added to #104083? |
I've opened elastic/elasticsearch#76705 but for now will just continue assuming the limit is set to 100mb |
Discussed offline: yes, I suggested getting this information from an Elasticsearch response because data structure size might be different in JS (Kibana) and Java (Elasticsearch), so libs like
We can start by providing a warning about excessive size in dev mode to prevent such cases #109815 |
Today, it's possible for a batch of objects being indexed during the
TRANSFORMED_DOCUMENTS_BULK_INDEX
step to fail with a 413 error from Elasticsearch, indicating that the uncompressed request body is larger than Elasticsearch'shttp.max_content_length
config (which defaults to 100mb).We should be able to better handle this scenario by either providing a log message to adjust setting(s) or automatically shrinking the size of the bulk index requests.
For instance, the
TRANSFORMED_DOCUMENTS_BULK_INDEX
step could automatically break the batch of documents in half to be indexed in two separate requests rather than one. I'd imagine this would work by:TransformedDocumentsBulkIndex.transformedDocs
an array of arraysArray<Array<SavedObjectsRawDoc>>
TRANSFORMED_DOCUMENTS_BULK_INDEX
action, only attempt to index the first batch in thetransformedDocs
array.transformedDocs
into two batches of equal sizes and run theTRANSFORMED_DOCUMENTS_BULK_INDEX
step again.TRANSFORMED_DOCUMENTS_BULK_INDEX
on the next batch (if any remain) or proceed toOUTDATED_DOCUMENTS_SEARCH_READ
(if no batches remain)The text was updated successfully, but these errors were encountered: