Snapshot repository registration api call causing "out of memory" errors #10344

nicktgr15 · 2015-03-31T13:27:18Z

We are running an elasticsearch cluster with 27 nodes and we create about 200 new indices per day.
We keep data for the last 5 days, so in total we have about 1000 indices. Every day we take a snapshot of yesterday's 200 indices in S3 (so, no incremental backups).

After updating to 1.4.2 we've noticed that when we try to register a snapshots repository we end up with the master node running out of heap space and the cluster going into an unresponsive state. I'm attaching a screenshot where you can see the heap space usage on the master node after making a PUT request to register a snapshots repository.

After inspecting a heap dump taken from the master node we realised that it's trying to list the contents of our s3 repository. At the moment we keep all previous snapshots in our s3 repository which means that it's impossible (in terms of time and resources) to list everything. I'm attaching a screenshot from Memory Analyser where you can see that 51% of the heap space (800 MB) is occupied by a Map storing 5.000.000 entries with PlainBlobMetadata objects as values and s3 locations as keys.

We recently updated to 1.4.2. (from 1.1.2) and we don't think we've seen a similar behaviour (i.e. listing s3 repository contents) in 1.1.2. Could be an issue that needs further investigation on your side or could be the way we are using the snapshots service at the moment that is causing us problems?

Any suggestions/feedback would be welcome.

Thank you

imotov · 2015-03-31T13:56:51Z

@nicktgr15 could you post the stack trace for the OOM error?

tlrx · 2015-03-31T14:00:52Z

we create about 200 new indices per day

How many number of shards represent those 200 indices? What's the total size?

nicktgr15 · 2015-03-31T14:34:59Z

Thanks for the responses!

@imotov I'm currently reproducing the issue in order to get a full stack trace which I will attach to the ticket as soon as I have it

@tlrx There are 7 primary shards and 7 replicas per index, so 200 indices are represented by 2800 shards. The total number of shards (for the 5 days retention policy) is 14000. The total size in bytes for the 200 indices created per day is close to 400GB (including replicas).

tlrx · 2015-03-31T15:02:25Z

@nicktgr15 thanks for your quick response

We are suspecting that the verification process failed in your case. Can you please try to register the repository but without verification? You have to set verify: false when registering the repo, see the documentation here

nicktgr15 · 2015-03-31T16:16:58Z

I'm attaching a screenshot from jConsole showing the heap space utilisation in both cases with verify: true (default) and verify: false. For the two cases we got the following output:

verify: true (default)

curl -XPUT 'http://localhost:9200/_snapshot/s3_repository' -d '{
>     "type": "s3",
>     "settings":{"region":"eu-west-1","base_path":"pyxis/test/snapshots","max_restore_bytes_per_sec":"100mb","bucket":"prod-s3-common-storagebucket-sl9vplg1o48d"}
> }'
{"error":"OutOfMemoryError[Java heap space]","status":500}

[2015-03-31 15:43:25,265][INFO ][repositories             ] [Master Menace] update repository [s3_repository]
[2015-03-31 15:53:11,375][WARN ][repositories             ] [Master Menace] [s3_repository] failed to finish repository verification

verify: false

curl -XPUT 'http://localhost:9200/_snapshot/s3_repository' -d '{
>     "type": "s3",
>     "settings":{"region":"eu-west-1","base_path":"pyxis/test/snapshots","max_restore_bytes_per_sec":"100mb","bucket":"prod-s3-common-storagebucket-sl9vplg1o48d", "verify": false}
> }'
{"error":"AmazonClientException[Unable to unmarshall response (Failed to sanitize XML document destined for handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler). Response Code: 200, Response Text: OK]; nested: AmazonClientException[Failed to sanitize XML document destined for handler c

[2015-03-31 15:56:21,785][INFO ][repositories             ] [Master Menace] update repository [s3_repository]
[2015-03-31 16:05:33,932][WARN ][repositories             ] [Master Menace] [s3_repository] failed to finish repository verification

We reduced the heap space to 300m in order to trigger the out of memory error sooner. As you can see in the first block there is an "out of memory" error but there was no stack trace.

This was not reproduced on Live environment and load conditions were not the same.

tlrx · 2015-04-01T12:41:40Z

@nicktgr15 thanks for the information. I think I can now imagine what happened in your cluster.

In elasticsearch 1.4 we added repository verification. When registering a repository, each node of the cluster tries to write a file in the repository before the repository registration is acknowledged.

The current implementation of the verification process lists all the files at the root level of the repository... and in your case it represents too much data for the Java XML API so the node hits an OOM.

First thing, I've created the pull request #10366 to avoid listing all files when verifying a repository.

Second thing, the is an error in the documentation. To disable repository verification you need to add the URL parameter verify=false when registering the repository. I've created the pull request #10365 to update the doc, but you can now try to register the repository with this parameter.

Last thing: in your case there are good chances that disabling the repository verification will just differ the OOM. The snapshot process needs to list all files of the repository to be able to make an incremental backup. The upcoming #8969 will improve the creation and deletion of snapshot in use cases like yours, but you should consider to create less indices (200 indices with 7 shards = 1400 shards, 400GB / 1400 =~142MB per shard where a shard could handle GBs of data)

You can also consider to create multiple repositories, like snapshotting newly indices in a new repo every day.

nicktgr15 · 2015-04-02T09:50:01Z

Thanks for the update @tlrx. As a temporary workaround we will be rotating the snapshots repository on a monthly basis. We didn't have a chance to try again with verify: false as a url parameter, if we do, I will provide an update.

Regarding your suggestion about the number of shards in the cluster, is there a performance impact on the snapshotting process because of this?

In general, we know that there is a suggested 10GB upper limit for the shard size and based on what you describe, we should aim for a minimum of 1GB. Does this sound like the right strategy to optimise the number of shards we currently have in the cluster? At this point we can't reduce the number of indices as it is a requirement the system should satisfy.

tlrx · 2015-04-02T10:07:59Z

Regarding your suggestion about the number of shards in the cluster, is there a performance impact on the snapshotting process because of this?

Definitely. The snapshot process iterates over each file of the shards to process and upload them. The more shards you have, the more files need to be snapshotted.

Does this sound like the right strategy to optimise the number of shards we currently have in the cluster?

Hard to tell if it's the right strategy without knowing exactly what is your use case. But for what I know of your cluster I feel like you should benefit of creating less but larger indices.

At this point we can't reduce the number of indices as it is a requirement the system should satisfy.

I'm curious to understand why the number of indices is a requirement for your system?

nicktgr15 · 2015-04-02T10:23:18Z

Was a requirement when the system was designed as it was simplifying the way we managed clients data (e.g. backups per index, permissions). However, we are going to review our current approach as the high number of indices seems to be an overhead. Thank you for you help.

tlrx · 2015-04-02T12:19:00Z

@nicktgr15 ok thanks, that's what I suspected.

You may be interested in the chapter "Designing for Scale" of the book Elasticsearch Definitive Guide. It's a great chapter that deals with use case like yours.

I'm closing this issue since many pull requests are already engaged. Feel free to reopen if needed.

The current implementation of AbstractBlobContainer.deleteByPrefix() calls AbstractBlobContainer.deleteBlobsByFilter() which calls BlobContainer.listBlobs() for deleting files, resulting in loading all files in order to delete few of them. This can be improved by calling BlobContainer.listBlobsByPrefix() directly. This problem happened in #10344 when the repository verification process tries to delete a blob prefixed by "tests-" to ensure that the repository is accessible for the node. When doing so we have the following calling graph: BlobStoreRepository.endVerification() -> BlobContainer.deleteByPrefix() -> AbstractBlobContainer.deleteByPrefix() -> AbstractBlobContainer.deleteBlobsByFilter() -> BlobContainer.listBlobs()... and boom. Also, AbstractBlobContainer.listBlobsByPrefix() and BlobContainer.deleteBlobsByFilter() can be removed because it has the same drawbacks as AbstractBlobContainer.deleteByPrefix() and also lists all blobs. Listing blobs by prefix can be done at the FsBlobContainer level. Related to #10344

The current implementation of AbstractBlobContainer.deleteByPrefix() calls AbstractBlobContainer.deleteBlobsByFilter() which calls BlobContainer.listBlobs() for deleting files, resulting in loading all files in order to delete few of them. This can be improved by calling BlobContainer.listBlobsByPrefix() directly. This problem happened in elastic#10344 when the repository verification process tries to delete a blob prefixed by "tests-" to ensure that the repository is accessible for the node. When doing so we have the following calling graph: BlobStoreRepository.endVerification() -> BlobContainer.deleteByPrefix() -> AbstractBlobContainer.deleteByPrefix() -> AbstractBlobContainer.deleteBlobsByFilter() -> BlobContainer.listBlobs()... and boom. Also, AbstractBlobContainer.listBlobsByPrefix() and BlobContainer.deleteBlobsByFilter() can be removed because it has the same drawbacks as AbstractBlobContainer.deleteByPrefix() and also lists all blobs. Listing blobs by prefix can be done at the FsBlobContainer level. Related to elastic#10344

The current implementation of AbstractBlobContainer.deleteByPrefix() calls AbstractBlobContainer.deleteBlobsByFilter() which calls BlobContainer.listBlobs() for deleting files, resulting in loading all files in order to delete few of them. This can be improved by calling BlobContainer.listBlobsByPrefix() directly. This problem happened in #10344 when the repository verification process tries to delete a blob prefixed by "tests-" to ensure that the repository is accessible for the node. When doing so we have the following calling graph: BlobStoreRepository.endVerification() -> BlobContainer.deleteByPrefix() -> AbstractBlobContainer.deleteByPrefix() -> AbstractBlobContainer.deleteBlobsByFilter() -> BlobContainer.listBlobs()... and boom. Also, AbstractBlobContainer.listBlobsByPrefix() and BlobContainer.deleteBlobsByFilter() can be removed because it has the same drawbacks as AbstractBlobContainer.deleteByPrefix() and also lists all blobs. Listing blobs by prefix can be done at the FsBlobContainer level. Related to #10344

The current implementation of AbstractBlobContainer.deleteByPrefix() calls AbstractBlobContainer.deleteBlobsByFilter() which calls BlobContainer.listBlobs() for deleting files, resulting in loading all files in order to delete few of them. This can be improved by calling BlobContainer.listBlobsByPrefix() directly. This problem happened in elastic#10344 when the repository verification process tries to delete a blob prefixed by "tests-" to ensure that the repository is accessible for the node. When doing so we have the following calling graph: BlobStoreRepository.endVerification() -> BlobContainer.deleteByPrefix() -> AbstractBlobContainer.deleteByPrefix() -> AbstractBlobContainer.deleteBlobsByFilter() -> BlobContainer.listBlobs()... and boom. Also, AbstractBlobContainer.listBlobsByPrefix() and BlobContainer.deleteBlobsByFilter() can be removed because it has the same drawbacks as AbstractBlobContainer.deleteByPrefix() and also lists all blobs. Listing blobs by prefix can be done at the FsBlobContainer level. Related to elastic#10344

tlrx mentioned this issue Apr 1, 2015

AbstractBlobContainer.deleteByPrefix() should not list all blobs #10366

Merged

tlrx closed this as completed Apr 2, 2015

tlrx mentioned this issue Jun 11, 2015

ES 1.4.2 random node disconnect #9212

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Snapshot repository registration api call causing "out of memory" errors #10344

Snapshot repository registration api call causing "out of memory" errors #10344

nicktgr15 commented Mar 31, 2015

imotov commented Mar 31, 2015

tlrx commented Mar 31, 2015

nicktgr15 commented Mar 31, 2015

tlrx commented Mar 31, 2015

nicktgr15 commented Mar 31, 2015

tlrx commented Apr 1, 2015

nicktgr15 commented Apr 2, 2015

tlrx commented Apr 2, 2015

nicktgr15 commented Apr 2, 2015

tlrx commented Apr 2, 2015

Snapshot repository registration api call causing "out of memory" errors #10344

Snapshot repository registration api call causing "out of memory" errors #10344

Comments

nicktgr15 commented Mar 31, 2015

imotov commented Mar 31, 2015

tlrx commented Mar 31, 2015

nicktgr15 commented Mar 31, 2015

tlrx commented Mar 31, 2015

nicktgr15 commented Mar 31, 2015

tlrx commented Apr 1, 2015

nicktgr15 commented Apr 2, 2015

tlrx commented Apr 2, 2015

nicktgr15 commented Apr 2, 2015

tlrx commented Apr 2, 2015