Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshot repository registration api call causing "out of memory" errors #10344

Closed
nicktgr15 opened this issue Mar 31, 2015 · 10 comments
Closed

Comments

@nicktgr15
Copy link

We are running an elasticsearch cluster with 27 nodes and we create about 200 new indices per day.
We keep data for the last 5 days, so in total we have about 1000 indices. Every day we take a snapshot of yesterday's 200 indices in S3 (so, no incremental backups).

After updating to 1.4.2 we've noticed that when we try to register a snapshots repository we end up with the master node running out of heap space and the cluster going into an unresponsive state. I'm attaching a screenshot where you can see the heap space usage on the master node after making a PUT request to register a snapshots repository.

jmx-master

After inspecting a heap dump taken from the master node we realised that it's trying to list the contents of our s3 repository. At the moment we keep all previous snapshots in our s3 repository which means that it's impossible (in terms of time and resources) to list everything. I'm attaching a screenshot from Memory Analyser where you can see that 51% of the heap space (800 MB) is occupied by a Map storing 5.000.000 entries with PlainBlobMetadata objects as values and s3 locations as keys.

memoryanalyser_tree

We recently updated to 1.4.2. (from 1.1.2) and we don't think we've seen a similar behaviour (i.e. listing s3 repository contents) in 1.1.2. Could be an issue that needs further investigation on your side or could be the way we are using the snapshots service at the moment that is causing us problems?

Any suggestions/feedback would be welcome.

Thank you

@imotov
Copy link
Contributor

imotov commented Mar 31, 2015

@nicktgr15 could you post the stack trace for the OOM error?

@tlrx
Copy link
Member

tlrx commented Mar 31, 2015

we create about 200 new indices per day

How many number of shards represent those 200 indices? What's the total size?

@nicktgr15
Copy link
Author

Thanks for the responses!

@imotov I'm currently reproducing the issue in order to get a full stack trace which I will attach to the ticket as soon as I have it

@tlrx There are 7 primary shards and 7 replicas per index, so 200 indices are represented by 2800 shards. The total number of shards (for the 5 days retention policy) is 14000. The total size in bytes for the 200 indices created per day is close to 400GB (including replicas).

@tlrx
Copy link
Member

tlrx commented Mar 31, 2015

@nicktgr15 thanks for your quick response

We are suspecting that the verification process failed in your case. Can you please try to register the repository but without verification? You have to set verify: false when registering the repo, see the documentation here

@nicktgr15
Copy link
Author

I'm attaching a screenshot from jConsole showing the heap space utilisation in both cases with verify: true (default) and verify: false. For the two cases we got the following output:

verify: true (default)

curl -XPUT 'http://localhost:9200/_snapshot/s3_repository' -d '{
>     "type": "s3",
>     "settings":{"region":"eu-west-1","base_path":"pyxis/test/snapshots","max_restore_bytes_per_sec":"100mb","bucket":"prod-s3-common-storagebucket-sl9vplg1o48d"}
> }'
{"error":"OutOfMemoryError[Java heap space]","status":500}
[2015-03-31 15:43:25,265][INFO ][repositories             ] [Master Menace] update repository [s3_repository]
[2015-03-31 15:53:11,375][WARN ][repositories             ] [Master Menace] [s3_repository] failed to finish repository verification

verify: false

curl -XPUT 'http://localhost:9200/_snapshot/s3_repository' -d '{
>     "type": "s3",
>     "settings":{"region":"eu-west-1","base_path":"pyxis/test/snapshots","max_restore_bytes_per_sec":"100mb","bucket":"prod-s3-common-storagebucket-sl9vplg1o48d", "verify": false}
> }'
{"error":"AmazonClientException[Unable to unmarshall response (Failed to sanitize XML document destined for handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler). Response Code: 200, Response Text: OK]; nested: AmazonClientException[Failed to sanitize XML document destined for handler c
[2015-03-31 15:56:21,785][INFO ][repositories             ] [Master Menace] update repository [s3_repository]
[2015-03-31 16:05:33,932][WARN ][repositories             ] [Master Menace] [s3_repository] failed to finish repository verification

repository-verify

We reduced the heap space to 300m in order to trigger the out of memory error sooner. As you can see in the first block there is an "out of memory" error but there was no stack trace.

This was not reproduced on Live environment and load conditions were not the same.

@tlrx
Copy link
Member

tlrx commented Apr 1, 2015

@nicktgr15 thanks for the information. I think I can now imagine what happened in your cluster.

In elasticsearch 1.4 we added repository verification. When registering a repository, each node of the cluster tries to write a file in the repository before the repository registration is acknowledged.

The current implementation of the verification process lists all the files at the root level of the repository... and in your case it represents too much data for the Java XML API so the node hits an OOM.

First thing, I've created the pull request #10366 to avoid listing all files when verifying a repository.

Second thing, the is an error in the documentation. To disable repository verification you need to add the URL parameter verify=false when registering the repository. I've created the pull request #10365 to update the doc, but you can now try to register the repository with this parameter.

Last thing: in your case there are good chances that disabling the repository verification will just differ the OOM. The snapshot process needs to list all files of the repository to be able to make an incremental backup. The upcoming #8969 will improve the creation and deletion of snapshot in use cases like yours, but you should consider to create less indices (200 indices with 7 shards = 1400 shards, 400GB / 1400 =~142MB per shard where a shard could handle GBs of data)

You can also consider to create multiple repositories, like snapshotting newly indices in a new repo every day.

@nicktgr15
Copy link
Author

Thanks for the update @tlrx. As a temporary workaround we will be rotating the snapshots repository on a monthly basis. We didn't have a chance to try again with verify: false as a url parameter, if we do, I will provide an update.

Regarding your suggestion about the number of shards in the cluster, is there a performance impact on the snapshotting process because of this?

In general, we know that there is a suggested 10GB upper limit for the shard size and based on what you describe, we should aim for a minimum of 1GB. Does this sound like the right strategy to optimise the number of shards we currently have in the cluster? At this point we can't reduce the number of indices as it is a requirement the system should satisfy.

@tlrx
Copy link
Member

tlrx commented Apr 2, 2015

Regarding your suggestion about the number of shards in the cluster, is there a performance impact on the snapshotting process because of this?

Definitely. The snapshot process iterates over each file of the shards to process and upload them. The more shards you have, the more files need to be snapshotted.

Does this sound like the right strategy to optimise the number of shards we currently have in the cluster?

Hard to tell if it's the right strategy without knowing exactly what is your use case. But for what I know of your cluster I feel like you should benefit of creating less but larger indices.

At this point we can't reduce the number of indices as it is a requirement the system should satisfy.

I'm curious to understand why the number of indices is a requirement for your system?

@nicktgr15
Copy link
Author

Was a requirement when the system was designed as it was simplifying the way we managed clients data (e.g. backups per index, permissions). However, we are going to review our current approach as the high number of indices seems to be an overhead. Thank you for you help.

@tlrx
Copy link
Member

tlrx commented Apr 2, 2015

@nicktgr15 ok thanks, that's what I suspected.

You may be interested in the chapter "Designing for Scale" of the book Elasticsearch Definitive Guide. It's a great chapter that deals with use case like yours.

I'm closing this issue since many pull requests are already engaged. Feel free to reopen if needed.

@tlrx tlrx closed this as completed Apr 2, 2015
tlrx added a commit that referenced this issue Apr 7, 2015
The current implementation of AbstractBlobContainer.deleteByPrefix() calls AbstractBlobContainer.deleteBlobsByFilter() which calls BlobContainer.listBlobs() for deleting files, resulting in loading all files in order to delete few of them. This can be improved by calling BlobContainer.listBlobsByPrefix() directly.

This problem happened in #10344 when the repository verification process tries to delete a blob prefixed by "tests-" to ensure that the repository is accessible for the node. When doing so we have the following calling graph: BlobStoreRepository.endVerification() -> BlobContainer.deleteByPrefix() -> AbstractBlobContainer.deleteByPrefix() -> AbstractBlobContainer.deleteBlobsByFilter() -> BlobContainer.listBlobs()... and boom.

Also, AbstractBlobContainer.listBlobsByPrefix() and BlobContainer.deleteBlobsByFilter() can be removed because it has the same drawbacks as AbstractBlobContainer.deleteByPrefix() and also lists all blobs. Listing blobs by prefix can be done at the FsBlobContainer level.

Related to #10344
tlrx added a commit to tlrx/elasticsearch that referenced this issue Apr 7, 2015
The current implementation of AbstractBlobContainer.deleteByPrefix() calls AbstractBlobContainer.deleteBlobsByFilter() which calls BlobContainer.listBlobs() for deleting files, resulting in loading all files in order to delete few of them. This can be improved by calling BlobContainer.listBlobsByPrefix() directly.

This problem happened in elastic#10344 when the repository verification process tries to delete a blob prefixed by "tests-" to ensure that the repository is accessible for the node. When doing so we have the following calling graph: BlobStoreRepository.endVerification() -> BlobContainer.deleteByPrefix() -> AbstractBlobContainer.deleteByPrefix() -> AbstractBlobContainer.deleteBlobsByFilter() -> BlobContainer.listBlobs()... and boom.

Also, AbstractBlobContainer.listBlobsByPrefix() and BlobContainer.deleteBlobsByFilter() can be removed because it has the same drawbacks as AbstractBlobContainer.deleteByPrefix() and also lists all blobs. Listing blobs by prefix can be done at the FsBlobContainer level.

Related to elastic#10344
tlrx added a commit that referenced this issue Apr 7, 2015
The current implementation of AbstractBlobContainer.deleteByPrefix() calls AbstractBlobContainer.deleteBlobsByFilter() which calls BlobContainer.listBlobs() for deleting files, resulting in loading all files in order to delete few of them. This can be improved by calling BlobContainer.listBlobsByPrefix() directly.

This problem happened in #10344 when the repository verification process tries to delete a blob prefixed by "tests-" to ensure that the repository is accessible for the node. When doing so we have the following calling graph: BlobStoreRepository.endVerification() -> BlobContainer.deleteByPrefix() -> AbstractBlobContainer.deleteByPrefix() -> AbstractBlobContainer.deleteBlobsByFilter() -> BlobContainer.listBlobs()... and boom.

Also, AbstractBlobContainer.listBlobsByPrefix() and BlobContainer.deleteBlobsByFilter() can be removed because it has the same drawbacks as AbstractBlobContainer.deleteByPrefix() and also lists all blobs. Listing blobs by prefix can be done at the FsBlobContainer level.

Related to #10344
mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
The current implementation of AbstractBlobContainer.deleteByPrefix() calls AbstractBlobContainer.deleteBlobsByFilter() which calls BlobContainer.listBlobs() for deleting files, resulting in loading all files in order to delete few of them. This can be improved by calling BlobContainer.listBlobsByPrefix() directly.

This problem happened in elastic#10344 when the repository verification process tries to delete a blob prefixed by "tests-" to ensure that the repository is accessible for the node. When doing so we have the following calling graph: BlobStoreRepository.endVerification() -> BlobContainer.deleteByPrefix() -> AbstractBlobContainer.deleteByPrefix() -> AbstractBlobContainer.deleteBlobsByFilter() -> BlobContainer.listBlobs()... and boom.

Also, AbstractBlobContainer.listBlobsByPrefix() and BlobContainer.deleteBlobsByFilter() can be removed because it has the same drawbacks as AbstractBlobContainer.deleteByPrefix() and also lists all blobs. Listing blobs by prefix can be done at the FsBlobContainer level.

Related to elastic#10344
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants