Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Give the engine the whole index buffer size on init. #31105

Merged
merged 4 commits into from Jun 6, 2018

Conversation

Projects
None yet
5 participants
@jpountz
Copy link
Contributor

commented Jun 5, 2018

Currently the engine is initialized with a hardcoded 256MB of RAM. Elasticsearch
may never use more than that for a given shard, IndexingMemoryController only
has the power to flush segments to disk earlier in case multiple shards are
actively indexing and use too much memory.

While this amount of memory is enough for an index with few fields and larger
RAM buffers are not expected to improve indexing speed, this might actually be
little for an index that has many fields.

Kudos to @bleskes for finding it out when looking into a user who was reporting
a much slower indexing speed when upgrading from 2.x to 5.6 with an index
that has about 20,000 fields.

Give the engine the whole index buffer size on init.
Currently the engine is initialized with a hardcoded 256MB of RAM. Elasticsearch
may never use more than that for a given shard, `IndexingMemoryController` only
has the power to flush segments to disk earlier in case multiple shards are
actively indexing and use too much memory.

While this amount of memory is enough for an index with few fields and larger
RAM buffers are not expected to improve indexing speed, this might actually be
little for an index that has many fields.

Kudos to @bleskes for finding it out when looking into a user who was reporting
a **much** slower indexing speed when upgrading from 2.x to 5.6 with an index
that has about 20,000 fields.
@elasticmachine

This comment has been minimized.

Copy link
Collaborator

commented Jun 5, 2018

@jpountz

This comment has been minimized.

Copy link
Contributor Author

commented Jun 5, 2018

This change would probably be worth backporting to 5.6 though we should probably make it an opt-in via a setting or a JVM command-line argument since it might otherwise have a significant impact on runtime behavior?

@jpountz

This comment has been minimized.

Copy link
Contributor Author

commented Jun 5, 2018

For the record this was introduced in 5.0 via #14121.

@jpountz jpountz requested a review from s1monw Jun 5, 2018

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Jun 5, 2018

@jpountz I am +1 in general. I am not sure if we should give it all we have. I think having an upper bound is good ie. lets move it to at most 1GB?

@jpountz

This comment has been minimized.

Copy link
Contributor Author

commented Jun 5, 2018

I like having protection too, but the fact it doesn't really help anymore once you have several shards that are actively indexing per node makes me wonder whether it is any useful?

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Jun 5, 2018

I like having protection too, but the fact it doesn't really help anymore once you have several shards that are actively indexing per node makes me wonder whether it is any useful?

I mean I agree it won't help much. I am still hesitating to open it up all the way without an escape hatch. Can we have a sys property to set it explicitly if shit hits the fan?

@jpountz

This comment has been minimized.

Copy link
Contributor Author

commented Jun 5, 2018

I added an escape hatch.

@s1monw

s1monw approved these changes Jun 5, 2018

@bleskes

This comment has been minimized.

Copy link
Member

commented Jun 6, 2018

Lgtm2. Thx @jpountz

jpountz added a commit to jpountz/elasticsearch that referenced this pull request Jun 6, 2018

jpountz added a commit to jpountz/elasticsearch that referenced this pull request Jun 6, 2018

jpountz added a commit that referenced this pull request Jun 6, 2018

jpountz added a commit that referenced this pull request Jun 6, 2018

@jpountz jpountz merged commit e9fe371 into elastic:master Jun 6, 2018

3 checks passed

CLA Commit author is a member of Elasticsearch
Details
elasticsearch-ci Build finished.
Details
elasticsearch-ci/packaging-sample Build finished.
Details

@jpountz jpountz deleted the jpountz:fix/iw_memory_buffer branch Jun 6, 2018

jpountz added a commit that referenced this pull request Jun 6, 2018

Give the engine the whole index buffer size on init. (#31105)
Currently the engine is initialized with a hardcoded 256MB of RAM. Elasticsearch
may never use more than that for a given shard, `IndexingMemoryController` only
has the power to flush segments to disk earlier in case multiple shards are
actively indexing and use too much memory.

While this amount of memory is enough for an index with few fields and larger
RAM buffers are not expected to improve indexing speed, this might actually be
little for an index that has many fields.

Kudos to @bleskes for finding it out when looking into a user who was reporting
a **much** slower indexing speed when upgrading from 2.x to 5.6 with an index
that has about 20,000 fields.

@jpountz jpountz removed the v5.6.10 label Jun 13, 2018

@colings86 colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.