Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow marking an index as cold and moving its in-memory items to disk #23546

Closed
trevan opened this issue Mar 11, 2017 · 15 comments
Closed

Allow marking an index as cold and moving its in-memory items to disk #23546

trevan opened this issue Mar 11, 2017 · 15 comments
Assignees
Labels
:Data Management/Indices APIs APIs to create and manage indices and templates >feature resiliency

Comments

@trevan
Copy link

trevan commented Mar 11, 2017

This is similar to #10869 but instead of closing/reopening indices, it would be nice to just have it flush its in-memory items to disk. Looking at the output of /{index}/_segments?verbose=true shows a lot of BlockTreeItems, FST, and postings that are in-memory. For a really old index, having the ability to reduce its in-memory footprint while still leaving it open would be beneficial. Going through a close/reopen process causes the cluster to go red a lot and if a node goes down, we have to make sure that any closed indices on that dead node are re-opened so that the replicas get moved to other nodes.

This could be a call like /_cache/clear that we have to run periodically. So the index itself doesn't have to worry about whether it is "cold". If the new call is run, the index moves the in-memory data to disk and the next time a query hits it, it brings the data back to memory. Then later, we would manually run the call again to move it back to disk.

@s1monw
Copy link
Contributor

s1monw commented Mar 12, 2017

While this sounds like an appealing idea there are quite a bit of caveats attached to it. Essentially what this request is about is to make opening an index reader in a lazy fashion once it's requested. Yet, if you take a step closer what that means is that we need to load the in-memory datastrucutres once the reader is accessed. Lets say you have 2k lazy indices in your cluster and you call _search you will hit every index in the cluster and that will in-turn cause your cluster to load all it's lazy readers which will likely end up in a out of memory (OOM) error? While I agree that this would be nice I am way to concerned about a read request killing your cluster to make this a feature.

@s1monw s1monw closed this as completed Mar 12, 2017
@trevan
Copy link
Author

trevan commented Mar 12, 2017

@s1monw, the problem is that the current way to deal with lots of cold indices is to close/open them. So if I have a custom built auto-opener, then I can still have the issue where lots of indices are opened at the same time and run out of memory. So I and anyone who would use this feature already have to deal with that case. Anyone who would use this feature would have to deal with that issue and you can put "caveats" around it.

But the open/close mechanism is more likely to cause issues in the long run because it probably takes longer to open an index versus warm up a lazy reader, the cluster goes red while the index is opening (and might not even recover if there are allocation issues), and closed indices don't automatically close when a node goes down.

@s1monw
Copy link
Contributor

s1monw commented Mar 12, 2017

So if I have a custom built auto-opener, then I can still have the issue where lots of indices are opened at the same time and run out of memory. So I and anyone who would use this feature already have to deal with that case. Anyone who would use this feature would have to deal with that issue and you can put "caveats" around it.

I disagree on this statement. Lemme explain, when you open / close an index that is and admin like operation that isn't executed by accident and it also won't apply wildcards. It's also executed by a privileged user (I hope) such that there is more control over this in general. Yet, if we allow this to be executed by _search or some index patterns that expand to those lazy indices can be done by anybody and should not hit you by surprise as in letting your nodes go OOM.

But the open/close mechanism is more likely to cause issues in the long run because it probably takes longer to open an index versus warm up a lazy reader, the cluster goes red while the index is opening (and might not even recover if there are allocation issues), and closed indices don't automatically close when a node goes down.

we are currently in the process of designing a better open / close implement that is also replicated etc. that might also make it easier to allocate these indices and will prevent the cluster from blinking. (go red for a while) I think we need to design that is safe and gives you a less hard time here. Yet, we are not there yet.

@trevan
Copy link
Author

trevan commented Mar 13, 2017

@s1monw, I think what you are missing is that for those of us that need either this feature or #10869 will implement our own open/close mechanism that can be executed by accident and can be applied to wildcards. We need to be able to have lots of data available in ES without requiring a ton of RAM. So, the only mechanism is to close/open the indices and to do it automatically when searches are made.

In the end, we need a way to reduce the RAM usage for really cold indices but still allow those indices to be easily searchable.

@s1monw
Copy link
Contributor

s1monw commented Mar 15, 2017

We need to be able to have lots of data available in ES without requiring a ton of RAM. So, the only mechanism is to close/open the indices and to do it automatically when searches are made.

I agree but we can't just add some risky feature just fix a problem. we need a sustainable solution that deals with it. This suggestion has problems that I am not willing to sign up for as a compromise. If we make compromises here they need to be safe!

@s1monw
Copy link
Contributor

s1monw commented Mar 15, 2017

I will reopen this to use it as a discuss area for now.

@s1monw s1monw reopened this Mar 15, 2017
@s1monw s1monw self-assigned this Mar 15, 2017
@makeyang
Copy link
Contributor

how about add a breaker for in-memory datastructures when open index reader?

@clintongormley clintongormley added the :Data Management/Indices APIs APIs to create and manage indices and templates label Mar 29, 2017
@jpountz
Copy link
Contributor

jpountz commented Apr 25, 2017

If we do this, I'd like to make sure we are addressing an actual problem rather than the symptom of some mis-configuration (eg. too many shards in a cluster). Also how much memory are we talking about, and where is it spent? Maybe the right fix is to make Lucene indices more memory-efficient.

@trevan
Copy link
Author

trevan commented Apr 25, 2017

@jpountz, improving memory usage of Lucene could work for me as well.

This is really a cost issue. It is really cheap to add more disk for cold storage indices (we have it over iSCSI which is slow but these are indices that only used once a day at most). But to add additional RAM, it costs a bit.

https://discuss.elastic.co/t/why-is-my-heap-usage-always-high/45017/9 is my discuss question to see if there is something I can do in my configuration. The last reply from Mark Walkom is that I just need to add new nodes and that increases our overhead.

@jpountz
Copy link
Contributor

jpountz commented Apr 25, 2017

To me the issue is that there are 17745 shards for only 5.6 billion docs. I'd recommend using the rollover and shrink APIs to manage indices rather than daily indices in order to better utilize resources.

@trevan
Copy link
Author

trevan commented Apr 25, 2017

@jpountz, you need to read my last comment there. We are now doing rollover that is based on size instead of daily. As of today, we have 7300 shards with 55 billion docs that takes up 180TB of disk.

@jpountz
Copy link
Contributor

jpountz commented Apr 26, 2017

180TB for 7300 shards means that shards are only 25GB on average, I think you could still aim at larger shards.

That said, I agree this is the kind of scale that makes keeping these indices open quite costly. I don't have good ideas how to improve this, but I don't think opening indices on demand is a good solution.

@trevan
Copy link
Author

trevan commented Apr 26, 2017

@jpountz, yeah, we've been working on that. All of our most recent indices should be averaging around 40GB per shard. It takes a while to combine older indices.

@trevan
Copy link
Author

trevan commented Jul 18, 2017

@s1monw, is the design for the better open / close mechanism in 5.x or 6.x? We are trying to deal with opening/closing indices and handling rebalancing and node removal better. That seems to be the only way to accomplish this unless there has been any more discussion about this?

@s1monw
Copy link
Contributor

s1monw commented Nov 9, 2018

superseded by #34352 and #33888

@s1monw s1monw closed this as completed Nov 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Indices APIs APIs to create and manage indices and templates >feature resiliency
Projects
None yet
Development

No branches or pull requests

7 participants