Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy free of keys and databases #1748

Closed
antirez opened this Issue May 13, 2014 · 18 comments

Comments

Projects
None yet
9 participants
@antirez
Copy link
Owner

antirez commented May 13, 2014

This issue is very long, so the gentle reader deserves a TL;DR:

DEL and FLUSHALL / FLUSHDB are blocking commands, slow when called with large values / DBs. This issue proposes a way to free objects incrementally in the background.

Status and alternate names

This issue is a work in progress proposal of a feature that would allow Redis to free keys and entire databases in an incremental way. Currently it is not meant to be turned into an implementation, but to serve as a starting point for the design.

The feature alternate names (to make this issue simpler to search) are:

  • Asynchronous or incremental deletion of objects.
  • Lazy freeing.
  • Non blocking delete.

The problem

When the DEL command (and other commands performing an implicit deletion of the previous value stored at a key, like SET) is called against a very big aggregate value, such a big Hash, Set, Sorted Set, or List, Redis will try to reclaim the memory used by the key and associated value.

For large objects, this takes a non trivial amount of time. For reference, to free a Redis List containing 1 million of items "foobar", takes 200 milliseconds (0.2 seconds).

The same problem happens in a more visible way with the FLUSHALL and FLUSHDB commands, since the commands are required to walk the entire key space in order to release all the keys and associated values. Freeing a key, or the entire database, takes a time which is proportional to the number of dynamic allocations we need to free(), so it is roughly proportional to the number of objects we have in total (a key is an object, every element in an aggregate data type is an object, and so forth). It is easy for FLUSHALL of multi gigabyte data sets to take many seconds to be executed.

Solutions

Fortunately overcoming this limitation is, from an implementation standpoint, not particularly hard. The simplest thing to do is to don't really free values (or entire dictionaries of values in the case of FLUSH), but queue them into a free list. The free list is then processed incrementally in the serverCron() function like all the other background tasks in Redis, or a special timer hander can be setup specifically for lazy freeing of objects in order for us to be able to call it with an higher frequency compared to Redis hz setting.

At every call, the lazy free function should try to perform a given number of deallocations.

The deallocation should be more aggressive if maxmemory setting is set and we are near the memory limit. Probably a blocking-free should be performed once we hit the limit.

API

To have this as default semantics of DEL, FLUSHALL and FLUSHDB is in my opinion dangerous, since the semantics of reclaiming memory asynchronously is notoriously more complex than freeing resources as soon as possible, and for most Redis users, the default semantics is not problematic at all.

So the proposal is to add the following commands and command options:

  • LAZYDEL <key> ... <key>
  • FLUSHALL LAZY
  • FLUSHDB <db_id> LAZY

Those commands are equivalent to DEL, FLUSHALL, and FLUSHDB, with the sole difference of how the memory is reclaimed: incrementally versus in the usual blocking way.

LAZYDEL should probably be optimized in order to recognize when actually the object the user wants to release is small so that it is a better idea to just release it in a direct way, like DEL would do. This way the user may switch an entire application to LAZYDEL without checking with too much care when it is a good idea to use it and when it is not.

Lazy freeing of expired / evicted keys

The same reasoning could be extended to keys expired / evicted from the database when they reach their time to live, or because the maxmemory limit is reached. However probably this should be addressed, at least for eviction due to maxmemory, into a different issue, since this requires logic to understand how viable is for us to don't really reclaim memory ASAP when we reach the memory limit.

More design effort and work is needed in order to turn this specification into an actual plan that can be turned into code. This feature is scheduled for Redis 3.2.

@antirez antirez added this to the Redis >= 3.2 milestone May 13, 2014

@bobrik

This comment has been minimized.

Copy link

bobrik commented May 13, 2014

Do you think this is a bad idea to make freeing behavior configurable? I tend to thing that making all flush/delete operations lazy with config option could be a good idea.

Anyway, even with LAZYDEL it seems possible to do rename-command LAZYDEL DEL to achieve the same behavior.

@xiaost

This comment has been minimized.

Copy link
Contributor

xiaost commented May 15, 2014

Cool 👍

but what about the expiry which contains millions of items?
can it be lazy?

@antirez

This comment has been minimized.

Copy link
Owner Author

antirez commented May 15, 2014

@bobrik I think it is better if there is semantical distinction in the client using Redis, so IMHO it is better that the client does what it means using the right command. Lazy free is not a "better free", it is just semantically different way to reclaim memory, which is non blocking (good!) and less predictable (bad!).

@antirez

This comment has been minimized.

Copy link
Owner Author

antirez commented May 15, 2014

@xiaost Good point! Adding this to the proposal.

@mattsta

This comment has been minimized.

Copy link
Contributor

mattsta commented May 15, 2014

Some thoughts on edge cases and operation order:

  • Would lazy delete immediately remove the entire keyspace then free data in the background?
    • For example, if an instance has a million keys, what does this return?
FLUSHALL LAZY
KEYS *
  • If we run a pipeline of SET a b; FLUSHALL LAZY; GET a, does a return nil?
  • If we run a pipeline of SET a b; FLUSHALL LAZY; SET a z, are we guaranteed the second SET will continue to exist?
  • I guess for FLUSH{ALL,DB} LAZY we could:
    • immediately create new, empty, DBs
    • move existing DBs into non-user-accessible storage
      • then, cron can iterate over any DBs in the non-user-selectable DB space to free data.
@peterbourgon

This comment has been minimized.

Copy link

peterbourgon commented May 15, 2014

For example, if an instance has a million keys, what does this return? FLUSHALL LAZY; KEYS *

I hope it will immediately return zero keys...

@antirez

This comment has been minimized.

Copy link
Owner Author

antirez commented May 15, 2014

@mattsta semantically the behavior of commands is identical to non-lazy freeing. What changes is not the behavior of commands, but the behavior of memory release, that is postponed to the future.

So if you today have the following workload:

  • Add ASAP million of keys.
  • FLUSHALL
  • Add ASAP million of keys (the new version of something or alike)
  • FLUSHALL

If you know that you have, for example, 10 GB of memory, and those are enough for your mass-add operation, later the FLUSHALL will reclaim the memory, and in the next iterations you can mass-add again.

If you turn the FLUSHALL into its lazy freeing version, this is no longer true, and as you start adding the new set of keys, you may have still most of the previous memory still occupied. This is just an example of how the semantics of memory release becomes more complex and why I don't think it is a good idea to make those commands the default behavior.

@josiahcarlson

This comment has been minimized.

Copy link

josiahcarlson commented Aug 6, 2014

I can describe several different methods for lazy data deletion that offer user-tunable overhead for deletion, deletion that is effectively hidden from execution time but is able to keep up even during the scenario that @antirez described, and/or a combination of the two.

The implementation is not simple, but it's also not complex. Let me know if you are curious/interested.

@antirez

This comment has been minimized.

Copy link
Owner Author

antirez commented Jan 31, 2015

@josiahcarlson what you describe is basically not different than what you get with expires + maxmemory, the eviction is proportionally more aggressive as we are near to the memory limit, however there is a difference: during expires we have a clear understanding of how much memory we are over. With lazy freeing there would be no such help, so we need to resort to something else, something that can be tracked very fast. One metric could be: number of objects pending to be freed. Bottom line is: to get adaptive, you need to understand how bad the condition is in a given moment in order to adapt.

@antirez

This comment has been minimized.

Copy link
Owner Author

antirez commented Jan 31, 2015

Re-reading the thread, there is more about it if the argument is: let's make lazy freeing the default. Without lazy freeing it is provable that you can get a peak memory that is smaller in all the cases. In certain cases the peak memory can be much smaller with blocking free. Since we know how desirable it is, since RSS is easy to raise, hard to lower, there are many workloads where you want to just free ASAP and be sure that what happened is to reclaim memory. Most of the time it is the right thing to do.

Moreover it is worth to note that blocking free is more efficient even not accounting for latency. To put the object into the list and process it incrementally, to later free it, is more work compared to just free it ASAP. However it's worth to note that what we can do is to make LAZYFREE itself smart enough that who really want this kind of behavior, should feel well using it by default, which is:

  1. It should be smart enough to delete the object in a blocking way when it's actually faster or as fast as putting it into a free list. This is 99% of cases with normal objects composed of a few allocations.
  2. The freeing performed in background should be performed in a way which avoids as much as possible to incur into pathological conditions. This may be obtained in multiple ways. For instance we can do some work in the Redis serverCron() timer, but also we should free a few objects for every event loop cycle, like expired keys eviction does. This way we have already some adaptive quality built-in, which scales to the number of queries we are receiving. This is good since to fill the DB, users need to execute commands. If every command execution consumes a few objects from the lazy list, it will be hard to accumulate too many objects.
  3. We should track some metric, for example number of objects to free. This is a good metric since the work to do is proportional to the number of allocations, and not to the size of the allocation (not entirely true, but it's a good approximation: unmapping a large portion of memory may be costly). This way we can adapt the work we do both in the timer and for every command.

The function to lazy free objects may resemble a lot the one used to expire keys, in which it may be called with an argument that tells if we are into a fast-path or a slow-path, so that the function will adapt to the circumstance.

@antirez

This comment has been minimized.

Copy link
Owner Author

antirez commented Jan 31, 2015

Yet another note: when we delete a key, it is trivial to track the number of objects that are pending to be freed. We can just, for instance, add the cardinality of the object we deleted to a counter. However with FLUSHALL LAZY things are a bit more complex. We can't do it in that specific case, this adds some complexity to the above reasoning. Sorry for the spam but better to track those ideas here instead of re-thing everything every time. 🔁

@antirez

This comment has been minimized.

Copy link
Owner Author

antirez commented Feb 1, 2015

Ok, I think I've some solution for the problem of the FLUSHALL LAZY / FLUSHDB LAZY databases that are not accounted. The database structures should be gained into a list of freed databases. In the lazy-freeing cycle, we actually do two things instead of just freeing objects:

  1. If the list of databases to free is non-zero, we iterate into the DBs, a few tens fo keys per iteration, to move objects from the database to the free list, updating the count of objects to free as we add new things into the free list.
  2. As usually, for every iteration, we actually free a few objects.

This way while freeing a database does not immediately change the perceived pressure in the system, the count of objects to free will rapidly be updated, especially if step "1" is conceived in order to do more work than step "2", under low pressure.

@antirez

This comment has been minimized.

Copy link
Owner Author

antirez commented Sep 26, 2015

I blogged about the implementation I wrote for this issue here: http://antirez.com/news/93

@badboy

This comment has been minimized.

Copy link
Contributor

badboy commented Sep 26, 2015

And in case anyone is looking for it, the implementation currently lives in the lazyfree branch

@nnog

This comment has been minimized.

Copy link

nnog commented Sep 28, 2015

I realise this is a little late... but why not keep the command semantics simple, and just enable lazy freeing for TTL expirations. So, instead of adding LAZY* commands/options, just keep the original blocking semantics (with the associated maxmemory guarantees) for DEL/FLUSHALL, then if the client wants the lazy (more dangerous) non-blocking speedup they can just do EXPIRE mykey 0. TTL deletion already has weak guarantees of the key being actually deleted at the specified time, so nobody's usage assumptions could ever be broken. Then you'd only need to add one command EXPIREALL (with EXPIREALL 0 == FLUSHALL LAZY) to give the full range of functionality described here. Seems a lot simpler to me.

Of course, TTL expiry could still do blocking delete for small keys where appending to the freelist is more expensive than just freeing immediately. And you could possibly implement a special case optimisation of EXPIRE command with 0 argument if TTL stuff has an overhead. These are just implementation optimisations, though.

@antirez

This comment has been minimized.

Copy link
Owner Author

antirez commented Sep 28, 2015

I think having UNLINK is a lot less convoluted, and I also think that it's a good idea to have instead options controlling if the expiration (TTL) and eviction (LRU) are handled in a lazy way or not. It's a matter of design choices so there is not always an objectively superior way of doing things. Btw the implementation you can find in the lazyfree branch works very well and is not dangerous at all. It is almost impossible to allocate memory at the same rate at which the lazyfree is able to reclaim it.

@antirez

This comment has been minimized.

Copy link
Owner Author

antirez commented Oct 15, 2015

Implemented into unstable. Will be part of Redis 3.4.

@antirez antirez closed this Oct 15, 2015

@andyfeller

This comment has been minimized.

Copy link

andyfeller commented Nov 15, 2016

Any thoughts about document new settings below within redis.conf? Appreciate work you've put into documenting this as it's really helpful!

lazyfree-lazy-eviction no
lazyfree-lazy-expire no
lazyfree-lazy-server-del no
slave-lazy-flush no
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.