Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module events for key modifications #12335

Open
guybe7 opened this issue Jun 21, 2023 · 4 comments
Open

Module events for key modifications #12335

guybe7 opened this issue Jun 21, 2023 · 4 comments

Comments

@guybe7
Copy link
Collaborator

guybe7 commented Jun 21, 2023

Intro

Many modules require to have reliable events with regard to keyspace changes.
Keyspace notifications perform poorly, mainly because they are fired after the action had already taken place, so by the time the module gets it, some information is already gone (for example, the “del” KSN is sent after the key is no longer in the dataset)

This design doc describes a generic insert/update/delete approach, but it might not fit complex non-JSON datatypes (streams and maybe future module types).
We define three basic events: Insert, Update, and Delete

Insert

  1. Whenever a new key is added to the DB
  2. Whenever a new element is inserted into a collection

Whole keys

An event of REDISMODULE_KEY_INSERT will be fired, just after the insertion (so it’s accessible from the module CB)
Additional information: keyname

Elements in collections

An event of REDISMODULE_ELEMENT_INSERT will be fired, just after the insertion (so it’s accessible from the module CB)
Additional information: keyname, element name/path

Update

Whenever an existing key is modified (Excluding removing/inserting of elements, see the next section)

Strings

An event of REDISMODULE_KEY_PRE_UPDATE will be fired, just before the update
Additional information: keyname
An event of REDISMODULE_KEY_POST_UPDATE will be fired, just after the update
Additional information: keyname

Elements in array-like collections

Irrelevant (array-like collections’ elements can only be removed/added, but not modified)

Elements in map-like collections

An event of REDISMODULE_ELEMENT_PRE_UPDATE will be fired, just before the update
Additional information: keyname, element name/path
An event of REDISMODULE_ELEMENT_POST_UPDATE will be fired, just after the update
Additional information: keyname, element name/path

Delete

  1. Whenever a key is deleted from the DB (overwrites don’t count as deletes)
  2. Whenever an element is deleted from a collection

Whole keys

An event of REDISMODULE_KEY_DELETE will be fired, just before the deletion (so it’s accessible from the module CB)
Additional information: keyname

Elements in collections

An event of REDISMODULE_ELEMENT_DELETE will be fired, just before the deletion (so it’s accessible from the module CB)
Additional information: keyname, element name/path

Note: we may want to take the extra step (well, it's already done in #9406) and report the reason for deletion (eviction, expiry, overwrite, etc.)

Overwrite

Whenever an existing key/element is overwritten (e.g. SUNIONSTORE where dst exists)

An overwrite will just be a combo of “delete” followed by an “insert”

Limitations / open questions

  1. How do we handle metadata changes like adding/removing a consumer? adding/removing entry to PEL? Should we have events for these as well? if so, we need to provide the module CB with a bit more information than just keyname and element path

Another approach: KSN-like events

Another approach is to do something flexible like KSN (i.e. the events are free text) but to fire them before or before and after something changes in the dataset.
For example, before an INCR we fire a KSN-like event, "pre-incr", with the keyname and the RedisModuleKey passed as arguments to the module CB. Another "post-incr" event will be fired afterward.

Outro

The insert/update/delete is more generic and thus makes it a bit easier for the module writes (only three cases to handle in the CB), but it is not as flexible as a KSN-like approach (which in turn, will make the module CB quite long and complex)

Related issues

#12047, #12073, #1697, #2057, #3186, #6973

@guybe7
Copy link
Collaborator Author

guybe7 commented Jun 21, 2023

@redis/core-team @joehu21 please review

@oranagra
Copy link
Member

i'm not a fun of either. i think the one listed as an alternative approach, requires too much knowledge from the module. it's not a standard interface, but rather a collection or partially documented strings.

the proposed one is cleaner, but it only covers certain types (maps, arrays, lists, and nested collections of these), but as noted it doesn't properly cover other types, like maybe HLL, bloom, streams with their metadata, and who knows what other odd types modules implement (i.e. we'll want to let modules that implement data types fire these events too)

@joehu21
Copy link
Contributor

joehu21 commented Jun 21, 2023

An interesting proposal. Map-like types such as hash are widely used, which is addressed by the proposal. It's hard to cover all data types. I think this is a good initiative to enrich and complement the existing KSN. For module types such as JSON, the corresponding modules can be easily modified to fire the proposed new events.

@madolson
Copy link
Contributor

The problem to me is that Redis doesn't really have building block operations that can be generalized. Compare it to a relational database which has the tuple, which is pretty easy to hook into insert, delete, update operations. You have the pre-image of the tuple and you have the post image. We have no building block here, since every datastructure is unique and modules present unbounded potential for integration.

I see two options:

  1. We think through the fundamental operations that can occur on each data structure. Such as STRING_SET(prev_string, new_string), STRING_DELETE, LIST_PUSH(position, element), LIST_POP(position), LIST_COPY(existing_list), ZSET_ADD(key, score), ZSET_REMOVE, HASH_SET(existing), HASH_REMOVE, STREAM_XREAD(consumer). All commands will become compositions of these APIs, a MSET will repeatedly call STRING_SET() for example. Modules would be able to define their own primitive operations. Modules would be able to hook into these, and define custom behavior for all of these operations to operate on the pre-post image of them. Within the engine, we could abstract away all of the actions on the core data structures to run through these core operations. We could even investigate doing something fancy like allowing pluggable datastructures here. We could have implemented our per-object memory overhead this way Introduce CLUSTER SLOT-STATS command (#11422) #11432. I imagine RL could port parts of their CRDT implementation here. We could support low level optimizations for power users or managed providers.

  2. Alternatively, we acknowledge there is no fundamental operations. The only primitives we are left with is the one highlighted here, insert/update/delete, and knowing which command was executed. I think this option is useful, but ultimately I think module developers will find the outlined approach very lacking.

I really think we should think consider something like option 1. more deeply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backlog
Development

No branches or pull requests

4 participants