You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Are there any guarantees about call order with .on.resume vs update and create?
Will resume get called always, or could sometimes update be called instead?
Is it possible to know that kopf has finished processing the current set of resumes? As in, has it finished start up and is now watching and (crucially) waiting for events?
Is there any way where I could see a resume event after i've processed a delete event?
I'd like to populate a cache with all instances of my CRD at start up, then keep it up to date by following the created/update/delete events. I'd like to be able to know that my cache is in a good state before i start processing events so that I can reference it. Is this doable?
Neither before, nor after, but as part of create/update. Usually, the handlers are called in the order they are declared. Resume handlers are then just mixed-in into the list of handlers to call, if this is an operator restart.
Both resume+update, or resume+create handlers will be called. However, if they point to the same function, they will be de-duplicated, and that function will be called only once. E.g.:
In this case, either fn1+fn2 or fn1+fn3 will be called on operator restart if the object pre-existed. Or just fn2 or fn3 if the event happened when the operator was up (fn1 will not be called, as there was no "resuming" of anything).
No, it is not possible at the moment to know when Kopf has finished the initial listing and started watching.
And I'm not sure if this is at all conceptually possible: the handlers are per-object, not per-CRD; the operator could finish the initial listing and start watching, but the resuming handlers can continue being executed (e.g. retried; or just queued and waiting for some objects (of there are many)).
As I write this, I've realised there is a bug in the implementation: if the resume handler fails, it will never be retried as other handlers are retried, because any retry will be a regular status-update event, not the initial listing; same for the resume handlers if they are not the first ones. Created #113 out of this (seems easily fixable).
The resuming will not happen after deletion or as part of deletion. if the object is gone (or is going to be gone soon), there is nothing to resume/monitor/handle. This can be seen in the kopf.reactor.causation module — the deletion events & states are the first-catchers: the routine does not continue further if the object is in deletion state or really deleted.
The in-memory caching is probably doable (in theory). I didn't try this yet. Usually, I keep the state on the objects themselves, not in memory (e.g. children's labels referring to the parents, or parent's status fields referring to the children).
If there will be any problems, please let me know.
kopf==0.23rc1 was pre-released (see the release notes). It fixes a lot of things with the on-resume handlers in #230:
There can be more than one on-resume handler (previously only the first one was executed).
They can go after the on-create/on-update handlers (previous worked only if they were the first in row).
Arbitrary or temporary errors in the on-resume handlers are now retried, as for all other handlers (previously, were ignored).
The sub-handlers should now be possible too (I didn't check though — but the preventing issue is the same as for all of the above).
And they are not repeated anymore, once the object was resumed once.
The order of execution is the same as before — mixed with regular handlers in the order of appearance.
As it turned out, contrary to what I said above, the on-resume handlers CAN be called when the object is marked for deletion, and they actually were supposed to be called before — just never got that far to be actually executed due to the issues mentioned above.
I am now in confusion on the desired behaviour.
On the one hand, such behaviour is more expected: on-resumes should happen when the operator restarts, the object does exist (no matter if it is marked for deletion or not), and the deletion handlers are yet to be executed. And the execution can take some time due to retries.
Skipping the on-resumes when the object is marked for deletion can have undesired side-effects: the object DOES exist, but the operator does not know it after the restart (unlike for any objects that are not marked for deletion). And the deletion handlers can in fact execute for long time (due to retries) — and the object still exists at this time.
On the other hand, the deletion handlers is a natural place for cleaning up the system resources allocated for the object. E.g., threads, tasks.
If the on-delete and on-resume handlers are mixed in this case, the resources are allocated and deleted fast enough and with no need (in the best case), or can be allocated in on.resume() AFTER the release happened in on.delete() (worst case), thus leading to the memory leaks.
The solution to this would be to check if the object is deleted or not while allocating the resources in the on-resume handlers. But this leads to unnecessary code when the behaviour is in most cases to "ignore" the handler (which violates the Kopf's mission of being simple and intuitive).
Decided to go both ways (#233): Skip the on-resume handlers normally on deletions. But make it possible to mark them as deletion-safe (on the developer's responsibility).
One of the reasons i asked about this behaviour is because i wanted to construct a cache of the current state in memory in my operator. I noticed you are now doing this internally for kopf. Is that accessible from operator code?
Jc2k It was made internal-only for beginning (as part of this massive refactoring release).
It is relatively easy now to add an extra field to the ResourceMemory class with arbitrary user fields, with the same semantics as threading.local does, and pass it as memo into the handler's kwargs.
Though, I have some fear that it will be abused by the operator developers to store the data that should be persistent and stored on the resource's status instead. — And this is why I didn't expose it initially.
On second thought, it will be their problem then. There is anyway a plenty of other ways to mis-design something.
Hi!
I have a few questions about resume handlers:
.on.resume
vs update and create?I'd like to populate a cache with all instances of my CRD at start up, then keep it up to date by following the created/update/delete events. I'd like to be able to know that my cache is in a good state before i start processing events so that I can reference it. Is this doable?
Neither before, nor after, but as part of create/update. Usually, the handlers are called in the order they are declared. Resume handlers are then just mixed-in into the list of handlers to call, if this is an operator restart.
The order of handlers can be additionally controlled with the handlers lifecycles. It seems, they are not documented (accidentally); but they ARE the public interface. These ones: https://kopf.readthedocs.io/en/latest/packages/kopf.reactor.lifecycles/. Usage example:
Or you can make your own callback and control the order as you wish (and store the state in the status).
I will document the handler ordering control a bit later.
Both resume+update, or resume+create handlers will be called. However, if they point to the same function, they will be de-duplicated, and that function will be called only once. E.g.:
The
fn
function will be called only once per event, be that creation or update or operator restart (i.e. object resuming).In this case, either fn1+fn2 or fn1+fn3 will be called on operator restart if the object pre-existed. Or just fn2 or fn3 if the event happened when the operator was up (fn1 will not be called, as there was no "resuming" of anything).
No, it is not possible at the moment to know when Kopf has finished the initial listing and started watching.
And I'm not sure if this is at all conceptually possible: the handlers are per-object, not per-CRD; the operator could finish the initial listing and start watching, but the resuming handlers can continue being executed (e.g. retried; or just queued and waiting for some objects (of there are many)).
As I write this, I've realised there is a bug in the implementation: if the resume handler fails, it will never be retried as other handlers are retried, because any retry will be a regular status-update event, not the initial listing; same for the resume handlers if they are not the first ones. Created #113 out of this (seems easily fixable).
The resuming will not happen after deletion or as part of deletion. if the object is gone (or is going to be gone soon), there is nothing to resume/monitor/handle. This can be seen in the
kopf.reactor.causation
module — the deletion events & states are the first-catchers: the routine does not continue further if the object is in deletion state or really deleted.Why should it do so? What is the use-case?
The in-memory caching is probably doable (in theory). I didn't try this yet. Usually, I keep the state on the objects themselves, not in memory (e.g. children's labels referring to the parents, or parent's status fields referring to the children).
If there will be any problems, please let me know.
Thanks for the detailed answers!
I don't have a use case for delete+resume, i just wanted to make sure it wasn't a case I had to handle - glad I dont!
kopf==0.23rc1
was pre-released (see the release notes). It fixes a lot of things with the on-resume handlers in #230:The order of execution is the same as before — mixed with regular handlers in the order of appearance.
As it turned out, contrary to what I said above, the on-resume handlers CAN be called when the object is marked for deletion, and they actually were supposed to be called before — just never got that far to be actually executed due to the issues mentioned above.
I am now in confusion on the desired behaviour.
On the one hand, such behaviour is more expected: on-resumes should happen when the operator restarts, the object does exist (no matter if it is marked for deletion or not), and the deletion handlers are yet to be executed. And the execution can take some time due to retries.
Skipping the on-resumes when the object is marked for deletion can have undesired side-effects: the object DOES exist, but the operator does not know it after the restart (unlike for any objects that are not marked for deletion). And the deletion handlers can in fact execute for long time (due to retries) — and the object still exists at this time.
On the other hand, the deletion handlers is a natural place for cleaning up the system resources allocated for the object. E.g., threads, tasks.
If the on-delete and on-resume handlers are mixed in this case, the resources are allocated and deleted fast enough and with no need (in the best case), or can be allocated in
on.resume()
AFTER the release happened inon.delete()
(worst case), thus leading to the memory leaks.The solution to this would be to check if the object is deleted or not while allocating the resources in the on-resume handlers. But this leads to unnecessary code when the behaviour is in most cases to "ignore" the handler (which violates the Kopf's mission of being simple and intuitive).
Decided to go both ways (#233): Skip the on-resume handlers normally on deletions. But make it possible to mark them as deletion-safe (on the developer's responsibility).
Pre-released as
kopf==0.23rc2
Nice! Thanks for the update.
One of the reasons i asked about this behaviour is because i wanted to construct a cache of the current state in memory in my operator. I noticed you are now doing this internally for kopf. Is that accessible from operator code?
Jc2k It was made internal-only for beginning (as part of this massive refactoring release).
It is relatively easy now to add an extra field to the
ResourceMemory
class with arbitrary user fields, with the same semantics asthreading.local
does, and pass it asmemo
into the handler's kwargs.Though, I have some fear that it will be abused by the operator developers to store the data that should be persistent and stored on the resource's status instead. — And this is why I didn't expose it initially.
On second thought, it will be their problem then. There is anyway a plenty of other ways to mis-design something.
Implemented in #234. Released as
kopf==0.23rc3
.The text was updated successfully, but these errors were encountered: